004 Datenverarbeitung; Informatik
Refine
Year of publication
Document Type
- Part of Periodical (65)
- Diploma Thesis (14)
- Bachelor Thesis (11)
- Master's Thesis (8)
- Conference Proceedings (3)
- Doctoral Thesis (2)
- Study Thesis (2)
Language
- English (105) (remove)
Keywords
- Bluetooth (4)
- ontology (4)
- Knowledge Compilation (3)
- University (3)
- computer clusters (3)
- Campus Information System (2)
- Cloud Computing (2)
- E-KRHyper (2)
- Enterprise 2.0 (2)
- Linked Open Data (2)
Institute
- Fachbereich 4 (105) (remove)
Social networks are ubiquitous structures that we generate and enrich every-day while connecting with people through social media platforms, emails, and any other type of interaction. While these structures are intangible to us, they carry important information. For instance, the political leaning of our friends can be a proxy to identify our own political preferences. Similarly, the credit score of our friends can be decisive in the approval or rejection of our own loans. This explanatory power is being leveraged in public policy, business decision-making and scientific research because it helps machine learning techniques to make accurate predictions. However, these generalizations often benefit the majority of people who shape the general structure of the network, and put in disadvantage under-represented groups by limiting their resources and opportunities. Therefore it is crucial to first understand how social networks form to then verify to what extent their mechanisms of edge formation contribute to reinforce social inequalities in machine learning algorithms.
To this end, in the first part of this thesis, I propose HopRank and Janus two methods to characterize the mechanisms of edge formation in real-world undirected social networks. HopRank is a model of information foraging on networks. Its key component is a biased random walker based on transition probabilities between k-hop neighborhoods. Janus is a Bayesian framework that allows to identify and rank plausible hypotheses of edge formation in cases where nodes possess additional information. In the second part of this thesis, I investigate the implications of these mechanisms - that explain edge formation in social networks - on machine learning. Specifically, I study the influence of homophily, preferential attachment, edge density, fraction of inorities, and the directionality of links on both performance and bias of collective classification, and on the visibility of minorities in top-k ranks. My findings demonstrate a strong correlation between network structure and machine learning outcomes. This suggests that systematic discrimination against certain people can be: (i) anticipated by the type of network, and (ii) mitigated by connecting strategically in the network.
In this thesis the possibilities for real-time visualization of OpenVDB
files are investigated. The basics of OpenVDB, its possibilities, as well
as NanoVDB and its GPU port, were studied. A system was developed
using PNanoVDB, the graphics API port of OpenVDB. Techniques were
explored to improve and accelerate a single ray approach of ray tracing.
To prove real-time capability, two single scattering approaches were
also implemented. One of these was selected, further investigated and
optimized to achieve interactive real-time rendering.
It is important to give artists immediate feedback on their adjustments, as
well as the possibility to change all parameters to ensure a user friendly
creation process.
In addition to the optical rendering, corresponding benchmarks were
collected to compare different improvement approaches and to prove
their relevance. Attention was paid to the rendering times and memory
consumption on the GPU to ensure optimal use. A special focus, when
rendering OpenVDB files, was put on the integrability and extensibility of
the program to allow easy integration into an existing real-time renderer
like U-Render.
Semantic Web technologies have been recognized to be key for the integration of distributed and heterogeneous data sources on the Web, as they provide means to define typed links between resources in a dynamic manner and following the principles of dataspaces. The widespread adoption of these technologies in the last years led to a large volume and variety of data sets published as machine-readable RDF data, that once linked constitute the so-called Web of Data. Given the large scale of the data, these links are typically generated by computational methods that given a set of RDF data sets, analyze their content and identify the entities and schema elements that should be connected via the links. Analogously to any other kind of data, in order to be truly useful and ready to be consumed, links need to comply with the criteria of high quality data (e.g., syntactically and semantically accurate, consistent, up-to-date). Despite the progress in the field of machine learning, human intelligence is still essential in the quest for high quality links: humans can train algorithms by labeling reference examples, validate the output of algorithms to verify their performance on a data set basis, as well as augment the resulting set of links. Humans —especially expert humans, however, have limited availability. Hence, extending data quality management processes from data owners/publishers to a broader audience can significantly improve the data quality management life cycle.
Recent advances in human computation and peer-production technologies opened new avenues for human-machine data management techniques, allowing to involve non-experts in certain tasks and providing methods for cooperative approaches. The research work presented in this thesis takes advantage of such technologies and investigates human-machine methods that aim at facilitating link quality management in the Semantic Web. Firstly, and focusing on the dimension of link accuracy, a method for crowdsourcing ontology alignment is presented. This method, also applicable to entities, is implemented as a complement to automatic ontology alignment algorithms. Secondly, novel measures for the dimension of information gain facilitated by the links are introduced. These entropy-centric measures provide data managers with information about the extent the entities in the linked data set gain information in terms of entity description, connectivity and schema heterogeneity. Thirdly, taking Wikidata —the most successful case of a linked data set curated, linked and maintained by a community of humans and bots— as a case study, we apply descriptive and predictive data mining techniques to study participation inequality and user attrition. Our findings and method can help community managers make decisions on when/how to intervene with user retention plans. Lastly, an ontology to model the history of crowd contributions across marketplaces is presented. While the field of human-machine data management poses complex social and technical challenges, the work in this thesis aims to contribute to the development of this still emerging field.
This paper describes the robot Lisa used by team
homer@UniKoblenz of the University of Koblenz Landau, Germany, for the participation at the RoboCup@Home 2016 in Leipzig, Germany. A special focus is put on novel system components and the open source contributions of our team. We have released packages for object recognition, a robot face including speech synthesis, mapping and navigation, speech recognition interface via android and a GUI. The packages are available (and new packages will be released) on http://wiki.ros.org/agas-ros-pkg.
Information systems research has started to use crowdsourcing platforms such as Amazon Mechanical Turks (MTurk) for scientific research, recently. In particular, MTurk provides a scalable, cheap work-force that can also be used as a pool of potential respondents for online survey research. In light of the increasing use of crowdsourcing platforms for survey research, the authors aim to contribute to the understanding of its appropriate usage. Therefore, they assess if samples drawn from MTurk deviate from those drawn via conventional online surveys (COS) in terms of answers in relation to relevant e-commerce variables and test the data in a nomological network for assessing differences in effects.
The authors compare responses from 138 MTurk workers with those of 150 German shoppers recruited via COS. The findings indicate, inter alia, that MTurk workers tend to exhibit more positive word-of mouth, perceived risk, customer orientation and commitment to the focal company. The authors discuss the study- results, point to limitations, and provide avenues for further research.
The aim of this paper is to identify and understand the risks and issues companies are experiencing from the business use of social media and to develop a framework for describing and categorising those social media risks. The goal is to contribute to the evolving theorisation of social media risk and to provide a foundation for the further development of social media risk management strategies and processes. The study findings identify thirty risk types organised into five categories (technical, human, content, compliance and reputational). A risk-chain is used to illustrate the complex interrelated, multi-stakeholder nature of these risks and directions for future work are identified.
The way information is presented to users in online community platforms has an influence on the way the users create new information. This is the case, for instance, in question-answering fora, crowdsourcing platforms or other social computation settings. To better understand the effects of presentation policies on user activity, we introduce a generative model of user behaviour in this paper. Running simulations based on this user behaviour we demonstrate the ability of the model to evoke macro phenomena comparable to the ones observed on real world data.
Modeling and publishing Linked Open Data (LOD) involves the choice of which vocabulary to use. This choice is far from trivial and poses a challenge to a Linked Data engineer. It covers the search for appropriate vocabulary terms, making decisions regarding the number of vocabularies to consider in the design process, as well as the way of selecting and combining vocabularies. Until today, there is no study that investigates the different strategies of reusing vocabularies for LOD modeling and publishing. In this paper, we present the results of a survey with 79 participants that examines the most preferred vocabulary reuse strategies of LOD modeling. Participants of our survey are LOD publishers and practitioners. Their task was to assess different vocabulary reuse strategies and explain their ranking decision. We found significant differences between the modeling strategies that range from reusing popular vocabularies, minimizing the number of vocabularies, and staying within one domain vocabulary. A very interesting insight is that the popularity in the meaning of how frequent a vocabulary is used in a data source is more important than how often individual classes and properties arernused in the LOD cloud. Overall, the results of this survey help in understanding the strategies how data engineers reuse vocabularies, and theyrnmay also be used to develop future vocabulary engineering tools.
This paper presents a method for the evolution of SHI ABoxes which is based on a compilation technique of the knowledge base. For this the ABox is regarded as an interpretation of the TBox which is close to a model. It is shown, that the ABox can be used for a semantically guided transformation resulting in an equisatisfiable knowledge base. We use the result of this transformation to effciently delete assertions from the ABox. Furthermore, insertion of assertions as well as repair of inconsistent ABoxes is addressed. For the computation of the necessary actions for deletion, insertion and repair, the E-KRHyper theorem prover is used.
Large amounts of qualitative data make the utilization of computer-assisted methods for their analysis inevitable. In this thesis Text Mining as an interdisciplinary approach, as well as the methods established in the empirical social sciences for analyzing written utterances are introduced. On this basis a process of extracting concept networks from texts is outlined and the possibilities of utilitzing natural language processing methods within are highlighted. The core of this process is text processing, to whose execution software solutions supporting manual as well as automated work are necessary. The requirements to be met by these solutions, against the background of the initiating project GLODERS, which is devoted to investigating extortion racket systems as part of the global fiσnancial system, are presented, and their fulσlment by the two most preeminent candidates reviewed. The gap between theory and pratical application is closed by a prototypical application of the method to a data set of the research project utilizing the two given software solutions.
This thesis describes the implementation of a Path-planning algorithm for multi-axle vehicles using machine learning algorithms. For that purpose, a general overview over Genetic Algorithms is given and alternative machine learning algorithms are briefly explained. The software developed for this purpose is based on the EZSystem Simulation Software developed by the AG Echtzeitysteme at the University Koblenz-Landau and a path correction algorithm developed by Christian Schwarz, which is also detailed in this paper. This also includes a description of the vehicle used in these simulations. Genetic Algorithms as a solution for path-planning in complex scenarios are then evaluated based on the results of the developed simulation software and compared to alternative, non-machine learning solutions, which are also shortly presented.
We present the conceptual and technological foundations of a distributed natural language interface employing a graph-based parsing approach. The parsing model developed in this thesis generates a semantic representation of a natural language query in a 3-staged, transition-based process using probabilistic patterns. The semantic representation of a natural language query is modeled in terms of a graph, which represents entities as nodes connected by edges representing relations between entities. The presented system architecture provides the concept of a natural language interface that is both independent in terms of the included vocabularies for parsing the syntax and semantics of the input query, as well as the knowledge sources that are consulted for retrieving search results. This functionality is achieved by modularizing the system's components, addressing external data sources by flexible modules which can be modified at runtime. We evaluate the system's performance by testing the accuracy of the syntactic parser, the precision of the retrieved search results as well as the speed of the prototype.
Iterative Signing of RDF(S) Graphs, Named Graphs, and OWL Graphs: Formalization and Application
(2013)
When publishing graph data on the web such as vocabulariesrnusing RDF(S) or OWL, one has only limited means to verify the authenticity and integrity of the graph data. Today's approaches require a high signature overhead and do not allow for an iterative signing of graph data. This paper presents a formally defined framework for signing arbitrary graph data provided in RDF(S), Named Graphs, or OWL. Our framework supports signing graph data at different levels of granularity: minimum self-contained graphs (MSG), sets of MSGs, and entire graphs. It supports for an iterative signing of graph data, e. g., when different parties provide different parts of a common graph, and allows for signing multiple graphs. Both can be done with a constant, low overhead for the signature graph, even when iteratively signing graph data.
Autonomous systems such as robots already are part of our daily life. In contrast to these machines, humans an react appropriately to their counterparts. People can hear and interpret human speech, and interpret facial expressions of other people.
This thesis presents a system for automatic facial expression recognition with emotion mapping. The system is image-based and employs feature-based feature extraction. This thesis analyzes the common steps of an emotion recognition system and presents state-of-the-art methods. The approach presented is based on 2D features. These features are detected in the face. No neutral face is needed as reference. The system extracts two types of facial parameters. The first type consists of distances between the feature points. The second type comprises angles between lines connecting the feature points. Both types of parameters are implemented and tested. The parameters which provide the best results for expression recognition are used to compare the system with state-of-the-art approaches. A multiclass Support Vector Machine classifies the parameters.
The results are codes of Action Units of the Facial Action Coding System. These codes are mapped to a facial emotion. This thesis addresses the six basic emotions (happy, surprised, sad, fearful, angry, and disgusted) plus the neutral facial expression. The system presented is implemented in C++ and is provided with an interface to the Robot Operating System (ROS).
This paper consists of the observation of existing first aid applications for smartphones and comparing them to a first aid application developed by the University of Koblenz called "Defi Now!". The main focus lies on examining "Defi Now!" in respect to its usability based on the dialogue principles referring to the seven software ergonomic principles due to the ISO 9241-110 standard. These are known as suitability for learning, controllability, error tolerance, self-descriptiveness, conformity with user expectations, suitability for the task, and suitability for individualization.
Therefore a usability study was conducted with 74 participants. A questionnaire was developed, which was to be filled out by the test participants anonymously. The test results were used for an optimization of the app referring its' usability.
Various best practices and principles guide an ontology engineer when modeling Linked Data. The choice of appropriate vocabularies is one essential aspect in the guidelines, as it leads to better interpretation, querying, and consumption of the data by Linked Data applications and users.
In this paper, we present the various types of support features for an ontology engineer to model a Linked Data dataset, discuss existing tools and services with respect to these support features, and propose LOVER: a novel approach to support the ontology engineer in modeling a Linked Data dataset. We demonstrate that none of the existing tools and services incorporate all types of supporting features and illustrate the concept of LOVER, which supports the engineer by recommending appropriate classes and properties from existing and actively used vocabularies. Hereby, the recommendations are made on the basis of an iterative multimodal search. LOVER uses different, orthogonal information sources for finding terms, e.g. based on a best string match or schema information on other datasets published in the Linked Open Data cloud. We describe LOVER's recommendation mechanism in general and illustrate it alongrna real-life example from the social sciences domain.
Concept for a Knowledge Base on ICT for Governance and Policy Modelling regarding eGovPoliNet
(2013)
Abstract The EU project eGovPoliNet is engaged in research and development in the field of information and communication technologies (ICT) for governance and policy modelling. Numerous communities pursue similar goals in this field of IT-based, strategic decision making and simulation of social problem areas. Though, the existing research approaches and results so far are quite fragmented. The aim of eGovPoliNet is to overcome the fragmentation across disciplines and to establish an international, open dialogue by fostering the cooperation between research and practice. This dialogue will advance the discussion and development of various problem areas with the help of researchers from different disciplines, who share knowledge, expertise and best practice supporting policy analysis, modelling and governance. To support this dialogue, eGovPoliNet will provide a knowledge base, which's conceptual development is the subject of this thesis. The knowledge base is to be filled with content from the area of ICT for strategic decision making and social simulation, such as publications, ICT solutions and project descriptions. This content needs to be structured, organised and managed in a way, so that it generates added value and the knowledge base is used as source of accumulated knowledge, which consolidates the previously fragmented research and development results in a central location.
The aim of this thesis is the development of a concept for a knowledge base, which provides the structure and the necessary functionalities to gather and process knowledge concerning ICT solutions for governance and policy modelling. This knowledge needs to be made available to users and thereby motivate them to contribute to the development and maintenance of the knowledge base.
E-KRHyper is a versatile theorem prover and model generator for firstorder logic that natively supports equality. Inequality of constants, however, has to be given by explicitly adding facts. As the amount of these facts grows quadratically in the number of these distinct constants, the knowledge base is blown up. This makes it harder for a human reader to focus on the actual problem, and impairs the reasoning process. We extend E-Hyper- underlying E-KRhyper tableau calculus to avoid this blow-up by implementing a native handling for inequality of constants. This is done by introducing the unique name assumption for a subset of the constants (the so called distinct object identifiers). The obtained calculus is shown to be sound and complete and is implemented into the E-KRHyper system. Synthetic benchmarks, situated in the theory of arrays, are used to back up the benefits of the new calculus.
Dualizing marked Petri nets results in tokens for transitions (t-tokens). A marked transition can strictly not be enabled, even if there are sufficient "enabling" tokens (p-tokens) on its input places. On the other hand, t-tokens can be moved by the firing of places. This permits flows of t-tokens which describe sequences of non-events. Their benefiit to simulation is the possibility to model (and observe) causes and effects of non-events, e.g. if something is broken down.
In this paper, we demonstrate by means of two examples how to work with probability propagation nets (PPNs). The fiirst, which comes from the book by Peng and Reggia [1], is a small example of medical diagnosis. The second one comes from [2]. It is an example of operational risk and is to show how the evidence flow in PPNs gives hints to reduce high losses. In terms of Bayesian networks, both examples contain cycles which are resolved by the conditioning technique [3].
The paper deals with a specific introduction into probability propagation nets. Starting from dependency nets (which in a way can be considered the maximum information which follows from the directed graph structure of Bayesian networks), the probability propagation nets are constructed by joining a dependency net and (a slightly adapted version of) its dual net. Probability propagation nets are the Petri net version of Bayesian networks. In contrast to Bayesian networks, Petri nets are transparent and easy to operate. The high degree of transparency is due to the fact that every state in a process is visible as a marking of the Petri net. The convenient operability consists in the fact that there is no algorithm apart from the firing rule of Petri net transitions. Besides the structural importance of the Petri net duality there is a semantic matter; common sense in the form of probabilities and evidencebased likelihoods are dual to each other.
Virtual Goods + ODRL 2012
(2012)
This is the 10th international workshop for technical, economic, and legal aspects of business models for virtual goods incorporating the 8th ODRL community group meeting. This year we did not call for completed research results, but we invited PhD students to present and discuss their ongoing research work. In the traditional international group of virtual goods and ODRL researchers we discussed PhD research from Belgium, Brazil, and Germany. The topics focused on research questions about rights management in the Internet and e-business stimulation. In the center of rights management stands the conception of a formal policy expression that can be used for human readable policy transparency, as well as for machine readable support of policy conformant systems behavior up to automatic policy enforcement. ODRL has proven to be an ideal basis for policy expressions, not only for digital copy rights, but also for the more general "Policy Awareness in the World of Virtual Goods". In this sense, policies support the communication of virtual goods, and they are a virtualization of rules-governed behavior themselves.
Regarding the rapidly growing amount of data produced every year and the increasing acceptance of Enterprise 2.0 enterprises have to care about the management of their data more and more. Content created and stored in an uncoordinated manner can lead to data-silos (Williams & Hardy 2011, p.57), which result in long search times, inaccessible data and in consequence monetary losses. The "expanding digital universe" forces enterprises to develop new archiving solutions and records management policies (Gantz et al. 2007, p.13). Enterprise Content Management (ECM) is the research field that deals with these challenges. It is placed in the scientific context of Enterprise Information Management. This thesis aims to find out to what extent current Enterprise Content Management Systems (ECMS) support these new requirements, especially concerning the archiving of Enterprise 2.0 data. For this purpose, three scenarios were created to evaluate two different kinds of ECMS (one Open Source - and one proprietary system) chosen on the basis of a short marketrnresearch. The application of the scenarios reveals that the system vendors actually face the industry- concerns: both tools provide functionality for the archiving of data arising from online collaboration and also business records management capabilities but the integration of those topics is not, or is only inconsistently solved. At this point new questions - such as, "Which datarngenerated in an Enterprise 2.0 is worth being a record?" - arise and should be examined in future research.
Procedural content generation, the generation of video game content using pseudo-random algorithms, is a field of increasing business and academic interest due to its suitability for reducing development time and cost as well as the possibility of creating interesting, unique game spaces. Although many contemporary games feature procedurally generated content, the author perceived a lack of games using this approach to create realistic outer-space game environments, and the feasibility of employing procedural content generations in such a game was examined. Using current scientific models, a real-time astronomical simulation was developed in Python which generates star and planets object in a fictional galaxy procedurally to serve as the game space of a simple 2D space exploration game where the player has to search for intelligent life.
Schema information about resources in the Linked Open Data (LOD) cloud can be provided in a twofold way: it can be explicitly defined by attaching RDF types to the resources. Or it is provided implicitly via the definition of the resources´ properties.
In this paper, we analyze the correlation between the two sources of schema information. To this end, we have extracted schema information regarding the types and properties defined in two datasets of different size. One dataset is a LOD crawl from TimBL- FOAF profile (11 Mio. triple) and the second is an extract from the Billion Triples Challenge 2011 dataset (500 Mio. triple). We have conducted an in depth analysis and have computed various entropy measures as well as the mutual information encoded in this two manifestations of schema information.
Our analysis provides insights into the information encoded in the different schema characteristics. It shows that a schema based on either types or properties alone will capture only about 75% of the information contained in the data. From these observations, we derive conclusions about the design of future schemas for LOD.
Magnetic resonance (MR) tomography is an imaging method, that is used to expose the structure and function of tissues and organs in the human body for medical diagnosis. Diffusion weighted (DW) imaging is a specific MR imaging technique, which enables us to gain insight into the connectivity of white matter pathways noninvasively and in vivo. It allows for making predictions about the structure and integrity of those connections. In clinical routine this modality finds application in the planning phase of neurosurgical operations, such as in tumor resections. This is especially helpful if the lesion is deeply seated in a functionally important area, where the risk of damage is given. This work reviews the concepts of MR imaging and DW imaging. Generally, at the current resolution of diffusion weighted data, single white matter axons cannot be resolved. The captured signal rather describes whole fiber bundles. Beside this, it often appears that different complex fiber configurations occur in a single voxel, such as crossings, splittings and fannings. For this reason, the main goal is to assist tractography algorithms who are often confound in such complex regions. Tractography is a method which uses local information to reconstruct global connectivities, i.e. fiber tracts. In the course of this thesis, existing reconstruction methods such as diffusion tensor imaging (DTI) and q-ball imaging (QBI) are evaluated on synthetic generated data and real human brain data, whereas the amount of valuable information provided by the individual reconstruction mehods and their corresponding limitations are investigated. The output of QBI is the orientation distribution function (ODF), where the local maxima coincides with the underlying fiber architecture. We determine those local maxima. Furthermore, we propose a new voxel-based classification scheme conducted on diffusion tensor metrics. The main contribution of this work is the combination of voxel-based classification, local maxima from the ODF and global information from a voxel- neighborhood, which leads to the development of a global classifier. This classifier validates the detected ODF maxima and enhances them with neighborhood information. Hence, specific asymmetric fibrous architectures can be determined. The outcome of the global classifier are potential tracking directions. Subsequently, a fiber tractography algorithm is designed that integrates along the potential tracking directions and is able to reproduce splitting fiber tracts.
The Multimedia Metadata Ontology (M3O) provides a generic modeling framework for representing multimedia metadata. It has been designed based on an analysis of existing metadata standards and metadata formats. The M3O abstracts from the existing metadata standards and formats and provides generic modeling solutions for annotations, decompositions, and provenance of metadata. Being a generic modeling framework, the M3O aims at integrating the existing metadata standards and metadata formats rather than replacing them. This is in particular useful as today's multimedia applications often need to combine and use more than one existing metadata standard or metadata format at the same time. However, applying and specializing the abstract and powerful M3O modeling framework in concrete application domains and integrating it with existing metadata formats and metadata standards is not always straightforward. Thus, we have developed a step-by-step alignment method that describes how to integrate existing multimedia metadata standards and metadata formats with the M3O in order to use them in a concrete application. We demonstrate our alignment method by integrating seven different existing metadata standards and metadata formats with the M3O and describe the experiences made during the integration process.
In this thesis the feasibility of a GPGPU (general-purpose computing on graphics processing units) approach to natural feature description on mobile phone GPUs is assessed. To this end, the SURF descriptor [4] has been implemented with OpenGL ES 2.0/GLSL ES 1.0 and evaluated across different mobile devices. The implementation is multiple times faster than a comparable CPU variant on the same device. The results proof the feasibility of modern mobile graphics accelerators for GPGPU tasks especially for the detection phase in natural feature tracking used in augmented reality applications. Extensive analysis and benchmarking of this approach in comparison to state of the art methods have been undertaken. Insights into the modifications necessary to adapt and modify the SURF algorithm to the limitations of a mobile GPU are presented. Further, an outlook for a GPGPU-based tracking pipeline on a mobile device is provided.
Particle swarm optimization is an optimization technique based on simulation of the social behavior of swarms.
The goal of this thesis is to solve 6DOF local pose estimation using a modified particle swarm technique introduced by Khan et al. in 2010. Local pose estimation is achieved by using continuous depth and color data from a RGB-D sensor. Datasets are aquired from different camera poses and registered into a common model. Accuracy and computation time of the implementation is compared to state of the art algorithms and evaluated in different configurations.
The natural and the artificial environment of mankind is of enormous complexity, and our means of understanding this complex environment are restricted unless we make use of simplified (but not oversimplified) dynamical models with the help of which we can explicate and communicate what we have understood in order to discuss among ourselves how to re-shape reality according to what our simulation models make us believe to be possible. Being both a science and an art, modelling and simulation isrnstill one of the core tools of extended thought experiments, and its use is still spreading into new application areas, particularly as the increasing availability of massive computational resources allows for simulating more and more complex target systems.
In the early summer of 2012, the 26th European Conference on Modelling andrnSimulation (ECMS) once again brings together the best experts and scientists in the field to present their ideas and research, and to discuss new challenges and directions for the field.
The 2012 edition of ECMS includes three new tracks, namely Simulation-BasedrnBusiness Research, Policy Modelling and Social Dynamics and Collective Behaviour, and extended the classical Finance and Economics track with Social Science. It attracted more than 110 papers, 125 participants from 21 countries and backgrounds ranging from electrical engineering to sociology.
This book was inspired by the event, and it was prepared to compile the most recent concepts, advances, challenges and ideas associated with modelling and computer simulation. It contains all papers carefully selected from the large number of submissions by the programme committee for presentation during the conference and is organised according to the still growing number tracks which shaped the event. The book is complemented by two invited pieces from other experts that discussed an emerging approach to modelling and a specialised application. rnrnWe hope these proceedings will serve as a reference to researchers and practitioners in the ever growing field as well as an inspiration to newcomers to the area of modelling and computer simulation. The editors are honoured and proud to present you with this carefully compiled selection of topics and publications in the field.
The purpose of this master thesis is to enable the Robot Lisa to process complex commands and extract the necessary information in order to perform a complex task as a sequence of smaller tasks. This is intended to be achieved by the improvement of the understanding that Lisa has of her environment by adding semantics to the maps that she builds. The complex command itself will be expected to be already parsed. Therefore the way the input is processed to become a parsed command is out of the scope of this work. Maps that Lisa builds will be improved by the addition of semantic annotations that can include any kind of information that might be useful for the performance of generic tasks. This can include (but not necessarily limited to) hierarchical classifications of locations, objects and surfaces. The processing of the command in addition to some information of the environment shall trigger the performance of a sequence of actions. These actions are expected to be included in Lisa- currently implemented tasks and will rely on the currently existing modules that perform them.
Nevertheless the aim of this work is not only to be able to use currently implemented tasks in a more complex sequence of actions but also make it easier to add new tasks to the complex commands that Lisa can perform.
The World Wide Web (WWW) has become a very important communication channel. Its usage has steadily grown within the past. Interest by website owners in identifying user behaviour has been around since Tim Berners-Lee developed the first web browser in 1990. But as the influence of the online channel today eclipses all other media the interest in monitoring website usage and user activities has intensified as well. Gathering and analysing data about the usage of websites can help to understand customer behaviour, improve services and potentially increase profit.
It is further essential for ensuring effective website design and management, efficient mass customization and effective marketing. Web Analytics (WA) is the area addressing these considerations. However, changing technologies and evolving Web Analytic methods and processes present a challenge to organisations starting with Web Analytic programmes. Because of lacking resources in different areas and other types of websites especially small and medium-sized enterprises (SME) as well as non-profit organisations struggle to operate WA in an effective manner.
This research project aims to identify the existing gap between theory, tool possibilities and business needs for undertaking Web Analytic programmes. Therefore the topic was looked at from three different ways: the academic literature, Web Analytic tools and an interpretative case study. The researcher utilized an action research approach to investigate Web Analytics presenting an holistic overview and to identify the gaps that exists. The outcome of this research project is an overall framework, which provides guidance for SMEs who operate information websites on how to proceed in a Web Analytic programme.
In this paper, we compare two approaches for exploring large,rnhierarchical data spaces of social media data on mobile devicesrnusing facets. While the first approach arranges thernfacets in a 3x3 grid, the second approach makes use of arnscrollable list of facets for exploring the data. We have conductedrna between-group experiment of the two approachesrnwith 24 subjects (20 male, 4 female) executing the same set ofrntasks of typical mobile users" information needs. The resultsrnshow that the grid-based approach requires significantly morernclicks, but subjects need less time for completing the tasks.rnFurthermore, it shows that the additional clicks do not hamperrnthe subjects" satisfaction. Thus, the results suggest thatrnthe grid-based approach is a better choice for faceted searchrnon touchscreen mobile devices. To the best of our knowledge,rnsuch a summative evaluation of different approaches for facetedrnsearch on mobile devices has not been done so far.
Web-programming is a huge field of different technologies and concepts. Each technology implements a web-application requirement like content generation or client-server communication. Different technologies within one application are organized by concepts, for example architectural patterns. The thesis describes an approach for creating a taxonomy about these web-programming components using the free encyclopaedia Wikipedia. Our 101companies project uses implementations to identify and classify the different technology sets and concepts behind a web-application framework. These classifications can be used to create taxonomies and ontologies within the project. The thesis also describes, how we priorize useful web-application frameworks with the help of Wikipedia. Finally, the created implementations concerning web-programming are documented.
The goal of this Bachelor thesis is to implement and evaluate the "Simulating of Collective Misbelief"-model into the NetLogo programming language. Therefore, the model requirements have to be specified and implemented into the NetLogo environment. Further tool-related re-quirements have to be specified to enable the model to work in NetLogo. After implementation several simulations will be conducted to answer the research question stated above.
Software projects typically rely on several, external libraries. The interface provided by such a library is called API (application programming interface). APIs often evolve over time, thereby implying the need to adapt applications that use them. There are also reasons which may call for the replacement of one library by another one, what also results in a need to adapt the applications where the library is replaced. The process of adapting applications to use a different API is called API migration. Doing API migration manually is a cumbersome task. Automated API migration is an active research field. A related field of research is API analysis which can also provide data for developing API migration tools.
The following thesis investigates techniques and technologies for API analysis and API migration frameworks. To this end, design patterns are leveraged. These patterns are based on experience with API analysis and migration within the Software Languages Team.
In automated theorem proving, there are some problems that need information on the inequality of certain constants. In most cases this information is provided by adding facts which explicitly state that two constants are unequal. Depending on the number of constants, a huge amount of this facts can clutter the knowledge base and distract the author and readers of the problem from its actual proposition. For most cases it is save to assume that a larger knowledge base reduces the performance of a theorem prover, which is another drawback of explicit inequality facts. Using the unique name assumption in those reasoning tasks renders the introduction of inequality facts obsolete as the unique name assumptions states that two constants are identical iff their interpretation is identical. Implicit handling of non-identical constants makes the problems easier to comprehend and reduces the execution time of reasoning. In this thesis we will show how to integrate the unique name assumption into the E-hyper tableau calculus and that the modified calculus is sound and complete. The calculus will be implemented into the E-KRHyper theorem prover and we will show, by empiric evaluation, that the changed implementation, which is able to use the unique name assumption, is superior to the traditional version of E-KRHyper.
In dieser Ausarbeitung beschreibe ich die Ergebnisse meiner Untersuchungen zur Erweiterung des LogAnswer-Systemsmit nutzerspezifischen Profilinformationen. LogAnswer ist ein natürlichsprachliches open-domain Frage-Antwort-System. Das heißt: es beantwortet Fragen zu beliebigen Themen und liefert dabei konkrete (möglichst knappe und korrekte) Antworten zurück. Das System wird im Rahmen eines Gemeinschaftsprojekts der Arbeitsgruppe für künstliche Intelligenz von Professor Ulrich Furbach an der Universität Koblenz-Landau und der Arbeitsgruppe Intelligent Information and Communication Systems (IICS) von Professor Hermann Helbig an der Fernuniversität Hagen entwickelt. Die Motivation meiner Arbeit war die Idee, dass der Prozess der Antwortfindung optimiert werden kann, wenn das Themengebiet, auf das die Frage abzielt, im Vorhinein bestimmt werden kann. Dazu versuchte ich im Rahmen meiner Arbeit die Interessensgebiete von Nutzern basierend auf Profilinformationen zu bestimmen. Das Semantic Desktop System NEPOMUK wurde verwendet um diese Profilinformationen zu erhalten. NEPOMUK wird verwendet um alle Daten, Dokumente und Informationen, die ein Nutzer auf seinem Rechner hat zu strukturieren. Dazu nutzt das System ein sogenanntes Personal Information Model (PIMO) in Form einer Ontologie. Diese Ontologie enthält unter anderem eine Klasse "Topic", welche die wichtigste Grundlage für das Erstellen der in meiner Arbeit verwendeten Nutzerprofile bildete. Konkret wurde die RDF-Anfragesprache SPARQL verwendet, um eine Liste aller für den Nutzer relevanten Themen aus der Ontologie zu filtern. Die zentrale Idee meiner Arbeit war es nun diese Profilinformationen zur Optimierung des Ranking von Antwortkandidaten einzusetzen. In LogAnswer werden zu jeder gestellten Frage bis zu 200 potentiell relevante Textstellen aus der deutschen Wikipedia extrahiert. Diese Textstellen werden auf Basis von Eigenschaften (wie z.B. lexikalische Übereinstimmungen zwischen Frage und Textstelle) geordnet, da innerhalb des zur Verfügung stehenden Zeitlimits nicht alle Kandidaten bearbeitet werden können.
Mein Ansatz verfolgte das Ziel, diesen Algorithmus durch Nutzerprofile so zu erweitern, dass Antwortkandidaten, welche für den Benutzer relevante Informationen enthalten, höher in der Rangfolge eingeordnet werden. Zur Umsetzung dieser Idee musste eine Methode gefunden werden, um zu bestimmen ob ein Antwortkandidat mit dem Profil übereinstimmt. Da sich die in einer Textstelle enthaltenen Informationen in den meisten Fällen auf das übergeordnete Thema des Artikels beziehen, ohne den Namen des Artikels explizit zu erwähnen, wurde in meiner Implementierung der Artikelname betrachtet, um zu ermitteln, zu welchem Themengebiet die Textstelle Informationen liefert. Als zusätzliches Hilfsmittel wurde außerdem die DBpedia-Ontologie eingesetzt, welche die Informationen der Wikipedia strukturiert im RDF Format enthält. Mit Hilfe dieser Ontologie war es möglich, jeden Artikel in Kategorien einzuordnen, die dann mit den im Profil enthaltenen Stichworten verglichen wurden. Zur Untersuchung der Auswirkungen des Ansatzes auf das Ranking-Verfahren wurden mehrere Testläufe mit je 200 Testfragen durchgeführt. Die erste Testmenge bestand aus zufällig ausgewählten Fragen, die mit meinem eigenen Nutzerprofil getestet wurden. Dieser Testlauf lieferte kaum nutzbare Ergebnisse, da nur bei 29 der getesteten Fragen überhaupt ein Antwortkandidat mit dem Profil in Verbindung gebracht werden konnte. Außerdem konnte eine potentielle Verbesserung der Ergebnisse nur bei einer dieser 29 Fragen festgestellt werden, was zu der Schlussfolgerung führte, dass der Einsatz von Profildaten nicht für Anwendungsfälle geeignet ist, in denen die Fragen keine Korrelation mit dem genutzten Profil aufweisen.
Da die Grundannahme meiner Arbeit war, dass Nutzer in erster Linie Fragen zu den Interessensgebieten stellen, welche sich aus ihrem Profil ableiten lassen, sollten die weiteren Testläufe genau diesen Fall beleuchten. Dazu wurden 200 Testfragen aus dem Bereich Sport ausgewählt und mit einem Profil getestet, welches Stichworte zu unterschiedlichen Sportarten enthielt. Die Tests mit den Sportfragen waren wesentlich aussagekräftiger. Auch hier deuteten die Ergebnisse darauf hin, dass der Ansatz kein großes Potential zur Verbesserung des Rankings hat. Eine genauere Betrachtung einiger ausgewählter Beispiele zeigte allerdings, dass die Integration von Profildaten für bestimmte Anwendungsfälle, wie z.B. offene Fragen für die es mehr als eine korrekte Antwort gibt, durchaus zu einer Verbesserung der Ergebnisse führen kann. Außerdem wurde festgestellt, dass viele der schlechten Ergebnisse auf Inkosistenzen in der DBpedia-Ontologie und grundsätzliche Probleme im Umgang mit Wissensbasen in natürlicher Sprache beruhen.
Die Schlussfolgerung meiner Arbeit ist, dass der in dieser Arbeit vorgestellte Ansatz zur Integration von Profilinformationen für den aktuellen Anwendungsfall von LogAnswer nicht geeignet ist, da vor allem Faktenwissen aus sehr unterschiedlichen Domänen abgefragt wird und offene Fragen nur einen geringen Anteil ausmachen.
Robotics research today is primarily about enabling autonomous, mobile robots to seamlessly interact with arbitrary, previously unknown environments. One of the most basic problems to be solved in this context is the question of where the robot is, and what the world around it, and in previously visited places looks like " the so-called simultaneous localization and mapping (SLAM) problem. We present a GraphSLAM system, which is a graph-based approach to this problem. This system consists of a frontend and a backend: The frontend- task is to incrementally construct a graph from the sensor data that models the spatial relationship between measurements. These measurements may be contradicting and therefore the graph is inconsistent in general. The backend is responsible for optimizing this graph, i. e. finding a configuration of the nodes that is least contradicting. The nodes represent poses, which do not form a regular vector space due to the contained rotations. We respect this fact by treating them as what they really are mathematically: manifolds. This leads to a very efficient and elegant optimization algorithm.
Augmented Reality bedeutet eine reale Umgebung mit, meistens grafischen, virtuellen Inhalten zu erweitern. Oft sind dabei die virtuellen Inhalte der Szene jedoch nur ein Overlay und interagieren nicht mit den realen Bestandteilen der Szene. Daraus ergibt sich ein Authentizitätsproblem für Augmented Reatliy Anwendungen. Diese Arbeit betrachtet Augmented Reality in einer speziellen Umgebung, mit deren Hilfe eine authentischere Darstellung möglich ist. Ziel dieserArbeitwar die Erstellung eines Systems, das Zeichnungen durch Techniken der Augmented Reality mit virtuellen Inhalten erweitert. Durch das Anlegen einer Repräsentation soll es der Anwendung dabei möglich sein die virtuellen Szeneelementemit der Zeichnung interagieren zu lassen. Dazu wurden verschiedene Methoden aus den Bereichen des Pose Tracking und der Sketch Recognition disktutiert und für die Implementierung in einem prototypischen System ausgewählt. Als Zielhardware fungiert ein Android Smartphone. Kontext der Zeichnungen ist eine Dungeon Karte, wie sie in Rollenspielen vorkommt. Die virtuellen Inhalte nehmen dabei die Form von Bewohnern des Dungeons an, welche von einer Agentensimulation verwaltet werden. Die Agentensimulation ist Gegenstand einer eigenen Diplomarbeit [18]. Für das Pose Tracking wurde ARToolkitPlus eingesetzt, ein optisches Tracking System, das auf Basis von Markern arbeitet. Die Sketch Recognition ist dafür zuständig die Inhalte der Zeichnung zu erkennen und zu interpretieren. Dafür wurde ein eigener Ansatz implementiert der Techniken aus verschiedenen Sketch Recognition Systemen kombiniert. Die Evaluation konzentriert sich auf die technischen Aspekte des Systems, die für eine authentische Erweiterung der Zeichnung mit virtuellen Inhalten wichtig sind.
Only little information is available about the diffusion of cloud computing in German higher educational institutions. A better understanding of the state of the art in this field would support the modernization of the higher educational institutions in Germany and allow the development of more adequate cloud products and more appropriate business models for this niche. For this purpose, a literature research on Cloud Computing and IT-diffusion will be run and an empirical investigation with an online questionnaire addressed to higher educational institutions in Germany will be performed to illustrate the state of the art of Cloud Computing in German higher educational institutions as well as the threats and opportunities perceived by employees of higher educational institutions data centers connected to the usage of the cloud.
In addition to that, different experts from universities and businesses will be interviewed to complete the knowledge and information collected through the online questionnaire and during the research phase. The expected results will serve to create a recommendation for higher educational institutions in Germany about either they should migration to the cloud or not and introduce a list of guiding questions of critical issues to consider before using cloud-computing technologies.
In this thesis we exercise a wide variety of libraries, frameworks and other technologies that are available for the Haskell programming language. We show various applications of Haskell in real-world scenarios and contribute implementations and taxonomy entities to the 101companies system. That is, we cover a broad range of the 101companies feature model and define related terms and technologies. The implementations illustrate how different language concepts of Haskell, such as a very strong typing system, polymorphism, higher-order functions and monads, can be effectively used in the development of information systems. In this context we demonstrate both advantages and limitations of different Haskell technologies.
Distance vector routing protocols are interior gateway protocols in which every router sets up a routing table with the help of the information it receives from its neighboring routers. The routing table contains the next hops and associated distances on the shortest paths to every other router in the network. Security mechanisms implemented in distance vector routing protocols are insufficient. It is rather assumed that the environment is trustworthy. However, routers can be malicious for several reasons and manipulate routing by injecting false routing updates. Authenticity and integrity of transmitted routing updates have to be guaranteed and at the same time performance and benefits should be well-balanced.
In this paper several approaches that aim at meeting the above mentioned conditions are examined and their advantages and disadvantages are compared.
Cloud Computing is a topic that has gained momentum in the last years. Current studies show that an increasing number of companies is evaluating the promised advantages and considering making use of cloud services. In this paper we investigate the phenomenon of cloud computing and its importance for the operation of ERP systems. We argue that the phenomenon of cloud computing could lead to a decisive change in the way business software is deployed in companies. Our reference framework contains three levels (IaaS, PaaS, SaaS) and clarifies the meaning of public, private and hybrid clouds. The three levels of cloud computing and their impact on ERP systems operation are discussed. From the literature we identify areas for future research and propose a research agenda.
This paper describes results of the simulation of social objects, the dependence of schoolchildren's professional abilities on their personal characteristics. The simulation tool is the artificial neural network (ANN) technology. Results of a comparison of the time expense for training the ANN and for calculating the weight coefficients with serial and parallel algorithms, respectively, are presented.
An estimation of the number of multiplication and addition operations for training artififfcial neural networks by means of consecutive and parallel algorithms on a computer cluster is carried out. The evaluation of the efficiency of these algorithms is developed. The multilayer perceptron, the Volterra network and the cascade-correlation network are used as structures of artififfcial neural networks. Different methods of non-linear programming such as gradient and non-gradient methods are used for the calculation of the weight coefficients.
Identifying reusable legacy code able to implement SOA services is still an open research issue. This master thesis presents an approach to identify legacy code for service implementation based on dynamic analysis and the application of data mining techniques. rnrnAs part of the SOAMIG project, code execution traces were mapped to business processes. Due to the high amount of traces generated by dynamic analyses, the traces must be post-processed in order to provide useful information. rnrnFor this master thesis, two data mining techniques - cluster analysis and link analysis - were applied to the traces. First tests on a Java/Swing legacy system provided good results, compared to an expert- allocation of legacy code.
MapReduce with Deltas
(2011)
The MapReduce programming model is extended slightly in order to use deltas. Because many MapReduce jobs are being re-executed over slightly changing input, processing only those changes promises significant improvements. Reduced execution time allows for more frequent execution of tasks, yielding more up-to-date results in practical applications. In the context of compound MapReduce jobs, benefits even add up over the individual jobs, as each job gains from processing less input data. The individual steps necessary in working with deltas are being analyzed and examined for efficiency. Several use cases have been implemented and tested on top of Hadoop. The correctness of the extended programming model relies on a simple correctness criterion.