OPUS 4 | 004 Datenverarbeitung; Informatik

Developing ‘EasyTalk’ – a writing system utilizing natural language processing for interactive generation of ‘Leichte Sprache’ (Easy-to-Read German) to assist low-literate users with intellectual or developmental disabilities and/or complex communication needs in writing (2023)

Steinmetz, Ina

Leichte Sprache (LS, easy-to-read German) is a simplified variety of German. It is used to provide barrier-free texts for a broad spectrum of people, including lowliterate individuals with learning difficulties, intellectual or developmental disabilities (IDD) and/or complex communication needs (CCN). In general, LS authors are proficient in standard German and do not belong to the aforementioned group of people. Our goal is to empower the latter to participate in written discourse themselves. This requires a special writing system whose linguistic support and ergonomic software design meet the target group’s specific needs. We present EasyTalk a system profoundly based on natural language processing (NLP) for assistive writing in an extended variant of LS (ELS). EasyTalk provides users with a personal vocabulary underpinned with customizable communication symbols and supports in writing at their individual level of proficiency through interactive user guidance. The system minimizes the grammatical knowledge needed to produce correct and coherent complex contents by intuitively formulating linguistic decisions. It provides easy dialogs for selecting options from a natural-language paraphrase generator, which provides context-sensitive suggestions for sentence components and correctly inflected word forms. In addition, EasyTalk reminds users to add text elements that enhance text comprehensibility in terms of audience design (e.g., time and place of an event) and improve text coherence (e.g., explicit connectors to express discourse-relations). To tailor the system to the needs of the target group, the development of EasyTalk followed the principles of human-centered design (HCD). Accordingly, we matured the system in iterative development cycles, combined with purposeful evaluations of specific aspects conducted with expert groups from the fields of CCN, LS, and IT, as well as L2 learners of the German language. In a final case study, members of the target audience tested the system in free writing sessions. The study confirmed that adults with IDD and/or CCN who have low reading, writing, and computer skills can write their own personal texts in ELS using EasyTalk. The positive feedback from all tests inspires future long-term studies with EasyTalk and further development of this prototypical system, such as the implementation of a so-called Schreibwerkstatt (writing workshop)

On the recognition of human activities and the evaluation of its imitation by robotic systems (2023)

Memmesheimer, Raphael

This thesis addresses the problem of action recognition through the analysis of human motion and the benchmarking of its imitation by robotic systems. For our action recognition related approaches, we focus on presenting approaches that generalize well across different sensor modalities. We transform multivariate signal streams from various sensors to a common image representation. The action recognition problem on sequential multivariate signal streams can then be reduced to an image classification task for which we utilize recent advances in machine learning. We demonstrate the broad applicability of our approaches formulated as a supervised classification task for action recognition, a semi-supervised classification task for one-shot action recognition, modality fusion and temporal action segmentation. For action classification, we use an EfficientNet Convolutional Neural Network (CNN) model to classify the image representations of various data modalities. Further, we present approaches for filtering and the fusion of various modalities on a representation level. We extend the approach to be applicable for semi-supervised classification and train a metric-learning model that encodes action similarity. During training, the encoder optimizes the distances in embedding space for self-, positive- and negative-pair similarities. The resulting encoder allows estimating action similarity by calculating distances in embedding space. At training time, no action classes from the test set are used. Graph Convolutional Network (GCN) generalized the concept of CNNs to non-Euclidean data structures and showed great success for action recognition directly operating on spatio-temporal sequences like skeleton sequences. GCNs have recently shown state-of-the-art performance for skeleton-based action recognition but are currently widely neglected as the foundation for the fusion of various sensor modalities. We propose incorporating additional modalities, like inertial measurements or RGB features, into a skeleton-graph, by proposing fusion on two different dimensionality levels. On a channel dimension, modalities are fused by introducing additional node attributes. On a spatial dimension, additional nodes are incorporated into the skeleton-graph. Transformer models showed excellent performance in the analysis of sequential data. We formulate the temporal action segmentation task as an object detection task and use a detection transformer model on our proposed motion image representations. Experiments for our action recognition related approaches are executed on large-scale publicly available datasets. Our approaches for action recognition for various modalities, action recognition by fusion of various modalities, and one-shot action recognition demonstrate state-of-the-art results on some datasets. Finally, we present a hybrid imitation learning benchmark. The benchmark consists of a dataset, metrics, and a simulator integration. The dataset contains RGB-D image sequences of humans performing movements and executing manipulation tasks, as well as the corresponding ground truth. The RGB-D camera is calibrated against a motion-capturing system, and the resulting sequences serve as input for imitation learning approaches. The resulting policy is then executed in the simulated environment on different robots. We propose two metrics to assess the quality of the imitation. The trajectory metric gives insights into how close the execution was to the demonstration. The effect metric describes how close the final state was reached according to the demonstration. The Simitate benchmark can improve the comparability of imitation learning approaches.

Edge Formation and its Influence in Machine Learning (2022)

Espín-Noboa, Lisette

Social networks are ubiquitous structures that we generate and enrich every-day while connecting with people through social media platforms, emails, and any other type of interaction. While these structures are intangible to us, they carry important information. For instance, the political leaning of our friends can be a proxy to identify our own political preferences. Similarly, the credit score of our friends can be decisive in the approval or rejection of our own loans. This explanatory power is being leveraged in public policy, business decision-making and scientific research because it helps machine learning techniques to make accurate predictions. However, these generalizations often benefit the majority of people who shape the general structure of the network, and put in disadvantage under-represented groups by limiting their resources and opportunities. Therefore it is crucial to first understand how social networks form to then verify to what extent their mechanisms of edge formation contribute to reinforce social inequalities in machine learning algorithms. To this end, in the first part of this thesis, I propose HopRank and Janus two methods to characterize the mechanisms of edge formation in real-world undirected social networks. HopRank is a model of information foraging on networks. Its key component is a biased random walker based on transition probabilities between k-hop neighborhoods. Janus is a Bayesian framework that allows to identify and rank plausible hypotheses of edge formation in cases where nodes possess additional information. In the second part of this thesis, I investigate the implications of these mechanisms - that explain edge formation in social networks - on machine learning. Specifically, I study the influence of homophily, preferential attachment, edge density, fraction of inorities, and the directionality of links on both performance and bias of collective classification, and on the visibility of minorities in top-k ranks. My findings demonstrate a strong correlation between network structure and machine learning outcomes. This suggests that systematic discrimination against certain people can be: (i) anticipated by the type of network, and (ii) mitigated by connecting strategically in the network.

Real-Time Implementation of OpenVDB Rendering (2022)

Gaida, Sebastian

In this thesis the possibilities for real-time visualization of OpenVDB files are investigated. The basics of OpenVDB, its possibilities, as well as NanoVDB and its GPU port, were studied. A system was developed using PNanoVDB, the graphics API port of OpenVDB. Techniques were explored to improve and accelerate a single ray approach of ray tracing. To prove real-time capability, two single scattering approaches were also implemented. One of these was selected, further investigated and optimized to achieve interactive real-time rendering. It is important to give artists immediate feedback on their adjustments, as well as the possibility to change all parameters to ensure a user friendly creation process. In addition to the optical rendering, corresponding benchmarks were collected to compare different improvement approaches and to prove their relevance. Attention was paid to the rendering times and memory consumption on the GPU to ensure optimal use. A special focus, when rendering OpenVDB files, was put on the integrability and extensibility of the program to allow easy integration into an existing real-time renderer like U-Render.

Methods for Human-Machine Link Quality Management on the Web of Data (2022)

Sarasua, Cristina

Semantic Web technologies have been recognized to be key for the integration of distributed and heterogeneous data sources on the Web, as they provide means to define typed links between resources in a dynamic manner and following the principles of dataspaces. The widespread adoption of these technologies in the last years led to a large volume and variety of data sets published as machine-readable RDF data, that once linked constitute the so-called Web of Data. Given the large scale of the data, these links are typically generated by computational methods that given a set of RDF data sets, analyze their content and identify the entities and schema elements that should be connected via the links. Analogously to any other kind of data, in order to be truly useful and ready to be consumed, links need to comply with the criteria of high quality data (e.g., syntactically and semantically accurate, consistent, up-to-date). Despite the progress in the field of machine learning, human intelligence is still essential in the quest for high quality links: humans can train algorithms by labeling reference examples, validate the output of algorithms to verify their performance on a data set basis, as well as augment the resulting set of links. Humans —especially expert humans, however, have limited availability. Hence, extending data quality management processes from data owners/publishers to a broader audience can significantly improve the data quality management life cycle. Recent advances in human computation and peer-production technologies opened new avenues for human-machine data management techniques, allowing to involve non-experts in certain tasks and providing methods for cooperative approaches. The research work presented in this thesis takes advantage of such technologies and investigates human-machine methods that aim at facilitating link quality management in the Semantic Web. Firstly, and focusing on the dimension of link accuracy, a method for crowdsourcing ontology alignment is presented. This method, also applicable to entities, is implemented as a complement to automatic ontology alignment algorithms. Secondly, novel measures for the dimension of information gain facilitated by the links are introduced. These entropy-centric measures provide data managers with information about the extent the entities in the linked data set gain information in terms of entity description, connectivity and schema heterogeneity. Thirdly, taking Wikidata —the most successful case of a linked data set curated, linked and maintained by a community of humans and bots— as a case study, we apply descriptive and predictive data mining techniques to study participation inequality and user attrition. Our findings and method can help community managers make decisions on when/how to intervene with user retention plans. Lastly, an ontology to model the history of crowd contributions across marketplaces is presented. While the field of human-machine data management poses complex social and technical challenges, the work in this thesis aims to contribute to the development of this still emerging field.

Nutzungsakzeptanz von digitalen Werkzeugen in den Geisteswissenschaften (2022)

Simon, Tobias

Currently, there are a variety of digital tools in the humanities, such as annotation, visualization, or analysis software, which support researchers in their work and offer them new opportunities to address different research questions. However, the use of these tools falls far short of expectations. In this thesis, twelve improvement measures are developed within the framework of a design science theory to counteract the lack of usage acceptance. By implementing the developed design science theory, software developers can increase the acceptance of their digital tools in the humanities context.

Knowledge engineering for software languages and software technologies (2022)

Heinz, Marcel

For software engineers, conceptually understanding the tools they are using in the context of their projects is a daily challenge and a prerequisite for complex tasks. Textual explanations and code examples serve as knowledge resources for understanding software languages and software technologies. This thesis describes research on integrating and interconnecting existing knowledge resources, which can then be used to assist with understanding and comparing software languages and software technologies on a conceptual level. We consider the following broad research questions that we later refine: What knowledge resources can be systematically reused for recovering structured knowledge and how? What vocabulary already exists in literature that is used to express conceptual knowledge? How can we reuse the online encyclopedia Wikipedia? How can we detect and report on instances of technology usage? How can we assure reproducibility as the central quality factor of any construction process for knowledge artifacts? As qualitative research, we describe methodologies to recover knowledge resources by i.) systematically studying literature, ii.) mining Wikipedia, iii.) mining available textual explanations and code examples of technology usage. The theoretical findings are backed by case studies. As research contributions, we have recovered i.) a reference semantics of vocabulary for describing software technology usage with an emphasis on software languages, ii.) an annotated corpus of Wikipedia articles on software languages, iii.) insights into technology usage on GitHub with regard to a catalog of pattern and iv.) megamodels of technology usage that are interconnected with existing textual explanations and code examples.

Improving Usability and Accessibility of the Web with Eye Tracking (2021)

Menges, Raphael

The Web is an essential component of moving our society to the digital age. We use it for communication, shopping, and doing our work. Most user interaction in the Web happens with Web page interfaces. Thus, the usability and accessibility of Web page interfaces are relevant areas of research to make the Web more useful. Eye tracking is a tool that can be helpful in both areas, performing usability testing and improving accessibility. It can be used to understand users' attention on Web pages and to support usability experts in their decision-making process. Moreover, eye tracking can be used as an input method to control an interface. This is especially useful for people with motor impairment, who cannot use traditional input devices like mouse and keyboard. However, interfaces on Web pages become more and more complex due to dynamics, i.e., changing contents like animated menus and photo carousels. We need general approaches to comprehend dynamics on Web pages, allowing for efficient usability analysis and enjoyable interaction with eye tracking. In the first part of this thesis, we report our work on improving gaze-based analysis of dynamic Web pages. Eye tracking can be used to collect the gaze signals of users, who browse a Web site and its pages. The gaze signals show a usability expert what parts in the Web page interface have been read, glanced at, or skipped. The aggregation of gaze signals allows a usability expert insight into the users' attention on a high-level, before looking into individual behavior. For this, all gaze signals must be aligned to the interface as experienced by the users. However, the user experience is heavily influenced by changing contents, as these may cover a substantial portion of the screen. We delineate unique states in Web page interfaces including changing contents, such that gaze signals from multiple users can be aggregated correctly. In the second part of this thesis, we report our work on improving the gaze-based interaction with dynamic Web pages. Eye tracking can be used to retrieve gaze signals while a user operates a computer. The gaze signals may be interpreted as input controlling an interface. Nowadays, eye tracking as an input method is mostly used to emulate mouse and keyboard functionality, hindering an enjoyable user experience. There exist a few Web browser prototypes that directly interpret gaze signals for control, but they do not work on dynamic Web pages. We have developed a method to extract interaction elements like hyperlinks and text inputs efficiently on Web pages, including changing contents. We adapt the interaction with those elements for eye tracking as the input method, such that a user can conveniently browse the Web hands-free. Both parts of this thesis conclude with user-centered evaluations of our methods, assessing the improvements in the user experience for usability experts and people with motor impairment, respectively.

The Line Space - a Directional Data Structure for Ray Tracing Acceleration (2021)

Keul, Kevin

Ray tracing acceleration through dedicated data structures has long been an important topic in computer graphics. In general, two different approaches are proposed: spatial and directional acceleration structures. The thesis at hand presents an innovative combined approach of these two areas, which enables a further acceleration of the tracing process of rays. State-of-the-art spatial data structures are used as base structures and enhanced by precomputed directional visibility information based on a sophisticated abstraction concept of shafts within an original structure, the Line Space. In the course of the work, novel approaches for the precomputed visibility information are proposed: a binary value that indicates whether a shaft is empty or non-empty as well as a single candidate approximating the actual surface as a representative candidate. It is shown how the binary value is used in a simple but effective empty space skipping technique, which allows a performance gain in ray tracing of up to 40% compared to the pure base data structure, regardless of the spatial structure that is actually used. In addition, it is shown that this binary visibility information provides a fast technique for calculating soft shadows and ambient occlusion based on blocker approximations. Although the results contain a certain inaccuracy error, which is also presented and discussed, it is shown that a further tracing acceleration of up to 300% compared to the base structure is achieved. As an extension of this approach, the representative candidate precomputation is demonstrated, which is used to accelerate the indirect lighting computation, resulting in a signiﬁcant performance gain at the expense of image errors. Finally, techniques based on two-stage structures and a usage heuristic are proposed and evaluated. These reduce memory consumption and approximation errors while maintaining the performance gain and also enabling further possibilities with object instancing and rigid transformations. All performance and memory values as well as the approximation errors are measured, presented and discussed. Overall, the Line Space is shown to result in a considerate improvement in ray tracing performance at the cost of higher memory consumption and possible approximation errors. The presented ﬁndings thus demonstrate the capability of the combined approach and enable further possibilities for future work.

Ontologie-basierte Informationsintegration in der Form eines Social Network of Business Objects (SoNBO) (2021)

Gebel-Sauer, Berit

The flexible integration of information from distributed and complex information systems poses a major challenge for organisations. The ontology-based information integration concept SoNBO (Social Network of Business Objects) developed and presented in this dissertation addresses these challenges. In an ontology-based concept, the data structure in the source systems (e.g. operational application systems) is described with the help of a schema (= ontology). The ontology and the data from the source systems can be used to create a (virtualised or materialised) knowledge graph, which is used for information access. The schema can be flexibly adapted to the changing needs of a company regarding their information integration. SoNBO differs from existing concepts known from the Semantic Web (OBDA = Ontology-based Data Access, EKG = Enterprise Knowledge Graph) both in the structure of the company-specific ontology (= Social Network of Concepts) as well as in the structure of the user-specific knowledge graph (= Social Network of Business Objects) and makes use of social principles (known from Enterprise Social Software). Following a Design Science Research approach, the SoNBO framework was developed and the findings documented in this dissertation. The framework provides guidance for the introduction of SoNBO in a company and the knowledge gained from the evaluation (in the company KOSMOS Verlag) is used to demonstrate its viability. The results (SoNBO concept and SoNBO framework) are based on the synthesis of the findings from a structured literature review and the investigation of the status quo of ontology-based information integration in practice: For the status quo in practice, the basic idea of SoNBO is demonstrated in an in-depth case study about the engineering office Vössing, which has been using a self-developed SoNBO application for a few years. The status quo in the academic literature is presented in the form of a structured literature analysis on ontology-based information integration approaches. This dissertation adds to theory in the field of ontology-based information integration approaches (e. g. by an evaluated artefact) and provides an evaluated artefact (the SoNBO Framework) for practice.

Type-safe Programming for the Semantic Web (2021)

Leinberger, Martin

Graph-based data formats are flexible in representing data. In particular semantic data models, where the schema is part of the data, gained traction and commercial success in recent years. Semantic data models are also the basis for the Semantic Web - a Web of data governed by open standards in which computer programs can freely access the provided data. This thesis is concerned with the correctness of programs that access semantic data. While the flexibility of semantic data models is one of their biggest strengths, it can easily lead to programmers accidentally not accounting for unintuitive edge cases. Often, such exceptions surface during program execution as run-time errors or unintended side-effects. Depending on the exact condition, a program may run for a long time before the error occurs and the program crashes. This thesis defines type systems that can detect and avoid such run-time errors based on schema languages available for the Semantic Web. In particular, this thesis uses the Web Ontology Language (OWL) and its theoretic underpinnings, i.e., description logics, as well as the Shapes Constraint Language (SHACL) to define type systems that provide type-safe data access to semantic data graphs. Providing a safe type system is an established methodology for proving the absence of run-time errors in programs without requiring execution. Both schema languages are based on possible world semantics but differ in the treatment of incomplete knowledge. While OWL allows for modelling incomplete knowledge through an open-world semantics, SHACL relies on a fixed domain and closed-world semantics. We provide the formal underpinnings for type systems based on each of the two schema languages. In particular, we base our notion of types on sets of values which allows us to specify a subtype relation based on subset semantics. In case of description logics, subsumption is a routine problem. For the type system based on SHACL, we are able to translate it into a description logic subsumption problem.

Untersuchung von Analyse-durch-Synthese Techniken im markerlosen Tracking (2020)

Schumann, Martin

In the context of augmented reality we define tracking as a collection of methods to obtain the position and orientation (pose) of a user. By means of various displaying techniques, this ensures a correct visual overlay of graphical information onto the reality perceived. Precise results for calculation of the camera pose are gained by methods of image processing, usually analyzing the pixels of an image and extracing features, which can be recognized over the image sequence. However, these methods do not regard the process of image synthesis or at least in a very simplyfied way. In contrast, the class of model-based methods assumes a given 3D model of the observed scene. Based on the model data features can be identified to establish correspondences in the camera image. From these feature correspondences the camera pose is calculated. An interesting approach is the strategy of analysis-by-synthesis, regarding the computer graphics rendering process for extending the knowledge about the model by information from image synthesis and other environment variables. In this thesis the components of a tracking system are identified and further it is analyzed, to what extend information about the model, the rendering process and the environment can contribute to the components for improvement of the tracking process using analysis-by-synthesis. In particular, by using knowledge as topological information, lighting or perspective, the feature synthesis and correspondence finding should lead to visually unambiguous features that can be predicted and evaluated to be suitable for stable tracking of the camera pose.

Predicting foreign users from English conversations on social media (2020)

Winkens, Alexander

Social media platforms such as Twitter or Reddit allow users almost unrestricted access to publish their opinions on recent events or discuss trending topics. While the majority of users approach these platforms innocently, some groups have set their mind on spreading misinformation and influencing or manipulating public opinion. These groups disguise as native users from various countries to spread frequently manufactured articles, strong polarizing opinions in the political spectrum and possibly become providers of hate-speech or extremely political positions. This thesis aims to implement an AutoML pipeline for identifying second language speakers from English social media texts. We investigate style differences of text in different topics and across the platforms Reddit and Twitter, and analyse linguistic features. We employ feature-based models with datasets from Reddit, which include mostly English conversation from European users, and Twitter, which was newly created by collecting English tweets from selected trending topics in different countries. The pipeline classifies language family, native language and origin (Native or non-Native English speakers) of a given textual input. We evaluate the resulting classifications by comparing prediction accuracy, precision and F1 scores of our classification pipeline to traditional machine learning processes. Lastly, we compare the results from each dataset and find differences in language use for topics and platforms. We obtained high prediction accuracy for all categories on the Twitter dataset and observed high variance in features such as average text length especially for Balto-Slavic countries.

Data Protection Assurance by Design: Support for Conflict Detection, Requirements Traceability and Fairness Analysis (2020)

Ramadan, Qusai

Data-minimization and fairness are fundamental data protection requirements to avoid privacy threats and discrimination. Violations of data protection requirements often result from: First, conflicts between security, data-minimization and fairness requirements. Second, data protection requirements for the organizational and technical aspects of a system that are currently dealt with separately, giving rise to misconceptions and errors. Third, hidden data correlations that might lead to influence biases against protected characteristics of individuals such as ethnicity in decision-making software. For the effective assurance of data protection needs, it is important to avoid sources of violations right from the design modeling phase. However, a model-based approach that addresses the issues above is missing. To handle the issues above, this thesis introduces a model-based methodology called MoPrivFair (Model-based Privacy & Fairness). MoPrivFair comprises three sub-frameworks: First, a framework that extends the SecBPMN2 approach to allow detecting conflicts between security, data-minimization and fairness requirements. Second, a framework for enforcing an integrated data-protection management throughout the development process based on a business processes model (i.e., SecBPMN2 model) and a software architecture model (i.e., UMLsec model) annotated with data protection requirements while establishing traceability. Third, the UML extension UMLfair to support individual fairness analysis and reporting discriminatory behaviors. Each of the proposed frameworks is supported by automated tool support. We validated the applicability and usability of our conflict detection technique based on a health care management case study, and an experimental user study, respectively. Based on an air traffic management case study, we reported on the applicability of our technique for enforcing an integrated data-protection management. We validated the applicability of our individual fairness analysis technique using three case studies featuring a school management system, a delivery management system and a loan management system. The results show a promising outlook on the applicability of our proposed frameworks in real-world settings.

Model-based privacy by design (2020)

Ahmadian, Amirshayan

Nowadays, almost any IT system involves personal data processing. In such systems, many privacy risks arise when privacy concerns are not properly addressed from the early phases of the system design. The General Data Protection Regulation (GDPR) prescribes the Privacy by Design (PbD) principle. As its core, PbD obliges protecting personal data from the onset of the system development, by effectively integrating appropriate privacy controls into the design. To operationalize the concept of PbD, a set of challenges emerges: First, we need a basis to define privacy concerns. Without such a basis, we are not able to verify whether personal data processing is authorized. Second, we need to identify where precisely in a system, the controls have to be applied. This calls for system analysis concerning privacy concerns. Third, with a view to selecting and integrating appropriate controls, based on the results of system analysis, a mechanism to identify the privacy risks is required. Mitigating privacy risks is at the core of the PbD principle. Fourth, choosing and integrating appropriate controls into a system are complex tasks that besides risks, have to consider potential interrelations among privacy controls and the costs of the controls. This thesis introduces a model-based privacy by design methodology to handle the above challenges. Our methodology relies on a precise definition of privacy concerns and comprises three sub-methodologies: model-based privacy analysis, modelbased privacy impact assessment and privacy-enhanced system design modeling. First, we introduce a definition of privacy preferences, which provides a basis to specify privacy concerns and to verify whether personal data processing is authorized. Second, we present a model-based methodology to analyze a system model. The results of this analysis denote a set of privacy design violations. Third, taking into account the results of privacy analysis, we introduce a model-based privacy impact assessment methodology to identify concrete privacy risks in a system model. Fourth, concerning the risks, and taking into account the interrelations and the costs of the controls, we propose a methodology to select appropriate controls and integrate them into a system design. Using various practical case studies, we evaluate our concepts, showing a promising outlook on the applicability of our methodology in real-world settings.

Time series influences in political communication (2019)

Thesing, Tobias

Current political issues are often reflected in social media discussions, gathering politicians and voters on common platforms. As these can affect the public perception of politics, the inner dynamics and backgrounds of such debates are of great scientific interest. This thesis takes user generated messages from an up-to-date dataset of considerable relevance as Time Series, and applies a topic-based analysis of inspiration and agenda setting to it. The Institute for Web Science and Technologies of the University Koblenz-Landau has collected Twitter data generated beforehand by candidates of the European Parliament Election 2019. This work processes and analyzes the dataset for various properties, while focusing on the influence of politicians and media on online debates. An algorithm to cluster tweets into topical threads is introduced. Subsequently, Sequential Association Rules are mined, yielding wide array of potential influence relations between both actors and topics. The elaborated methodology can be configured with different parameters and is extensible in functionality and scope of application.

Recovering Security in Model-Based Software Engineering by Context-Driven Co-Evolution (2019)

Bürger, Jens

Software systems have an increasing impact on our daily lives. Many systems process sensitive data or control critical infrastructure. Providing secure software is therefore inevitable. Such systems are rarely being renewed regularly due to the high costs and effort. Oftentimes, systems that were planned and implemented to be secure, become insecure because their context evolves. These systems are connected to the Internet and therefore also constantly subject to new types of attacks. The security requirements of these systems remain unchanged, while, for example, discovery of a vulnerability of an encryption algorithm previously assumed to be secure requires a change of the system design. Some security requirements cannot be checked by the system’s design but only at run time. Furthermore, the sudden discovery of a security violation requires an immediate reaction to prevent a system shutdown. Knowledge regarding security best practices, attacks, and mitigations is generally available, yet rarely integrated part of software development or covering evolution. This thesis examines how the security of long-living software systems can be preserved taking into account the influence of context evolutions. The goal of the proposed approach, S²EC²O, is to recover the security of model-based software systems using co-evolution. An ontology-based knowledge base is introduced, capable of managing common, as well as system-specific knowledge relevant to security. A transformation achieves the connection of the knowledge base to the UML system model. By using semantic differences, knowledge inference, and the detection of inconsistencies in the knowledge base, context knowledge evolutions are detected. A catalog containing rules to manage and recover security requirements uses detected context evolutions to propose potential co-evolutions to the system model which reestablish the compliance with security requirements. S²EC²O uses security annotations to link models and executable code and provides support for run-time monitoring. The adaptation of running systems is being considered as is round-trip engineering, which integrates insights from the run time into the system model. S²EC²O is amended by prototypical tool support. This tool is used to show S²EC²O’s applicability based on a case study targeting the medical information system iTrust. This thesis at hand contributes to the development and maintenance of long-living software systems, regarding their security. The proposed approach will aid security experts: It detects security-relevant changes to the system context, determines the impact on the system’s security and facilitates co-evolutions to recover the compliance with the security requirements.

Internet of Things -Foodstuff Traceability and Transportation with Consideration of Logistic Processes in Cold Chain Management- (2019)

Schulz, Maurice

Abstract This bachelor thesis delivers a comprehensive overview of the topic Internet of Things (IoT). With the help of a first literature review, important characteristics, architectures, and properties have been identified. The main aim of this bachelor thesis is to determine whether the use of IoT in the transport of food, considering the compliance with the cold chain, can provide advantages for companies to reduce food waste. For this purpose, a second literature review has been carried out with food transport systems without the use, as well as with the use of IoT. Based on the literature review, it is possible at the end to determine a theoretical ‘ideal’ system for food transport in refrigerated trucks. The respective used technologies are also mentioned. The findings of several authors have shown that often significant improvements can be achieved in surveillance, transport in general, or traceability of food, and ultimately food waste can be reduced. However, benefits can also be gained using new non-IoT-based technologies. Thus, the main knowledge of this bachelor thesis is that a theoretical ‘ideal’ transport system contains a sensible combination of technologies with and without IoT. This system includes the use of a Wireless Sensor Network (WSN) for real-time food monitoring, as well as an alarm function when the temperature exceeds a maximum. Real-time monitoring with GPS coupled with a monitoring center to prevent traffic jams is another task. Smart and energy-efficient packaging, and finally the use of the new supercooling-technology, make the system significantly more efficient in reducing food waste. These highlights, that when choosing a transport system, which is as efficient and profitable as possible for food with refrigerated transport, companies need not just rely on the use of IoT. On this basis, it is advisable to combine the systems and technologies used so far with IoT in order to avoid as much food waste as possible.

Object-oriented high-level data flow analysis (2019)

Mebus, David

Data flow models in the literature are often very fine-grained, which transfers to the data flow analysis performed on them and thus leads to a decrease in the analysis' understandability. Since a data flow model, which abstracts from the majority of implementation details of the program modeled, allows for potentially easier to understand data flow analyses, this master thesis deals with the specification and construction of a highly abstracted data flow model and the application of data flow analyses on this model. The model and the analyses performed on it have been developed in a test-driven manner, so that a wide range of possible data flow scenarios could be covered. As a concrete data flow analysis, a static security check in the form of a detection of insufficient user input sanitization has been performed. To date, there's no data flow model on a similarly high level of abstraction. The proposed solution is therefore unique and facilitates developers without expertise in data flow analysis to perform such analyses.

Weiterentwicklung der Unterrichtsreihe Planspiel 2.0: „Wer weiß was über mich im Internet?“ des Projekts Informatik im Kontext und Durchführung dieser in einem Grundkurs Informatik (2019)

Noll, Christoph

Diese Arbeit soll das von Dietz und Oppermann entwickelte Planspiel „Datenschutz 2.0“ an den heutigen Alltag der Schüler anpassen, die Benutzung in der Sekundarstufe II ermöglichen und die technischen und gesetzlichen Problematiken des Planspiels beheben. Das mit dem Planspiel aufgegriffene Thema Datenschutz ist im rheinland-pfälzischen Informatik-Lehrplan für die Sekundarstufe II verankert. Hier wird der Begriff Datenschutz in der Reihe „Datenerhebung unter dem Aspekt Datenschutz beurteilen“ genannt. Jedoch werden in dem Planspiel keine Daten erhoben, sondern die selbst hinterlassenen Datenspuren untersucht. Diese Form des Datenschutzes ist im Grundkurs in der vorgeschlagenen Reihe „Datensicherheit unter der Berücksichtigung kryptologischer Verfahren erklären und beachten“ unter dem Thema Kommunikation in Rechnernetzen zu finden. Im Leistungskurs steht die Datensicherheit in gleichbenannter Reihe und Thema und in der Reihe „Datenerhebung unter dem Aspekt Datenschutz beurteilen“ im Thema Wechselwirkung zwischen Informatiksysteme, Individuum und Gesellschaft.

004 Datenverarbeitung; Informatik

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Institute

751 search hits