OPUS 4 | 004 Datenverarbeitung; Informatik

Edge Formation and its Influence in Machine Learning (2022)

Espín-Noboa, Lisette

Social networks are ubiquitous structures that we generate and enrich every-day while connecting with people through social media platforms, emails, and any other type of interaction. While these structures are intangible to us, they carry important information. For instance, the political leaning of our friends can be a proxy to identify our own political preferences. Similarly, the credit score of our friends can be decisive in the approval or rejection of our own loans. This explanatory power is being leveraged in public policy, business decision-making and scientific research because it helps machine learning techniques to make accurate predictions. However, these generalizations often benefit the majority of people who shape the general structure of the network, and put in disadvantage under-represented groups by limiting their resources and opportunities. Therefore it is crucial to first understand how social networks form to then verify to what extent their mechanisms of edge formation contribute to reinforce social inequalities in machine learning algorithms. To this end, in the first part of this thesis, I propose HopRank and Janus two methods to characterize the mechanisms of edge formation in real-world undirected social networks. HopRank is a model of information foraging on networks. Its key component is a biased random walker based on transition probabilities between k-hop neighborhoods. Janus is a Bayesian framework that allows to identify and rank plausible hypotheses of edge formation in cases where nodes possess additional information. In the second part of this thesis, I investigate the implications of these mechanisms - that explain edge formation in social networks - on machine learning. Specifically, I study the influence of homophily, preferential attachment, edge density, fraction of inorities, and the directionality of links on both performance and bias of collective classification, and on the visibility of minorities in top-k ranks. My findings demonstrate a strong correlation between network structure and machine learning outcomes. This suggests that systematic discrimination against certain people can be: (i) anticipated by the type of network, and (ii) mitigated by connecting strategically in the network.

Real-Time Implementation of OpenVDB Rendering (2022)

Gaida, Sebastian

In this thesis the possibilities for real-time visualization of OpenVDB files are investigated. The basics of OpenVDB, its possibilities, as well as NanoVDB and its GPU port, were studied. A system was developed using PNanoVDB, the graphics API port of OpenVDB. Techniques were explored to improve and accelerate a single ray approach of ray tracing. To prove real-time capability, two single scattering approaches were also implemented. One of these was selected, further investigated and optimized to achieve interactive real-time rendering. It is important to give artists immediate feedback on their adjustments, as well as the possibility to change all parameters to ensure a user friendly creation process. In addition to the optical rendering, corresponding benchmarks were collected to compare different improvement approaches and to prove their relevance. Attention was paid to the rendering times and memory consumption on the GPU to ensure optimal use. A special focus, when rendering OpenVDB files, was put on the integrability and extensibility of the program to allow easy integration into an existing real-time renderer like U-Render.

Methods for Human-Machine Link Quality Management on the Web of Data (2022)

Sarasua, Cristina

Semantic Web technologies have been recognized to be key for the integration of distributed and heterogeneous data sources on the Web, as they provide means to define typed links between resources in a dynamic manner and following the principles of dataspaces. The widespread adoption of these technologies in the last years led to a large volume and variety of data sets published as machine-readable RDF data, that once linked constitute the so-called Web of Data. Given the large scale of the data, these links are typically generated by computational methods that given a set of RDF data sets, analyze their content and identify the entities and schema elements that should be connected via the links. Analogously to any other kind of data, in order to be truly useful and ready to be consumed, links need to comply with the criteria of high quality data (e.g., syntactically and semantically accurate, consistent, up-to-date). Despite the progress in the field of machine learning, human intelligence is still essential in the quest for high quality links: humans can train algorithms by labeling reference examples, validate the output of algorithms to verify their performance on a data set basis, as well as augment the resulting set of links. Humans —especially expert humans, however, have limited availability. Hence, extending data quality management processes from data owners/publishers to a broader audience can significantly improve the data quality management life cycle. Recent advances in human computation and peer-production technologies opened new avenues for human-machine data management techniques, allowing to involve non-experts in certain tasks and providing methods for cooperative approaches. The research work presented in this thesis takes advantage of such technologies and investigates human-machine methods that aim at facilitating link quality management in the Semantic Web. Firstly, and focusing on the dimension of link accuracy, a method for crowdsourcing ontology alignment is presented. This method, also applicable to entities, is implemented as a complement to automatic ontology alignment algorithms. Secondly, novel measures for the dimension of information gain facilitated by the links are introduced. These entropy-centric measures provide data managers with information about the extent the entities in the linked data set gain information in terms of entity description, connectivity and schema heterogeneity. Thirdly, taking Wikidata —the most successful case of a linked data set curated, linked and maintained by a community of humans and bots— as a case study, we apply descriptive and predictive data mining techniques to study participation inequality and user attrition. Our findings and method can help community managers make decisions on when/how to intervene with user retention plans. Lastly, an ontology to model the history of crowd contributions across marketplaces is presented. While the field of human-machine data management poses complex social and technical challenges, the work in this thesis aims to contribute to the development of this still emerging field.

RoboCup 2016 – homer@UniKoblenz (Germany) (2018)

Memmesheimer, Raphael

This paper describes the robot Lisa used by team homer@UniKoblenz of the University of Koblenz Landau, Germany, for the participation at the RoboCup@Home 2016 in Leipzig, Germany. A special focus is put on novel system components and the open source contributions of our team. We have released packages for object recognition, a robot face including speech synthesis, mapping and navigation, speech recognition interface via android and a GUI. The packages are available (and new packages will be released) on http://wiki.ros.org/agas-ros-pkg.

Forschungs- und Lehrbericht 2015/2016 Fachbereich 4: Informatik Universität Koblenz-Landau (2017)

Der Fachbereich 4 (Informatik) besteht aus fünfundzwanzig Arbeitsgruppen unter der Leitung von Professorinnen und Professoren, die für die Forschung und Lehre in sechs Instituten zusammenarbeiten. In jedem Jahresbericht stellen sich die Arbeitsgruppen nach einem einheitlichen Muster dar, welche personelle Zusammensetzung sie haben, welche Projekte in den Berichtszeitraum fallen und welche wissenschaftlichen Leistungen erbracht wurden. In den folgenden Kapiteln werden einzelne Parameter aufgeführt, die den Fachbereich in quantitativer Hinsicht, was Drittmitteleinwerbungen, Abdeckung der Lehre, Absolventen oder Veröffentlichungen angeht, beschreiben.

Forschungs- und Lehrbericht 2014/2015 Fachbereich 4: Informatik Universität Koblenz-Landau (2017)

Der Fachbereich 4 (Informatik) besteht aus fünfundzwanzig Arbeitsgruppen unter der Leitung von Professorinnen und Professoren, die für die Forschung und Lehre in sechs Instituten zusammenarbeiten. In jedem Jahresbericht stellen sich die Arbeitsgruppen nach einem einheitlichen Muster dar, welche personelle Zusammensetzung sie haben, welche Projekte in den Berichtszeitraum fallen und welche wissenschaftlichen Leistungen erbracht wurden. In den folgenden Kapiteln werden einzelne Parameter aufgeführt, die den Fachbereich in quantitativer Hinsicht, was Drittmitteleinwerbungen, Abdeckung der Lehre, Absolventen oder Veröffentlichungen angeht, beschreiben.

Crowdsourcing for Survey Research : where Amazon Mechanical Turks deviates from conventional survey methods (2015)

Schaarschmidt, Mario ; Ivens, Stefan ; Homscheid, Dirk ; Bilo, Pascal

Information systems research has started to use crowdsourcing platforms such as Amazon Mechanical Turks (MTurk) for scientific research, recently. In particular, MTurk provides a scalable, cheap work-force that can also be used as a pool of potential respondents for online survey research. In light of the increasing use of crowdsourcing platforms for survey research, the authors aim to contribute to the understanding of its appropriate usage. Therefore, they assess if samples drawn from MTurk deviate from those drawn via conventional online surveys (COS) in terms of answers in relation to relevant e-commerce variables and test the data in a nomological network for assessing differences in effects. The authors compare responses from 138 MTurk workers with those of 150 German shoppers recruited via COS. The findings indicate, inter alia, that MTurk workers tend to exhibit more positive word-of mouth, perceived risk, customer orientation and commitment to the focal company. The authors discuss the study- results, point to limitations, and provide avenues for further research.

Forschungs- und Lehrbericht 2013/2014 Fachbereich 4: Informatik Universität Koblenz-Landau (2015)

Der Fachbereich 4 (Informatik) besteht aus fünfundzwanzig Arbeitsgruppen unter der Leitung von Professorinnen und Professoren, die für die Forschung und Lehre in sechs Instituten zusammenarbeiten. In jedem Jahresbericht stellen sich die Arbeitsgruppen nach einem einheitlichen Muster dar, welche personelle Zusammensetzung sie haben, welche Projekte in den Berichtszeitraum fallen und welche wissenschaftlichen Leistungen erbracht wurden. In den folgenden Kapiteln werden einzelne Parameter aufgeführt, die den Fachbereich in quantitativer Hinsicht, was Drittmitteleinwerbungen, Abdeckung der Lehre, Absolventen oder Veröffentlichungen angeht, beschreiben.

Categorising Social Media Business Risks (2014)

Hausmann, Verena ; Williams, Susan P.

The aim of this paper is to identify and understand the risks and issues companies are experiencing from the business use of social media and to develop a framework for describing and categorising those social media risks. The goal is to contribute to the evolving theorisation of social media risk and to provide a foundation for the further development of social media risk management strategies and processes. The study findings identify thirty risk types organised into five categories (technical, human, content, compliance and reputational). A risk-chain is used to illustrate the complex interrelated, multi-stakeholder nature of these risks and directions for future work are identified.

Micro Modelling of User Perception and Generation Processes for Macro Level Predictions in Online Communities (2014)

Schwagereit, Felix ; Gottron, Thomas ; Staab, Steffen

The way information is presented to users in online community platforms has an influence on the way the users create new information. This is the case, for instance, in question-answering fora, crowdsourcing platforms or other social computation settings. To better understand the effects of presentation policies on user activity, we introduce a generative model of user behaviour in this paper. Running simulations based on this user behaviour we demonstrate the ability of the model to evoke macro phenomena comparable to the ones observed on real world data.

004 Datenverarbeitung; Informatik

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Keywords

Institute

273 search hits