Institut für Computervisualistik
Filtern
Erscheinungsjahr
Dokumenttyp
- Masterarbeit (16)
- Bachelorarbeit (12)
- Dissertation (9)
- Ausgabe (Heft) zu einer Zeitschrift (7)
- Diplomarbeit (5)
- Studienarbeit (2)
- Konferenzveröffentlichung (1)
Sprache
- Englisch (52) (entfernen)
Schlagworte
- virtual reality (3)
- Bildverarbeitung (2)
- Computer Graphics (2)
- Computergraphik (2)
- Graphik (2)
- Line Space (2)
- OpenGL (2)
- Volumen-Rendering (2)
- tracking (2)
- Acceleration Structures (1)
Institut
In recent years head mounted displays (HMD) and their abilities to create virtual realities comparable with the real world moved more into the focus of press coverage and consumers. The reason for this lies in constant improvements in available computing power, miniaturisation of components as well as the constantly shrinking power consumption. These trends originate in the general technical progress driven by advancements made in smartphone sector. This gives more people than ever access to the required components to create these virtual realities. However at the same time there is only limited research which uses the current generation of HMDs especially when comparing the virtual and real world against each other. The approach of this thesis is to look into the process of navigating both real and virtual spaces while using modern hardware and software. One of the key areas are the spatial and peripheral perception without which it would be difficult to navigate a given space. The influence of prior real and virtual experiences on these will be another key aspect. The final area of focus is the influence on the emotional state and how it compares to the real world. To research these influences a experiment using the Oculus Rift DK2 HMD will be held in which subjects will be guided through a real space as well as a virtual model of it. Data will be gather in a quantitative manner by using surveys. Finally, the findings will be discussed based on a statistical evaluation. During these tests the different perception of distances and room size will the compared and how they change based on the current reality. Furthermore, the influence of prior spatial activities both in the real and the virtual world will looked into. Lastly, it will be checked how real these virtual worlds are and if they are sufficiently sophisticated to trigger the same emotional responses as the real world.
This work describes a novel software tool for visualizing anatomical segmentations of medical images. It was developed as part of a bachelor's thesis project, with a view to supporting research into automatic anatomical brain image segmentation. The tool builds on a widely-used visualization approach for 3D image volumes, where sections in orthogonal directions are rendered on screen as 2D images. It implements novel display modes that solve common problems with conventional viewer programs. In particular, it features a double-contour display mode to aid the user's spatial orientation in the image, as well as modes for comparing two competing segmentation labels pertaining to one and the same anatomical region. The tool was developed as an extension to an existing open-source software suite for medical image processing. The visualization modes are, however, suitable for implementation in the context of other viewer programs that follow a similar rendering approach.
The modified code can be found here: soundray.org/mm-segmentation-visualization.tar.gz.
On the recognition of human activities and the evaluation of its imitation by robotic systems
(2023)
This thesis addresses the problem of action recognition through the analysis of human motion and the benchmarking of its imitation by robotic systems.
For our action recognition related approaches, we focus on presenting approaches that generalize well across different sensor modalities. We transform multivariate signal streams from various sensors to a common image representation. The action recognition problem on sequential multivariate signal streams can then be reduced to an image classification task for which we utilize recent advances in machine learning. We demonstrate the broad applicability of our approaches formulated as a supervised classification task for action recognition, a semi-supervised classification task for one-shot action recognition, modality fusion and temporal action segmentation.
For action classification, we use an EfficientNet Convolutional Neural Network (CNN) model to classify the image representations of various data modalities. Further, we present approaches for filtering and the fusion of various modalities on a representation level. We extend the approach to be applicable for semi-supervised classification and train a metric-learning model that encodes action similarity. During training, the encoder optimizes the distances in embedding space for self-, positive- and negative-pair similarities. The resulting encoder allows estimating action similarity by calculating distances in embedding space. At training time, no action classes from the test set are used.
Graph Convolutional Network (GCN) generalized the concept of CNNs to non-Euclidean data structures and showed great success for action recognition directly operating on spatio-temporal sequences like skeleton sequences. GCNs have recently shown state-of-the-art performance for skeleton-based action recognition but are currently widely neglected as the foundation for the fusion of various sensor modalities. We propose incorporating additional modalities, like inertial measurements or RGB features, into a skeleton-graph, by proposing fusion on two different dimensionality levels. On a channel dimension, modalities are fused by introducing additional node attributes. On a spatial dimension, additional nodes are incorporated into the skeleton-graph.
Transformer models showed excellent performance in the analysis of sequential data. We formulate the temporal action segmentation task as an object detection task and use a detection transformer model on our proposed motion image representations. Experiments for our action recognition related approaches are executed on large-scale publicly available datasets. Our approaches for action recognition for various modalities, action recognition by fusion of various modalities, and one-shot action recognition demonstrate state-of-the-art results on some datasets.
Finally, we present a hybrid imitation learning benchmark. The benchmark consists of a dataset, metrics, and a simulator integration. The dataset contains RGB-D image sequences of humans performing movements and executing manipulation tasks, as well as the corresponding ground truth. The RGB-D camera is calibrated against a motion-capturing system, and the resulting sequences serve as input for imitation learning approaches. The resulting policy is then executed in the simulated environment on different robots. We propose two metrics to assess the quality of the imitation. The trajectory metric gives insights into how close the execution was to the demonstration. The effect metric describes how close the final state was reached according to the demonstration. The Simitate benchmark can improve the comparability of imitation learning approaches.
This paper describes the robot Lisa used by team homer@UniKoblenz of the University of Koblenz Landau, Germany, for the participation at the RoboCup@Home 2017 in Nagoya, Japan. A special focus is put on novel system components and the open source contributions of our team. We have released packages for object recognition, a robot face including speech synthesis, mapping and navigation, speech recognition interface via android and a GUI. The packages are available (and new packages will be released) on
http://wiki.ros.org/agas-ros-pkg.
This paper describes the robot Lisa used by team
homer@UniKoblenz of the University of Koblenz Landau, Germany, for the participation at the RoboCup@Home 2016 in Leipzig, Germany. A special focus is put on novel system components and the open source contributions of our team. We have released packages for object recognition, a robot face including speech synthesis, mapping and navigation, speech recognition interface via android and a GUI. The packages are available (and new packages will be released) on http://wiki.ros.org/agas-ros-pkg.
In dieser Arbeit präsentieren wir Methoden zum Schätzen von Kamerabewegungen einer RGB-D-Kamera in sechs Freiheitsgraden und dem Erstellen von 3D-Karten. Als erstes werden die RGB- und Tiefendaten registriert und synchronisiert. Nach der Vorverarbeitung extrahieren wir FAST-Merkmale in zwei aufeinander folgenden Bildern. Daraus wird eine Korrespondenzmenge erstellt und Ausreißer werden herausgefiltert. Anschließend projizieren wir die Korrespondenzmenge in 3D, um die Bewegung aus 3D-3D-Korrespondezen mittels Least-Squares zu bestimmen. Weiterhin präsentieren wir Methoden, um 3D-Karten aus Bewegungsschätzungen und RGB-D-Daten zu erstellen. Dafür benutzen wir das OctoMap-Framework und erstellen wahlweise auch inkrementelle Karten aus Punktewolken. Anschließend evaluieren wir das System mit dem weit verbreiteten RGB-D-Benchmark.
Objekterkennung ist ein gut erforschtes Gebiet bei bildbasiertem Rechnersehenrnund eine Vielzahl an Methoden wurden entwickelt. In letzter Zeit haben sich dabei Ansätze verbreitet, die auf dem Implicit Shape Model-Konzept basieren. Dabei werden Objekte zunächst in grundlegende visuelle Bestandteile aufgetrennt, die um örtliche Informationen erweitert werden. Das so generierte Objektmodell wird dann in der Objekterkennung genutzt, um unbekannte Objekte zu erkennen. Seit dem Aufkommen von erschwinglichen Tiefenkameras wie der Microsoft Kinect wurde jedoch die Objekterkennung mittels 3D-Punktwolken von zunehmender Bedeutung. Im Rahmen des Robotersehens in Innenräumen wird ein Verfahren entwickelt, welches auf vorhandenen Ansätze aufbaut und damit die Implicit Shape Model basierte Objekterkennung für die Verarbeitung von 3D-Punktwolken erweitert.
Die Mitralklappe ist eine der vier Herzklappen des Menschen und in der linken Herzkammer zu finden. Ihre Funktion ist es, den Blutfluss vom linken Atrium zum linken Ventrikel zu regeln. Pathologien können zu eingeschränker Funktionalität der Klappe führen, sodass Blut zurück ins Atrium fließen kann. Patienten, die von einer Fehlfunktion betroffen sind, leiden möglicherweise an Erschöpfung und Schmerzen in der Brust. Die Funktionalität kann chirurgisch wiederhergestellt werden, was meist ein langer und anstrengender Eingriff ist. Eine gründliche Planung ist daher nötig, um eine sichere und effektive Operation zu garantieren. Dies kann durch prä-operative Segmentierungen der Mitralklappe unterstützt werden. Eine post-operative Analyse kann den Erfolg eines Eingriffs feststellen. Diese Arbeit wird bestehende und neue Ideen zu einem neuen Ansatz kombinieren, der zur (semi-)automatischen Erstellung solcher Mitralmodelle dienen kann. Der manuelle Anteil garantiert ein Modell hoher Qualität, während der automatische Teil dazu beiträgt, wertvolle Arbeitszeit zu sparen.
Die Hauptbeiträge des automatischen Algorithmus sind eine ungefähre semantische Trennung der beiden Mitralsegel und ein Optimierungsprozess, der in der Lage ist, eine Koaptations-Linie und -Fläche zwischen den Segeln zu finden. Die Methode kann eine vollautomatische Segmentierung der Mitralsegel durchführen, wenn der Annulusring bereits gegeben ist. Die Zwischenschritte dieses Vorgangs werden in eine manuelle Segmentierungsmethode integriert, so dass ein Benutzer den Gesamtprozess beeinflussen kann. Die Qualität der generierten Mitralmodelle wird durch das Vergleichen mit vollständig manuell erstellten Modellen gemessen. Dies wird zeigen, dass übliche Methoden zur Bestimmung der Qualität einer Segmentierung zu allgemein gefasst sind und nicht ausreichen, um die echte Qualität eines Modells widerspiegeln zu können. Folglich führt diese Arbeit Messungen ein, die in der Lage sind, eine Segmentierung der Mitralklappe detailliert und unter Betracht anatomischer Landmarken bewerten zu können. Neben der intra-operativen Unterstützung eines Chirurgen liefert eine segmentierte Mitralklappe weitere Vorteile. Die Möglichkeit, die Anatomie einer Klappe patientenspezifisch aufzunehmen und objektiv zu bewerten, könnte als Grundlage für zukünftige medizinische Forschung in diesem Bereich dienen. Die Automatisierung erlaubt dabei das Bearbeiten großer Datenmengen mit reduzierter Abhängigkeit von Experten. Desweiteren könnten Simulationsmethoden, welche ein segmentiertes Modell als Eingabe nutzen, das Ergebnis einer Operation vorhersagen.
Bio-medical data comes in various shapes and with different representations.
Domain experts use such data for analysis or diagnosis,
during research or clinical applications. As the opportunities to obtain
or to simulate bio-medical data become more complex and productive,
the experts face the problem of data overflow. Providing a
reduced, uncluttered representation of data, that maintains the data’s
features of interest falls into the area of Data Abstraction. Via abstraction,
undesired features are filtered out to give space - concerning the
cognitive and visual load of the viewer - to more interesting features,
which are therefore accentuated. To address this challenge, the dissertation
at hand will investigate methods that deal with Data Abstraction
in the fields of liver vasculature, molecular and cardiac visualization.
Advanced visualization techniques will be applied for this purpose.
This usually requires some pre-processing of the data, which will also
be covered by this work. Data Abstraction itself can be implemented
in various ways. The morphology of a surface may be maintained,
while abstracting its visual cues. Alternatively, the morphology may
be changed to a more comprehensive and tangible representation.
Further, spatial or temporal dimensions of a complex data set may
be projected to a lower space in order to facilitate processing of the
data. This thesis will tackle these challenges and therefore provide an
overview of Data Abstraction in the bio-medical field, and associated
challenges, opportunities and solutions.
Molecular dynamics (MD) as a field of molecular modelling has great potential to revolutionize our knowledge and understanding of complex macromolecular structures. Its field of application is huge, reaching from computational chemistry and biology over material sciences to computer-aided drug design. This thesis on one hand provides insights into the underlying physical concepts of molecular dynamics simulations and how they are applied in the MD algorithm, and also briefly illustrates different approaches, as for instance the molecular mechanics and molecular quantum mechanics approaches.
On the other hand an own all-atom MD algorithm is implemented utilizing and simplifying a version of the molecular mechanics based AMBER force field published by \big[\cite{cornell1995second}\big]. This simulation algorithm is then used to show by the example of oxytocin how individual energy terms of a force field function. As a result it has been observed, that applying the bond stretch forces alone caused the molecule to be compacted first in certain regions and then as a whole, and that with adding more energy terms the molecule got to move with increasing flexibility.