004 Datenverarbeitung; Informatik
Refine
Year of publication
- 2011 (34) (remove)
Document Type
- Part of Periodical (14)
- Diploma Thesis (5)
- Bachelor Thesis (4)
- Doctoral Thesis (4)
- Study Thesis (3)
- Conference Proceedings (2)
- Master's Thesis (2)
Language
- English (34) (remove)
Keywords
- computer clusters (3)
- Data Mining (2)
- Modellgetriebene Entwicklung (2)
- OWL <Informatik> (2)
- Ontologie <Wissensverarbeitung> (2)
- Software Engineering (2)
- artificial neural networks (2)
- classification (2)
- parallel algorithms (2)
- 8C model (1)
In this thesis, I study the spectral characteristics of large dynamic networks and formulate the spectral evolution model. The spectral evolution model applies to networks that evolve over time, and describes their spectral decompositions such as the eigenvalue and singular value decomposition. The spectral evolution model states that over time, the eigenvalues of a network change while its eigenvectors stay approximately constant.
I validate the spectral evolution model empirically on over a hundred network datasets, and theoretically by showing that it generalizes arncertain number of known link prediction functions, including graph kernels, path counting methods, rank reduction and triangle closing. The collection of datasets I use contains 118 distinct network datasets. One dataset, the signed social network of the Slashdot Zoo, was specifically extracted during work on this thesis. I also show that the spectral evolution model can be understood as a generalization of the preferential attachment model, if we consider growth in latent dimensions of a network individually. As applications of the spectral evolution model, I introduce two new link prediction algorithms that can be used for recommender systems, search engines, collaborative filtering, rating prediction, link sign prediction and more.
The first link prediction algorithm reduces to a one-dimensional curve fitting problem from which a spectral transformation is learned. The second method uses extrapolation of eigenvalues to predict future eigenvalues. As special cases, I show that the spectral evolution model applies to directed, undirected, weighted, unweighted, signed and bipartite networks. For signed graphs, I introduce new applications of the Laplacian matrix for graph drawing, spectral clustering, and describe new Laplacian graph kernels. I also define the algebraic conflict, a measure of the conflict present in a signed graph based on the signed graph Laplacian. I describe the problem of link sign prediction spectrally, and introduce the signed resistance distance. For bipartite and directed graphs, I introduce the hyperbolic sine and odd Neumann kernels, which generalize the exponential and Neumann kernels for undirected unipartite graphs. I show that the problem of directed and bipartite link prediction are related by the fact that both can be solved by considering spectral evolution in the singular value decomposition.
In this thesis we exercise a wide variety of libraries, frameworks and other technologies that are available for the Haskell programming language. We show various applications of Haskell in real-world scenarios and contribute implementations and taxonomy entities to the 101companies system. That is, we cover a broad range of the 101companies feature model and define related terms and technologies. The implementations illustrate how different language concepts of Haskell, such as a very strong typing system, polymorphism, higher-order functions and monads, can be effectively used in the development of information systems. In this context we demonstrate both advantages and limitations of different Haskell technologies.
Distance vector routing protocols are interior gateway protocols in which every router sets up a routing table with the help of the information it receives from its neighboring routers. The routing table contains the next hops and associated distances on the shortest paths to every other router in the network. Security mechanisms implemented in distance vector routing protocols are insufficient. It is rather assumed that the environment is trustworthy. However, routers can be malicious for several reasons and manipulate routing by injecting false routing updates. Authenticity and integrity of transmitted routing updates have to be guaranteed and at the same time performance and benefits should be well-balanced.
In this paper several approaches that aim at meeting the above mentioned conditions are examined and their advantages and disadvantages are compared.
Cloud Computing is a topic that has gained momentum in the last years. Current studies show that an increasing number of companies is evaluating the promised advantages and considering making use of cloud services. In this paper we investigate the phenomenon of cloud computing and its importance for the operation of ERP systems. We argue that the phenomenon of cloud computing could lead to a decisive change in the way business software is deployed in companies. Our reference framework contains three levels (IaaS, PaaS, SaaS) and clarifies the meaning of public, private and hybrid clouds. The three levels of cloud computing and their impact on ERP systems operation are discussed. From the literature we identify areas for future research and propose a research agenda.
This paper describes results of the simulation of social objects, the dependence of schoolchildren's professional abilities on their personal characteristics. The simulation tool is the artificial neural network (ANN) technology. Results of a comparison of the time expense for training the ANN and for calculating the weight coefficients with serial and parallel algorithms, respectively, are presented.
An estimation of the number of multiplication and addition operations for training artififfcial neural networks by means of consecutive and parallel algorithms on a computer cluster is carried out. The evaluation of the efficiency of these algorithms is developed. The multilayer perceptron, the Volterra network and the cascade-correlation network are used as structures of artififfcial neural networks. Different methods of non-linear programming such as gradient and non-gradient methods are used for the calculation of the weight coefficients.
Identifying reusable legacy code able to implement SOA services is still an open research issue. This master thesis presents an approach to identify legacy code for service implementation based on dynamic analysis and the application of data mining techniques. rnrnAs part of the SOAMIG project, code execution traces were mapped to business processes. Due to the high amount of traces generated by dynamic analyses, the traces must be post-processed in order to provide useful information. rnrnFor this master thesis, two data mining techniques - cluster analysis and link analysis - were applied to the traces. First tests on a Java/Swing legacy system provided good results, compared to an expert- allocation of legacy code.
MapReduce with Deltas
(2011)
The MapReduce programming model is extended slightly in order to use deltas. Because many MapReduce jobs are being re-executed over slightly changing input, processing only those changes promises significant improvements. Reduced execution time allows for more frequent execution of tasks, yielding more up-to-date results in practical applications. In the context of compound MapReduce jobs, benefits even add up over the individual jobs, as each job gains from processing less input data. The individual steps necessary in working with deltas are being analyzed and examined for efficiency. Several use cases have been implemented and tested on top of Hadoop. The correctness of the extended programming model relies on a simple correctness criterion.