Datenbestand vom 13. Juni 2019
Tel: 089 / 66060798
Mo - Fr, 9 - 12 Uhr
Fax: 089 / 66060799
DER VERLAG IST IN DER ZEIT VOM 12.06.2019 BIS 23.06.2019 AUSCHLIESSLICH PER EMAIL ERREICHBAR.
aktualisiert am 13. Juni 2019
978-3-8439-0012-6, Reihe Informatik
Data Mining on Chemical Graphs Using Kernel Algorithms
171 Seiten, Dissertation Eberhard-Karls-Universität Tübingen (2011), Softcover, A5
This thesis describes novel algorithmic approaches and experiments for kernel-based data mining algorithms for chemical graphs. Data mining is the process of mining knowledge from given data by applying algorithms for pattern recognition. The assessment of in silico chemical compounds by specific data mining approaches is essential to reduce the number of expensive real-world experiments. First, the theoretical foundations needed to understand the encodings and data mining approaches are introduced. An important point here is how chemical information can be compared and a special class of functions, the so-called Mercer kernels, can be applied to chemical graphs. Then, a new open-source toolkit for chemical fingerprints is introduced, which has been used in several publications. Afterwards, the results of a study are presented where a modified large-scale linear support-vector machine library was used to predict large and unbalanced classification problems related to chemical graphs. Large-scale machine learning tasks are becoming more and more important because the data available is steadily growing. Most approaches, however, are not suited for large-scale learning tasks because of their computational complexity. Next, new approaches are introduced to compute fingerprint encodings which can be compared efficiently. Finally, a novel graph kernel framework is presented using the geometrical and topological distance information between the vertices. The results obtained in combination with a support vector machine were excellent.