About
Publications
Downloads

bufferkdtree
Shark
nnratio
SVM model selection
SVM training
SVM online learning
KTA optimization
MO-CMA
Rprop

AI Centre
ML Section
Fabian Gieseke, Cosmin Oancea, and Christian Igel.
bufferkdtree: A Python Library for Massive Nearest Neighbor Queries on Multi-Many-Core Devices.
*Knowledge-Based Systems* 120 , pp. 1-3, 2017

Fabian Gieseke, Cosmin Eugen Oancea, Ashish Mahabal, Christian
Igel, and Tom Heskes. Bigger Buffer k-d Trees
on Multi-Many-Core Systems.
*Big Data & Deep Learning in High Performance Computing*, pp. 172–180. Springer-Verlag, 2016.

Fabian Gieseke,
Justin Heinermann,
Cosmin Oancea, and
Christian Igel. Buffer k-d Trees: Processing Massive Nearest Neighbor
Queries on GPUs.
*JMLR W&CP* 32 (ICML), pp. 172-180, 2014

The library provides methods for regression, classification, and density estimation, including various kinds of neural networks and kernel methods, as well as general algorithms for nonlinear optimization, in particular single- and multi-objective evolutionary algorithms and gradient-based methods.

The most recent version of Shark can be downloaded from here.

Christian Igel, Verena Heidrich-Meisner, and Tobias Glasmachers. Shark.
*Journal of Machine Learning Research* 9 , pp. 993-996,
2008

Jan Kremer, Fabian Gieseke, Kim Steenstrup
Pedersen, and Christian Igel. Nearest Neighbor Density
Ratio Estimation for Large-Scale Applications in
Astronomy. *Astronomy and Computing* 12 , pp. 67-72, 2015

Tobias Glasmachers and Christian Igel.
Maximum Likelihood Model Selection for 1-Norm Soft Margin SVMs with Multiple Parameters.
*IEEE Transactions on Pattern Analysis and Machine
Intelligence* 32 (8), pp. 1522-1528, 2010

Tobias Glasmachers. On Related Violating Pairs for Working Set Selection in SMO Algorithms. In M. Verleysen, ed.: *16th European Symposium on Artificial Neural Networks (ESANN 2008)*. Evere, Belgien: d-side publications, 2008

Tobias Glasmachers and Christian Igel.
Maximum-Gain Working Set
Selection for SVMs. *Journal of Machine Learning Research* 7 , pp. 1437-1466, 2006

Antoine Bordes, Seyda Ertekin, Jason Weston, Léon Bottou.
Fast Kernel Classifiers with Online and Active Learning. *Journal of Machine
Learning Research* 5 , pp. 1579-1619, 2005

Tobias Glasmachers and Christian Igel.
Second Order SMO Improves SVM Online and Active Learning.
*Neural Computation* 20 (2), pp. 374–382, 2008

Extracting protein-encoding sequences from nucleotide sequences is an important task in bioinformatics. This requires to detect locations at which coding regions start. These locations are called translation initiation sites (TIS).

The TIS2007 data is a reliable data set designed to evaluate machine learning algorithms for automatic TIS detection. It is based on E. coli genes from the EcoGene database. Only entries with biochemically verified N-terminus were considered. The neighboring nucleotides were looked up in the GenBank file U00096.gbk . From the 732 positive examples associated negative examples were created. For the negative examples, sequences centered around a potential start codon were considered and accepted them if the codon is in-frame with one of the real start sites used as a positive case, its distance from a real TIS is less than 80 nucleotides, and no in-frame stop codon occurs in between. This data selection generates a difficult benchmark because the negative TISs in the data set are both in-frame with and in the neighborhood of the real TIS. Finally a set of 1248 negative examples was obtained. The length of each sequence is 50 nucleotides, with 32 located upstream and 18 downstream including the start codon.

To minimize sampling effects, 50 different partitionings of the data into training and test sets were generated. Each training set contains 400 sequences plus the associated negatives, the corresponding test set 332 sequences plus the associated negatives. Each line in a data file starts with the label, 1 or -1 for positive and negative examples, respectively, followed by the nucleotide sequence as ASCII string.

Christian Igel, Tobias
Glasmachers, Britta
Mersch, Nico Pfeifer, and Peter
Meinicke. Gradient-based Optimization of Kernel-Target Alignment
for Sequence Kernels Applied to Bacterial Gene Start
Detection. *IEEE/ACM Transactions on Computational Biology and
Bioinformatics* 4 (2), pp. 216-226, 2007

Britta Mersch, Tobias Glasmachers, Peter
Meinicke, and Christian Igel. Evolutionary Optimization of Sequence
Kernels for Detection of Bacterial Gene Starts. *International
Journal of Neural Systems* 17 (5), selected paper of
ICANN 2006, pp. 369-381, 2007

The covariance matrix adaptation evolution strategy (CMA-ES) is one of the most powerful evolutionary algorithms for real-valued optimization. We propose the an elitist version of this algorithm. For step size adaption, the algorithms used an improved 1/5-th success rule, which replaces the cumulative path length control in the standard CMA-ES.

We developed an incremental Cholesky update for the covariance matrix replacing the computational demanding and numerically involved decomposition of the covariance matrix. This rank-one update can replace the decomposition only for the update without evolution path and reduces the computational effort by a factor of $n$, where $n$ is the problem dimension. The resulting $(1+1)$-Cholesky-CMA-ES is an elegant algorithm and the perhaps simplest evolution strategy with covariance matrix and step size adaptation.

Christian Igel, Thorsten Suttorp, and Nikolaus Hansen.
A Computational Efficient Covariance Matrix Update and a (1+1)-CMA for
Evolution Strategies. *Proceedings of the Genetic and Evolutionary
Computation Conference (GECCO 2006)*, pp. 453-460, ACM Press

Thorsten
Suttorp, Nikolaus Hansen, and Christian Igel.
Efficient Covariance Matrix Update for Variable Metric Evolution
Strategies. *Machine Learning* 75 , pp. 167-197,
2009

The covariance matrix adaptation evolution strategy (CMA-ES) is one of the most powerful evolutionary algorithms for real-valued single-objective optimization. We developed a variant of the CMA-ES for multi-objective optimization.

In the new multi-objective CMA-ES (MO-CMA-ES) a population of individuals that adapt their search strategy as in the elitist CMA-ES is maintained. These are subject to multi-objective selection. The selection is based on non-dominated sorting using either the crowding-distance or the contributing hypervolume as second sorting criterion. The MO-CMA-ES inherits important invariance properties, in particular invariance under rotation of the search space, from the original CMA-ES.

Christian Igel, Nikolaus Hansen, and Stefan
Roth. Covariance Matrix Adaptation for Multi-objective
Optimization. *Evolutionary Computation* 15 (1), pp. 1-28, 2007

Thorsten Suttorp, Nikolaus Hansen, and Christian Igel.
Efficient Covariance Matrix Update for Variable Metric Evolution Strategies. *Machine Learning*, 2009

Christian Igel and Michael Hüsken. Empirical Evaluation of the Improved
Rprop Learning Algorithm. *Neurocomputing* 50 (C), pp.
105-123, 2003

Christian Igel and Michael Hüsken.
Improving the Rprop Learning Algorithm. In H. Bothe and R. Rojas,
eds.: *Second
International Symposium on Neural Computation (NC 2000)*, pp. 115-121,
ICSC
Academic Press, 2000)

Martin Riedmiller. Advanced supervised learning in multilayer perceptrons-from
backpropagation to adaptive learning techniques. *International Journal of Computer Standards
and Interfaces* 16 (3), pp. 265-278, 1994.

Martin Riedmiller, Heinreich Braun. A direct adaptive method for faster
backpropagation learning: the RPROP algorithm. In: *Proceedings of the
International Conference on Neural Networks*, pp.
586-591, IEEE Press, 1993