About Publications Downloads AI Centre ML Section


Multi-Planar UNet

Download Multi-Planar UNet, a simple and thoroughly evaluated deep learning Python framework for segmentation of arbitrary medical image volumes. The system requires no task-specific information, no human interaction and is based on a fixed model topology and a fixed hyperparameter set, eliminating the process of model selection and its inherent tendency to cause method-level over-fitting. The system is available in open source and does not require deep learning expertise to use. Without task-specific modifications, the system performed better than or similar to highly specialized deep learning methods across 3 separate segmentation tasks. In addition, it ranked 5-th and 6-th in the first and second round of the 2018 Medical Segmentation Decathlon comprising another 10 tasks. The system relies on multi-planar data augmentation which facilitates the application of a single 2D architecture based on the familiar U-Net. Multi-planar training combines the parameter efficiency of a 2D fully convolutional neural network with a systematic train- and test-time augmentation scheme, which allows the 2D model to learn a representation of the 3D image volume that fosters generalization.

Paper describing the package

Mathias Perslev, Erik Dam, Akshay Pai, and Christian Igel. One Network To Segment Them All: A General, Lightweight System for Accurate 3D Medical Image Segmentation. In: Medical Image Computing and Computer Assisted Intervention (MICCAI), LNCS 11765, pp. 30-38, Springer, 2019


U-Time is a fully feed-forward deep learning approach to physiological time series segmentation developed for the analysis of sleep data. U-Time is a temporal fully convolutional network based on the U-Net architecture.It maps sequential inputs of arbitrary length to sequences of class labels on a freely chosen temporal scale. This is done by implicitly classifying every individual time-point of the input signal and aggregating these classifications over fixed intervals to form the final predictions.

Download U-Time here.

Paper describing the package

Mathias Perslev, Michael Hejselbak Jensen, Sune Darkner, Poul Jørgen Jennum, and Christian Igel. U-Time: A Fully Convolutional Network for Time Series Segmentation Applied to Sleep Staging. Advances in Neural Information Processing Systems (NeurIPS 2019), 2019


Download woody, a Python library for constructing very large random forests.

Paper describing the package

Fabian Gieseke and Christian Igel. Training Big Random Forests with Little Resources. ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pp. 1445-1454, ACM Press, 2018


[bufferkdtree] Download bufferkdtree for blistering fast nearest neighbor seach on GPUs. Comes with a Python frontend.

Paper describing the package

Fabian Gieseke, Cosmin Oancea, and Christian Igel. bufferkdtree: A Python Library for Massive Nearest Neighbor Queries on Multi-Many-Core Devices. Knowledge-Based Systems 120, pp. 1-3, 2017
Fabian Gieseke, Cosmin Eugen Oancea, Ashish Mahabal, Christian Igel, and Tom Heskes. Bigger Buffer k-d Trees on Multi-Many-Core Systems. Big Data & Deep Learning in High Performance Computing, pp. 172–180. Springer-Verlag, 2016.
Fabian Gieseke, Justin Heinermann, Cosmin Oancea, and Christian Igel. Buffer k-d Trees: Processing Massive Nearest Neighbor Queries on GPUs. JMLR W&CP 32 (ICML), pp. 172-180, 2014


The Shark machine learning library is a modular C++ library for the design and optimization of adaptive systems.

[Shark] The library provides methods for regression, classification, and density estimation, including various kinds of neural networks and kernel methods, as well as general algorithms for nonlinear optimization, in particular single- and multi-objective evolutionary algorithms and gradient-based methods.

The most recent version of Shark can be downloaded from here.

Video lecture

Tobias Glasmachers' 2008 NIPS open source software workshop video lecture on Shark.

Paper describing the library

Christian Igel, Verena Heidrich-Meisner, and Tobias Glasmachers. Shark. Journal of Machine Learning Research 9, pp. 993-996, 2008


Go to nnratio for a nearest neighbor density ratio estimator implemented in Python.

Paper describing the package

Jan Kremer, Fabian Gieseke, Kim Steenstrup Pedersen, and Christian Igel. Nearest Neighbor Density Ratio Estimation for Large-Scale Applications in Astronomy. Astronomy and Computing 12, pp. 67-72, 2015

Maximum Likelihood Model Selection for SVMs

Adapting the hyperparameters of support vector machines (SVMs) is a challenging model selection problem, especially when flexible kernels are to be adapted and data are scarce. We present a coherent framework for regularized model selection of 1-norm soft margin SVMs for binary classification. We propose to use gradient-ascent on a likelihood function of the hyperparameters. The likelihood function is based on logistic regression for robustly estimating the class conditional probabilities and can be computed efficiently. Overfitting is an important issue in SVM model selection and can be addressed in our framework by incorporating suitable prior distributions over the hyperparameters.

Source code

Download example code for Shark version 2.

Paper describing the algorithm

Tobias Glasmachers and Christian Igel. Maximum Likelihood Model Selection for 1-Norm Soft Margin SVMs with Multiple Parameters. IEEE Transactions on Pattern Analysis and Machine Intelligence 32(8), pp. 1522-1528, 2010

Training of Large-scale SVMs

[feasible region of optimization problem] Support vector machines are trained by solving constrained quadratic optimization problems. This is usually done with an iterative decomposition algorithm operating on a small working set of variables in every iteration. The training time strongly depends on the selection of these variables. We propose the maximum-gain working set selection algorithm for large scale quadratic programming. It is based on the idea to greedily maximize the progress in each single iteration. The algorithm takes second order information from cached kernel matrix entries into account. We proved the convergence to an optimal solution of a variant termed hybrid maximum-gain working set selection. This method has been empirically compared to the prominent most violating pair selection and the latest algorithm using second order information. For large training sets our new selection scheme is significantly faster.

Source code

Hybrid Maximum Gain working set selection algorithm implemented into LIBSVM 2.71. It is also implemented in the Shark machine learning library.

Supplementary material

This program checks all 2,834,352 cases considered for the proof of Lemma 8. However, in the meantime Tobias Glasmachers found a much more elegant proof:

Tobias Glasmachers. On Related Violating Pairs for Working Set Selection in SMO Algorithms. In M. Verleysen, ed.: 16th European Symposium on Artificial Neural Networks (ESANN 2008). Evere, Belgien: d-side publications, 2008

Paper describing the algorithm

Tobias Glasmachers and Christian Igel. Maximum-Gain Working Set Selection for SVMs. Journal of Machine Learning Research 7, pp. 1437-1466, 2006

Online and Active Learning for SVMs

Iterative learning algorithms that approximate the solution of support vector machines (SVMs) have two potential advantages. First, they allow for online and active learning. Second, for large datasets computing the exact SVM solution may be too time consuming and an efficient approximation can be preferable. The powerful LASVM proposed by A. Bordes, S. Ertekin, J. Weston, and L. Bottou iteratively approaches the exact SVM solution using sequential minimal optimization (SMO). It allows for efficient online and active learning. This algorithm can be considerably improved in speed and accuracy by replacing the working set selection in the SMO steps. We incorporated a second order working set selection strategy, which greedily aims at maximizing the progress in each single step.

Supplementary material

The official LASVM website can found here. The modified LASVM source code implementing our working set selection strategy can be downloaded here.


Antoine Bordes, Seyda Ertekin, Jason Weston, Léon Bottou. Fast Kernel Classifiers with Online and Active Learning. Journal of Machine Learning Research 5, pp. 1579-1619, 2005
Tobias Glasmachers and Christian Igel. Second Order SMO Improves SVM Online and Active Learning. Neural Computation 20(2), pp. 374–382, 2008

Gradient-based Optimization of Kernel-Target Alignment for Sequence Kernels Applied to Bacterial Gene Start Detection: Sequence Data

Extracting protein-encoding sequences from nucleotide sequences is an important task in bioinformatics. This requires to detect locations at which coding regions start. These locations are called translation initiation sites (TIS).

The TIS2007 data is a reliable data set designed to evaluate machine learning algorithms for automatic TIS detection. It is based on E. coli genes from the EcoGene database. Only entries with biochemically verified N-terminus were considered. The neighboring nucleotides were looked up in the GenBank file U00096.gbk . From the 732 positive examples associated negative examples were created. For the negative examples, sequences centered around a potential start codon were considered and accepted them if the codon is in-frame with one of the real start sites used as a positive case, its distance from a real TIS is less than 80 nucleotides, and no in-frame stop codon occurs in between. This data selection generates a difficult benchmark because the negative TISs in the data set are both in-frame with and in the neighborhood of the real TIS. Finally a set of 1248 negative examples was obtained. The length of each sequence is 50 nucleotides, with 32 located upstream and 18 downstream including the start codon.

To minimize sampling effects, 50 different partitionings of the data into training and test sets were generated. Each training set contains 400 sequences plus the associated negatives, the corresponding test set 332 sequences plus the associated negatives. Each line in a data file starts with the label, 1 or -1 for positive and negative examples, respectively, followed by the nucleotide sequence as ASCII string.

Source code

Gradient-based optimization of the kernel-target alignment is available as a standard SVM model selection method in the Shark library. For an implementation of various oligo kernel functions for the Shark library click here.

Supplementary material

All 50 data partitionings in training and test set.


Christian Igel, Tobias Glasmachers, Britta Mersch, Nico Pfeifer, and Peter Meinicke. Gradient-based Optimization of Kernel-Target Alignment for Sequence Kernels Applied to Bacterial Gene Start Detection. IEEE/ACM Transactions on Computational Biology and Bioinformatics 4(2), pp. 216-226, 2007
Britta Mersch, Tobias Glasmachers, Peter Meinicke, and Christian Igel. Evolutionary Optimization of Sequence Kernels for Detection of Bacterial Gene Starts. International Journal of Neural Systems 17(5), selected paper of ICANN 2006, pp. 369-381, 2007

The Elitist Covariance Matrix Adaptation Evolution Strategy


The covariance matrix adaptation evolution strategy (CMA-ES) is one of the most powerful evolutionary algorithms for real-valued optimization. We propose the an elitist version of this algorithm. For step size adaption, the algorithms used an improved 1/5-th success rule, which replaces the cumulative path length control in the standard CMA-ES.

We developed an incremental Cholesky update for the covariance matrix replacing the computational demanding and numerically involved decomposition of the covariance matrix. This rank-one update can replace the decomposition only for the update without evolution path and reduces the computational effort by a factor of n, where n is the problem dimension. The resulting (1+1)-Cholesky-CMA-ES is an elegant algorithm and the perhaps simplest evolution strategy with covariance matrix and step size adaptation.

Source code

There are an example program for the (1+1)-CMA-ES in the Shark library.

Papers describing the algorithm

Christian Igel, Thorsten Suttorp, and Nikolaus Hansen. A Computational Efficient Covariance Matrix Update and a (1+1)-CMA for Evolution Strategies. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2006), pp. 453-460, ACM Press
Thorsten Suttorp, Nikolaus Hansen, and Christian Igel. Efficient Covariance Matrix Update for Variable Metric Evolution Strategies. Machine Learning 75, pp. 167-197, 2009

Covariance Matrix Adaptation for Multi-objective Optimization


The covariance matrix adaptation evolution strategy (CMA-ES) is one of the most powerful evolutionary algorithms for real-valued single-objective optimization. We developed a variant of the CMA-ES for multi-objective optimization.

In the new multi-objective CMA-ES (MO-CMA-ES) a population of individuals that adapt their search strategy as in the elitist CMA-ES is maintained. These are subject to multi-objective selection. The selection is based on non-dominated sorting using either the crowding-distance or the contributing hypervolume as second sorting criterion. The MO-CMA-ES inherits important invariance properties, in particular invariance under rotation of the search space, from the original CMA-ES.

Source code

The MO-CMA-ES is part of the Shark library. There is a tutorial and an example program that can serve as starting points for using the MO-CMA-ES.

Papers describing the algorithm

Christian Igel, Nikolaus Hansen, and Stefan Roth. Covariance Matrix Adaptation for Multi-objective Optimization. Evolutionary Computation 15(1), pp. 1-28, 2007
Thorsten Suttorp, Nikolaus Hansen, and Christian Igel. Efficient Covariance Matrix Update for Variable Metric Evolution Strategies. Machine Learning, 2009

Rprop with Improved Weight-Backtracking

[Rprop Scheme] The Rprop (Resilient Backpropagation) algorithm proposed by Riedmiller and Braun is one of the best performing first-order learning methods. We introduce slight modifications of the algorithm that further improve its robustness and learning speed.

We suggest to use weight-backtracking for highly non-linear optimization problems. Weight-backtracking means partially retracting “unfavorable” previous update steps. Whether a parameter change was “unfavorable” or not is decided by a heuristic. We propose an improved weight-backtracking heuristic considering both the evolution of the partial derivatives and the value of the objective function.

Source code

The Rprop algorithms are implemented in the Shark library, which comes with several examples using the Rprop for optimization.


Christian Igel and Michael Hüsken. Empirical Evaluation of the Improved Rprop Learning Algorithm. Neurocomputing 50(C), pp. 105-123, 2003
Christian Igel and Michael Hüsken. Improving the Rprop Learning Algorithm. In H. Bothe and R. Rojas, eds.: Second International Symposium on Neural Computation (NC 2000), pp. 115-121, ICSC Academic Press, 2000)
Martin Riedmiller. Advanced supervised learning in multilayer perceptrons-from backpropagation to adaptive learning techniques. International Journal of Computer Standards and Interfaces 16(3), pp. 265-278, 1994.
Martin Riedmiller, Heinreich Braun. A direct adaptive method for faster backpropagation learning: the RPROP algorithm. In: Proceedings of the International Conference on Neural Networks, pp. 586-591, IEEE Press, 1993

German Traffic Sign Recognition Benchmark

The German Traffic Sign Recognition Benchmark (GTSRB) is a multi-class image classification benchmark in the domain of advanced driver assistance systems and autonomous driving. It was first published at IJCNN 2011.

Official data archive

The offical archive can be found here.

Data in different format

Because the data is a bit difficult to read in and process, I have put an archive online with a simple directory structure and the final training and test images in PNG format, which can be easily processed by popular deep learning frameworks. Let me know if there is something wrong with this archive. Click here.


Johannes Stallkamp, Marc Schlipsing, Jan Salmen, and Christian Igel. Man vs. Computer: Benchmarking Machine Learning Algorithms for Traffic Sign Recognition. Neural Networks 32, pp. 323-332, 2012
Johannes Stallkamp, Marc Schlipsing, Jan Salmen, and Christian Igel. The German Traffic Sign Recognition Benchmark: A Multi-class Classification Competition. International Joint Conference on Neural Networks (IJCNN 2011), pp. 1453-1460, IEEE Press, 2011

German Traffic Sign Detection Benchmark

Data from the German Traffic Sign Detection Benchmark (GTSDB). This archive contains the data used in the IJCNN 2013 competition.

Official data archive

The offical archive for the detection benchmark be found here.


Sebastian Houben, Johannes Stallkamp, Jan Salmen, Marc Schlipsing, and Christian Igel. Detection of Traffic Signs in Real-World Images: The German Traffic Sign Detection Benchmark. International Joint Conference on Neural Networks (IJCNN 2013), pp. 715-722, IEEE Press, 2013