Pattern Recognition course

After having followed this course, the student has a good understanding of basic pattern recognition and machine learning techniques and is able to recognize what method is most applicable to data analysis problems (s)he encounters in bioinformatics applications.

Date: September 25-29, 2017

Target audience: The course is aimed at PhD students with a background in bioinformatics, computer science or a related field, life sciences. Participants from the private sector are also welcome. A working knowledge of basic statistics and linear algebra is assumed. Preparation material on statistics and linear algebra will be distributed before the course, to be studied by students missing the required background.


Many problems in bioinformatics require classification: prediction of the class to which a certain object (i.e. a gene, protein, cell, patient, …) belongs. This calls for algorithms that can assign the most likely label (discrete output) to an object, given one or more measurements on that object. For most interesting problems, the underlying physics are too complex to explicitly design such an algorithm. In such cases, often a machine learning approach is taken: an algorithm is constructed, with parameters that are tuned based on an available dataset of training examples. The algorithm should predict the labels for these examples as well as possible, yet still generalize, i.e. perform well on objects not seen before. Some examples of classification problems in bioinformatics are gene finding (sequence in, gene presence out), diagnostics (gene expression data in, diagnosis out), data integration (measurements in, probability of interaction out), etc.

In this course, we will introduce basic techniques from the fields of pattern recognition and machine learning to solve such problems. We will introduce the pattern recognition pipeline: measuring, feature extraction and selection, classification and evaluation. The course is a mixture of theory sessions and lab courses. During the lab courses Matlab will be used and a brief introduction to Matlab will be provided. The course has to be completed afterwards with a 5-10 page report describing the analysis of a biological dataset using some of the methods taught in the course.

Course website: