- Machine Learning (Fundamental), Amsterdam – 26-30 September 2022
- Algorithms for Genomics (Advanced), Delft – 10-14 October 2022
- Computational Metagenomics, Wageningen 10-12 October 2022
- Single cell Analysis, Leiden – 10-14 October 2022
- Integrated modeling and optimization (Fundamental), December 2022
Fundamental and advanced courses
BioSB offers fundamental courses and advanced courses. The current list of courses we offer can be found below.
- Fundamental courses: fundamental courses are organised once per year. In 5 days, a range of experts from the Dutch bioinformatics & systems biology community will provide you a solid foundation in core technologies in bioinformatics and systems biology, laced with examples of applications. The advanced courses build on this foundation.
- Advanced courses: advanced courses are organised once every 2 years. Each course deals with a specific topic in current bioinformatics and systems biology. Each 5-day course is organized by experts and exposes students to both fundamental approaches and recent developments. Most build on methods introduced in the fundamental courses, though depending on prior education it may not be needed to follow these first.
- Other courses: we occasionally also co-organise courses with other research schools or announce other courses via our website. These courses may have a different format and be offered occasionally only. Other courses are announced via the upcoming courses list.
You can pre-register for a course via this form. We will send you an e-mail as soon as the course of your interest is announced and when registration is online.
General enrollment information for our courses can be found here.
Press ‘+’ to read the detailed course description.
Perry Moerland, Amsterdam UMC (location AMC)
1.5 ECTS for following the course, 3 ECTS when successfully completing a final assignment
Modern biology is a data-rich science, driven by our ability to measure the detailed molecular characteristics of cells, organs, and individuals at many different levels. Interpretation of these large-scale biological data requires the detection of statistical dependencies and patterns in order to establish useful models of complex biological systems. Techniques from machine learning are key in this endeavour. Typical examples are the visualization of single-cell RNA-seq data using dimensionality reduction methods, base calling for nanopore sequencing data using hidden Markov models and (recurrent) neural networks, and classification of high-throughput microscopy image data using convolutional neural networks. In this one-week course, the foundations of machine learning will be laid out and commonly used methods for unsupervised (clustering, dimensionality reduction, visualization) and supervised (mainly classification) learning will be explained in detail. Methods will be illustrated using recent examples from the fields of systems biology and bioinformatics. Methods discussed in the morning lectures will be put into practice during the afternoon computer lab sessions.
- Density estimation, including histograms, nearest neighbour, Parzen
- Evaluation, including ROC, cross-validation
- Parametric and non-parametric classifiers, including linear discriminant analysis, k-nearest neighbours, logistic regression, decision trees and random forests
- Feature selection, including search algorithms (forward, backward, branch & bound) and sparse classifiers (ridge, lasso, elastic net)
- Dimensionality reduction, including principal component analysis, multi-dimensional scaling, t-SNE.
- Clustering, including hierarchical clustering, k-means, Gaussian mixture models
- Hidden Markov models
- (Deep) neural networks
- Kernel-based methods, including support vector machines
After having followed this course, the student has a good understanding of a wide range of machine learning techniques and is able to recognize what method is most applicable to data analysis problems (s)he encounters in bioinformatics and systems biology applications.
The course is aimed at PhD students with a background in bioinformatics, systems biology, computer science or a related field, and life sciences. Participants from the private sector are also welcome. A working knowledge of basic statistics and linear algebra is assumed. Preparation material on statistics and linear algebra will be distributed before the course, to be studied by students missing the required background.
Natal van Riel, Eindhoven University of Technology
1,5 ECTS for following the course, 3 ECTS when successfully completing a final assignment.
Living organisms are characterized by an amazing degree of hierarchical complexity. Although our ability to collect measurements at different spatial levels and time-scales has grown dramatically, it has become clear that only measurements cannot provide the answer to unravelling biological complexity. This is because the dynamical behavior of complex systems cannot be reduced to the linear sum of the functions of their parts. Hence, computational modelling is an absolute requisite to gain understanding of the mechanisms underlying patterns observed in experimental data, in particular when studying dynamic phenomena. Mathematical models allow in a relatively cheap way to generate and test hypotheses about these mechanisms. However, given the huge complexity and peculiar features of biological systems, it is necessary to carefully understand the specific modelling requirements they pose, in order to define what a good model should look like. In this way one could say that modelling is a craftsmanship, that can only be learned via intense exercising and ‘learning by doing’. In this course we offer the participants the possibility to learn and exercise the modeling process.
In validating models one always meet with the need to fit models to data. So, the parameters that are present in any realistic model have to be chosen based on comparison of model predictions with data. In this matching process optimization techniques are indispensable. That’s why a considerable part of this course is spent on getting you acquainted with the optimization techniques that are nowadays available and widely used. Numerical optimization also is the basis for so-called flux balance analysis (FBA), commonly used to study large metabolic networks. This type of models and their analysis and simulation is also introduced in the course.
The course is a mixture of theory sessions and computer practicals. The course is completed with an assignment to be finalized afterwards, for those who want to acquire 3 ECTS.
The students will be provided with a theoretical basis, a variety of methods, and a computational hands-on experience to set-up systems biology models and handle numerical optimization.
In the course the students will learn:
- To understand the common ground and the differences for applications of dynamic modeling in metabolic, regulatory, signaling, and multi-scale biological processes
- How to set-up a dynamic model to represent biological networks using different interaction mechanisms
- To implement, simulate and analyze dynamic network models
- To understand the wide variety of problems in modelling that can be solved with optimization
- To apply different types of numerical optimization methods
- The combination of dynamic modeling and optimization to integrate experimental data in modelling, estimate model parameters and design experiments.
- To understand how numerical optimization (linear programming) works in flux balance analysis to simulate metabolic network models.
- Nonlinear differential equations, numerical simulation, parameter sensitivity analysis.
- Parameter estimation, identifiability, uncertainty quantification, experimental design, regularization.
- Global and local search methods: steepest descent, Levenberg-Marquardt, genetic algorithms, linear programming.
The course is aimed at PhD students with a background in bioinformatics, systems biology, computer science or a related field, and life sciences. Participants from the private sector are also welcome. A working knowledge of mathematics, especially differential equations, is recommendable, but we will distribute preparation material to be studied by students missing the required background. Furthermore, at the start we offer a math refresher to help those participants who are not (yet) involved in modelling on a daily basis.
Examples and computer practical make use of Matlab. A computer with a working version of Matlab is needed and some programming experience and knowledge of Matlab are required to take the course. A short introductory training in Matlab will be made available (online) for those without Matlab skills.
- prof. dr. ir. Dick de Ridder, Wageningen University
- dr. Aalt-Jan van Dijk, Wageningen University
1.5 ECTS for following the course, 3 ECTS when successfully completing a final assignment
Molecular biology is concerned with the study of the presence of and interactions between molecules, at the cellular and sub-cellular level. In bioinformatics and systems biology, algorithms and tools are developed to model these interactions, with various goals: predicting yet unobserved interactions, assigning functions to yet unknown molecules through their relations with known molecules; predicting certain phenotypes such as diseases; or just to build up biological knowledge in a structured way.
Such interaction models are often best modelled as networks or graphs, which opens up the possibility of using a large number of readily available algorithms for inferring networks, performing simulations of biology, optimising paths or flows through networks, graph-based data integration and graph mining. Many of these algorithms can be applied (sometimes with slight alterations) to solve a particular biological problem, such as modeling transcriptional regulation or predicting protein interaction/complex formation, but also to derive systems behaviour by breaking down networks into modules or motifs with certain characteristics.
In this course, we will first give a brief overview of molecular biology, the advent of high-throughput measurement techniques and large databases containing biological knowledge, and the importance of networks to model all this. We will highlight a number of peculiar features of biological networks. Next, a number of basic network models (linear, Boolean, Bayesian) will be discussed, as well as methods of inferring these from observed measurement data. A number of alternative network models more suited for high-level simulation of cellular behaviour will also be introduced. Building on the network inference methods, a number of ways of integrating various data sources and databases to refine biological networks will be discussed, with specific attention to the use of sequence information to refine interaction and transcription regulation networks. Finally, we will give some examples of algorithms exploiting the networks found to learn about biology, specifically for inspecting protein interaction networks.
The course is aimed at PhD students with a background in bioinformatics, computer science or a related field; a working knowledge of basic statistics and linear algebra is assumed. The BioSB fundamental course “Machine Learning for Bioinformatics & Systems Biology” discusses many of the tools used in this course, but it is not required to have followed these. Prior knowledge of molecular biology is a bonus, but also not strictly required.
- dr. Thomas Abeel (TU Delft)
- dr. Sandra Smit (Wageningen University)
- dr. Jasmijn Baaijens (TU Delft)
Comparative genomics aims to compare large sets of genomes in order to understand and explain differences in traits of an organism. Contemporary methods are powered by fundamental algorithms and data structures, which are efficient and scale to large data sets. A thorough understanding of these algorithms and data structures is necessary for advanced users and developers in this area. In addition, understanding how comparative genomics is developing is important to shape your own research.
In this course, we will cover genome analysis, variant analysis, and pangenomics. Core concepts, applications, and future trends will be discussed, with a focus on the algorithms and data structures underlying state-of-the-art methods. The course offers an engaging mix of lectures, paper discussions, hands-on tutorials, and a do-it-yourself project.
The course is aimed at PhD students with a background in bioinformatics, computer science or a related field. Participants are expected to have experience in command-line usage (Unix shell) and programming (Python), and have basic knowledge of genomics.
After having followed this course, the student has a good understanding of algorithms and data structures in comparative genomics, is able to implement algorithms in python, is able to read and understand method papers in bioinformatics in detail, and is able to work with state-of-the-art command-line tools for genomics.
This introductory course gives an overview of many statistical tools to analyse omics data. The course can be followed by researchers with a minimum or elementary background in quantitative data analysis. Find more information about pre-requisites here.
Participants will learn and practice commonly used tools including:
- Tools to explore datasets including clustering, principal components and network analysis
- Models to answer basic statistical questions: differential behaviour (e.g. mRNA expression) and multiple testing, also using Bayesian models
- Models for classification and prediction, including penalised regression
- Models for emerging technologies: radiomics and single-cell sequencing data
Methods will be applied on experimental data in practical hands-on sessions using the statistical software R. Insight about how methods work is given in an intuitive way wherever possible which, combined with some formalisation and the practical work, makes theory accessible and helps cement concepts. Slides and instructions for the practical sessions will be made available electronically to participants.
The course is tailored for PhD students and researchers (such as pathologists, psychological biologists, human geneticists, oncologists, neuro-geneticists) whose research involves experiments that generate omics data. It can also suit researchers with a quantitative background looking for a short introductory course.
Maria Suarez Diez (Wageningen University), Marian Breuer (Maastricht University)
1.5 + 1.5 ECTS (attendance -/+ assessment)
Constraint-based modeling is a powerful modeling framework that allows to model reaction fluxes in genome-scale metabolic network models, with uses in fundamental and applied questions relevant to biotechnology, microbiology and biomedicine. These genome-scale metabolic models encompass all metabolic reactions encoded in an organism, possibly contextualized to a particular cell type and condition. Constraint-based modeling provides a scalable framework to analyze these large-scale models, whether they describe microbes, human cells in health and disease or multi-cellular systems like microbiota and the whole body. In this course, you will be introduced to:
- the principles of constraint-based reconstruction and analysis (COBRA)
- the underlying mathematical foundations of constraint-based modeling
- content, structure and reconstruction of genome-scale metabolic models
- basic and advanced methods for interrogating models and interpreting results
- methods for integration of omics data with genome-scale metabolic models
- multi-scale applications, in particular dynamic Flux Balance Analysis
- constraint-based modeling of microbial communities
- open source COBRA software, with a focus on the Python package COBRApy
- standards for reconstruction and model sharing
- example applications to biomedicine and biotechnology
The course is structured into both lectures and practical sessions so that theory can be illustrated with biologically motivated computational examples.
Constraint-based modeling is a rapidly growing field that is being used in both fundamental and applied research and biotechnology. It sits at the intersection between quantitative modeling, bioinformatics and cellular physiology, and as such, is an example of systems biology at work.
While participants would benefit from some knowledge of either quantitative modeling, bioinformatics or metabolism this is not required. This includes, for example, scientists working in a laboratory and wanting to learn more about building or using genome-scale metabolic models, as well as those who have used some basic COBRA methodologies, but want to know how else it can be applied to their research.
- Introduction to COBRA
- Proficiency with the COBRApy package in Python
- Familiarity with typical applications of constraint-based modeling in biomedicine and biotechnology
Marco Roos (LUMC) and Katy Wolstencroft (Leiden University(
Credits and grading
The total study load of the course is 1,5 EC.
With endorsements for FAIR* data stewardship ranging from Nature Genetics to the G7, and increasing pressure from funders for much stricter data management, FAIR data stewardship skills will be among the most wanted for the next decade. By following this course, you add these skills to your CV and learn cutting-edge semantic techniques to search and integrate health and life science data for efficient, reproducible data science.
* FAIR: Findable, Accessible, Interoperable and Reusable for humans and computers
The amount of Life Science data available in the public domain is a vast and growing resource for bioinformatics research. There are over 20 million papers in PubMed and over 1600 biological databases. In many cases finding and applying the information from these resources is far from trivial. Following this course will show you techniques for working with these distributed resources, which includes using the web of Linked data and scientific workflows. It will also focus on methods for using or linking your own data into this large distributed Semantic Web of resources, in order to ensure that your data is FAIR (Findable, Accessible, Interoperable and Reusable).
This course is for bioinformaticians who would like to learn about leading-edge data and knowledge integration solutions. You will learn (1) powerful and flexible approaches to data and information management for your bioinformatics application (Semantic Web and Linked Data), (2) how to work with data across remote locations, for instance by applying Web Services and workflows, (3) how to publish your own data to make it available and reusable for the rest of the community. We assume a basic understanding of bioinformatics programming for the hands on sessions. It would suit previous user participants of BYOD meetings who would like more hands-on experience of data integration. It would also suit data providers who would like to explore new ways of serving their data or integrating it with other resources.
This course introduces modern techniques for the management of life science data and knowledge for bioinformatics applications. After following this course students should be able to start creating their first applications based on these technologies or make more informed design decisions for their current application.
In this course you will learn about:
- Linked Data and the Semantic Web technologies that underpin it
- How you can use Linked Data for data and knowledge integration in the Life Sciences
- Available Linked Data resources in the public domain and large-scale projects that use these resources
- How you can integrate your own data with Linked Data resources
- How you can combine data integration and analysis over distributed resources, using Web Services and workflows
- Andrew Stubbs, Erasmus MC
- Miaomiao Zhou, Avans University of Applied Sciences
This course covers the basic concepts and methods required for RNA-seq analysis. Particular attention is given to the data analysis pipelines for differential transcript expression and variant calling. Presentations are followed by hands-on computer sessions using Linux command line tools, Galaxy and R to directly apply and get more insight in the analysis methods. One afternoon is dedicated to the analysis of a new data set, allowing the students to refresh and extend their analysis skills. Students can also bring their own RNA-seq data for practicing. We will also explore the potential of long-read based RNA-seq in a Friday afternoon literature club. After the course, the presentations, practicals and test data will remain available for future reference. Software packages used are freeware.
Participants for the RNA-seq course should preferably have participated in a general next-generation sequencing (NGS) course or otherwise have some hands-on experience with NGS data analysis. The course is aimed at PhD students and post-docs, but scientific programmers and data analysts with a background in biology and bioinformatics may also attend.
After having followed this course:
- The participant has insight in the issues involved in good experimental design of RNA-seq experiments.
- The participant knows and can perform analysis steps in reference-based and de novo RNA-seq data analysis, and visually present and judge the results for:
- quality control and preprocessing,
- finding differentially expressed genes,
- variant calling,
- cluster analysis,
- classification analysis,
- pathway testing.
- The participant has insight in various RNA-seq platforms, their specificity in solving certain biological questions, and the bottlenecks in these applications.
- The participant has insight in the different algorithms and options available to perform an analysis, and can make an informed choice.
- The participant knows the pitfalls of existing analyses and is able to critically judge the statistical analysis of RNA-seq data performed by others.
- Natal van Riel, Eindhoven University of Technology
The human body can be thought of as a system of interconnected networks organized from individual cells through to tissues and organs. In disease states, a disturbance in one sub-network can have knock-on effects at multiple level of the system giving rise to often complex disease phenotype. Advances in high-throughput omics platforms and wearable devices has provided us with unparalleled insights into the drivers, mechanism, and markers of diseases states the human body. However, how to handle the sheer volume of data being generated has also presented researchers with many new challenges, necessitating the development and application of computational approaches to integrate, analyse, and interpret these data.
The course will provide an overview of the current state of the art in computational and systems biology approaches being employed in biomedical research. From so called bottom-up models incorporating existing knowledge forming a scaffold upon which emergent behaviours of a system can be studied, to top-down data driven approaches identifying novel predictors of disease or treatment responsiveness. Topics covered in this course will include:
- Algorithms for clinical genomics
- Dynamic modelling for personalized medicine and Digital Twins.
- Genome scale modelling and network analysis.
- Agent based modelling.
- Machine learning on clinical data and decision support systems
The course will be structured with each day focusing on a specific topic; morning sessions will cover the background and theory underlying approaches commonly used in the disciple, hands-on practicals may also be provided. Afternoon sessions will include talks and questions and answer sessions with leading researchers from both Universities and Academic Medical Centres in the Netherlands, sharing how and where the introduced approaches are currently being applied in the biomedical research.
This course is aimed towards PhD students and postdocs with a background in systems biology, bioinformatics, biomedical engineering or similar field. The topics covered in this course build/touch upon approaches introduced in the fundamental courses in Machine Learning and Integrated Modelling and Optimization. While it is not required to have followed these courses, prior knowledge/familiarity with the implementation of knowledge driven/machine learning models is recommended.