The course is aimed at PhD students with a background in bioinformatics, computer science or a related field; a working knowledge of basic statistics and linear algebra is assumed. The BioSB course “Pattern recognition” and the ASCI course “Advanced pattern recognition” (a1) discuss many of the tools used in this course, but it is not required to have followed these. Prior knowledge of molecular biology is a bonus, but also not strictly required.
Preparation material on probability theory, linear algebra and molecular biology can be found at the course website (TBA) and should be read by all students before the course starts
Molecular biology is concerned with the study of the presence of and interactions between molecules, at the cellular and sub-cellular level. In bioinformatics and systems biology, algorithms and tools are developed to model these interactions, with various goals: predicting yet unobserved interactions, assigning functions to yet unknown molecules through their relations with known molecules; predicting certain phenotypes such as diseases; or just to build up biological knowledge in a structured way.
Such interaction models are often best modelled as networks or graphs, which opens up the possibility of using a large number of readily available algorithms for inferring networks, performing simulations of biology, optimising paths or flows through networks, graph-based data integration and graph mining. Many of these algorithms can be applied (sometimes with slight alterations) to solve a particular biological problem, such as modeling transcriptional regulation or predicting protein interaction/complex formation, but also to derive systems behaviour by breaking down networks into modules or motifs with certain characteristics.
In this course, we will first give a brief overview of molecular biology, the advent of high-throughput measurement techniques and large databases containing biological knowledge, and the importance of networks to model all this. We will highlight a number of peculiar features of biological networks. Next, a number of basic network models (linear, Boolean, Bayesian) will be discussed, as well as methods of inferring these from observed measurement data. A number of alternative network models more suited for high-level simulation of cellular behaviour will also be introduced. Building on the network inference methods, a number of ways of integrating various data sources and databases to refine biological networks will be discussed, with specific attention to the use of sequence information to refine interaction and transcription regulation networks. Finally, we will give some examples of algorithms exploiting the networks found to learn about biology, specifically for inspecting protein interaction networks.