The Swammerdam Institute for Life Sciences (SILS) has a vacancy for a Postdoctoral researcher Network-based data integration.
In a growing number of cases in systems biology, multiple connected data sets are collected. This can be multiple genomics data measured on the same system (e.g. measured on the same subjects/patients) or the same set of biochemical entities on related systems (e.g. gene-expression of cancer cell lines). There is growing awareness that such data should be analysed and modelled simultaneously, in order to arrive at a global understanding of the whole system. A fruitful approach to tackle this problem is to build empirical models of all related data sets simultaneously by using data fusion or data integration methods. In essence, all these methods are based on correlations and are generalizations of component or factor analysis. Hence, they are purely data driven.
Several extensions of the existing data fusion methods can be envisioned. Increasingly pathway information is available for a system, and then an initial break-down of the data in pathways can be accomplished and the analysis can be focussed on studying the behaviour of a pathway measured under different conditions. In that case, the different conditions have to be encoded and brought in as a priori information in the data fusion method. To take this one step further, it will be of interest to study the relationship between pathways (initially metabolic pathways, but signal transduction pathways will be considered), perhaps also as a function of conditions. In that case, the grouping of the different pathways has to be imposed on the data fusion methods. Finally, it may also be that part of the biological network is known and imposing this known structure (e.g. the topology of this known part) focusses the data analysis and makes the results of the global model more interpretable.
A new set of data fusion methods have to be developed that combine biological information (pathways, network or known biological relationships) and empirical data fusion methods. Vehicles of accomplishing this are penalized methods and Bayesian methods. These methods will be developed, tested and validated in close collaboration with the medical biology partners.
Candidates should have a PhD in the Life Sciences (bioinformatics, systems biology, data analysis or chemometrics).