The ELIXIR Implementation Study ‘MolData2’ has recently been completed. Friederike Ehrhart of Maastricht University: “We investigated the current status of genotype-phenotype databases for rare diseases. In addition, we developed tools to tackle some of the most urgent problems.”
Ehrhart: “This study was conducted by the Dutch and Spanish ELIXIR node, Maastricht University, Leiden University Medical Centre, and the Barcelona Supercomputing Centre. We focused on genotype-phenotype databases such as ClinVar. These databases store genetic variants along with information about their pathogenicity. Researchers and clinicians need genotype-phenotype information to identify disease‐causing mutations for diagnosis, clinical management of the disease, and counselling of parents.”
Rett syndrome
As a test case, the researchers studied MECP2 variations, which may cause Rett syndrome. Ehrhart: “We found that at present, about 13 databases worldwide store information on MECP2 and Rett. However, the data was not properly annotated, stored, and technically available for future re-use in many of these databases. We recently published this part of the Moldata2 study in the journal Human Mutation. In the article, we try to raise awareness in the rare disease community that collecting data in itself is not enough: findability, accessibility, interoperability, and reusability are essential. The paper has been honoured as the editor’s choice article.”
Data integration
The researchers also developed a workflow to integrate genotype-phenotype data from multiple sources and to make it available in RDF format. “The next step is to integrate the available information from multiple databases in order to get a more detailed image of what the different Rett syndrome-causing mutations have in common compared to benign MECP2 mutations. In contrast to smaller national studies, we can use information from more than 10,000 individual database entries if we integrate data from multiple databases. This approach is highly interesting for the rare disease community because the number of rare disease patients is small by definition,” says Ehrhart.
The team also improved resources to work with molecular rare disease data. They created an update of the open-source CyTargetLinker app for Cytoscape, which is popular network analysis software. CyTargetLinker provides a simple interface to extend networks with links to relevant data and/or knowledge extracted from so-called linksets. “In addition, we developed a tool to link rare diseases using OMIM identifiers and provenance; this is still in preparation. And we visualised two new molecular pathways, for rare prion diseases and Prader-Willi/Angelman syndrome. Taken together, this has been a very productive Implementation Study,” concludes Ehrhart.
About ELIXIR Implementation Studies
Implementation Studies are short technical projects to inform future service development, drive standards adoption, and connect ELIXIR’s 21 Nodes. The projects are funded through the budget of the ELIXIR Hub and form part of ELIXIR’s ongoing activities in a particular Platform or Community. Read more.