DTL Focus meeting: Metadata for data reusability: eNotebook standards

Organised by DTL/ELIXIR-NL in collaboration with COST CHARME

In this meeting, we will explore our experiences with data reusability and we will evaluate how we can make sure that the R in FAIR data really means reusability for other purposes than the original research questions. What existing metadata standards can be used? What kind of extensions of standards and tools do we need? Since eNotebook and biological study database communities are really separate the question comes up whether we should connect them better and if so how we can best do that.

Many of the minimal metadata standards that are currently in use to describe study data are meant to help overcome the problem that published scientific studies can often not be reproduced. Of course, solving the reproducibility problem is indeed important. However, when talking about FAIR data the R really stands for data reuse and that goes beyond reproducibility. When reusing data we want to use the same data to answer different questions than the ones originally answered. That typically asks for more detailed data descriptions, richer metadata and the publication of data that was not deemed relevant for the original study. For example, for reusability of data collected in an epidemiological study much more detail is needed about the composition of a cohort than that it is “similar in composition” to a reference cohort.

Of course, asking for more data causes new problems. Study capture databases like Molgenis, the Phenotype database, and the ISA tools need to be able to capture the richer data and high-level study repositories like BioStudies and BioSamples need to be able to able to store the information collected in such study databases or provided directly to them.  The other problem, which is probably even bigger, is that researchers are not likely to provide information that they do not deem relevant for their own research questions. Improved data citation and credit given for that may make this more rewarding. But we also need mechanisms that make it much easier to provide the richer data. eNotebooks might be especially relevant in this respect. Since eNotebooks are increasingly used to record study designs, protocols, raw data and study results, they form an important source of information in this respect. To really benefit from this we need to look at eNotebook data standards and data export formats. Such standards are now being developed and we are glad that Klemen Zupancic will join this meeting as an expert on eNotebook standards. We will also need to evaluate how we use such standards to connect eNotebook data to study capture databases and data repositories.

Meeting details:

  • Where? Holland Heart House, Utrecht (close to central railway station), route description
  • Remote participation possible (please register, and mail us separately)
  • When? 31 October 2019, 10:30-16:30 including lunch
  • Registration: Free of charge, but required via the registration form 

Preliminary programme

10.30 Coffee
10.50 Welcome and purpose (Chris Evelo, Maastricht University)
10.55 Agenda for the day (Chris Evelo)
11.00 Setting the scene Chris Evelo: “From reproducibility to reusability: capturing metadata at the source”
11:30 Sören Hohmann: Title will follow
12:00 Keynote Klemen Zupancic (Biosystemika, Slovenia): Title will follow

12:45 Lunch

13:30 Evangelia Papoutsoglou (Wageningen University and Research): “Beyond reproducibility: improving the reusability of plant phenotyping data with MIAPPE”
14:00 Jildau Bouwman (TNO): “Data reuse solutions in the nutritional Phenotype database”
14:30 Morris Swertz (UMC Groningen): Title will follow
15:00 Tea break
15:15 Discussion
16:15 Action items, follow up, towards a white paper
16:30 End

The meeting is organised in collaboration with CHARME, the COST action for harmonisation of standards in life sciences. After the meeting, we will capture the results in a whitepaper.

Comments are closed.