About FAIR Data

Good data stewardship is rapidly becoming an essential part of modern science. To facilitate good data stewardship and to promote open science, a broad community of international stakeholders have developed the FAIR Data principles. FAIR stands for ‘Findable, Accessible, Interoperable, and Reusable’. DTL actively promotes FAIR data stewardship of life science information. 

Fair data image NGS paper

FAIR Data

The paper ‘The FAIR Guiding Principles for scientific data management and stewardship‘ is the first formal publication of the FAIR principles. In short, the FAIR Data Principles propose that all scholarly output should be:

  • Findable: easy to find for both humans and computers, with metadata that facilitate searching for specific datasets,
  • Accessible: stored for long term so that they can easily be accessed and/or downloaded with well-defined license and access conditions (open access when possible), whether at the level of metadata, or at the level of the actual data,
  • Interoperable: ready to be combined with other datasets by humans or computers,
  • Reusable: ready to be used for future research and to be further processed using computational methods.

Please consult ‘The FAIR Data principles explained‘ for a more detailed description.

FAIR Data: the rationale

Good data stewardship is the key to knowledge discovery and innovation. To generate value for a research community beyond the initial researchers, funding agencies are increasingly setting requirements for proper data stewardship of research data. Beyond proper collection, annotation, and archival, data stewardship includes the ‘long-term care’ of research data, with the goal that they can be found and re-used in downstream studies. To facilitate good data stewardship, a broad community of international stakeholders have developed the FAIR Data principles. The FAIR principles have been embraced by both the European Commission and the G20. The first formal publication of the FAIR Principles further describes the rationale behind them.

Machine-actionable

An important step in the FAIR Data approach is to publish existing and new datasets in a semantically interoperable format that can be understood by computers. (In this context, semantics is the meaning or intent of a digital object.) By semantically annotating data items and metadata, we can use computer systems to (semi-) automatically combine different data sources, resulting in greater knowledge discovery.

FAIR Data and DTL

DTL-associated scientists have played a key role in developing the FAIR principles and the DTL network actively promotes FAIR data stewardship. FAIR Data will increasingly be mandated by funding agencies and national governments, so there will definitely be a need to publish your data sets in a FAIR way. In addition, FAIR Data publishing will need to be a service provided by many certified entities across Europe. DTL helps its stakeholders prepare for this. This means that we

  • drive development of technology and infrastructure for FAIR data stewardship, including acquisition of software for FAIR Data analytics,
  • involve (international) organisations that provide access to essential core legacy life science datasets and scientific literature,
  • assist groups in translating their datasets into a FAIR format,
  • actively engage with international initiatives that promote data sharing and citation, and assist in driving standard-setting (as the Dutch node in ELIXIR, DTL assembles Dutch and international groups involved in FAIR data stewardship),
  • actively involve science funders in the Netherlands, Europe, and globally to work towards adoption of FAIR data standards.

Please contact Luiz Olavo Bonino (expert group leader FAIR Data) if you want to get involved.

FAIR Data projects

DTL is actively involved in several FAIR Data projects and initiatives, including

  • GO FAIR initiative: a proposal for the practical implementation of the European Open Science Cloud,
  • Personal Health Train: aiming to increase the utilisation of existing biomedical data for research into personalised health & medicine, preventive medicine, and value-based healthcare,
  • Farm Data Train: aiming to connect agricultural data to make them more usable,
  • FAIRdICT: aiming to address scientific and technical hurdles on the path towards optimal use of scientific data for personalised health, disease prevention, and care.

More information