Good data stewardship is rapidly becoming an essential part of modern science. To facilitate good data stewardship and to promote open science, a broad community of international stakeholders have developed the FAIR Data principles. FAIR stands for ‘Findable, Accessible, Interoperable, and Reusable’. DTL actively promotes FAIR data stewardship of life science information.
The paper ‘The FAIR Guiding Principles for scientific data management and stewardship‘ is the first formal publication of the FAIR principles. In short, the FAIR Data Principles propose that all scholarly output should be:
- Findable: easy to identify and find for both humans and computers, with metadata that facilitate searching for specific datasets,
- Accessible: stored for long term so that they can easily be accessed and/or downloaded with well-defined access conditions, whether at the level of metadata, or at the level of the actual data,
- Interoperable: ready to be combined with other datasets by humans or computers, without ambiguities in the meanings of terms and values,
- Reusable: ready to be used for future research and to be further processed using computational methods. This requires adequate information about how the data were obtained and processed (provenance) and an appropriate license
Please consult ‘The FAIR Data principles explained‘ for a more detailed description.
There is no “O” for “Open” in FAIR. Proponents of FAIR data often also stress that data should be as open as possible, access only being restricted where necessary.
FAIR Data: the rationale
Good data stewardship is the key to knowledge discovery and innovation. To generate value for a research community beyond the initial researchers, funding agencies are increasingly setting requirements for proper data stewardship of research data. Beyond proper collection, annotation, and archival, data stewardship includes the ‘long-term care’ of research data, with the goal that they can be found and re-used in downstream studies. To facilitate good data stewardship, a broad community of international stakeholders have developed the FAIR Data principles. The FAIR principles have been embraced by both the European Commission and the G20. The first formal publication of the FAIR Principles further describes the rationale behind them.
An important step in the FAIR Data approach is to publish existing and new datasets in a semantically interoperable format that can be understood by computers. (In this context, semantics is the meaning or intent of a digital object.) By semantically annotating data items and metadata, we can use computer systems to (semi-) automatically combine different data sources, resulting in greater knowledge discovery.
FAIR Data and DTL
DTL-associated scientists have played a key role in developing the FAIR principles and the DTL network actively promotes FAIR data stewardship. FAIR Data will increasingly be mandated by funding agencies and national governments, so there will definitely be a need to publish your data sets in a FAIR way. In addition, FAIR Data publishing will need to be a service provided by many certified entities across Europe. DTL helps its stakeholders prepare for this. This means that we
- drive development of technology and infrastructure for FAIR data stewardship, including acquisition of software for FAIR Data analytics,
- involve (international) organisations that provide access to essential core legacy life science datasets and scientific literature,
- assist groups in translating their datasets into a FAIR format,
- actively engage with international initiatives that promote data sharing and citation, and assist in driving standard-setting (as the Dutch node in ELIXIR, DTL assembles Dutch and international groups involved in FAIR data stewardship),
- actively involve science funders in the Netherlands, Europe, and globally to work towards adoption of FAIR data standards.
Please contact Rob Hooft if you want to get involved.
FAIR Data projects
DTL is actively involved in several FAIR Data projects and initiatives, including
- GO FAIR initiative: a proposal for the practical implementation of the European Open Science Cloud,
- Personal Health Train: aiming to increase the utilisation of existing biomedical data for research into personalised health & medicine, preventive medicine, and value-based healthcare,
- Farm Data Train: aiming to connect agricultural data to make them more usable,
- FAIRdICT: aiming to address scientific and technical hurdles on the path towards optimal use of scientific data for personalised health, disease prevention, and care.
- Paper ‘The FAIR Guiding Principles for scientific data management and stewardship‘ (i.e., the FAIR Data paper)
- Paper ‘Cloudy, increasingly FAIR; revisiting the FAIR Data guiding principles for the European Open Science Cloud‘
- Webinar on FAIR data stewardship and BYODs
- Interview with Barend Mons about FAIR data stewardship