Towards an international research data infrastructure for biodiversity and ecology studies

For centuries, scientists all over the world have established enormous collections of animals, plants, microbes, fossils, minerals, and rocks. The data associated with these specimens can help researchers understand life on earth, counter biodiversity loss, and contribute to a sustainable future. However, to arrive at new scientific insights and solutions for environmental problems, researchers should integrate data from various sources. “This is a formidable challenge because the data are scattered over the world and collections are often not digitised,” says Wouter Addink of Naturalis.

Naturalis Biodiversity Center is home to some 37 million biological and geological specimens. Wouter Addink is Coordinator Research Data and e-Infrastructure at Naturalis. He explains: “The Naturalis laboratories harbour facilities to study these specimens in detail, including their morphology (e.g., SEM, TEM, CT), molecular characteristics (e.g., standard, Ion-Torrent, ancient DNA, DNA-barcoding), and geological features (e.g., spectrometers, X-Ray, nanoSims, FEG-EMP). In addition, we have computational facilities such as GIS, 3D, and HPC. We have catalogued all specimens and digitised images and data of more than nine million specimens. These are publicly accessible via the Naturalis Bioportal, the Netherlands Biodiversity API, and the Global Biodiversity Information Facility, ensuring their accessibility and broad use.”

Researcher at Naturalis Biodiversity Center

Connecting data
Unfortunately, not all organisations in the world have the means to make data about their biological collections accessible. Addink: “Many collections are not digitised yet, making it very hard for researchers to obtain access to the data. In addition, data are spread over museums, botanical gardens, and scientific institutes around the world, each using their own data structure, taxonomic view, and management system. As a result, it is difficult to integrate the datasets into one catalogue and to connect them with other biodiversity data sources that use different data standards and vocabularies (e.g., ecological data). And with 1,500 million objects in Europe alone, we are talking about a lot of data.”

It is exactly this combination of data from various sources that holds great promise of delivering answers to pressing challenges such as coastal defence, sustainable agriculture and fisheries, liveable cities, climate change, and human wellbeing. Understanding the functioning of complete ecosystems is a prerequisite to address these global issues, and to this end, researchers need to integrate data from various sources. These may include biodiversity data, ecological research data on the location, movements, and activities of individual organisms collected with sensors, field observations such as those on the website , or information on changes in land use and ecosystem characteristics collected with remote sensing techniques.

Edwin van Huis is the General Director of Naturalis. He explains: “The challenge is to generate knowledge about the occurrence and interactions of species in relation to ecosystem functioning, and in the end, how we might manage the natural environment and use it sensibly. This knowledge will contribute to nature-based solutions for a variety of global ecological and economic challenges. To make this all possible, we should make all data accessible and we should strive for complete and high quality data. This is very expensive, so we need to do this efficiently, for instance by globally using the same biodiversity data infrastructures, taxonomic backbones, and molecular data sets. Therefore, Naturalis is actively involved in several national, European, and global research infrastructure initiatives.”

Open Biodiversity Information
One of the initiatives in which Naturalis is involved is the Global Biodiversity Information Facility (GBIF) and its Dutch node NLBIF. This facility aims to make biodiversity data freely and easily accessible through the internet. It acts as a global catalogue of biodiversity data. The Dutch node of GBIF was founded in 2002 by the Ministry of Education, Culture and Science (OCW), the Netherlands Organisation for Scientific Research (NWO) and the UvA. NLBIF stimulates and demonstrates the virtues of open biodiversity data, and supports Dutch organisations who want to publish their biodiversity data in an internationally standardised way.

National infrastructure
Where GBIF and NLBIF merely aim to make data open and freely accessible, a recent collaborative effort of Naturalis, UvA, and the Fungal Biodiversity Centre of CBS-KNAW has resulted in a national research infrastructure with a much broader scope. Van Huis: “Building on the digitised collections of Naturalis and the Fungal Biodiversity Centre, the Netherlands Infrastructure for Ecosystem and Biodiversity Analysis (NIEBA) provides researchers with easy and remote access to an abundance of validated data that can be used to understand ecosystems by combining them with abiotic data and traits. This combines the ecological knowledge in the Netherlands with the species occurrences. Moreover, the NIEBA infrastructure provides a virtual research space to integrate, analyse, model, and evaluate the data.” NIEBA involves many other partners, for instance Utrecht University, the Netherlands Institute of Ecology (NIOO-KNAW), and Groningen University. NIEBA obtained a position on the NWO National Roadmap for Large-Scale Scientific Infrastructure in 2016.

European infrastructure
Reaching out to colleagues outside the Netherlands, Naturalis is leading an initiative for a European research infrastructure to provide unified access to scientific collections and related expertise in Europe: the Distributed System of Scientific Collections (DiSSCo). Van Huis: “Biodiversity is a global topic by definition. Our vision for DiSSCo is to unify European natural science collections, effectively transforming a dispersed and fragmented access model to an integrated data‐driven pan‐European research infrastructure. We want to digitise collections at a large scale. In addition, we will establish a joint policy, for instance regarding the priorities on what should be digitised first and on how to improve data quality. DiSSCo will provide key reference data for research to provide answers to economical and societal questions.”

With a staggering number of 111 partners from 21 countries, DiSSCo is the largest consortium in its kind. The partners range from natural history museums to botanical gardens, research institutes, and more. Addink: “Each country has a national consortium with one representative in DiSSCo. Naturalis is the Dutch representative and leads the European consortium. NLBIF also is one of the partners in the national consortium, so these initiatives are intertwined.

DiSSCo cartoon

The DiSSCo initiative also greatly benefits from the existence of other European research infrastructures, such as ELIXIR and LifeWatch. Professor Peter van Tienderen is the dean of the Faculty of Science of the University of Amsterdam (UvA). He adds: “LifeWatch was established earlier this year as the first ERIC in Europe for the integration of biodiversity data of all kinds at a European scale. An ERIC is a European Research Infrastructure Consortium, a legal entity established by the EC under European Law. The aim of LifeWatch is to provide its users an optimal virtual research environment with transparent access to data from museum collections, but also from observatories, remote and earth-based sensors and experiments. Supported by excellent ICT core facilities and user services provided by the LifeWatch Service Center, researchers will be able to efficiently implement their work flows from their own desktop.”

Connecting them all
Initiatives like ELIXIR, LifeWatch, and DiSSCo have encouraged the participating countries to align their national infrastructure initiatives with those in other countries.Van Huis: “This is a giant leap forwards. So, all these initiatives are now starting to come together. The time is obviously right for biologists to jointly tackle challenges by means of building an interconnected research infrastructure. We are closely following DTL’s efforts to promote open science and FAIR data stewardship. This was the reason for Naturalis to join the DTL partnership. I have noticed that people from abroad are impressed with the Dutch collaborative spirit, of which DTL is an example. People worldwide are starting to see this as the way to go.”

More information

Comments are closed.