Bring Your Own Data workshops

At a Bring Your Own Data (BYOD) workshop, experts in semantic data and FAIR assist researchers in improving the FAIRness of their research data. The main goal of these events is for a larger group of people to learn how to FAIRify datasets using linked data technology, and to be able to combine the datasets with other FAIR datasets to answer scientific questions that used to need lots of handword before the advent of FAIR.

At a BYOD as we have organised them, data owners, domain experts (usually biologists or chemists), and FAIR data experts jointly work on specific data sets. At the start, data owners present the data they wish to make FAIR. The data experts have extensive knowledge about FAIR data formats and principles, and support the data owners in choosing the optimal data model. In addition, they make sure that FAIR linked data is produced in the end. Domain experts can assist the data owners and data experts to solve intellectually challenging data modelling issues and to demonstrate the added value of FAIR data in answering specific research questions.

Past BYODs
Since 2014, we have organised multiple BYODs, supporting organisations and companies in producing FAIR data (e.g., Human Protein Atlas, Enza Zaden, Rijk Zwaan, the rare disease community). Since 2016, we have also organised BYODs for projects such as ELIXIR EXCELERATE and Odex4all.

A tailored programme is set up for each BYOD because it involves a unique combination of data owners and data sets. In principle, all BYODs contain the following elements:

Preparation
Prior to the BYOD, one or two webinars are organised to introduce the principles to the attendees, to point them to preparatory materials, and to provide the starting points for a BYOD: a list of questions from data owners and a list of existing linked data sets by the data experts. Having specific research questions or workflows at hand that cannot be answered with the data in its original format is also helpful to demonstrate the added value of the FAIR linked data approach. Good preparation is very important since a lot of time can be saved during the BYOD if data models and, for instance, the most relevant vocabularies and ontologies have been defined beforehand.

Execution
The first two days of the three-day BYOD are dedicated to transforming the data owners’ data into FAIR data using the FAIR Data unit’s technology and guidelines, meanwhile providing hands-on training for both the data owners and the future FAIR linked data experts.

Together, data owners and data experts work out a showcase to demonstrate the added value of FAIR data in answering specific research questions of the data owner using multiple integrated data resources. At the end of every BYOD, data owners will also get an overview of the potential of the interlinked data. In our experience, this demonstration will trigger the imagination and will bring about novel questions that can be answered using unimagined combinations of data.

The newly transformed FAIR data is then deposited in either an open or closed (in case of proprietary data) repository with the original data remaining at the data owner’s location. The FAIR Data Unit will host this repository.

The last day of the BYOD is dedicated to exploration of the data using analytics tools.

Follow up
After a BYOD, data owners are familiar enough with the basic principles of making data FAIR, allowing them to use the FAIR data approach themselves. For further support, two teleconferences are planned to follow up after 2-3 and 6-8 weeks, to tackle any problems the data owners might encounter.

One of the biggest challenges of data-intensive science is to facilitate knowledge discovery. Life scientists, both in the public and the private domain, produce large amounts of data that are both complex and heterogeneous. They also make use of multiple ‘core’ data resources like UniProt or ChEMBL. Researchers spend many hours in projects coupling these data sources, struggling to decrypt the data and to transform them into actionable knowledge. Connecting and functionally interlinking datasets is therefore essential for knowledge discovery.

The FAIR Data Unit (FDU) at DTL offers a helping hand in linking data. FDU organises Bring Your Own Data (BYOD) workshops in which experts in modeling data and content experts support data owners to make data FAIR using linked data technology. The acronym FAIR for data means that they are Findable, Accessible, Interoperable and Reusable, by humans and computers.

To generate value for a research community beyond the initial researchers, funding agencies are increasingly setting requirements for proper data stewardship of research data. Since FAIR data is vital to enable appropriate data stewardship and will be mandated by funding agencies and national governments alike, there will be a definitive need to publish FAIR data for new and legacy data sets. FAIR Data publishing will need to be a service provided by many certified entities across Europe.

We have developed a methodology for making data FAIR via BYOD workshops. A BYOD is a low barrier approach to get data owners acquainted with the possibilities opened up by ‘functionally interlinking’ data with other relevant datasets and demonstrating the added value of FAIR data for knowledge discovery. It is a lightweight, very effective and also fun way to collaborate across disciplinary and political borders, often yielding eye opening results. It typically is a three day event in which FAIR data are produced and analysed, and hands-on training modules.

Since 2014, we have organised multiple BYODs, supporting organisations and companies (a.o. Human Protein Atlas, Enza Zaden, Rijk Zwaan) and for instance the rare disease community, to produce FAIR data. Over time we developed scripts, working and planning documents, a budget, and training materials for life science researchers and future data experts. Furthermore, we have a pool of FAIR data experts standing by, eager to share their knowledge. From 2016 onwards, BYODs are scheduled for the ELIXIR EXCELERATE project, the Odex4all project and many other projects.