FAIR Data in Medical Research: Incorporating the FAIR Principles in the Research Data Life Cycle

On April 22nd 2022, Martijn Kersloot defended his PhD thesis in Amsterdam. Kersloot wrote his thesis, entitled ‘FAIR Data in Medical Research: Incorporating the FAIR Principles in the Research Data Life Cycle’, under the supervision of promoters professor Ameen Abu-Hanna and Dr. Ronald Cornet and co-promotor Dr. Derk Arts. With his thesis, Kersloot aimed to incorporate the FAIRification steps into the Research Data Life Cycle to ensure continuous rather than post-hoc data FAIRification. ‘If it were up to me, I would want FAIRification to happen automatically, without too much effort for individual researchers.’

FAIR implementation is a core ambition of organisations such as DTL and Health-RI, in line with Open Science principles to stimulate reuse of data. ‘It is estimated that 80% of the data generated by research cannot be reused. This costs society billions of Euros. Ever since the FAIR principles (FAIR stands for Findability, Accessibility, Interoperability, Reusability) were first published in 2016, researchers increasingly choose and are asked to FAIRify their data by funding bodies and research institutes. But there is a lack of tools and guidance about how to accomplish that. That is why my thesis is about how we can help researchers with the FAIRification of their data,’ says Kersloot.

M. Kersloot

Summary
His thesis consists of three parts. In part one, Kersloot and his colleagues assess the awareness and attitudes of clinical researchers and support staff regarding data FAIRification. They also examine the researchers’ understanding of and experience with the FAIR principles. The results show, among other things, that most researchers and support staff are undertaking effort to achieve at least one of the FAIR principles. It turns out that most researchers do not know that the FAIR principles emphasise both human and machine readability.

In part two, Kersloot and his colleagues explore the current methods used for developing and evaluating algorithms that transform free-text notes, often used by physicians in Electronic Health Records, into machine-interpretable data. Since they found a large variety in algorithm evaluations, they developed a list of sixteen recommendations for future studies. They also developed their own algorithm for machine-interpretable data.

In part three, Kersloot and his colleagues establish a workflow for de-novo FAIRification, in which the FAIRification steps are incorporated in the process of setting up and collecting data for a registry or research project. The implementation and evaluation of said workflow in an Electronic Data Capture system is described.

‘One of our conclusions is that we should endeavour to FAIRify right from the start. It is very costly, both in terms of money and time, to do it afterwards. Plans for FAIRification should be incorporated in all research proposals: how can we ensure that the data we will collect is reusable for other purposes than our own?’

M. Kersloot

If it were up to me, I would want FAIRification to happen automatically, without too much effort for individual researchers. Researchers should know about the FAIR principles, and what’s necessary to adhere to them. But there are a lot of steps in the FAIRification process that require a lot of expertise. In the future, I hope that FAIRification is maintained semi-automatically, or with the help of data stewards, who have the expertise.’

Collaboration
In his thesis, Kersloot assembled a list of the professionals and organisations that are needed to FAIRify data right from the start. ‘We conclude that funding bodies, research institutes and software developers should collaborate. During my research, I saw that all these parties are on their own island, so to speak, there is a lot of disconnect and fragmentation. Many stakeholders believe that the FAIR principles should be incorporated in research, but interprofessional discussions are mainly lacking. There often is no budget to maintain the FAIR structure in the long run. This process should become more sustainable, and less of a project-based responsibility for individual researchers.’ Kersloot points to organisations such as Health-RI and DTL to help realise a FAIR-based Dutch research ecosystem.

‘Lots of groups can develop FAIRification workflows for their own domains, but an organisation like Health-RI has a more overall perspective. It will be able to help harmonise efforts and thus ensure that individual researchers and groups won’t have to reinvent the wheel. In the Netherlands, Health-RI can play an important role in connecting all stakeholders across the health domain, and in providing researchers with tools, standards and guidelines to FAIRify their data.’

This connects with the mission of Health-RI, aimed at realising a national health-data infrastructure. ‘Via a hub-and-spoke model with among others the university medical centers and top clinical hospitals, we are working towards a shared infrastructure based on the FAIR principles to make data findable and accessible for reuse by healthcare professionals, researchers and companies’, says Mijke Jetten, programme manager FAIR data at Health-RI.

‘In realising that ambition, FAIR awareness and peer-to-peer learning are vital, just like building a trust framework for FAIR standards and workflows.’

M. Jetten

In the coming years, Health-RI, together with DTL and other organisations, will help to advance the network of trained data stewards and other FAIR experts, for example the June 29 hackathon to create lesson plans on FAIR in the life sciences facilitated by Kersloot and Jetten.

M. Jetten

In the meantime, the data sharing revolution is well underway, says Kersloot. ‘I have seen academia’s opinion on data sharing change. There are many young PhD-candidate colleagues who really see its value. There will be a culture-shift. I think that if we can show the benefits of FAIR in a strong use case – that it allows for data combination and makes data collection easier – it will stimulate people to FAIRify their data.’

CV
Martijn Kersloot was born in Amsterdam in 1996. In 2014, he started studying Medical Information Sciences at the University of Amsterdam, from which he graduated cum laude in 2019. Currently, Kersloot is a postdoctoral researcher at the Department of Medical Informatics in the Amsterdam UMC and a Product Owner at health-tech company Castor, one of the DTL partner organisations. Kersloot’s PhD thesis can be read via the digital repository of the University of Amsterdam.

Comments are closed.