The report aims to estimate the cost of not having FAIR research data for the EU economy based on a series of measurable indicators, which were defined based on existing studies and interviews with subject matter experts.

Cost-Benefit analysis for FAIR research data – Study

Cost of not having FAIR research data

FAIR research data encompasses the way to create, store and publish research data in a way that they are findable, accessible, interoperable and reusable. In order to be FAIR, research data published should meet certain criteria described by the FAIR principles. FAIR originated from the current patchy data management practices in EU, which are not optimal yet. Several local initiatives, as well as global ones, are making the move towards an infrastructure supporting the FAIR principles in order to get the most of research data. This report aims to estimate the cost of not having FAIR research data for the EU economy based on a series of measurable indicators, which were defined based on existing studies and interviews with subject matter experts.

 

Executive Summary

Technological advancements have made research and science more data intensive and interconnected, with researchers producing and sharing increasing volumes of data. In their effort to produce high quality data, researchers have to follow good data management and data stewardship practices such as the FAIR principles.

However, there has been no thorough analysis to determine the value of not having FAIR research data, within and across scientific disciplines, both in economic and non-economic terms, and to contrast it against the current situation where a majority of research data is not adhering to the FAIR principles.This report aims to fill these gaps by estimaing the cost of not having FAIR research data for the EU data market and EU data economy.

Our analysis relied on available studies that have focused on the quantitative value of research data that is findable, accessible, interoperable and reusable.

By looking at the impact of FAIR on research activities, collaboration and innovation, indicators were identified, defined and then quantified. Seven indicators were defined to estimate the cost of not having FAIR research data: Time spent, cost of storage, licence costs, research retraction, double funding, interdisciplinarity and potential economic growth.

To estimate the first five indicators, we first assessed the inefficiencies arising in research activities due to the absence of FAIR data. From the different levels of inefficiency, we computed the time wasted due to no having FAIR and the associated costs. Secondly, we estimated the cost of extra licences that researchers have to pay to access data that would otherwise be open with the FAIR principles. Thirdly, we looked at the additional storage costs linked to the absence of FAIR data. Unaccessible data leads to the creation of additional copies of the data (e.g. by journals or partner universities) which would otherwise not be required if the FAIR principles were in place. With insufficient data to estimate the last two indicators, we provided mostly qualitative considerations and findings instead.

Following this approach, we found that the annual cost of not having FAIR research data costs the European economy at least €10.2bn every year. In addition, we also listed a number of consequences from not having FAIR which could not be reliably estimated, such as an impact on research quality, economic turnover, or machine readability of research data. By drawing a rough parallel with the European open data economy, we concluded that these unquantified elements could account for another €16bn annually on top of what we estimated. These results relied on a combination of desk research, interviews with the subject matter experts and our most conservative assumptions.

Moreover, while building on top of other available studies and being heavily reliant on existing material, we have come to realise ourselves how important is to have FAIR research data. Not only the time invested in this study could have been reduced by a significant amount, but the content could have been enhanced if more material had been accessible and reusable.
Finally, by estimating the qualitative and quantitative costs of not having FAIR data, this report will enable decision makers to make evidence based decisions about efficient ways to support the real-life implementation of the FAIR data principles. Researchers and research institutions will now be able to weight the cost of not having FAIR versus the cost of implementing the FAIR principles.

 

© European Union, 2018