Monitoring Open Access at a national level: French case study
Eric Jeangirard (Ministère de l’enseignement supérieur et de la recherche)
May 2019
After the launch of multiple plans for Open Science, there is now a need for an accurate method or tool to monitor the Open Science trends and in particular Open Access (OA) trends. We address this requirement with a methodology that we developed and tested for France, but that could be extended to other countries. Only open data and information available on the Web are used,leveraging as much as we can large-scale systems such as Unpaywall, HAL (the main open repository in France, part of the CNRS), ORCID and IDRef (referential for French Higher Education and Research). We used rule-based and machine learning techniques to enrich the metadata of the publications. We estimate that the overall OA rate for French affiliated publications ranges from 39% to 42% between 2013 and 2017. The trend is slightly up, except for the last year, but we gather evidence that shows this is a consequence of the moving nature of the OA status.
Therefore these figures should be seen as a snapshot rather than definitive. For the last observed year (2017), we show that the OA rate varies according to the publication type, the publisher and the discipline (more than 60% in Mathematics while it is about 30% in Medical research which represents the largest share in the number of publications). We describe the main challenges of our method (detection of the publications with a French affiliation, metadata enrichment with machine learning, open access status) and evaluate the errors of each step. Most of the method is not country-specific and could be applied for another perimeter.
INTRODUCTION
METHOD
First step in the method: Identify the publications with a French affiliation
Second step in the method: Enrich the metadata of the identified publications
Third step in the method: Open Access detection
RESULTS
How many publications with a French affiliation are identified?
What is the precision of the identification method?
What are the main types of publications for the identified DOIs?
What are the main publishers for the identified DOIs?
What is the split of the identified DOIs per discipline?
What is the precision of Open Access detection?
What is the overall level of Open Access on the period 2013-2017?
What is the Open Access level per type of publication (year 2017)?
What is the Open Access level per discipline (year 2017)?
What is the Open Access level per publisher (top-10 publishers for 2017)?
Discussion and conclusion