Interview with Nina Weisweiler and Heinz Pampel – Helmholtz Open Science Office
The Registry of Research Data Repositories (re3data) was established ten years ago. Today, the platform is the most comprehensive source of information regarding research data – global and cross-disciplinary in scope – and is used by researchers, research organisations, and publishers around the world. In the present interview, Nina Weisweiler and Heinz Pampel from the Helmholtz Open Science Office report on its genesis and plans for the service’s future.
What were the most important milestones in ten years of re3data?
Heinz Pampel: I first introduced the idea of developing a directory of research data repositories in 2010 in the Electronic Publishing working group of the German Initiative for Networked Information (DINI). A consortium of institutions was soon created that made a proposal to the German Research Foundation (DFG) in April 2011 to develop the “re3data – Registry of Research Data Repositories” The initiating institutions were the Karlsruhe Institute of Technology (KIT), the Humboldt-Universität zu Berlin, and the Helmholtz Open Science Office at the GFZ German Research Centre for Geosciences. The proposal was approved in September 2011. We started developing the registry in the same year. As a first step, a metadata schema to describe digital repositories for research data was created. In spring 2012, we came into contact with a similar initiative at Purdue University in the USA, known as “Databib”.
Fig. 1. Number of research data repositories indexed per year in re3data. [CC BY 4.0]
The idea of combining both projects soon developed, in dialogue with Databib. After the conception and implementation phase, this cooperation and internationalisation was decisive for re3data. Many stakeholders on an international level supported it. After Databib and re3data had merged, the service was continued as a partner of DataCite. Up until today, various third party funded projects support the continuous development of the service – currently “re3data COREF” for example, a project Nina Weisweiler manages here at the Helmholtz Open Science Office.
What makes re3data so unique for you?
Nina Weisweiler: re3data is the largest directory for research data repositories and is used and recommended by researchers, funding organisations, publishers, scientific institutions as well as other infrastructures around the world. It not only covers individual research fields and regions, it also targets the holistic mapping of the repository landscape for research data.
With re3data, we are actively supporting a culture of sharing and transparent handling of research data management, thereby encouraging the realisation of Open Science at an international level. re3data ensures that the sharing of data and the infrastructural work in the field of research data management receives more visibility and recognition.
In terms of Open Science, why is re3data so important?
Heinz Pampel: The core idea of re3data was always to support scientists in their handling of research data. re3data helps researchers to search for and to identify suitable infrastructures for storage and for making digital research data accessible. For this reason, many academic institutions and funding organisations, but also publishers and scholarly journals, have firmly anchored re3data in their policies. Furthermore, diverse stakeholders reuse data from re3data for their community services, for example regarding the European Open Science Cloud (EOSC) and the National Research Data Infrastructure (NFDI). The data retrieved from re3data are also increasingly used to monitor the landscape of digital information structures. Particularly in information science, researchers use re3data for analyses relating to the development of Open Science.
In your birthday post on the DataCite blog, you write that inclusivity is one of your aims. How do you want to achieve it? How do you manage, for example, to record repositories in other regions of the world? Isn’t the language barrier a problem?
Nina Weisweiler: Yes, the language barrier is a challenge of course. We responded to this challenge early on by establishing an international editorial board. There are experts on this board who check the entries in re3data, and who kindly support the service and promote it in their respective region. Furthermore, re3data collaborates with numerous stakeholders to improve the indexing of repositories outside Europe and the United States.
We are active members of the internationally focussed Research Data Alliance (RDA) and regularly exchange information with national initiatives as well as other services and stakeholders with whom we develop and intensify partnerships. For example, we are currently working with the Digital Research Alliance of Canada, in order to improve the quality of the entries of Canadian repositories.
Are you planning to offer re3data in other languages apart from English?
Nina Weisweiler: In the comprehensive metadata schema, which is used in re3data for the description of research data repositories, the names and descriptions can be added in any language. Basically, the team discusses the topic of multilingualism a lot. We try to design the service as openly and as internationally as possible. In this, we depend on the languages our editors speak in order to guarantee the quality of the datasets. Thanks to our international team, we were able to incorporate many infrastructures that are being operated in China or India for example.
How can the success of re3data be measured?
Nina Weisweiler: We consider the numerous recommendations and the wide reuse of our service as the central measurement factors for the success of re3data. Important funding organisations such as the European Commission (PDF), the National Science Foundation (NSF) or the Deutsche Forschungsgemeinschaft (German Research Foundation, DFG) recommend that researchers use the service to implement these organisations’ Open Science requirements. re3data also provides information to the Open Science Monitor of the European Commission as well as to OpenAIRE’s Open Science Observatory. The European Research Council (ERC) also refers to re3data in its recommendations for Open Science.
Furthermore, on the re3data website, we also document references that mention or recommend the service. Based on this collection, our colleague Dorothea Strecker from the Humboldt-Universität zu Berlin has made an exciting analysis that we have published in the re3data COREF project blog.
Do you know if there are also companies like publishers that use re3data as a basis for chargeable services?
Heinz Pampel: Yes. We decided on an Open Data policy when starting the service. re3data metadata are available for reuse as public domain, via CC0. Any interested party can use it via the API. Various publishers and companies in the field of scholarly information are already using re3data metadata for their services. Without this open availability of re3data metadata, several commercial services would certainly be less advanced in this field. We are sure that the advantage of Open Data ultimately outweighs the disadvantages.
re3data has many filters and functions. Which of them is your personal favourite?
Nina Weisweiler: I like the diverse browsing options, particularly the map view, which visualises the countries where institutions that are involved in the operation of the repositories are located. We have published a blog post on this topic that is well worth reading.
I am also enthusiastic about the facetted filter search, which allows for targeted searches across the almost 3,000 repository entries. At first glance, this search mode appears to be very detailed and perhaps somewhat challenging, but thanks to the exact representation of our comprehensive metadata schema in the filter facets, users can use it to search for and find a suitable repository according to their individual criteria and needs.
For technically savvy users, who would like reuse our data to prepare their own analyses, we have developed a special “treat” in the context of COREF. The colleagues at Humboldt-Universität zu Berlin and KIT have designed inspiring examples for the use of the re3data API, which are published in our GitHub repository as Jupyter Notebooks. If anyone has any queries about these examples, we would be delighted to help!
What’s more, in re3data you can also have metrics illustrated, which provide a clear overview of the current landscape of the research data repositories.
In a perfect world, where will re3data be in the year 2032?
Nina Weisweiler: I have the following vision: re3data is a high-quality and complete global directory for research data repositories from all academic disciplines. The composition of our team and our partners reflects this internationalism. We are thereby able to continue to increase coverage in regions from which not many infrastructures have yet been recorded.
Researchers, funders, publishers, and scientific institutions use the directory to reliably find the most suitable repositories und portals for their individual requirements. re3data is closely networked with further infrastructures for research data. In this way it supports an interconnected worldwide system of FAIR research data. Scientific communities use re3data actively and contribute to ensuring that the entries are current and complete.
Through greater awareness of the importance of Open Research Data and a corresponding remuneration of activities in the field of research data management, more scientists are motivated to research and publish in line with Open Science principles.publizieren.
What’s more: In re3data, datasets can be very easily updated via the link “Submit a change request” in a repository entry. We are also always delighted to receive information about new repositories. Simply fill out the “Suggest” form on our website.
This text has been translated from German.
This might also interest you:
The post Anniversary of re3data: 10 Years of Active Campaigning for the Opening of Research Data and a Culture of Sharing first appeared on ZBW MediaTalk.