“Last year’s White House Office of Science and Technology Policy (OSTP) Nelson Memo is just one recent example of a national funding organization that is paying attention to PIDs. It directs US agencies to instruct their funded researchers “to obtain a digital persistent identifier … include it in published research outputs when available, and provide federal agencies with the metadata associated with all published research outputs they produce”. Other examples include UK Research and Innovation’s (UKRI) recently updated open access policy, which states that “Persistent Identifiers (PIDs) for articles must be implemented according to international recognised standards”; and Plan S’s requirement for the “Use of persistent identifiers (PIDs) for scholarly publications (with versioning, for example, in case of revisions), such as DOI”, which has been adopted by multiple countries.
It’s not just the national funders who are getting in on the act; there’s also been a surge in interest at the national government level. A number of countries in the Americas, Asia Pacific, and Europe are at various stages of developing and implementing national PID strategies. They include Australia, Brazil, Canada, Finland, the Netherlands, Peru, South Korea, and the UK, all of which are participating in a Research Data Alliance (RDA) National PID Strategies Working Group, set up following a Birds of a Feather session at the RDA Virtual Plenary 17 last year. There are a number of similarities between these countries’ approaches, as the RDA WG has found. Its aim is “to map common activities across national agencies/efforts and produce a guide on the specific PIDs adopted in the context of national or regional PID strategies [in order to] help others, irrespective of geographical region, follow a blueprint to define their national PID approach. The intention is that it can be adopted or adapted by other countries looking to develop their own PID strategies. By following the recommendations it will encourage standardisation internationally.” One element of this work is to identify the most commonly used PIDs across all countries, which I’m sure is music to the ears of my former NISO colleague Todd Carpenter, who pointed out in his recent post that, “It is past time that we all agree on a core set of identifiers and basic metadata elements and begin to encourage researchers to use them at scale when communicating their results.” Common PIDs (not all of which are open) that have already been identified in the RDA WG’s work include: ORCID or ISNI for researchers; ROR or ISNI for research organizations; Crossref DOIs for research articles; DataCite DOIs or Handles for research data; Crossref DOIs for grants; RAiD for projects; and DOIs, IGSN and RRID for samples and specimens….”
“The scholarly publishing community talks a LOT about metadata and the need for high-quality, interoperable, and machine-readable descriptors of the content we disseminate. However, as we’ve reflected on previously in the Kitchen, despite well-established information standards (e.g., persistent identifiers), our industry lacks a shared framework to measure the value and impact of the metadata we produce.
In 2021, we embarked on a Crossref-sponsored study designed to measure how metadata impacts end-user experiences and contributes to the successful discovery of academic and research literature via the mainstream web. Specifically, we set out to learn if scholarly books with DOIs (and associated metadata) were more easily found in Google Scholar than those without DOIs.
Initial results indicated that DOIs have an indirect influence on the discoverability of scholarly books in Google Scholar — however, we found no direct linkage between book DOIs and the quality of Google Scholar indexing or users’ ability to access the full text via search-result links. Although Google Scholar claims to not use DOI metadata in its search index, the results of our mixed-methods study of 100+ books (from 20 publishers) demonstrate that books with DOIs are generally more discoverable than those without DOIs….”
“DataCite is pleased to announce that The Wellcome Trust has awarded funds to build the Open Global Data Citation Corpus to dramatically transform the data citation landscape. The corpus will store asserted data citations from a diverse set of sources and can be used by any community stakeholder.
The Make Data Count (MDC) initiative was established in 2014 to develop an infrastructure for open data metrics. A key learning from the initiative is that the community needs a clear understanding of data reuse to monitor impact, inform future funding, and improve the dissemination of research. The development of a trusted central aggregate of all references to research data across articles, preprints, government documents, and other outputs will help achieve this goal….”
“We’re excited to introduce DOCI, the OpenCitations Index of Datacite open DOI-to-DOI citations, a new tool containing citations derived from publications bearing DataCite DOIs to other DOI-identified publications, harvested from DataCite. The citations available in DOCI are treated as first-class data entities, with accompanying properties including the citations timespan, modelled according to the OpenCitations Data Model.
Currently, DOCI’s December 2022 release contains 169,822,752 citations from 1,753,860 bibliographic resources, and is based on the last dump of DataCite dated 22 October 2021 provided by the Internet Archive. …”
“As I posted a while ago, from January 2023 I will be working at Crossref while retaining my university Professorship. I wanted, here, to outline a few of the projects that I hope to work on once I get started there. I should say upfront: I am afraid there is no time estimate on these and we can’t guarantee to prioritise any particular project. But if there is one that stands out to you, do let me know, as this serves as a useful community gauge….”
“Metadata is at the heart of DOIs and open scholarly infrastructure. At DataCite, our metadata schema defines what metadata properties can be included through DOI registration. The schema currently includes just six required properties—identifier (the DOI), creator, title, publication year, publisher, and resource type—along with 14 recommended and optional properties.
On the one hand, requiring only six metadata properties keeps the schema flexible and makes it easy to get started with DOI registration. At the same time, we want to encourage all DataCite Metadata Schema users to go beyond the mandatory properties and to share rich metadata that includes all available information about a given resource. This is especially important for metadata properties that are essential for discoverability—such as description and subject—and building connections between PIDs—including identifiers for related resources, people, and organizations. Keeping metadata up-to-date is also critical to ensure that the “persistent” part of persistent identifiers lives up to its full potential….”
“Earlier this year, DataCite consortium lead and partner organization, the Australian Research Data Commons (ARDC), together with Australian ORCID consortium lead organization, the Australian Access Federation (AAF), commissioned the MoreBrains Cooperative to undertake a cost benefit analysis of the incentives for adoption of persistent identifiers (PIDs) by the Australian research sector. The resulting report, Incentives to invest in identifiers: A cost-benefit analysis of persistent identifiers in Australian research systems, published in September, found that 80% adoption of five priority PIDs would lead to savings of 38,000 researcher days per year. The direct financial cost of this wasted effort is close to AUD24 million per year (around 15M USD/ EUR); accounting for the opportunity cost associated with technology transfer and innovation-led growth, the savings increase to a staggering AUD84 million per year!
The PIDs in question are ORCID iDs for people, ROR IDs for institutions, ARDC’s own RAiDs for projects, Crossref and DataCite DOIs for research outputs, and Crossref DOIs for grants. In addition, as part of a longer-term strategy, the report recommends that work should continue on developing PIDs for instruments, expanding the uses of IGSN IDs for samples, and potentially other IDs, in collaboration with other research communities. Other recommendations include: …”
“The ConfIDent project focuses on the development of a service platform for scientific events. ConfIDent aims to help researchers find relevant conferences in their field and to share information about conferences. The project is led by TIB – German National Library of Science and Technology and the Department of Information Systems & Databases at RWTH Aachen University (Chair of Computer Science 5).
The goal of the ConfIDent project is “to make the descriptive metadata on conferences and other formats of scientific events permanently accessible in a high quality through automated processes and scientific data curation” (https://projects.tib.eu/en/confident/). Pilot communities for the project included computer science and transport and mobility research. ConfIDent is a sustainable service for researchers who search for and publish information on scientific events, as well as universities, information infrastructure institutions, specialized societies, publishers and funding agencies. The project is supported by the German Research Foundation (DFG) through November 2022 . To date, over 12,700 events have been added to the ConfIDent platform.
The idea of registering DOIs for scientific events was initiated by a working group on PIDs for Conferences initiated by Crossref and DataCite, which published a first draft metadata schema for comment in 2018. ConfIDent has taken up this preliminary work and developed a metadata schema to describe scientific events in a sustainable way….”
“The DataCite Metadata Working Group has been working on the next version of the metadata schema—and we need your feedback!
Over the past year and a half, the Metadata Working Group has been working on changes to support the evolving use cases for DataCite DOIs. These proposed updates are in response to requests from DataCite community members and also in alignment with pillar 3 of DataCite’s strategic plan—that is, to “identify and connect all resource types held by research organizations globally.”
We want to make sure these changes work—that they solve the problems that they are intended to solve—and we want to hear from you! For the first time, we are sharing a draft proposal before releasing the next metadata schema version….”
“The opening up of citation data is welcome. It means greater transparency and accountability for research studies designed to inform academics, funders and governments in their decisions about areas of research they should focus energy and money on.
But more is needed. Not all publishers index papers on Crossref, and not all indexed papers have citation data associated with them. One study published in July found that about one-third of papers indexed in 2021 are lacking such data (N. J. van Eck and L. Waltman. Preprint at https://doi.org/10.31222/osf.io/smxe5; 2022). Some of these articles — particularly editorials, letters, corrections and book reviews — might not have any references, but this by no means applies to all of them. Uploading citation data should not be seen as optional….”
“The DataCite Metadata Working Group has been working on the next version of the metadata schema—and we need your feedback! Over the past year and a half, the Metadata Working Group has been working on changes to support the evolving use cases for DataCite DOIs. These proposed updates are in response to requests from DataCite community members and also in alignment with pillar 3 of DataCite’s strategic plan—that is, to “identify and connect all resource types held by research organizations globally.” We want to make sure these changes work—that they solve the problems that they are intended to solve—and we want to hear from you! For the first time, we are sharing a draft proposal before releasing the next metadata schema version….”
DOIs and URLs themselves don’t really tell readers much. People with visual impairments rely on screen readers to read out loud the contents of a page. We’re asking for the title of each DOI to be added, in an ARIA (Accessible Rich Internet Applications) attribute, so these users understand what these links are for.
Accessible text, as this kind of description is known, should be included for all links, but at this time, we’re specifically recommending it for landing pages of newly registered records.
It’s not required, yet. We’re proposing a 2 year recommendation period and we want your feedback on the particulars, including timing and how we can help. Please take a short survey and/or get in touch and share your thoughts.
We’ll finalize these recommendations after assessing the feedback. Please check back for updates….”
Abstract: The adoption of journal policies requiring authors to include a Data Availability Statement has helped to increase the availability of research data associated with research articles. However, having a Data Availability Statement is not a guarantee that readers will be able to locate the data; even if provided with an identifier like a uniform resource locator (URL) or a digital object identifier (DOI), the data may become unavailable due to link rot and content drift. To explore the long-term availability of resources including data, code, and other digital research objects associated with papers, this study extracted 8,503 URLs and DOIs from a corpus of nearly 50,000 Data Availability Statements from papers published in PLOS ONE between 2014 and 2016. These URLs and DOIs were used to attempt to retrieve the data through both automated and manual means. Overall, 80% of the resources could be retrieved automatically, compared to much lower retrieval rates of 10–40% found in previous papers that relied on contacting authors to locate data. Because a URL or DOI might be valid but still not point to the resource, a subset of 350 URLs and 350 DOIs were manually tested, with 78% and 98% of resources, respectively, successfully retrieved. Having a DOI and being shared in a repository were both positively associated with availability. Although resources associated with older papers were slightly less likely to be available, this difference was not statistically significant, suggesting that URLs and DOIs may be an effective means for accessing data over time. These findings point to the value of including URLs and DOIs in Data Availability Statements to ensure access to data on a long-term basis.