Update on Access to Coronavirus-related Articles in PubMed Central (PMC) COVID-19 Collection After End of Public Health Emergency

“Early in the COVID-19 pandemic, the National Library of Medicine (NLM) collaborated with publishers and scholarly societies to expand access to coronavirus-related journal articles in PubMed Central (PMC), a digital archive of peer-reviewed biomedical and life sciences literature. Through this collaboration, more than 50 publishers made more than 350,000 coronavirus-related articles accessible under various article-level license terms through the PMC COVID-19 Collection (previously the PMC COVID-19 Public Health Emergency Initiative). This collaboration made a significant collection of coronavirus-related information immediately accessible to researchers to accelerate discoveries about COVID-19.

As COVID-19 emergency declarations expired in the United States and around the globe, so too did article-level license terms for use of some of these articles. Most of the articles deposited in the PMC COVID-19 Collection will remain available in PMC and available for bulk distribution and reuse, and all citations will remain searchable in PubMed; however, some publishers retained the right to remove their content and have requested to do so.

To assist PMC users in understanding these changes, NLM is making available, in downloadable format, lists of PMCIDs (PMC unique reference numbers) for any impacted articles.

NLM remains committed to providing perpetual public access to all articles deposited in the PMC COVID-19 Collection for which the copyright holder provides such permission. More information is available from the PMC COVID-19 Collection and PMC COVID-19 Collection FAQ webpages.”

[2308.07333] Computational reproducibility of Jupyter notebooks from biomedical publications

Abstract:  Jupyter notebooks facilitate the bundling of executable code with its documentation and output in one interactive environment, and they represent a popular mechanism to document and share computational workflows. The reproducibility of computational aspects of research is a key component of scientific reproducibility but has not yet been assessed at scale for Jupyter notebooks associated with biomedical publications. We address computational reproducibility at two levels: First, using fully automated workflows, we analyzed the computational reproducibility of Jupyter notebooks related to publications indexed in PubMed Central. We identified such notebooks by mining the articles full text, locating them on GitHub and re-running them in an environment as close to the original as possible. We documented reproduction success and exceptions and explored relationships between notebook reproducibility and variables related to the notebooks or publications. Second, this study represents a reproducibility attempt in and of itself, using essentially the same methodology twice on PubMed Central over two years. Out of 27271 notebooks from 2660 GitHub repositories associated with 3467 articles, 22578 notebooks were written in Python, including 15817 that had their dependencies declared in standard requirement files and that we attempted to re-run automatically. For 10388 of these, all declared dependencies could be installed successfully, and we re-ran them to assess reproducibility. Of these, 1203 notebooks ran through without any errors, including 879 that produced results identical to those reported in the original notebook and 324 for which our results differed from the originally reported ones. Running the other notebooks resulted in exceptions. We zoom in on common problems, highlight trends and discuss potential improvements to Jupyter-related workflows associated with biomedical publications.

Increasing agility and visibility in scientific publishing – Archives of Endocrinology and Metabolism

“Since its inception, AE&M aimed to establish itself as a leading source of high-quality scienti?c information in the areas of endocrinology and metabolism ( 1 ). In that sense, maintaining open access to our articles was paramount to amplify the reach of such information in a globalized, albeit inequitable, world ( 2 ). Aiming to continue to serve the community of readers, authors, and reviewers in the best possible way, two new implementations are underway: AE&M has joined PubMed Central (PMC), and from May 2023, AE&M will adopt the continuous publication model.


AE&M’s incorporation into PMC re?ects its growth and scienti?c relevance in the ?eld. PMC is a free full-text repository of biomedical and life sciences journal literature at the U.S. National Institutes of Health’s National Library of Medicine (NIH/NLM), directly linked to its preeminent search engine. Created in 2000, it houses more than 7.6 million records and, like SciELO, PMC-indexed journals make their issues and articles available in full format. Its global reach will certainly bring even more visibility and prominence to the research ?ndings published in AE&M….”

OpCitance: Citation contexts identified from the PubMed Central open access articles | Scientific Data

Abstract:  OpCitance contains all the sentences from 2 million PubMed Central open-access (PMCOA) articles, with 137 million inline citations annotated (i.e., the “citation contexts”). Parsing out the references and citation contexts from the PMCOA XML files was non-trivial due to the diversity of referencing style. Only 0.5% citation contexts remain unidentified due to technical or human issues, e.g., references unmentioned by the authors in the text or improper XML nesting, which is more common among older articles (pre-2000). PubMed IDs (PMIDs) linked to inline citations in the XML files compared to citations harvested using the NCBI E-Utilities differed for 70.96% of the articles. Using an in-house citation matcher, called Patci, 6.84% of the referenced PMIDs were supplemented and corrected. OpCitance includes fewer total number of articles than the Semantic Scholar Open Research Corpus, but OpCitance has 160 thousand unique articles, a higher inline citation identification rate, and a more accurate reference mapping to PMIDs. We hope that OpCitance will facilitate citation context studies in particular and benefit text-mining research more broadly.



PLOS Global Public Health and PLOS Digital Health now indexed in PubMed Central | STM Publishing News

“The Public Library of Science (PLOS) is pleased to announce that PLOS Global Public Health and PLOS Digital Health are now fully indexed in PubMed Central (PMC), expanding our reach and furthering our mission of ensuring research content is accessible and discoverable as widely as possible.

Both journals have an explicit mandate to promote equity in research that can tackle the most urgent priorities for the field, such as access to healthcare, or addressing bias in AI and developing machine learning tools for underserved communities. PLOS is proud to feature perspectives from all over the world, and we make sure that research is peer reviewed by experts with significant, context-appropriate expertise….

Work published in PLOS Digital Health and PLOS Global Public Health will now be accessible to an even wider audience, meeting researchers where it is convenient for them to access knowledge. With the vast majority of article views coming from PMC or Google Scholar searches, it is imperative that research in both journals be highly visible on these platforms.

Critically, the inclusion of PLOS Digital Health and PLOS Global Public Health in PMC is an endorsement of the rigor and reliability of the work published within and is the principle reason that researchers prefer to browse research on the platform. Journals indexed in PMC have undergone both technical and scientific benchmarking checks, allowing researchers to trust the findings, methods, and datasets shared. Of particular importance to the mission of both journals, this means local perspectives and expertise reported in rigorously reviewed published research will receive the attention and visibility that it deserves….”

Discover drug targets with Europe PMC Machine Learning Dataset and Open Targets – YouTube

“This webinar introduces the Europe PMC human-annotated full-text corpus for Gene/Proteins, Diseases and Organisms and highlights how it has been used to train machine learning models for systematic identification and prioritisation of potential therapeutic drugs by the Open Targets Platform. This webinar will explain how to access and reuse the annotated corpus as an open community resource and provide an overview of the Open Targets Platform.

Europe PMC is an open access life science database of journal articles and preprints, containing over 41 million abstracts and 8.7 million full-text articles. Open Targets Platform is an innovative public-private partnership that uses human genetics and genomics data for systematic drug target identification and prioritisation….”

NIH Preprint Pilot Accelerates and Expands Discovery of Research Results

“The NIH Preprint Pilot has accelerated and expanded broad discovery of NIH-funded research results relating to the SARS-CoV-2 virus and COVID-19. This finding comes from a new preprint authored by staff at the National Library of Medicine (NLM) and made available in bioRxiv. A project of NLM, the NIH Preprint Pilot was launched to explore new approaches to increase the discoverability of NIH-supported research results and gain a better understanding of perceptions and practices regarding preprints.

Preprints are complete and public drafts of scientific articles that have not yet been peer reviewed. Their use in communicating the results of biomedical research surged during the COVID-19 pandemic. The NIH Preprint Pilot builds on the role of PubMed Central (PMC) as a repository for peer-reviewed articles supported by NIH under the NIH Public Access Policy as well as NIH’s encouragement of investigators to use interim products of research, including preprints, to speed the dissemination of research and enhance the rigor of their work.

As part of NLM’s response to the COVID-19 public health emergency, Phase 1 of the NIH Preprint Pilot added to PMC more than 3,300 preprint records reporting on the results of NIH-funded research on the SARS-CoV-2 virus and COVID-19 and made citations discoverable in PubMed. These preprints have been viewed 4 million times and 3 million times in PMC and PubMed, respectively. The records are clearly labeled as preprints and can be included or excluded from search results in both resources.

Two years after the launch of the pilot, NLM analyzed the results of Phase 1 and found that inclusion of preprints facilitates discovery of NIH-supported research by making content available in full-text searchable formats, accelerating discoverability in NLM literature databases, and expanding the NIH research results made searchable. The findings suggest that the availability of preprints did not decrease users’ trust of NLM and its literature resources, with some reporting increased trust due to greater transparency offered into the research process.

The success of the pilot has encouraged NLM to extend the pilot in a second phase to launch in early 2023 that will encompass all preprints reporting on NIH-funded research. For preprints that are authored by NIH-funded researchers and voluntarily posted to eligible preprint servers on or after January 1, 2023, NLM will automatically include the full text of the preprint (as license terms allow) and associated citation information available in PMC and PubMed, respectively….”

Phase 1 of the NIH Preprint Pilot: Testing the viability of making preprints discoverable in PubMed Central and PubMed | bioRxiv

Abstract:  Introduction The National Library of Medicine (NLM) launched a pilot in June 2020 to 1) explore the feasibility and utility of adding preprints to PubMed Central (PMC) and making them discoverable in PubMed and 2) to support accelerated discoverability of NIH-supported research without compromising user trust in NLM’s widely used literature services.

Methods The first phase of the Pilot focused on archiving preprints reporting NIH-supported SARS-CoV-2 virus and COVID-19 research. To launch Phase 1, NLM identified eligible preprint servers and developed processes for identifying NIH-supported preprints within scope in these servers. Processes were also developed for the ingest and conversion of preprints in PMC and to send corresponding records to PubMed. User interfaces were modified for display of preprint records. NLM collected data on the preprints ingested and discovery of preprint records in PMC and PubMed and engaged users through focus groups and a survey to obtain direct feedback on the Pilot and perceptions of preprints.

Results Between June 2020 and June 2022, NLM added more than 3,300 preprint records to PMC and PubMed, which were viewed 4 million times and 3 million times, respectively. Nearly a quarter of preprints in the Pilot were not associated with a peer-reviewed published journal article. User feedback revealed that the inclusion of preprints did not have a notable impact on trust in PMC or PubMed.

Discussion NIH-supported preprints can be identified and added to PMC and PubMed without disrupting existing operations processes. Additionally, inclusion of preprints in PMC and PubMed accelerates discovery of NIH research without reducing trust in NLM literature services. Phase 1 of the Pilot provided a useful testbed for studying NIH investigator preprint posting practices, as well as knowledge gaps among user groups, during the COVID-19 public health emergency, an unusual time with heightened interest in immediate access to research results.

JMIRx Med first overlay journal accepted for PubMed and PubMed Central

“MIR Publications is proud to announce that our first-of-its-kind overlay journal, JMIRx Med, has been accepted for indexing in PubMed Central (PMC) and PubMed.

As the first overlay journal in PMC and PubMed, JMIRx Med becomes the standard-bearer of this important innovation in scholarly publishing. Editors of overlay journals select content already posted on preprint servers such as medRxiv and bioRxiv. They then select manuscripts that match the scope and quality parameters of their publications and offer authors a rapid peer review and possible publication of their preprints, coupled with all the traditional elements of a journal publication. JMIRx Med enters the ranks of PubMed-ranked scientific publications following the US National Library of Medicine’s (NLM’s) rigorous evaluation criteria. Papers published in JMIRx Med will be in PubMed by mid-summer 2022, after legacy files are prepared and deposited….”

Public Access in PMC Update

In 2021, PubMed Central (PMC) continued to grow and evolve in its role as a repository for research support by the National Institutes of Health (NIH) and other partner funding agencies. Around 1.3 million articles have been made publicly accessible in PMC under the NIH Public Access Policy; and the volume of NIH-supported articles added to PMC with associated data content continues to increase annually (59% of articles in 2020 included supplementary material and/or a data availability statement vs. 27% in 2009).

Blog – Europe PMC: Europe PMC adopts the Principles of Open Scholarly Infrastructure

“As a long-standing service and infrastructure provider in the open science ecosystem, Europe PMC supports the Principles of Open Scholarly Infrastructure (POSI). We welcome the momentum gathering behind this initiative to promote the need to support and sustain the open infrastructure.

Europe PMC has been a part of the public and open infrastructure for over 15 years and is run and managed by EMBL-EBI (which is part of the pan-European organisation of EMBL). It is funded by 34 international funders and is community-driven, open infrastructure, set in the context of key global open data resources such as the European Nucleotide Archive (INSDC), the wwPDB and the European Genome-Phenome Archive. All of these resources exist for the public good, led by scientific need and international collaborations, and have open governance structures and a commitment to long-term sustainability. Together with PMC USA, Europe PMC is a part of the PubMed Central International archive network, which plays an integral part in fulfilling shared goals to enable international open science. Europe PMC has been selected as an ELIXIR Core Data Resource, which means that it is of fundamental importance to the wider life-science community and the long-term preservation of biological data….”

Updated PMC Launching Soon!

In the coming weeks, we will be launching an updated PMC website with a modern design. You can try the updated version on PMC Labs now, and it will become the default design of the PMC website following launch. Be sure to check the banner at the top of the PMC website for updates on an exact cutover date.

Updated PMC Launching Soon!

In the coming weeks, we will be launching an updated PMC website with a modern design. You can try the updated version on PMC Labs now, and it will become the default design of the PMC website following launch. Be sure to check the banner at the top of the PMC website for updates on an exact cutover date.

Funders – About – Europe PMC

“Europe PMC has 33 research funders, across Europe. The Europe PMC funders expect:

Research outputs arising from research that we fund to be made freely and readily available;
Electronic copies of any biomedical research papers that have been accepted for publication in a peer-reviewed journal, and are supported in whole or in part by funding from any of the Europe PMC Funders, to be made available through PubMed Central (PMC) and Europe PMC, as soon as possible and in any event within six months of the journal publisher’s official date of final publication;
Authors and publishers, if an open access fee has been paid, to license research papers such that they may be freely copied and re-used for purposes such as text and data mining, provided that such uses are fully attributed. This is also encouraged where no fee has been paid….”