bioRxiv and medRxiv response to the OSTP memo – an open letter to US funding agencies

“Agencies can enable free public access to research results simply by mandating that reports of federally funded research are made available as “preprints” on servers such as arXiv, bioRxiv, medRxiv, and chemRxiv, before being submitted for journal publication. This will ensure that the findings are freely accessible to anyone anywhere in the world. An important additional benefit is the immediate availability of the information, avoiding the long delays associated with evaluation by traditional scientific journals (typically around one year). Scientific inquiry then progresses faster, as has been particularly evident for COVID research during the pandemic.

Prior access mandates in the US and elsewhere have focused on articles published by academic journals. This complicated the issue by making it a question of how to adapt journal revenue streams and led to the emergence of new models based on article-processing charges (APCs). But APCs simply move the access barrier to authors: they are a significant financial obstacle for researchers in fields and communities that lack the funding to pay them. A preprint mandate would achieve universal access for both authors and readers upstream, ensuring the focus remains on providing access to research findings, rather than on how they are selected and filtered.

Mandating public access to preprints rather than articles in academic journals would also future-proof agencies’ access policies. The distinction between peer-reviewed and non-peer-reviewed material is blurring as new approaches make peer review an ongoing process rather than a judgment made at a single point in time. Peer review can be conducted independently of journals through initiatives like Review Commons. And traditional journal-based peer review is changing: for example, eLife, supported by several large funders, peer reviews submitted papers but no longer distinguishes accepted from rejected articles. The author’s “accepted” manuscript that is the focus of so-called Green Open Access policies may therefore no longer exist. Because of such ongoing change, mandating the free availability of preprints would be a straightforward and strategically astute policy for US funding agencies.

A preprint mandate would underscore the fundamental, often overlooked, point that it is the results of research to which the public should have access. The evaluation of that research by journals is part of an ongoing process of assessment that can take place after the results have been made openly available. Preprint mandates from the funders of research would also widen the possibilities for evolution within the system and avoid channeling it towards expensive APC-based publishing models. Furthermore, since articles on preprint servers can be accompanied by supplementary data deposits on the servers themselves or linked to data deposited elsewhere, preprint mandates would also provide mechanisms to accomplish the other important OSTP goal: availability of research data.”

preLights talks to Richard Sever – preLights

“Richard Sever is Assistant Director of Cold Spring Harbor Laboratory Press (CSHL Press) and co-founder of bioRxiv and medRxiv. Prior to moving to CSHL Press in 2008, he worked as an editor for several journals including Current Opinion in Cell Biology, Trends in Biochemical Sciences, and Journal of Cell Science. Here, we discuss Richard’s transition into the academic publishing industry, the journey that led him to co-found the preprint servers bioRxiv and medRxiv with John Inglis, and his take on preprint peer review and the value it can hold for early-career researchers….”

Motivations, concerns and selection biases when posting preprints: A survey of bioRxiv authors | PLOS ONE

Abstract:  Since 2013, the usage of preprints as a means of sharing research in biology has rapidly grown, in particular via the preprint server bioRxiv. Recent studies have found that journal articles that were previously posted to bioRxiv received a higher number of citations or mentions/shares on other online platforms compared to articles in the same journals that were not posted. However, the exact causal mechanism for this effect has not been established, and may in part be related to authors’ biases in the selection of articles that are chosen to be posted as preprints. We aimed to investigate this mechanism by conducting a mixed-methods survey of 1,444 authors of bioRxiv preprints, to investigate the reasons that they post or do not post certain articles as preprints, and to make comparisons between articles they choose to post and not post as preprints. We find that authors are most strongly motivated to post preprints to increase awareness of their work and increase the speed of its dissemination; conversely, the strongest reasons for not posting preprints centre around a lack of awareness of preprints and reluctance to publicly post work that has not undergone a peer review process. We additionally find evidence that authors do not consider quality, novelty or significance when posting or not posting research as preprints, however, authors retain an expectation that articles they post as preprints will receive more citations or be shared more widely online than articles not posted.

 

New policy: Review Commons makes preprint review fully transparent – ASAPbio

“In a major step toward promoting preprint peer review as a means of increasing transparency and efficiency in scientific publishing, Review Commons is updating its policy: as of 1 June 2022, peer reviews and the authors’ response will be posted by Review Commons to bioRxiv or medRxiv when authors transfer their refereed preprint to the first affiliate journal….”

Making Science More Open Is Good for Research—but Bad for Security

But a new paper in the journal PLoS Biology argues that, while the swell of the open science movement is on the whole a good thing, it isn’t without risks. 

 

Though the speed of open-access publishing means important research gets out more quickly, it also means the checks required to ensure that risky science isn’t being tossed online are less meticulous. In particular, the field of synthetic biology—which involves the engineering of new organisms or the reengineering of existing organisms to have new abilities—faces what is called a dual-use dilemma: that while quickly released research may be used for the good of society, it could also be co-opted by bad actors to conduct biowarfare or bioterrorism. It also could increase the potential for an accidental release of a dangerous pathogen if, for example, someone inexperienced were able to easily get their hands on a how-to guide for designing a virus. “There is a risk that bad things are going to be shared,” says James Smith, a coauthor on the paper and a researcher at the University of Oxford. “And there’s not really processes in place at the moment to address it.”

 

Open data and data sharing in articles about COVID-19 published in preprint servers medRxiv and bioRxiv

This study aimed to analyze the content of data availability statements (DAS) and the actual sharing of raw data in preprint articles about COVID-19. The study combined a bibliometric analysis and a cross-sectional survey. We analyzed preprint articles on COVID-19 published on medRxiv and bioRxiv from January 1, 2020 to March 30, 2020. We extracted data sharing statements, tried to locate raw data when authors indicated they were available, and surveyed authors. The authors were surveyed in 2020–2021. We surveyed authors whose articles did not include DAS, who indicated that data are available on request, or their manuscript reported that raw data are available in the manuscript, but raw data were not found. Raw data collected in this study are published on Open Science Framework (https://osf.io/6ztec/). We analyzed 897 preprint articles. There were 699 (78%) articles with Data/Code field present on the website of a preprint server. In 234 (26%) preprints, data/code sharing statement was reported within the manuscript. For 283 preprints that reported that data were accessible, we found raw data/code for 133 (47%) of those 283 preprints (15% of all analyzed preprint articles). Most commonly, authors indicated that data were available on GitHub or another clearly specified web location, on (reasonable) request, in the manuscript or its supplementary files. In conclusion, preprint servers should require authors to provide data sharing statements that will be included both on the website and in the manuscript. Education of researchers about the meaning of data sharing is needed.

Examining Linguistic Shifts Between Preprints and Publications

Preprints allow researchers to make their findings available to the scientific community before they have undergone peer review. Studies on preprints within bioRxiv have been largely focused on article metadata and how often these preprints are downloaded, cited, published, and discussed online. A missing element that has yet to be examined is the language contained within the bioRxiv preprint repository. We sought to compare and contrast linguistic features within bioRxiv preprints to published biomedical text as a whole as this is an excellent opportunity to examine how peer review changes these documents. The most prevalent features that changed appear to be associated with typesetting and mentions of supporting information sections or additional files. In addition to text comparison, we created document embeddings derived from a preprint-trained word2vec model. We found that these embeddings are able to parse out different scientific approaches and concepts, link unannotated preprint–peer-reviewed article pairs, and identify journals that publish linguistically similar papers to a given preprint. We also used these embeddings to examine factors associated with the time elapsed between the posting of a first preprint and the appearance of a peer-reviewed publication. We found that preprints with more versions posted and more textual changes took longer to publish. Lastly, we constructed a web application (https://greenelab.github.io/preprint-similarity-search/) that allows users to identify which journals and articles that are most linguistically similar to a bioRxiv or medRxiv preprint as well as observe where the preprint would be positioned within a published article landscape.

Opinion: In Defense of Preprints | The Scientist Magazine®

“There are two important lessons here. First, the universal availability of the internet and social networks mean that this type of information can be easily disseminated independently of preprints. Second, peer-reviewed journals may not effectively function as gatekeepers: Raoult’s paper was published after alleged peer review despite its flaws and, as of today, still has not been retracted. Preprints provide an opportunity for the scientific community to discuss new work, and indeed many researchers pointed out the flaws in the Raoult manuscript in medRxiv’s comment section and elsewhere. Additionally, the “more-sober analysis” Mullins refers to showing “HCQ has no proven role” was itself a preprint posted to medRxiv in July 2020….

We and the other cofounders of medRxiv are experienced biomedical editors and thus well aware of the challenges presented by biomedical preprints. We recognize the need to balance their undoubted advantages (which have been particularly evident during the pandemic, when they have allowed researchers to quickly share information about promising research avenues and treatments) with the potential drawbacks. medRxiv papers go through extensive screening for dangerous material, and we have previously detailed the reasons for declining certain manuscripts out of an abundance of caution. Meanwhile, as the growth of preprints on bioRxiv and medRxiv demonstrates, the scientific community is becoming acclimatized to a new norm in which research is available for discussion and comment prior to formal review….”

Preprint servers in lipidology current status and future role

Abstract:  Purpose of review 

Preprinting, or the sharing of non-peer reviewed, unpublished scholarly manuscripts, has exploded in all fields of science and medicine over the past 5 years. We searched the literature and evaluated the posting and uptake of preprint publications in the field of lipidology in bioRxiv and medRxiv servers. We also contacted the editorial offices of 20 journals that publish original research in lipidology to gauge their policies on preprints.

Findings 

All 20 journals contacted indicated that they accepted preprints. As of 31 May 2021, 473 and 231 preprints in lipidology had been submitted to bioRxiv and medRxiv, respectively. About half of all lipidology preprints were related to cardiovascular, cardiometabolic, and/or metabolic diseases (CVMD) and their risk factors, but at least 12 other disease categories were also represented. 16.9% and 1.08% of medRxiv and bioRxiv preprints, respectively, were related to coronavirus disease 2019 (COVID-19).

Summary 

All identified journals accept lipidology themed preprints for submission, removing any barriers authors may have had regarding preprinting. Based on growing experience with preprinting, this trend should encourage increased community feedback and facilitate higher quality lipidology research in the future.

bioRxiv & medRxiv; Communicating at the Speed of Science

“Preprint servers bioRxiv & medRxiv have experienced unprecedented growth and attention during these past 18 months as they have contributed to the scientific community’s collaborative response to the present international health crisis. The frequent reports in mass-media outlets alone, after January 2020, demonstrate that bioRxiv and medRxiv are becoming recognized Open Science digital repositories that are at the center of rapidly disseminating scientific research freely throughout the world.

Please join us on Oct 26th at 11am for our inaugural session during Open Access Week 2021 as the Harvard Library welcomes Richard Sever, Assistant Director Cold Spring Harbor Laboratory Press & Co-founder of the preprint servers bioRxiv and medRxiv. Dr. Sever will share his observations and reflections on the exponential growth and impact that preprints have had on advancing scientific communication during this unprecedented time.”

Motivations, concerns and selection biases when posting preprints: a survey of bioRxiv authors | bioRxiv

Abstract:  Since 2013, the usage of preprints as a means of sharing research in biology has rapidly grown, in particular via the preprint server bioRxiv. Recent studies have found that journal articles that were previously posted to bioRxiv received a higher number of citations or mentions/shares on other online platforms compared to articles in the same journals that were not posted. However, the exact causal mechanism for this effect has not been established, and may in part be related to authors’ biases in the selection of articles that are chosen to be posted as preprints. We aimed to investigate this mechanism by conducting a mixed-methods survey of 1,444 authors of bioRxiv preprints, to investigate the reasons that they post or do not post certain articles as preprints, and to make comparisons between articles they choose to post and not post as preprints. We find that authors are most strongly motivated to post preprints to increase awareness of their work and increase the speed of its dissemination; conversely, the strongest reasons for not posting preprints centre around a lack of awareness of preprints and reluctance to publicly post work that has not undergone a peer review process. We additionally find weak evidence that authors preferentially select their highest quality, most novel or most significant research to post as preprints, however, authors retain an expectation that articles they post as preprints will receive more citations or be shared more widely online than articles not posted.

 

Research data communication strategy at the time of pandemics: a retrospective analysis of the Italian experience | Monaldi Archives for Chest Disease

Abstract:  Coronavirus pandemic has radically changed the scientific world. During these difficult times, standard peer-review processes could be too long for the continuously evolving knowledge about this disease. We wanted to assess whether the use of other types of network could be a faster way to disseminate the knowledge about Coronavirus disease. We retrospectively analyzed the data flow among three distinct groups of networks during the first three months of the pandemic: PubMed, preprint repositories (biorXiv and arXiv) and social media in Italy (Facebook and Twitter). The results show a significant difference in the number of original research articles published by PubMed and preprint repositories. On social media, we observed an incredible number of physicians participating to the discussion, both on three distinct Italian-speaking Facebook groups and on Twitter. The standard scientific process of publishing articles (i.e., the peer-review process) remains the best way to get access to high-quality research. Nonetheless, this process may be too long during an emergency like a pandemic. The thoughtful use of other types of network, such as preprint repositories and social media, could be taken into consideration in order to improve the clinical management of COVID-19 patients.

 

Following Preprints

“An important part of our mission at bioRxiv is to alert readers when new preprints that might interest them are posted. You can already sign up for personalized alerts on the bioRxiv Alerts/RSS page (see figure below) to get automatic notifications when papers that satisfy your search criteria are posted. We also provide dedicated RSS feeds and twitter accounts for certain subject categories (Cell Biology, Neuroscience, Genetics, etc.). 

But preprints can be revised, people comment on them, and ultimately most end up being published in journals. Since these are all events readers might also want to know about, we have now added an exciting new feature that allows you to Follow a preprint so that you get notified when someone comments on it, the authors post a new version, or the paper is published as a version of record in a journal.

 

To follow a paper, simply click on ‘Follow this preprint’ above the title, enter your email address, and choose which events you’d like to be notified about. We’ll then send you an email when the events occur – summary emails are sent once a day so you are not bombarded! …”

Making Strides in Research Reporting – The Official PLOS Blog

“PLOS keeps a watchful and enthusiastic eye on emerging research, and we update our policies as needed to address new challenges and opportunities that surface. In doing so, we work to advance our core mission and values aimed at transforming research communication and promoting Open Science. 

Here, I summarize a few key updates we made between 2016-2021….”

B2X – a new pipeline for author services

“For several years, bioRxiv has made life easier for authors by enabling them to send their papers directly from bioRxiv to journals. This B2J (bioRxiv-to-journal) technology saves people time by automatically transferring their PDF, metadata and any source files to journal submission systems so they don’t have to upload these again at the journal website and re-enter all the information. Around 200 journals now participate in B2J, and portable peer review services such as Review Commons also participate.

We are now introducing a new delivery pipeline – B2X – that will enable authors to send their manuscripts to a variety of third-party services. These services are completely independent of bioRxiv and may include groups that assess particular aspects of manuscripts, help authors improve them, or check for compliance with specific funder requirements. The first organization to join B2X is DataSeer, a service that helps researchers navigate open data policies.

DataSeer scans articles for datasets collected and provides recommendations for how these should be shared. Authors receive a brief report on the data that should be shared and advice on metadata, file formats, and appropriate repositories. They can also obtain an Open Data certificate documenting data deposited in public repositories….”