“In a major step toward promoting preprint peer review as a means of increasing transparency and efficiency in scientific publishing, Review Commons is updating its policy: as of 1 June 2022, peer reviews and the authors’ response will be posted by Review Commons to bioRxiv or medRxiv when authors transfer their refereed preprint to the first affiliate journal….”
But a new paper in the journal PLoS Biology argues that, while the swell of the open science movement is on the whole a good thing, it isn’t without risks.
Though the speed of open-access publishing means important research gets out more quickly, it also means the checks required to ensure that risky science isn’t being tossed online are less meticulous. In particular, the field of synthetic biology—which involves the engineering of new organisms or the reengineering of existing organisms to have new abilities—faces what is called a dual-use dilemma: that while quickly released research may be used for the good of society, it could also be co-opted by bad actors to conduct biowarfare or bioterrorism. It also could increase the potential for an accidental release of a dangerous pathogen if, for example, someone inexperienced were able to easily get their hands on a how-to guide for designing a virus. “There is a risk that bad things are going to be shared,” says James Smith, a coauthor on the paper and a researcher at the University of Oxford. “And there’s not really processes in place at the moment to address it.”
This study aimed to analyze the content of data availability statements (DAS) and the actual sharing of raw data in preprint articles about COVID-19. The study combined a bibliometric analysis and a cross-sectional survey. We analyzed preprint articles on COVID-19 published on medRxiv and bioRxiv from January 1, 2020 to March 30, 2020. We extracted data sharing statements, tried to locate raw data when authors indicated they were available, and surveyed authors. The authors were surveyed in 2020–2021. We surveyed authors whose articles did not include DAS, who indicated that data are available on request, or their manuscript reported that raw data are available in the manuscript, but raw data were not found. Raw data collected in this study are published on Open Science Framework (https://osf.io/6ztec/). We analyzed 897 preprint articles. There were 699 (78%) articles with Data/Code field present on the website of a preprint server. In 234 (26%) preprints, data/code sharing statement was reported within the manuscript. For 283 preprints that reported that data were accessible, we found raw data/code for 133 (47%) of those 283 preprints (15% of all analyzed preprint articles). Most commonly, authors indicated that data were available on GitHub or another clearly specified web location, on (reasonable) request, in the manuscript or its supplementary files. In conclusion, preprint servers should require authors to provide data sharing statements that will be included both on the website and in the manuscript. Education of researchers about the meaning of data sharing is needed.
Preprints allow researchers to make their findings available to the scientific community before they have undergone peer review. Studies on preprints within bioRxiv have been largely focused on article metadata and how often these preprints are downloaded, cited, published, and discussed online. A missing element that has yet to be examined is the language contained within the bioRxiv preprint repository. We sought to compare and contrast linguistic features within bioRxiv preprints to published biomedical text as a whole as this is an excellent opportunity to examine how peer review changes these documents. The most prevalent features that changed appear to be associated with typesetting and mentions of supporting information sections or additional files. In addition to text comparison, we created document embeddings derived from a preprint-trained word2vec model. We found that these embeddings are able to parse out different scientific approaches and concepts, link unannotated preprint–peer-reviewed article pairs, and identify journals that publish linguistically similar papers to a given preprint. We also used these embeddings to examine factors associated with the time elapsed between the posting of a first preprint and the appearance of a peer-reviewed publication. We found that preprints with more versions posted and more textual changes took longer to publish. Lastly, we constructed a web application (https://greenelab.github.io/preprint-similarity-search/) that allows users to identify which journals and articles that are most linguistically similar to a bioRxiv or medRxiv preprint as well as observe where the preprint would be positioned within a published article landscape.
“There are two important lessons here. First, the universal availability of the internet and social networks mean that this type of information can be easily disseminated independently of preprints. Second, peer-reviewed journals may not effectively function as gatekeepers: Raoult’s paper was published after alleged peer review despite its flaws and, as of today, still has not been retracted. Preprints provide an opportunity for the scientific community to discuss new work, and indeed many researchers pointed out the flaws in the Raoult manuscript in medRxiv’s comment section and elsewhere. Additionally, the “more-sober analysis” Mullins refers to showing “HCQ has no proven role” was itself a preprint posted to medRxiv in July 2020….
We and the other cofounders of medRxiv are experienced biomedical editors and thus well aware of the challenges presented by biomedical preprints. We recognize the need to balance their undoubted advantages (which have been particularly evident during the pandemic, when they have allowed researchers to quickly share information about promising research avenues and treatments) with the potential drawbacks. medRxiv papers go through extensive screening for dangerous material, and we have previously detailed the reasons for declining certain manuscripts out of an abundance of caution. Meanwhile, as the growth of preprints on bioRxiv and medRxiv demonstrates, the scientific community is becoming acclimatized to a new norm in which research is available for discussion and comment prior to formal review….”
Abstract: Purpose of review
Preprinting, or the sharing of non-peer reviewed, unpublished scholarly manuscripts, has exploded in all fields of science and medicine over the past 5 years. We searched the literature and evaluated the posting and uptake of preprint publications in the field of lipidology in bioRxiv and medRxiv servers. We also contacted the editorial offices of 20 journals that publish original research in lipidology to gauge their policies on preprints.
All 20 journals contacted indicated that they accepted preprints. As of 31 May 2021, 473 and 231 preprints in lipidology had been submitted to bioRxiv and medRxiv, respectively. About half of all lipidology preprints were related to cardiovascular, cardiometabolic, and/or metabolic diseases (CVMD) and their risk factors, but at least 12 other disease categories were also represented. 16.9% and 1.08% of medRxiv and bioRxiv preprints, respectively, were related to coronavirus disease 2019 (COVID-19).
All identified journals accept lipidology themed preprints for submission, removing any barriers authors may have had regarding preprinting. Based on growing experience with preprinting, this trend should encourage increased community feedback and facilitate higher quality lipidology research in the future.
“Preprint servers bioRxiv & medRxiv have experienced unprecedented growth and attention during these past 18 months as they have contributed to the scientific community’s collaborative response to the present international health crisis. The frequent reports in mass-media outlets alone, after January 2020, demonstrate that bioRxiv and medRxiv are becoming recognized Open Science digital repositories that are at the center of rapidly disseminating scientific research freely throughout the world.
Please join us on Oct 26th at 11am for our inaugural session during Open Access Week 2021 as the Harvard Library welcomes Richard Sever, Assistant Director Cold Spring Harbor Laboratory Press & Co-founder of the preprint servers bioRxiv and medRxiv. Dr. Sever will share his observations and reflections on the exponential growth and impact that preprints have had on advancing scientific communication during this unprecedented time.”
Abstract: Since 2013, the usage of preprints as a means of sharing research in biology has rapidly grown, in particular via the preprint server bioRxiv. Recent studies have found that journal articles that were previously posted to bioRxiv received a higher number of citations or mentions/shares on other online platforms compared to articles in the same journals that were not posted. However, the exact causal mechanism for this effect has not been established, and may in part be related to authors’ biases in the selection of articles that are chosen to be posted as preprints. We aimed to investigate this mechanism by conducting a mixed-methods survey of 1,444 authors of bioRxiv preprints, to investigate the reasons that they post or do not post certain articles as preprints, and to make comparisons between articles they choose to post and not post as preprints. We find that authors are most strongly motivated to post preprints to increase awareness of their work and increase the speed of its dissemination; conversely, the strongest reasons for not posting preprints centre around a lack of awareness of preprints and reluctance to publicly post work that has not undergone a peer review process. We additionally find weak evidence that authors preferentially select their highest quality, most novel or most significant research to post as preprints, however, authors retain an expectation that articles they post as preprints will receive more citations or be shared more widely online than articles not posted.
Abstract: Coronavirus pandemic has radically changed the scientific world. During these difficult times, standard peer-review processes could be too long for the continuously evolving knowledge about this disease. We wanted to assess whether the use of other types of network could be a faster way to disseminate the knowledge about Coronavirus disease. We retrospectively analyzed the data flow among three distinct groups of networks during the first three months of the pandemic: PubMed, preprint repositories (biorXiv and arXiv) and social media in Italy (Facebook and Twitter). The results show a significant difference in the number of original research articles published by PubMed and preprint repositories. On social media, we observed an incredible number of physicians participating to the discussion, both on three distinct Italian-speaking Facebook groups and on Twitter. The standard scientific process of publishing articles (i.e., the peer-review process) remains the best way to get access to high-quality research. Nonetheless, this process may be too long during an emergency like a pandemic. The thoughtful use of other types of network, such as preprint repositories and social media, could be taken into consideration in order to improve the clinical management of COVID-19 patients.
“An important part of our mission at bioRxiv is to alert readers when new preprints that might interest them are posted. You can already sign up for personalized alerts on the bioRxiv Alerts/RSS page (see figure below) to get automatic notifications when papers that satisfy your search criteria are posted. We also provide dedicated RSS feeds and twitter accounts for certain subject categories (Cell Biology, Neuroscience, Genetics, etc.).
But preprints can be revised, people comment on them, and ultimately most end up being published in journals. Since these are all events readers might also want to know about, we have now added an exciting new feature that allows you to Follow a preprint so that you get notified when someone comments on it, the authors post a new version, or the paper is published as a version of record in a journal.
To follow a paper, simply click on ‘Follow this preprint’ above the title, enter your email address, and choose which events you’d like to be notified about. We’ll then send you an email when the events occur – summary emails are sent once a day so you are not bombarded! …”
“PLOS keeps a watchful and enthusiastic eye on emerging research, and we update our policies as needed to address new challenges and opportunities that surface. In doing so, we work to advance our core mission and values aimed at transforming research communication and promoting Open Science.
Here, I summarize a few key updates we made between 2016-2021….”
“For several years, bioRxiv has made life easier for authors by enabling them to send their papers directly from bioRxiv to journals. This B2J (bioRxiv-to-journal) technology saves people time by automatically transferring their PDF, metadata and any source files to journal submission systems so they don’t have to upload these again at the journal website and re-enter all the information. Around 200 journals now participate in B2J, and portable peer review services such as Review Commons also participate.
We are now introducing a new delivery pipeline – B2X – that will enable authors to send their manuscripts to a variety of third-party services. These services are completely independent of bioRxiv and may include groups that assess particular aspects of manuscripts, help authors improve them, or check for compliance with specific funder requirements. The first organization to join B2X is DataSeer, a service that helps researchers navigate open data policies.
DataSeer scans articles for datasets collected and provides recommendations for how these should be shared. Authors receive a brief report on the data that should be shared and advice on metadata, file formats, and appropriate repositories. They can also obtain an Open Data certificate documenting data deposited in public repositories….”
“Because of the increasing number of articles submitted to BJP over the past year and that cite preprint material, the Editor-In-Chief and Senior Editors with the full Editorial Board of BJP have undertaken a review of the issues and our discipline-relevant data to set policy on the issue of preprint citation for the Journal….
The discussion so far has highlighted the negative aspects of preprints, but it is important to be balanced in our considerations and to note that, during the COVID-19 pandemic, the availability of preprints has been viewed as a key factor in the break-neck speed with which the biomedical research community has shared research on insights regarding the biology and clinical features of the infection, resulting in the rapid and timely delivery of much needed therapeutic options (Else, 2020)….
An excellent example is the Randomised Evaluation of COVID-19 Therapy (RECOVERY) trial which showed the benefit of the simple and low-cost utility of dexamethasone that has saved many lives globally. The RECOVERY trial was published as a preprint on 22 June 2020 (Horby et al., 2020) and as a peer-reviewed article published as an epub in the New England Journal of Medicine on July 17th 2020 (RECOVERY collaborative group, 2021). Whilst it is highly likely that the preprint publication and sharing of the results saved lives during the short time between preprint posting and full publication, the data were made available to regulatory authorities and clinicians prior to full publication….
CONCLUSION: THE BJP WILL NOT ALLOW THE FORMAL CITATION OF PREPRINTS
The Editorial Board of the BJP support the principles of preprinting. However, given the potential risks associated with allowing the citation of preprints, it is our collective view, supported by feedback received from the journal’s international Editorial Board, that BJP should take all reasonable steps to avoid perpetuating these risks….
We are aware that the issue of preprint citation is under discussion at COPE and that the British Pharmacological Society is establishing a working group to review this issue more broadly across its publications. Thus, the stated editorial position will be reviewed, and if solutions to the problems highlighted above emerge, we will revisit our policy….”
Abstract: This study investigates citation patterns between 2017 and 2020 for preprints published in three preprint servers, one specializing in biology (bioRxiv), one in chemistry (ChemRxiv), and another hosting preprints in all disciplines (Research Square). Showing evidence that preprints are now regularly cited in peer reviewed journal articles, books, and conference papers, the outcomes of this investigation further substantiate the value of open science also in relation to citation-based metrics on which the evaluation of scholarship continues to rely on. This analysis will be useful to inform new research-based education in today’s scholarly communication. View Full-Text
“Part of our mission at bioRxiv is to alert readers to reviews and discussion of preprints and support the different ways readers provide feedback to authors on their work. These include tweets, comments on preprints and community- or journal-organized peer reviews. bioRxiv improves discoverability of such efforts by linking to peer reviews, community discussions and mentions of the preprint in social and traditional media. By aggregating this information in a new dashboard, we are now making these even easier for readers to find and access.
A series of new icons now appears in the dashboard launch bar, above each Abstract, representing different sources of preprint discussion or evaluation; the numbers of each evaluation or interaction are shown, and clicking on one of the icons opens a dashboard with details of the entries in that section….”