Attitudes, behaviours and experiences of authors of COVID-19 preprints

Abstract:  The COVID-19 pandemic caused a rise in preprinting, apparently triggered by the need for open and rapid dissemination of research outputs. We surveyed authors of COVID-19 preprints to learn about their experience of preprinting as well as publishing in a peer-reviewed journal. A key aim was to consider preprints in terms of their effectiveness for authors to receive feedback on their work. We also aimed to compare the impact of feedback on preprints with the impact of comments of editors and reviewers on papers submitted to journals. We observed a high rate of new adopters of preprinting who reported positive intentions regarding preprinting their future work. This allows us to posit that the boost in preprinting may have a structural effect that will last after the pandemic. We also saw a high rate of feedback on preprints but mainly through “closed” channels – directly to the authors. This means that preprinting was a useful way to receive feedback on research, but the value of feedback could be increased further by facilitating and promoting “open” channels for preprint feedback. At the same time, almost a quarter of the preprints that received feedback received comments resembling journal peer review. This shows the potential of preprint feedback to provide valuable detailed comments on research. However, journal peer review resulted in a higher rate of major changes in the papers surveyed, suggesting that the journal peer review process has significant added value compared to preprint feedback.


The National Health and Medical Research Council (NHMRC) of Australia joins cOAlition S | Plan S

Australia’s National Health and Medical Research Council (NHMRC) is the first Australian organisation to join cOAlition S and the country’s first funding agency to introduce the requirement that scholarly publications arising from the research it funds must be made freely available and accessible.

Patient outcomes, open access: Ginny Barbour sets MJA agenda | InSight+

“There’s no doubt for me that we are moving along a trajectory where open access is absolutely going to be the outcome. The question is just how we get there and how quickly we get there.

Just a couple of weeks ago, the Office of Science and Technology Policy from the United States White House put out an edict that all federally funded research in the US must be made open access by 2026. In Australia already, we have a number of moves that are going in that direction.

We know that our Chief Scientist Dr Cathy Foley is looking at that closely, and the [National Health and Medical Research Council] and the [Australian Research Council] have open access policies.

I think it’s fair to say that this is a topic of great interest and Australia probably needs to move a little bit quicker.

“For the MJA [Medical Journal of Australia], there’s no question that we want open access. We want that research to be read; it needs to be used and reused, not just by practitioners but by patients. Open access can only be a good thing for the Journal.”

OpenPBTA: An Open Pediatric Brain Tumor Atlas | bioRxiv

Abstract:  Pediatric brain and spinal cancer are the leading disease-related cause of death in children, thus we urgently need curative therapeutic strategies for these tumors. To accelerate such discoveries, the Children’s Brain Tumor Network and Pacific Pediatric Neuro-Oncology Consortium created a systematic process for tumor biobanking, model generation, and sequencing with immediate access to harmonized data. We leverage these data to create OpenPBTA, an open collaborative project which establishes over 40 scalable analysis modules to genomically characterize 1,074 pediatric brain tumors. Transcriptomic classification reveals that TP53 loss is a significant marker for poor overall survival in ependymomas and H3 K28-altered diffuse midline gliomas and further identifies universal TP53 dysregulation in mismatch repair-deficient hypermutant high-grade gliomas. OpenPBTA is a foundational analysis platform actively being applied to other pediatric cancers and inform molecular tumor board decision-making, making it an invaluable resource to the pediatric oncology community.


XCIST – an open access x-ray/CT simulation toolkit – IOPscience

Abstract:  Objective: X-ray-based imaging modalities including mammography and computed tomography (CT) are widely used in cancer screening, diagnosis, staging, treatment planning, and therapy response monitoring. Over the past few decades, improvements to these modalities have resulted in substantially improved efficacy and efficiency, and substantially reduced radiation dose and cost. However, such improvements have evolved more slowly than would be ideal because lengthy preclinical and clinical evaluation is required. In many cases, new ideas cannot be evaluated due to the high cost of fabricating and testing prototypes. Wider availability of computer simulation tools could accelerate development of new imaging technologies. This paper introduces the development of a new open-access simulation environment for X-ray-based imaging. Approach: The X-ray-based Cancer Imaging Simulation Toolkit (XCIST) is developed in the context of cancer imaging, but can more broadly be applied. XCIST is physics-based, written in Python and C/C++, and currently consists of three major subsets: digital phantoms, the simulator itself (CatSim), and image reconstruction algorithms; planned future features include a fast dose-estimation tool and rigorous validation. To enable broad usage and to model and evaluate new technologies, XCIST is easily extendable by other researchers. To demonstrate XCIST’s ability to produce realistic images and to show the benefits of using XCIST for insight into the impact of separate physics effects on image quality, we present exemplary simulations by varying contributing factors such as noise and sampling. Main Results: The capabilities and flexibility of XCIST are demonstrated, showing easy applicability to specific simulation problems. Geometric and X-ray attenuation accuracy are shown, as well as XCIST’s ability to model multiple scanner and protocol parameters, and to attribute fundamental image quality characteristics to specific parameters. Significance: This work represents an important first step toward the goal of creating an open-access platform for simulating existing and emerging X-ray-based imaging systems.

LSHTM Press launches with a mission of equity in publishing in global health | LSHTM

“A new publishing platform for open access biomedical research has launched. LSHTM Press will provide an open access platform to publish peer-reviewed research and high-quality educational resources, in accordance with the LSHTM mission to improve health and health equity in the UK and worldwide.

The Press is a new initiative, developed in response to the increasing costs of publishing open access, and the many mandates and policies from funders and governments around the world. It will facilitate innovative and experimental publishing methods while striving towards equity in academic publishing in global health.

It launches with and will continue to develop a focus on equity, diversity and inclusion (EDI), and is in alignment with central LSHTM vision and values. Two dedicated EDI leads sit on the LSHTM Press Steering Committee, and the whole team is committed to promoting inclusivity and reducing barriers….”

Fifteen Questions: Marc Lipsitch on Covid Modeling, Open-Access Science, and Latte Art

“I’ve been arguing for open-access and favoring open-access journals for my own work — not exclusively, but a lot. I’ve been working as an editor and reviewer on open-access journals. That’s the wave of the future. It’s outrageous that publicly funded research is paywalled. Most journals add almost no value to the papers they publish. “Epidemiology,” one of the major journals in our field, really edits it carefully and improves the paper beyond the peer review. But there are other ways to pay for that. It’s long overdue….”

Remarks by President Biden on the Cancer Moonshot Initiative – The White House

“When I led the Cancer Moonshot as Vice President, one of the biggest issues I talked about was how federally funded cancer researchers were not sharing their results with their peers or the public because they wanted to have the answer. You all know it.

As I mentioned earlier, we made federally funded cancer research more available to any patient, to any doctor anywhere for free.

And today, as President, we’re making sure that transparency applies to all federally funded science beyond just cancer….”

Health Science Policy Analyst

“Duties As a Health Science Policy Analyst, some of your duties and responsibilities will include, but are not limited to the following: Conduct the development and implementation of the trans-NIH GDS resource by utilizing expertise and resources in genomics, bioinformatics, biomedical data analysis, policy implementation, and large-scale coordination and collaboration. Participate in the development of comprehensive information, education, and training resources in the areas of GDS compliance. Respond to inquiries from NIH staff, NIH data access committees and outside investigators related to implementation of the GDS policy and provides support to NIH data access committees, data repository staff and other key stakeholders. Perform analysis and evaluation of significant problems or questions pertaining to NIH GDS activities and policies. Plan, conduct, coordinate, and evaluate extensive long-range studies and develop analyses regarding GDS policy….”

Open science and data sharing in trauma research: Developing a trauma-informed protocol for archiving sensitive qualitative data. – PsycNET

Abstract:  Objective: The open science movement seeks to make research more transparent, and to that end, researchers are increasingly expected or required to archive their data in national repositories. In qualitative trauma research, data sharing could compromise participants’ safety, privacy, and confidentiality because narrative data can be more difficult to de-identify fully. There is little guidance in the traumatology literature regarding how to discuss data-sharing requirements with participants during the informed consent process. Within a larger research project in which we interviewed assault survivors, we developed and evaluated a protocol for informed consent for qualitative data sharing and engaging participants in data de-identification. Method: We conducted qualitative interviews with N = 32 adult sexual assault survivors regarding (a) how to conduct informed consent for data sharing, (b) whether participants should have input on sharing their data, and (c) whether they wanted to redact information from their transcripts prior to archiving. Results: No potential participants declined participation after learning about the archiving mandate. Survivors indicated that they wanted input on archiving because the interview is their story of trauma and abuse and it would be disempowering not to have control over how this information was shared and disseminated. Survivors also wanted input on this process to help guard their privacy, confidentiality, and safety. None of the participants elected to redact substantive data prior to archiving. Conclusions: Engaging participants in the archiving process is a feasible practice that is important and empowering for trauma survivors. (PsycInfo Database Record (c) 2022 APA, all rights reserved)

Is DIA proteomics data FAIR? Current data sharing practices, available bioinformatics infrastructure and recommendations for the future – Jones – PROTEOMICS – Wiley Online Library

Abstract:  Data independent acquisition (DIA) proteomics techniques have matured enormously in recent years, thanks to multiple technical developments in e.g. instrumentation and data analysis approaches. However, there are many improvements that are still possible for DIA data in the area of the FAIR (Findability, Accessibility, Interoperability and Reusability) data principles. These include more tailored data sharing practices and open data standards, since public databases and data standards for proteomics were mostly designed with DDA data in mind. Here we first describe the current state of the art in the context of FAIR data for proteomics in general, and for DIA approaches in particular. For improving the current situation for DIA data, we make the following recommendations for the future: (i) development of an open data standard for spectral libraries; (ii) make mandatory the availability of the spectral libraries used in DIA experiments in ProteomeXchange resources; (iii) improve the support for DIA data in the data standards developed by the Proteomics Standards Initiative; and (iv) improve the support for DIA datasets in ProteomeXchange resources, including more tailored metadata requirements.


Data-sharing and re-analysis for main studies assessed by the European Medicines Agency—a cross-sectional study on European Public Assessment Reports | BMC Medicine | Full Text

Abstract:  Background

Transparency and reproducibility are expected to be normative practices in clinical trials used for decision-making on marketing authorisations for new medicines. This registered report introduces a cross-sectional study aiming to assess inferential reproducibility for main trials assessed by the European Medicines Agency.


Two researchers independently identified all studies on new medicines, biosimilars and orphan medicines given approval by the European Commission between January 2017 and December 2019, categorised as ‘main studies’ in the European Public Assessment Reports (EPARs). Sixty-two of these studies were randomly sampled. One researcher retrieved the individual patient data (IPD) for these studies and prepared a dossier for each study, containing the IPD, the protocol and information on the conduct of the study. A second researcher who had no access to study reports used the dossier to run an independent re-analysis of each trial. All results of these re-analyses were reported in terms of each study’s conclusions, p-values, effect sizes and changes from the initial protocol. A team of two researchers not involved in the re-analysis compared results of the re-analyses with published results of the trial.


Two hundred ninety-two main studies in 173 EPARs were identified. Among the 62 studies randomly sampled, we received IPD for 10 trials. The median number of days between data request and data receipt was 253 [interquartile range 182–469]. For these ten trials, we identified 23 distinct primary outcomes for which the conclusions were reproduced in all re-analyses. Therefore, 10/62 trials (16% [95% confidence interval 8% to 28%]) were reproduced, as the 52 studies without available data were considered non-reproducible. There was no change from the original study protocol regarding the primary outcome in any of these ten studies. Spin was observed in the report of one study.


Despite their results supporting decisions that affect millions of people’s health across the European Union, most main studies used in EPARs lack transparency and their results are not reproducible for external researchers. Re-analyses of the few trials with available data showed very good inferential reproducibility.

Comprehensive review of publicly available colonoscopic imaging databases for artificial intelligence research: availability, accessibility and usability

Abstract:  Background and aims: Publicly available databases containing colonoscopic imaging data are valuable resources for artificial intelligence (AI) research. Currently, little is known regarding the available number and content of these databases. This review aimed to describe the availability, accessibility and usability of publicly available colonoscopic imaging databases, focusing on polyp detection, polyp characterization and quality of colonoscopy. Methods: A systematic literature search was performed in MEDLINE and Embase to identify AI-studies describing publicly available colonoscopic imaging datasets published after 2010. Second, a targeted search using Google’s Dataset Search, Google Search, GitHub and Figshare was done to identify datasets directly. Datasets were included if they contained data about polyp detection, polyp characterization or quality of colonoscopy. To assess accessibility of datasets the following categories were defined: open access, open access with barriers and regulated access. To assess the potential usability of the included datasets, essential details of each dataset were extracted using a checklist derived from the CLAIM-checklist. Results: We identified 22 datasets with open access, 3 datasets open access with barriers and 15 datasets with regulated access. The 22 open access databases containing 19,463 images and 952 videos. Nineteen of these databases focused on polyp detection, localization and/or segmentation, six on polyp characterization and three on quality of colonoscopy. Only half of these databases have been used by other researcher to develop, train or benchmark their AI-system. Although technical details were in general well-reported, important details such as polyp and patient demographics and the annotation process were underreported in almost all databases. Conclusion: This review provides greater insight on public availability of colonoscopic imaging databases for AI-research. Incomplete reporting of important details limits the ability of researchers to assess the usability of the current databases.