Evaluating the (in)accessibility of data behind papers in astronomy

Abstract:  This paper presents results of a survey of authors of journal articles published over several decades in astronomy. The study focuses on determining the characteristics and accessibility of data behind papers, referring to the spectrum of raw and derived data that would be needed to validate the results of a particular published article as a capsule of scientific knowledge. Curating the data behind papers can arguably lead to new discoveries through reuse. However, as shown through related research and confirmed by the results of the present study, a fully accessible portrait of the data behind papers is often unavailable. These findings have implications for reusability efforts and are presented alongside a discussion of open science.

Access to unpublished protocols and statistical analysis plans of randomised trials | Trials | Full Text

Abstract:  Background

Access to protocols and statistical analysis plans (SAPs) increases the transparency of randomised trial by allowing readers to identify and interpret unplanned changes to study methods, however they are often not made publicly available. We sought to determine how often study investigators would share unavailable documents upon request.


We used trials from two previously identified cohorts (cohort 1: 101 trials published in high impact factor journals between January and April of 2018; cohort 2: 100 trials published in June 2018 in journals indexed in PubMed) to determine whether study investigators would share unavailable protocols/SAPs upon request. We emailed corresponding authors of trials with no publicly available protocol or SAP up to four times.


Overall, 96 of 201 trials (48%) across the two cohorts had no publicly available protocol or SAP (11/101 high-impact cohort, 85/100 PubMed cohort). In total, 8/96 authors (8%) shared some trial documentation (protocol only [n?=?5]; protocol and SAP [n?=?1]; excerpt from protocol [n?=?1]; research ethics application form [n?=?1]). We received protocols for 6/96 trials (6%), and a SAP for 1/96 trial (1%). Seventy-three authors (76%) did not respond, 7 authors responded (7%) but declined to share a protocol or SAP, and eight email addresses were invalid (8%). A total of 329 emails were sent (an average of 41 emails for every trial which sent documentation). After emailing authors, the total number of trials with an available protocol increased by only 3%, from 52% in to 55%.


Most study investigators did not share their unpublished protocols or SAPs upon direct request. Alternative strategies are needed to increase transparency of randomised trials and ensure access to protocols and SAPs.

ARC bans preprints, again

“The information pack for Excellence in Research for Australia 2023 advises that the Australian Research Council consulted on including preprints and feedback, “was overwhelmingly supportive of not including preprints as an eligible research output type.”

And lest anyone miss the point, “as a consequence, preprints will not be eligible for ERA submission.”

This follows last year’s fiasco when the ARC enforced a rule that many Research Offices had missed, banning pre-prints from grant applications, generating outrage on behalf of excluded applicants who did not know about it (umpteen CMM stories last year but September 23 covers it)….”

Evaluation of publication bias for 12 clinical trials of molnupiravir to treat SARS-CoV-2 infection in 13,694 patients | Research Square

Abstract:  Introduction:

During the COVID-19 pandemic, Merck Sharp and Dohme (MSD) acquired the global licensing rights for molnupiravir. MSD allowed Indian manufacturers to produce the drug under voluntary license. Indian companies conducted local clinical trials to evaluate the efficacy and safety of molnupiravir.


Searches of the Clinical Trials Registry-India (CTRI) were conducted to find registered trials of molnupiravir in India. Subsequent investigations were performed to assess which clinical trials had been presented or published.


According to the CTRI, 12 randomised trials of molnupiravir were conducted in India, in 13,694 patients, starting in late May 2021. By July 2022, none of the 12 trials has been published, one was presented at a medical conference, and two were announced in press releases suggesting failure of treatment. Results from three trials were shared with the World Health Organisation. One of these three trials had many unexplained results, with effects of treatment significantly different from the MSD MOVE-OUT trial in a similar population.


The lack of results runs counter to established practices and leaves a situation where approximately 90% of the global data on molnupiravir has not been published in any form. Access to patient-level databases is required to investigate risks of bias or medical fraud.

EU Clinical Trials Register – Update

“Following the issuing of the Joint Letter by the European Commission, EMA and HMA, National Competent Authorities and European Medicines Agency have sent reminders to sponsors who were not compliant with the European Commission guideline on results posting. Thanks to these reminders, the percentage of posted results substantially increased. However, for some trials the reminders were not successful: detailed lists of these trials can be found here. …”

Clinical Trial Registry Errors Undermine Transparency | The Scientist Magazine®

“Confusion about terminology on the world’s largest clinical trials registry may be delaying the release of drug trial results and undermining rules designed to promote transparency, an investigation by The Scientist has found. 

Key study dates and other information are entered into the ClinicalTrials.gov database by trial researchers or sponsors, and are used by US science and regulatory agencies to determine legal deadlines by which results must be reported. The rules are supposed to ensure timely public access to findings about a potential therapy’s harms and benefits, as well as provide the scientific community with an up-to-date picture of the status of clinical research.

But neither the agencies nor staff overseeing the database routinely monitor individual trial records for veracity, instead relying on the person in charge of a given record to correctly declare information such as when a study ends and how many people were enrolled. …”

Information Retention in the Multi-platform Sharing of Science

Abstract:  The public interest in accurate scientific communication, underscored by recent public health crises, highlights how content often loses critical pieces of information as it spreads online. However, multi-platform analyses of this phenomenon remain limited due to challenges in data collection. Collecting mentions of research tracked by Altmetric LLC, we examine information retention in the over 4 million online posts referencing 9,765 of the most-mentioned scientific articles across blog sites, Facebook, news sites, Twitter, and Wikipedia. To do so, we present a burst-based framework for examining online discussions about science over time and across different platforms. To measure information retention we develop a keyword-based computational measure comparing an online post to the scientific article’s abstract. We evaluate our measure using ground truth data labeled by within field experts. We highlight three main findings: first, we find a strong tendency towards low levels of information retention, following a distinct trajectory of loss except when bursts of attention begin in social media. Second, platforms show significant differences in information retention. Third, sequences involving more platforms tend to be associated with higher information retention. These findings highlight a strong tendency towards information loss over time – posing a critical concern for researchers, policymakers, and citizens alike – but suggest that multi-platform discussions may improve information retention overall.


Another extraordinary year for citation impact | Research Information

“Responding to this year’s Journal Citation Reports, Nandita Quaderi explains how Covid-19 continues to affect the citation network, and introduces a new kind of citation distortion…

We have found a new trend in citation distortion, which we have defined as ‘self-stacking’. This is where the journal contains one ormore documents with citations that are highly concentrated to the JIF numerator of the title itself, for example a review or retrospective which predominantly includes citations that would contribute to the journal’s JIF.  This is the first year we have formally defined the criteria for self-stacking suppression, and as such we have made the decision to issue a warning to six journals rather than suppress the journal’s JIF. Going forward, continued journal self-stacking will result in suppression of JIF.  …”

Reproducibility of COVID-19 pre-prints | SpringerLink

Abstract:  To examine the reproducibility of COVID-19 research, we create a dataset of pre-prints posted to arXiv, bioRxiv, and medRxiv between 28 January 2020 and 30 June 2021 that are related to COVID-19. We extract the text from these pre-prints and parse them looking for keyword markers signaling the availability of the data and code underpinning the pre-print. For the pre-prints that are in our sample, we are unable to find markers of either open data or open code for 75% of those on arXiv, 67% of those on bioRxiv, and 79% of those on medRxiv.

A meta-research study of randomized controlled trials found infrequent and delayed availability of protocols – ScienceDirect

Abstract:  Objectives

Availability of randomized controlled trial (RCT) protocols is essential for the interpretation of trial results and research transparency.

Study Design and Setting

In this study, we determined the availability of RCT protocols approved in Switzerland, Canada, Germany, and the United Kingdom in 2012. For these RCTs, we searched PubMed, Google Scholar, Scopus, and trial registries for publicly available protocols and corresponding full-text publications of results. We determined the proportion of RCTs with (1) publicly available protocols, (2) publications citing the protocol, and (3) registries providing a link to the protocol. A multivariable logistic regression model explored factors associated with protocol availability.


Three hundred twenty-six RCTs were included, of which 118 (36.2%) made their protocol publicly available; 56 (47.6% 56 of 118) provided as a peer-reviewed publication and 48 (40.7%, 48 of 118) provided as supplementary material. A total of 90.9% (100 of 110) of the protocols were cited in the main publication, and 55.9% (66 of 118) were linked in the clinical trial registry. Larger sample size (>500; odds ratio [OR] = 5.90, 95% confidence interval [CI], 2.75–13.31) and investigator sponsorship (OR = 1.99, 95% CI, 1.11–3.59) were associated with increased protocol availability. Most protocols were made available shortly before the publication of the main results.


RCT protocols should be made available at an early stage of the trial.

Many researchers say they’ll share data — but don’t

“Most biomedical and health researchers who declare their willingness to share the data behind journal articles do not respond to access requests or hand over the data when asked, a study reports1. …

But of the 1,792 manuscripts for which the authors stated they were willing to share their data, more than 90% of corresponding authors either declined or did not respond to requests for raw data (see ‘Data-sharing behaviour’). Only 14%, or 254, of the contacted authors responded to e-mail requests for data, and a mere 6.7%, or 120 authors, actually handed over the data in a usable format. The study was published in the Journal of Clinical Epidemiology on 29 May….

Puljak’s results square with those of a study that Danchev led, which found low rates of data sharing by authors of papers in leading medical journals that stipulate all clinical trials must share data2. …

Past research suggests that some fields, such as ecology, embrace data sharing more than others. But multiple analyses of COVID-19 clinical trials — including some from Li4,5 and Tan6 — have reported that anywhere from around half to 80% of investigators are unwilling or not planning to share data freely….

To encourage researchers to prepare their data, Li says, journals could make data-sharing statements more prescriptive. They could require authors to detail where they will share raw data, who will be able to access it, when and how.


Funders could also raise the bar for data sharing. The US National Institutes of Health, in an effort to curb wasteful, irreproducible research, will soon mandate that grant applicants include a data-management and sharing plan in their applications. Eventually, they will be required to share data publicly….”

No Open Access Today, Anthropology: On the latest AAA-Wiley Announcement | anthro{dendum}

“After years of back and forth, it seemed that the AAA was finally going to make the shift to Open Access. But, the cheering didn’t last long. According to the recent announcement from the AAA, the move to open access is going to wait a bit longer (again). Why? Because the association has, once again, decided to continue its partnership with Wiley-Blackwell….

So they took a year, got input from many sources, including the Publishing Futures Committee and the Executive Board, drafted an RFP for potential publishers, and then evaluated those proposals. The result? According to AAA Executive Director Ed Liebow, “Wiley best aligned with the core values of the AAA’s publishing program – quality, breadth, accessibility, equity, and sustainability.”

It is completely unclear how that decision was actually made. …”


Judge strikes down Maryland law requiring publishers to make ebooks available to libraries | WJLA

“In a legal case closely watched by libraries and the publishing industry, a federal judge in Maryland struck down a state law requiring publishers to make e-books available on “reasonable terms” to libraries if they were also being offered to the general public.

The Association of American Publishers, the industry’s trade organization, had contended that the bill violated the United States Copyright Act by allowing states to regulate publishing transactions. The Maryland law was passed with overwhelming support a year ago, and included provisions for fines up to $10,000 and higher.

Maryland U.S. District Judge Deborah L. Boardman issued her decision Monday, four months after she had enjoined the Maryland Act, writing at the time that the law’s “practical impact” would force publishers “to offer their products to libraries — whether they want to or not — lest they face a civil enforcement action or criminal prosecution.” …”