To determine whether medRxiv data availability statements describe open or closed data—that is, whether the data used in the study is openly available without restriction—and to examine if this changes on publication based on journal data-sharing policy. Additionally, to examine whether data availability statements are sufficient to capture code availability declarations.
Observational study, following a pre-registered protocol, of preprints posted on the medRxiv repository between 25th June 2019 and 1st May 2020 and their published counterparts.
Main outcome measures
Distribution of preprinted data availability statements across nine categories, determined by a prespecified classification system. Change in the percentage of data availability statements describing open data between the preprinted and published versions of the same record, stratified by journal sharing policy. Number of code availability declarations reported in the full-text preprint which were not captured in the corresponding data availability statement.
3938 medRxiv preprints with an applicable data availability statement were included in our sample, of which 911 (23.1%) were categorized as describing open data. 379 (9.6%) preprints were subsequently published, and of these published articles, only 155 contained an applicable data availability statement. Similar to the preprint stage, a minority (59 (38.1%)) of these published data availability statements described open data. Of the 151 records eligible for the comparison between preprinted and published stages, 57 (37.7%) were published in journals which mandated open data sharing. Data availability statements more frequently described open data on publication when the journal mandated data sharing (open at preprint: 33.3%, open at publication: 61.4%) compared to when the journal did not mandate data sharing (open at preprint: 20.2%, open at publication: 22.3%).
Requiring that authors submit a data availability statement is a good first step, but is insufficient to ensure data availability. Strict editorial policies that mandate data sharing (where appropriate) as a condition of publication appear to be effective in making research data available. We would strongly encourage all journal editors to examine whether their data availability policies are sufficiently stringent and consistently enforced.
“Authors who adopt transparent practices for an article in Conservation Biology are now able to select from 3 open science badges: open data, open materials, and preregistration. Badges appear on published articles as visible recognition and highlight these efforts to the research community. There is an emerging body of literature regarding the influences of badges, for example, an increased number of articles with open data (Kidwell et al 2016) and increased rate of data sharing (Rowhani?Farid et al. 2018). However, in another study, Rowhani?Farid et al. (2020) found that badges did not “noticeably motivate” researchers to share data. Badges, as far as we know, are the only data?sharing incentive that has been tested empirically (Rowhani?Farid et al. 2017).
Rates of data and code sharing are typically low (Herold 2015; Roche et al 2015; Archmiller et al 2020; Culina et al 2020). Since 2016, we have asked authors of contributed papers, reviews, method papers, practice and policy papers, and research notes to tell us whether they “provided complete machine and human?readable data and computer code in Supporting Information or on a public archive.” Authors of 31% of these articles published in Conservation Biology said they shared their data or code, and all authors provide human?survey instruments in Supporting Information or via a citation or online link (i.e., shared materials)….”
Abstract: This communication refers to the retractions of the two high profile COVID-19 papers of the top medical journals when the data analytics company declined to share the raw data of the papers. In this commentary, we emphasize that it is very pertinent for the journals to mandatorily ask the authors for sharing of the primary data. This will ensure data integrity and transparency of the research findings, and help in negating the publication frauds.
“Changes are afoot in the way the scientific community is approaching the practice and reporting of research. Spurred by concerns about the fundamental reliability (i.e., replicability), or rather lack thereof, of contemporary psychological science (e.g., Open Science Collaboration, 2015), as well as how we go about our business (e.g., Gelman & Loken, 2014), several recommendations have been furthered for increasing the rigor of the published research through openness and transparency. The Journal has long prized and published the type of research with features, like large sample sizes (Fraley & Vazire, 2014), that has fared well by replicability standards (Soto, 2019). The type of work traditionally published here, often relying on longitudinal samples, large public datasets (e.g., Midlife in the United States Study), or complex data collection designs (e.g., ambulatory assessment and behavioral coding) did not seem to fit neatly into the template of the emerging transparency practices. However, as thinking in the open science movement has progressed and matured, we have decided to full?throatedly endorse these practices and join the growing chorus of voices that are encouraging and rewarding more transparent work in psychological science. We believe this can be achieved while maintaining the “big tent” spirit of personality research at the Journal with a broad scope in content, methods, and analytical tools that has made it so special and successful all of these years. Moving forward, we will be rigorously implementing a number of procedures for openness and transparency consistent with the Transparency and Open Science Promotion (TOP) Guidelines.
The TOP Guidelines are organized into eight standards, each of which can be implemented at three levels of stringency (Nosek et al., 2015). In what follows, we outline the initial TOP Standards Levels adopted by the Journal and the associated rationale. Generally, we have adopted Level 2 standards, as we believe these strike a desirable balance between compelling a high degree of openness and transparency while not being overly onerous and a deterrent for authors interested in the Journal as an outlet for their work….”
Abstract: PLOS has long supported Open Science. One of the ways in which we do so is via our stringent data availability policy established in 2014. Despite this policy, and more data sharing policies being introduced by other organizations, best practices for data sharing are adopted by a minority of researchers in their publications. Problems with effective research data sharing persist and these problems have been quantified by previous research as a lack of time, resources, incentives, and/or skills to share data.
In this study we built on this research by investigating the importance of tasks associated with data sharing, and researchers’ satisfaction with their ability to complete these tasks. By investigating these factors we aimed to better understand opportunities for new or improved solutions for sharing data. In May-June 2020 we surveyed researchers from Europe and North America to rate tasks associated with data sharing on (i) their importance and (ii) their satisfaction with their ability to complete them. We received 728 completed and 667 partial responses. We calculated mean importance and satisfaction scores to highlight potential opportunities for new solutions to and compare different cohorts. Tasks relating to research impact, funder compliance, and credit had the highest importance scores. 52% of respondents reuse research data but the average satisfaction score for obtaining data for reuse was relatively low. Tasks associated with sharing data were rated somewhat important and respondents were reasonably well satisfied in their ability to accomplish them. Notably, this included tasks associated with best data sharing practice, such as use of data repositories. However, the most common method for sharing data was in fact via supplemental files with articles, which is not considered to be best practice. We presume that researchers are unlikely to seek new solutions to a problem or task that they are satisfied in their ability to accomplish, even if many do not attempt this task. This implies there are few opportunities for new solutions or tools to meet these researcher needs. Publishers can likely meet these needs for data sharing by working to seamlessly integrate existing solutions that reduce the effort or behaviour change involved in some tasks, and focusing on advocacy and education around the benefits of sharing data. There may however be opportunities – unmet researcher needs – in relation to better supporting data reuse, which could be met in part by strengthening data sharing policies of journals and publishers, and improving the discoverability of data associated with published articles.
“Researchers are satisfied with their ability to share their own research data but may struggle with accessing other researchers’ data – according to PLOS research released as a preprint this week. Therefore, to increase data sharing in a findable and accessible way, PLOS will focus on better integrating existing data repositories and promoting their benefits rather than creating new solutions. We also call on the scholarly publishing industry to improve journal data sharing policies to better support researchers’ needs….”
“Our BABCP journals have for some time been supportive of open science in its various forms. We are now taking the next steps towards this in terms of our policies and practices. For some things we are transitioning to the changes (but would encourage our contributors to embrace these as early as possible), and in others we are implementing things straight away. This is part of the global shift to open practices in science, and has many benefits and few, if any, drawbacks. See for example http://www.unesco.or/e//ommunication-and-informatio/ortals-and-platform/oa/pen-science-movement/
One of the main drivers for open science has been the recent ‘reproducibility crisis’, which crystallised long-standing concerns about a range of biases within and across research publication. Open science and research transparency will provide the means to reduce the impact of such biases, and can reasonably be considered to be a paradigm change. There are benefits beyond dealing with problems, however.
McKiernan et al. (2016) for example suggest that ‘open research is associated with increases in citations, media attention, potential collaborators, job opportunities and funding opportunities’. This is, of course, from a researcher-focused perspective. The BABCP and the Journal Editors take the view that open and transparent research practices will have the greatest long-term impact on service users both directly and indirectly through more accurate reporting and interpretation of research and its applications by CBT practitioners. So what are the practical changes we are implementing in partnership with our publisher, Cambridge University Press?…”
“Question What are the rates of declared and actual sharing of clinical trial data after the medical journals’ implementation of the International Committee of Medical Journal Editors data sharing statement requirement?
Findings In this cross-sectional study of 487 clinical trials published in JAMA, Lancet, and New England Journal of Medicine, 334 articles (68.6%) declared data sharing. Only 2 (0.6%) individual-participant data sets were actually deidentified and publicly available on a journal website, and among the 89 articles declaring that individual-participant data would be stored in secure repositories, data from only 17 articles were found in the respective repositories as of April 10, 2020.
Meaning These findings suggest that there is a wide gap between declared and actual sharing of clinical trial data.”
“ESA has adopted a society-wide open research policy for its publications to further support scientific exploration and preservation, allow a full assessment of published research, and streamline policies across our family of journals. An open research policy provides full transparency for scientific data and code, facilitates replication and synthesis, and aligns ESA journals with current standards. As of Feb. 1, 2021, all new manuscript submissions to ESA journals must abide by the following policy:
As a condition for publication in ESA journals, all underlying data and statistical code pertinent to the results presented in the publication must be made available in a permanent, publicly accessible data archive or repository, with rare exceptions (see “Details” for more information). Archived data and statistical code should be sufficiently complete to allow replication of tables, graphs, and statistical analyses reported in the original publication, and perform new or meta-analyses. As such, the desire of authors to control additional research with these data and/or code shall not be grounds for withholding material. …”