A decade of surveys on attitudes to data sharing highlights three factors for achieving open science | Impact of Social Sciences

“Over a 10 year period Carol Tenopir of DataONE and her team conducted a global survey of scientists, managers and government workers involved in broad environmental science activities about their willingness to share data and their opinion of the resources available to do so (Tenopir et al., 2011, 2015, 2018, 2020). Comparing the responses over that time shows a general increase in the willingness to share data (and thus engage in Open Science)….

The most surprising result was that a higher willingness to share data corresponded with a decrease in satisfaction with data sharing resources across nations (e.g., skills, tools, training) (Fig.1). That is, researchers who did not want to share data were satisfied with the available resources, and those that did want to share data were dissatisfied. Researchers appear to only discover that the tools are insufficient when they begin the hard work of engaging in open science practices. This indicates that a cultural shift in the attitudes of researchers needs to precede the development of support and tools for data management….

Mandated requirements to share data really do work. However, this effect was shown in the surveys as government researchers were consistently far more willing to share data than those in academia or corporations, and this willingness to share increased substantially from 2011 to 2019….

Researchers working in academia were less willing to share than those in government, but did show significant increases in willingness to share from 2011 to 2015. Researchers in the commercial sector were, unsurprisingly, the least willing to share their data….

government involvement and funding play an important role in improving the attitudes researchers have towards open science practices. The organisational influence of government funding and mandates shifts individual incentives. Researchers then realize that they lack the knowledge, tools, and training they need to properly share data, which can push the social change needed to drastically change the way that science is done for the better.”

Aligning data-sharing policies: Meeting the moment | Commentary and opinion | Features | PND

“To make data sharing easier and to establish a clear baseline for what well-considered data-sharing policies should encompass, we recommend that funders:

1. Clearly specify which data grantees are required to share. Do you want grantees to share only data underlying published studies or all data generated during the funded project? Do you want raw or pre-processed data? If qualitative (not just quantitative) data are also covered by your policy, do you provide guidance for grantees on good practices for sharing qualitative data?

2. Consider incorporating code- and software-sharing requirements as a necessary extension of their data-sharing policies. To be able to reproduce results accurately and build upon shared data, researchers must not only have access to the files but also the code and software used to open and analyze data. Only then are data truly findable, accessible, interoperable, and reusable. The ORFG and the Higher Education Leadership Initiative for Open Scholarship (HELIOS) have prepared a more detailed brief.

3. Clearly specify the required timing of data sharing. The timing will vary based on what data are to be shared and what constitutes the event that triggers the sharing requirement. If data underlie a published study, complying or aligning with new federal policies will require data to be shared immediately at the time of publication. If, however, the policy requires sharing of all data, then the timing may be tied to the award period (as the NIH requires).

4. Require grantees to deposit data in trusted public repositories that assign a persistent identifier (e.g., DOI), provide the necessary infrastructure to host and export quality metadata, implement strategies for long-term preservation, and otherwise meet the National Science and Technology Council’s Desirable Characteristics of Data Repositories. To make compliance easier for grantees, funders should provide a list of approved data repositories that meet these characteristics and are appropriate for the disciplines they fund.

5. Require grantees to share data under licenses that facilitate reuse. The recommended free culture license for data is the Creative Commons Public Domain Dedication (CC0). The reasoning behind this is two-fold: first, data do not always incur copyright and, therefore, reserving certain rights under other licenses may be inappropriate, and second, we should avoid attribution or license stacking that may occur as datasets are remixed and reused. Other options include the Creative Commons Attribution (CC BY) or ShareAlike (CC BY-SA) licenses.

6. Strongly encourage grantees to share data according to established best practices. These include, but are not limited to: a) the FAIR Principles, which outline how to share data so they are Findable, Accessible, Interoperable, and Reusable; b) the CARE Principles for Indigenous Data Governance, which emphasize the importance of Collective Benefit, Authority to Control, Responsibility, and Ethics in the context of Indigenous data, but could also inform the responsible management and sharing of data for other populations; and c) privacy rules, such as those provided under HIPAA. Funders should communicate that it is the responsibility of grantees to get the appropriate consent and ethical approval (e.g., from their institutional review board) that will allow them to collect and subsequently openly share de-identified data.

7. Allow grantees to include data sharing costs in their grant budgets. This could include costs associated with data management, curation, hosting, and long-term preservation. For many projects, data hosting costs will likely be minimal—several public repositories allow researchers to store significant amounts of data for free. For projects that will generate larger amounts of data, additional hosting costs can be budgeted. The most important cost may be the personnel time and expertise required to properly prepare data for sharing and reuse. Funders should consider increasing the allowable personnel costs to secure extra curation time for team

ORFG Shares Guidance on Open Data Policies — Open Research Funders Group

“The Open Research Funders Group is pleased to share their evidence-informed perspective on how funders and philanthropies can optimize open data policies. The piece, which appears in Philanthropy News Digest, highlights eight key steps organizations can take to ensure the data generated by grant-funded projects improve research replicability, reproducibility, and transparency.”

Results from the COGR Survey on the Cost of Complying with the New NIH DMS Policy | Council on Governmental Relations

“For mid-size to large research institutions, the annual projected cost impact is expected to exceed $500,000 at the central administrative level, while also exceeding $500,000 at the academic level––a total impact that exceeds $1 million per institution. Cost impact is measured both by new expenditures and reallocation of effort away from an individual’s current responsibilities. In the case of Researchers and Investigators, this results in a shift away from conducting science in the lab toward tasks that might be considered more administrative in nature. For smaller and emerging research institutions, the cost impact also is expected to be significant, and for these institutions, the disproportionate negative impact may discourage their participation in the federal research ecosystem.”

Prevalence and predictors of data and code sharing in the medical and health sciences: systematic review with meta-analysis of individual participant data | The BMJ

Abstract:  Objectives To synthesise research investigating data and code sharing in medicine and health to establish an accurate representation of the prevalence of sharing, how this frequency has changed over time, and what factors influence availability.

Design Systematic review with meta-analysis of individual participant data.

Data sources Ovid Medline, Ovid Embase, and the preprint servers medRxiv, bioRxiv, and MetaArXiv were searched from inception to 1 July 2021. Forward citation searches were also performed on 30 August 2022.

Review methods Meta-research studies that investigated data or code sharing across a sample of scientific articles presenting original medical and health research were identified. Two authors screened records, assessed the risk of bias, and extracted summary data from study reports when individual participant data could not be retrieved. Key outcomes of interest were the prevalence of statements that declared that data or code were publicly or privately available (declared availability) and the success rates of retrieving these products (actual availability). The associations between data and code availability and several factors (eg, journal policy, type of data, trial design, and human participants) were also examined. A two stage approach to meta-analysis of individual participant data was performed, with proportions and risk ratios pooled with the Hartung-Knapp-Sidik-Jonkman method for random effects meta-analysis.

Results The review included 105 meta-research studies examining 2?121?580 articles across 31 specialties. Eligible studies examined a median of 195 primary articles (interquartile range 113-475), with a median publication year of 2015 (interquartile range 2012-2018). Only eight studies (8%) were classified as having a low risk of bias. Meta-analyses showed a prevalence of declared and actual public data availability of 8% (95% confidence interval 5% to 11%) and 2% (1% to 3%), respectively, between 2016 and 2021. For public code sharing, both the prevalence of declared and actual availability were estimated to be <0.5% since 2016. Meta-regressions indicated that only declared public data sharing prevalence estimates have increased over time. Compliance with mandatory data sharing policies ranged from 0% to 100% across journals and varied by type of data. In contrast, success in privately obtaining data and code from authors historically ranged between 0% and 37% and 0% and 23%, respectively.

Conclusions The review found that public code sharing was persistently low across medical research. Declarations of data sharing were also low, increasing over time, but did not always correspond to actual sharing of data. The effectiveness of mandatory data sharing policies varied substantially by journal and type of data, a finding that might be informative for policy makers when designing policies and allocating resources to audit compliance.

Making data sharing the norm in medical research | The BMJ

“The benefits to patients, science, and society are undeniable

Reuse of medical research data—which is conditional on access to individual participant data—is expected to maximise the value of medical research. It enables alternative hypothesis testing, validation of claims, exploration of controversies, restoration of unpublished trials, avoidance of duplicated efforts, and production of new knowledge from existing datasets. Given these benefits, politicians,123 funders,4 and publishers5 now support and implement data sharing policies. However, converging evidence indicates that current policies are unlikely to reach their goal of achieving data sharing. In a linked paper at The BMJ, Hamilton and colleagues (doi:10.1136/bmj-2023-075767) synthesised 105 meta-research studies examining 2?121?580 articles across 31 medical specialties and found that, despite some heterogeneity, data sharing rates are consistently low across medical research.6 Intention-to-share data have increased with time but are not associated with any increase in actual data sharing….”

New policy outlines research data stewardship expectations | The University Record

“The University of Michigan has created a new policy setting expectations and guidance for research data stewardship, focusing on issues of ownership, sharing and retention….

The policy also outlines expectations for U-M researchers to make research data publicly available when possible, taking into account any existing agreements, contracts, sensitivities or protections….”

a figshare webinar – The Library’s role in readying researchers for Open Data

“Libraries need robust and trusted infrastructure to support their researchers, especially in light of the rising tide of funder policies and mandates surrounding research sharing.  

During this webinar, we will discuss the ways in which Libraries can support their researchers in the context of these policies as well as the growing momentum of the global shift towards Open Science and Open Data.

We’ll discuss:

Policy changes and regional requirements for research and research data sharing

The influence of funders and their policies
Changing attitudes towards Research Data Management (RDM)

Join this 30 minute session to inform your planning for support structures and ensure that you’re well positioned to ready your researchers for Open Data….”

Dryad in the community: New data sharing mandates and the role of academic librariesDryad news

“Available to watch now: “New data sharing mandates and the role of academic libraries” presented at the 2023 Library Publishing Forum. 

Whether libraries become data publishers themselves or provide the types of support services that make data publishing possible–such as training, planning, and consultation–they have a critical role to play in advancing open science practices.

In this presentation, Dryad’s Head of Community Engagement, Sarah Lippincott is joined by fellow presenters Michael Casp, Head of Production Division at J&J Editorial, Emma Molls, Director of Open Research & Publishing at University of Minnesota Libraries, and Alberto Pepe, Director of Strategy and Innovation at Wiley and Co-founder of Authorea. Sarah reviews some pertinent highlights from the Nelson memo and NIH policies, two of the major developments that will impact data sharing over the next few years. and concludes with a discussion on how libraries can help researchers move from data sharing to data publishing.

Watch now to hear from the diverse perspectives of three data sharing and open data advocates: a funder, librarian, and publisher.”

We need a plan D | Nature Methods

“Ensuring data are archived and open thus seems a no-brainer. Several funders and journals now require authors to make their data public, and a recent White House mandate that data from federally funded research must be made available immediately on publication is a welcome stimulus. Various data repositories exist to support these requirements, and journals and preprint servers also provide storage options. Consequently, publications now often include various accession numbers, stand-alone data citations and/or supplementary files.

But as the director of the National Library of Medicine, Patti Brennan, once noted, “data are like pictures of children: the people who created them think they’re beautiful, but they’re not always useful”. So, although the above trends are to be applauded, we should think carefully about that word ‘useful’ and ask what exactly we mean by ‘the data’, how and where they should be archived, and whether some data should be kept at all….

Researchers, institutions and funders should collaborate to develop an overarching strategy for data preservation — a plan D. There will doubtless be calls for a ‘PubMed Central for data’. But what we really need is a federated system of repositories with functionality tailored to the information that they archive. This will require domain experts to agree standards for different types of data from different fields: what should be archived and when, which format, where, and for how long. We can learn from the genomics, structural biology and astronomy communities, and funding agencies should cooperate to define subdisciplines and establish surveys of them to ensure comprehensive coverage of the data landscape, from astronomy to zoology….”

Data sharing is the future | Nature Methods

“In late 2022, the US government mandated open-access publication of scholarly research and free and immediate sharing of data underlying those publications for federally funded research beginning no later than 2025. For some fields the necessary standards and infrastructure are largely in place to support these policies. For others, however, many questions remain as to how these mandates can best be met.

In this issue, we feature a Correspondence from Richard Sever that was inspired by the government mandate and the increasing demand for open science. In it, he raises important topics, including deciding which data must be shared, standardizing file formats and developing community guidelines. He also calls for a “federated system of repositories with functionality tailored to the information that they archive,” to meet the needs of many distinct fields….”

What constitutes equitable data sharing in global health research? A scoping review of the literature on low-income and middle-income country stakeholders’ perspectives | BMJ Global Health

Abstract:  Introduction Despite growing consensus on the need for equitable data sharing, there has been very limited discussion about what this should entail in practice. As a matter of procedural fairness and epistemic justice, the perspectives of low-income and middle-income country (LMIC) stakeholders must inform concepts of equitable health research data sharing. This paper investigates published perspectives in relation to how equitable data sharing in global health research should be understood.

Methods We undertook a scoping review (2015 onwards) of the literature on LMIC stakeholders’ experiences and perspectives of data sharing in global health research and thematically analysed the 26 articles included in the review.

Results We report LMIC stakeholders’ published views on how current data sharing mandates may exacerbate inequities, what structural changes are required in order to create an environment conducive to equitable data sharing and what should comprise equitable data sharing in global health research.

Conclusions In light of our findings, we conclude that data sharing under existing mandates to share data (with minimal restrictions) risks perpetuating a neocolonial dynamic. To achieve equitable data sharing, adopting best practices in data sharing is necessary but insufficient. Structural inequalities in global health research must also be addressed. It is thus imperative that the structural changes needed to ensure equitable data sharing are incorporated into the broader dialogue on global health research.

Publishers, funders and institutions: who is supporting UKRI-funded researchers to share data? – Insights

Abstract:  Researchers are increasingly being asked by funders, publishers and their institutions to share research data alongside written publications, and to include data availability statements to support their readers in finding this data. In the UK, UKRI (UK Research and Innovation) is one of the largest funding bodies and has had data-sharing policies for several years. This article investigates the reasons why a researcher may or may not share their data and assesses whether funders, publishers and institutions are supporting data-sharing behaviour through their policies and actions. A survey with 166 responses gave an indicative assessment of researcher opinions around data sharing, and a corpus of 3,277 journal articles retrieved from four UK institutions was analysed using multivariate logistic regression models to provide empirical evidence as to researcher behaviour around data sharing. The regression models provide insight into how this is affected by the funder, institution and publisher of the research. This study identifies that those publishers and funders who give clear guidance in their policies as to which data should be shared, and where this data should be shared, are most likely to encourage good practice in researchers.

 

Are the Humanities Ready for Data Sharing? – Ithaka S+R

“The Nelson memo is not the first federal policy to address data sharing and open access, but it is the first to apply to not only large funders such as the NSF and NIH, but to smaller ones such as the NEH. While the NEH funds only a tiny percentage of research and publications in the humanities, its inclusion in the Nelson memo and in the “year of open science” is clear evidence that humanists—who have largely existed on the margins of major trends towards mandatory data sharing that are transforming research practices and scholarly communication in other fields—must now consider their place in this policy landscape.[2]

Humanists—who have largely existed on the margins of major trends towards mandatory data sharing that are transforming research practices and scholarly communication in other fields—must now consider their place in this policy landscape.

It is not yet clear how the NEH will define data for the purposes of compliance with the Nelson memo, but the requirement that they do so should stimulate conversation about data sharing in the humanities. When should the evidence humanists collect be considered data? How might humanists adopt STEM-oriented norms around data sharing, and what might humanists bring to the table that would help other fields improve their data sharing practices?…”