We need a plan D | Nature Methods

“Ensuring data are archived and open thus seems a no-brainer. Several funders and journals now require authors to make their data public, and a recent White House mandate that data from federally funded research must be made available immediately on publication is a welcome stimulus. Various data repositories exist to support these requirements, and journals and preprint servers also provide storage options. Consequently, publications now often include various accession numbers, stand-alone data citations and/or supplementary files.

But as the director of the National Library of Medicine, Patti Brennan, once noted, “data are like pictures of children: the people who created them think they’re beautiful, but they’re not always useful”. So, although the above trends are to be applauded, we should think carefully about that word ‘useful’ and ask what exactly we mean by ‘the data’, how and where they should be archived, and whether some data should be kept at all….

Researchers, institutions and funders should collaborate to develop an overarching strategy for data preservation — a plan D. There will doubtless be calls for a ‘PubMed Central for data’. But what we really need is a federated system of repositories with functionality tailored to the information that they archive. This will require domain experts to agree standards for different types of data from different fields: what should be archived and when, which format, where, and for how long. We can learn from the genomics, structural biology and astronomy communities, and funding agencies should cooperate to define subdisciplines and establish surveys of them to ensure comprehensive coverage of the data landscape, from astronomy to zoology….”

Data sharing is the future | Nature Methods

“In late 2022, the US government mandated open-access publication of scholarly research and free and immediate sharing of data underlying those publications for federally funded research beginning no later than 2025. For some fields the necessary standards and infrastructure are largely in place to support these policies. For others, however, many questions remain as to how these mandates can best be met.

In this issue, we feature a Correspondence from Richard Sever that was inspired by the government mandate and the increasing demand for open science. In it, he raises important topics, including deciding which data must be shared, standardizing file formats and developing community guidelines. He also calls for a “federated system of repositories with functionality tailored to the information that they archive,” to meet the needs of many distinct fields….”

What constitutes equitable data sharing in global health research? A scoping review of the literature on low-income and middle-income country stakeholders’ perspectives | BMJ Global Health

Abstract:  Introduction Despite growing consensus on the need for equitable data sharing, there has been very limited discussion about what this should entail in practice. As a matter of procedural fairness and epistemic justice, the perspectives of low-income and middle-income country (LMIC) stakeholders must inform concepts of equitable health research data sharing. This paper investigates published perspectives in relation to how equitable data sharing in global health research should be understood.

Methods We undertook a scoping review (2015 onwards) of the literature on LMIC stakeholders’ experiences and perspectives of data sharing in global health research and thematically analysed the 26 articles included in the review.

Results We report LMIC stakeholders’ published views on how current data sharing mandates may exacerbate inequities, what structural changes are required in order to create an environment conducive to equitable data sharing and what should comprise equitable data sharing in global health research.

Conclusions In light of our findings, we conclude that data sharing under existing mandates to share data (with minimal restrictions) risks perpetuating a neocolonial dynamic. To achieve equitable data sharing, adopting best practices in data sharing is necessary but insufficient. Structural inequalities in global health research must also be addressed. It is thus imperative that the structural changes needed to ensure equitable data sharing are incorporated into the broader dialogue on global health research.

Publishers, funders and institutions: who is supporting UKRI-funded researchers to share data? – Insights

Abstract:  Researchers are increasingly being asked by funders, publishers and their institutions to share research data alongside written publications, and to include data availability statements to support their readers in finding this data. In the UK, UKRI (UK Research and Innovation) is one of the largest funding bodies and has had data-sharing policies for several years. This article investigates the reasons why a researcher may or may not share their data and assesses whether funders, publishers and institutions are supporting data-sharing behaviour through their policies and actions. A survey with 166 responses gave an indicative assessment of researcher opinions around data sharing, and a corpus of 3,277 journal articles retrieved from four UK institutions was analysed using multivariate logistic regression models to provide empirical evidence as to researcher behaviour around data sharing. The regression models provide insight into how this is affected by the funder, institution and publisher of the research. This study identifies that those publishers and funders who give clear guidance in their policies as to which data should be shared, and where this data should be shared, are most likely to encourage good practice in researchers.

 

Are the Humanities Ready for Data Sharing? – Ithaka S+R

“The Nelson memo is not the first federal policy to address data sharing and open access, but it is the first to apply to not only large funders such as the NSF and NIH, but to smaller ones such as the NEH. While the NEH funds only a tiny percentage of research and publications in the humanities, its inclusion in the Nelson memo and in the “year of open science” is clear evidence that humanists—who have largely existed on the margins of major trends towards mandatory data sharing that are transforming research practices and scholarly communication in other fields—must now consider their place in this policy landscape.[2]

Humanists—who have largely existed on the margins of major trends towards mandatory data sharing that are transforming research practices and scholarly communication in other fields—must now consider their place in this policy landscape.

It is not yet clear how the NEH will define data for the purposes of compliance with the Nelson memo, but the requirement that they do so should stimulate conversation about data sharing in the humanities. When should the evidence humanists collect be considered data? How might humanists adopt STEM-oriented norms around data sharing, and what might humanists bring to the table that would help other fields improve their data sharing practices?…”

Llebot | Are Institutional Research Data Policies in the US Supporting the FAIR Principles? A Content Analysis | Journal of eScience Librarianship

Abstract:  Objective: The FAIR principles were created with the goal of enhancing the reusability of research data and to give guidance on how to make data Findable, Accessible, Interoperable and Reusable. In this article we explore the role of institutional research data policies in enabling and encouraging researchers at their institutions to generate FAIR data.

Methods: We identified the research data policies in place for “very high research activity” institutions (as defined by Carnegie classification) in the United States. We created a list of 31 criteria, based on previous work by Davidson et al. (2019) and Briney et al. (2015), and evaluated the 40 policies using a content analysis methodology. 

Results: The guiding principles and the definitions for research data in the policies support the idea that institutional policies are a potential tool for the implementation of the FAIR principles. However, our analysis indicates that they are not generally used for that purpose. Only one policy mentions FAIR. Data sharing is mentioned in half of the policies, but 11 of these only note this concept in the context of funder requirements. Access and retention sections are mostly written without considering publicly available data. Twenty-nine policies do not mention data documentation. 

Conclusions: We discuss ways in which these institutional policies represent a missed opportunity to implement the FAIR principles and suggest ways policies could be modified to encourage researchers to follow them. We also discuss future research opportunities to examine how policy implementation may affect what institutional support researchers receive.

Grynoch | Show me the data! Data sharing practices demonstrated in published research at the University of Massachusetts Chan Medical School | Journal of eScience Librarianship

Abstract:  Objective: In the interest of making data findable, accessible, interoperable, and reusable (FAIR), the National Institutes of Health (NIH) will institute a new Data Management and Sharing Policy in January 2023. This policy will require researchers applying for NIH funding to submit a Data Management and Sharing Plan. As 63% of grant dollars received by University of Massachusetts Chan Medical School (UMass Chan) researchers comes from the NIH, we explored whether UMass Chan researchers are currently sharing data associated with their published research and how they shared their data. 

Methods: PubMed was searched for articles published in 2019 with a UMass Chan researcher as either the first or last author. These articles were examined for evidence of original or reused data, the type of data, whether the article stated that data was available, and where and how to find that data. 

Results: Of the 361 articles with original data, 26% had a data availability statement. However, most articles (71%) did not mention where data could be accessed. The data storage location of the estimated 1551 original datasets was similarly not mentioned for 74% the datasets with the next largest category being available upon request (8.6%). Genomic data repositories such as the Gene Expression Omnibus were among the top repositories used by authors. Similar areas for improvement were noted for permanent identifier use (46% had a permanent identifier), using non-proprietary file formats (most popular format was Excel), and citing reused data. Authors who published open access were more likely to share their data. 

Conclusions: While some researchers at UMass Chan have embraced data sharing, particularly genomic data sharing, we expect there will be more data shared in the coming years with the implementation of the new NIH Data Management and Sharing Policy.

Harvard Library Responds to the NIH Data Management and Sharing Policy | STAFF PORTAL

“Beginning with the first funding deadlines in January, all NIH grant proposals will be required to include a formal, two-page Data Management and Sharing Plan (DMSP), which must include the following elements….

Crucially, in addition to adding a required DMSP, the data management strategies stated in the plan will be audited and monitored externally, and compliance with stated plans may affect the funding status of grants.

 

Fortunately, here at Harvard affiliates have access to a variety of computing infrastructure and systems to effectively manage and steward a wide range of research outputs associated with modern, data-driven, computational research.

Harvard’s libraries, Harvard University Information Technology (HUIT), Research Computing, and Sponsored Programs offices have all been adding services and building capacity to support researchers complying with this new policy next year.

In the resources section below, we’ve included links to an executive summary of the policy and a collection of FAQs that we created specifically for Harvard users. We’ve also included resources from the NIH designed to support researchers writing and implementing a DMSP for the 2023 funding cycles.

Along with the requirement to make research data publicly available, in its new policy the NIH strongly encourages the use of established data repositories. When selecting an appropriate repository, researchers should plan to utilize subject- or domain-specific repositories for their data types if possible. When a disciplinary repository does not exist, researchers should use generalist repositories that accept all data types. We’ve included information on Harvard Dataverse and other generalist repositories in the resources section below….”

Browse Data Sharing Requirements by Federal Agency

“This is a community resource for tracking, comparing, and understanding current U.S. federal funder research data sharing policies. Originally completed by SPARC & Johns Hopkins University Libraries in 2016, the content of this resource was updated by RDAP and SPARC in 2021….”

Data sharing: what do we know and where can we go?

“OASPA is pleased to announce our next webinar which will focus on the what about and the why of data sharing.

The recent OSTP “Nelson memo” served as a re-focus on data as a first class research output. But maybe that’s a misrepresentation for those of us who think ‘hold on, we’ve been focused on data this whole time!?’ Well here’s a chance to learn from and with a group of experts who are thinking carefully about data sharing: what that means from different perspectives, tangible steps to take and policies to make around data, and what we can do next in our communities of practice. Attendees are more than welcome to bring their own perspectives! The webinar will be chaired by Rachael Lammey. We welcome our panelists: Sarah Lippincott will give a repository perspective with insights into where data is going post Nelson Memo and NIH Policy. Aravind Venkatesan will share the thinking, data science and workflows employed at EuropePMC to support data linking. Shelley Stall will talk about how AGU are leading the line with their data policies, and Kathleen Gregory will conclude by considering researchers’ perspectives regarding sharing and reusing data.”

Reminder: NIH Policy for Data Management and Sharing effective on January 25, 2023.

“The purpose of this notice is to remind the community of the effective date of the NIH Policy for Data Management and Sharing (DMS Policy) and summarize available key resources.

As noted in the Final NIH Policy for Data Management and Sharing (NOT-OD-21-013), the effective date of the DMS Policy is January 25, 2023 for competing grant applications submitted to NIH for the January 25, 2023 and subsequent receipt dates; proposals for contracts  submitted to NIH on or after January 25, 2023; NIH Intramural Research Projects conducted on or after January 25, 2023; and other funding agreements (e.g., Other Transactions)  executed on or after January 25, 2023, unless otherwise stipulated by NIH.

The DMS Policy applies to all NIH research, funded or conducted in whole or in part by NIH, that results in the generation of scientific data. Note that the DMS Policy does not apply to research and other activities that do not generate scientific data, for example: research training, fellowships, infrastructure development, and non-research activities. See Research Covered Under the Data Management & Sharing Policy for more details.

The DMS Policy has two basic requirements:

Submission of a Data Management and Sharing (DMS) Plan outlining how scientific data and any accompanying metadata will be managed and shared, considering any potential restrictions or limitations. 
Compliance with the Plan approved by the funding NIH Institute, Center, or Office.

DMS Plans should describe how data will be managed and appropriately shared. See Writing a Data Management & Sharing Plan for details, sample Plans, and an optional format page which includes six elements recommended to be included in a Data Management and Sharing Plan. Guidance on planning and budgeting and selecting a data repository are available on the NIH Scientific Data Sharing website. Application Guide instructions have been updated to provide instructions for DMS policy implementation.

Ultimately, the new DMS Policy promotes transparency and accountability in research by setting a minimum set of expectations for data management and sharing. This means that other NIH policies or NIH Institutes, Centers, Offices, or programs may build upon these expectations, for instance, by specifying scientific data to share, relevant standards, repository timelines, and/or shorter data sharing timelines for meeting programmatic needs, the DMS Policy sets a consistent baseline across NIH.

In preparing for DMS Policy implementation, NIH has developed a number of helpful resources that we encourage investigators and institutions to review:

DMS Policy Overview
DMS Policy FAQs
Learning Resources including 2-part webinar series on DMS Policy
Statements and Guide Notices …”

Canadian policy: Data management requirement takes effect in March

“Canadian institutions are preparing for a research data management policy developed by three major federal granting agencies to go into effect this March. The policy of the Tri-Agency Council, comprising the Canadian Institutes of Health Research (CIHR), the Natural Sciences and Engineering Research Council of Canada (NSERC), and the Social Sciences and Humanities Research Council of Canada (SSHRC), asserts that “research data collected through the use of public funds should be responsibly and securely managed and be, where ethical, legal and commercial obligations allow, available for reuse by others.” Dryad would be pleased to assist any Canadian institution seeking a solution to help support their affiliated researchers with this policy….”

NIH’s new data sharing policy is coming, and it’s a ‘big cultural shift’ | News | Chemistry World

“Biochemists and other researchers who apply for funding from the US National Institutes of Health (NIH) will have to include comprehensive data management and sharing plans in grants from 25 January. These will be formal strategies for managing, preserving and sharing scientific data, as well as the accompanying metadata.

The new rule, which is generating some concern within the research community, replaces the NIH’s existing data sharing policy that has been around since 2003, and applies to only those seeking at least $500,000 (£419,200) in direct costs from the agency in any given year. The original regulation required researchers to submit a plan that describes how they will share the underlying data, or if they cannot share it then why not.

By contrast, the latest policy affects all NIH grants, regardless of specific budget. It will apply to competing grant applications, proposals for contracts and other funding agreements submitted to the NIH on or after 25 January.

The agency will now mandate that researchers describe their strategy to share scientific data needed to ‘validate and replicate’ their research findings, whether or not the data is used to support scholarly publications….”