Dryad in the community: Responding to the Nelson Memo: repository re-curation for open scienceDryad news

“Available to watch now: “Responding to the Nelson Memo: repository re-curation for open science”.

This talk introduces the concept of re-curation with examples from three different types of repositories and research organisations; generalist, institutional, and field stations. Re-curation is the care and feeding of digital content over time, ensuring it remains discoverable, interoperable, and reusable and aligned with the latest standards.

Learn from Dryad partner Ted Habermann of Metadata Gamechangers about the importance of continually improving metadata to support discovery and reuse as standards emerge and evolve.”

We need a plan D | Nature Methods

“Ensuring data are archived and open thus seems a no-brainer. Several funders and journals now require authors to make their data public, and a recent White House mandate that data from federally funded research must be made available immediately on publication is a welcome stimulus. Various data repositories exist to support these requirements, and journals and preprint servers also provide storage options. Consequently, publications now often include various accession numbers, stand-alone data citations and/or supplementary files.

But as the director of the National Library of Medicine, Patti Brennan, once noted, “data are like pictures of children: the people who created them think they’re beautiful, but they’re not always useful”. So, although the above trends are to be applauded, we should think carefully about that word ‘useful’ and ask what exactly we mean by ‘the data’, how and where they should be archived, and whether some data should be kept at all….

Researchers, institutions and funders should collaborate to develop an overarching strategy for data preservation — a plan D. There will doubtless be calls for a ‘PubMed Central for data’. But what we really need is a federated system of repositories with functionality tailored to the information that they archive. This will require domain experts to agree standards for different types of data from different fields: what should be archived and when, which format, where, and for how long. We can learn from the genomics, structural biology and astronomy communities, and funding agencies should cooperate to define subdisciplines and establish surveys of them to ensure comprehensive coverage of the data landscape, from astronomy to zoology….”

Data sharing is the future | Nature Methods

“In late 2022, the US government mandated open-access publication of scholarly research and free and immediate sharing of data underlying those publications for federally funded research beginning no later than 2025. For some fields the necessary standards and infrastructure are largely in place to support these policies. For others, however, many questions remain as to how these mandates can best be met.

In this issue, we feature a Correspondence from Richard Sever that was inspired by the government mandate and the increasing demand for open science. In it, he raises important topics, including deciding which data must be shared, standardizing file formats and developing community guidelines. He also calls for a “federated system of repositories with functionality tailored to the information that they archive,” to meet the needs of many distinct fields….”

Open science and data sharing in cognitive neuroscience with MouseBytes and MouseBytes+ | Scientific Data

Abstract:  Open access to rodent cognitive data has lagged behind the rapid generation of large open-access datasets in other areas of neuroscience, such as neuroimaging and genomics. One contributing factor has been the absence of uniform standardization in experiments and data output, an issue that has particularly plagued studies in animal models. Touchscreen-automated cognitive testing of animal models allows standardized outputs that are compatible with open-access sharing. Touchscreen datasets can be combined with different neuro-technologies such as fiber photometry, miniscopes, optogenetics, and MRI to evaluate the relationship between neural activity and behavior. Here we describe a platform that allows deposition of these data into an open-access repository. This platform, called MouseBytes, is a web-based repository that enables researchers to store, share, visualize, and analyze cognitive data. Here we present the architecture, structure, and the essential infrastructure behind MouseBytes. In addition, we describe MouseBytes+, a database that allows data from complementary neuro-technologies such as imaging and photometry to be easily integrated with behavioral data in MouseBytes to support multi-modal behavioral analysis.


Interoperable infrastructure for software and data publishing

“Research data and software rely heavily on the technical and social infrastructure to disseminate, cultivate, and coordinate projects, priorities, and activities. The groups that have stepped forward to support these activities are often segmented by aspects of their identity – facets like discipline, for-profit versus academic orientation, and others. Siloes across the data and software publishing communities are even more splintered into those that are driven by altruism and collective advancement versus those motivated by ego and personal/project success. Roadblocks to progress are not limited to commercial interests, but rather defined by those who refuse to build on past achievements, the collective good, and opportunities for collaboration, insisting on reinventing the wheel and reinforcing siloes.

In the open infrastructure space, several community-led repositories have joined forces to collaborate on single integrations or grant projects (e.g. integrations with Frictionless Data, compliance with Make Data Count best practices, and common approaches to API development). While it is important to openly collaborate to fight against siloed tendencies, many of our systems are still not as interoperable as they could and should be. As a result, our aspirational goals for the community and open science are not being met with the pacing that modern research requires….”

Well-maintained digital repositories can bolster research

“Universities in Africa should establish digital research data repositories to archive important information gathered over time for posterity purposes, an important tool that can serve as an alternative to and complement open-access publishing.

Digital repositories would store critical data gathered by different researchers who are doing different research work over time, enriching archives already maintained by universities but, even more importantly, boosting the visibility of academics and their institutions….”

Ten lessons for data sharing with a data commons | Scientific Data

“A data commons is a cloud-based data platform with a governance structure that allows a community to manage, analyze and share its data. Data commons provide a research community with the ability to manage and analyze large datasets using the elastic scalability provided by cloud computing and to share data securely and compliantly, and, in this way, accelerate the pace of research. Over the past decade, a number of data commons have been developed and we discuss some of the lessons learned from this effort.”

We are going free and open source! – 4TU.ResearchData

“We are very pleased to announce that 4TU.ResearchData is taking the strategic choice to go free and open source! We are planning to go live with our in-house developed open source software repository in March this year….

Almost three years ago, we procured figshare as the repository software to run 4TU.ResearchData. We were pleased with the functionalities which figshare offered as well as with the quality of the support available.

However, in the past few years 4TU.ResearchData has been significantly investing in building an active community of researchers and support staff around its data repository. Our community is increasingly tech-savvy and started coming up with strong wishes to make improvements to the software operating 4TU.ResearchData, or even proposing co-development of new solutions. Unfortunately, the use of proprietary software made it impossible for us to embrace the wish of the community to shape the technical development of 4TU.ResearchData.

Furthermore, we came to realise that only by facilitating community-driven development, we can work towards sustainable infrastructures, which are agile and able to quickly respond to changing community needs. In other words, by co-developing and partnering with the research community, we invest in solutions which are valued and needed….

New partnership to promote open data awareness and participation in Africa – Digital Science

“Figshare – a world leader in digital infrastructure that supports open research, and part of Digital Science – has formed a new partnership with the African Library and Information Associations and Institutions (AfLIA), which is committed to open data and information sharing across Africa.

The partnership is aimed at promoting open data awareness and participation in Africa, to improve access to and use of open data across the continent….”

New partnership to promote open data awareness and participation in Africa – Digital Science

“Figshare – a world leader in digital infrastructure that supports open research, and part of Digital Science – has formed a new partnership with the African Library and Information Associations and Institutions (AfLIA), which is committed to open data and information sharing across Africa.

The partnership is aimed at promoting open data awareness and participation in Africa, to improve access to and use of open data across the continent….”

Harvard Library Responds to the NIH Data Management and Sharing Policy | STAFF PORTAL

“Beginning with the first funding deadlines in January, all NIH grant proposals will be required to include a formal, two-page Data Management and Sharing Plan (DMSP), which must include the following elements….

Crucially, in addition to adding a required DMSP, the data management strategies stated in the plan will be audited and monitored externally, and compliance with stated plans may affect the funding status of grants.


Fortunately, here at Harvard affiliates have access to a variety of computing infrastructure and systems to effectively manage and steward a wide range of research outputs associated with modern, data-driven, computational research.

Harvard’s libraries, Harvard University Information Technology (HUIT), Research Computing, and Sponsored Programs offices have all been adding services and building capacity to support researchers complying with this new policy next year.

In the resources section below, we’ve included links to an executive summary of the policy and a collection of FAQs that we created specifically for Harvard users. We’ve also included resources from the NIH designed to support researchers writing and implementing a DMSP for the 2023 funding cycles.

Along with the requirement to make research data publicly available, in its new policy the NIH strongly encourages the use of established data repositories. When selecting an appropriate repository, researchers should plan to utilize subject- or domain-specific repositories for their data types if possible. When a disciplinary repository does not exist, researchers should use generalist repositories that accept all data types. We’ve included information on Harvard Dataverse and other generalist repositories in the resources section below….”

ARIADNE PLUS – Ariadne infrastructure

“The ARIADNEplus project is the extension of the previous ARIADNE Integrating Activity, which successfully integrated archaeological data infrastructures in Europe, indexing in its registry about 2.000.000 datasets (ARIADNE portal). ARIADNEplus will build on the ARIADNE results, extending and supporting the research community that the previous project created and further developing the relationships with key stakeholders such as the most important European archaeological associations, researchers, heritage professionals, national heritage agencies and so on. The new enlarged partnership of ARIADNEplus covers all of Europe. It now includes leaders in different archaeological domains like palaeoanthropology, bioarchaeology and environmental archaeology as well as other sectors of archaeological sciences, including all periods of human presence from the appearance of hominids to present times. Transnational Activities together with the planned training will further reinforce the presence of ARIADNEplus as a key actor.

The ARIADNEplus data infrastructure will be embedded in a cloud that will offer the availability of Virtual Research Environments where data-based archaeological research may be carried out. The project will furthermore develop a Linked Data approach to data discovery, making available to users innovative services, such as visualization, annotation, text mining and geo-temporal data management. Innovative pilots will be developed to test and demonstrate the innovation potential of the ARIADNEplus approach.

ARIADNEplus is funded by the European Commission under the H2020 Programme, contract no. H2020-INFRAIA-2018-1-823914….”