Project of the Month: Improving the collection and management of data in citizen science | News | CORDIS | European Commission

“A challenge for citizen science is getting people to provide quality data. Another is to help ensure the sustainability of platforms used to collect this data. Technological services built by and for users should overcome these two major challenges.

There is a need to make it simpler for citizen science platforms, also known as citizen observatories, to share data. This will help to enhance citizen science observatories’ interoperability, networking, data quality and secure data management. Both the scientific community and the public stand to benefit.

To achieve this, the EU-funded COS4CLOUD project is working with nine citizen biodiversity observatories, four of which are the largest in Europe: Artportalen, iSpot, Natusfera and Pl@ntNet. The services will be tested on five environmental quality monitoring platforms….”

Meet the GREI Generalist Repositories

“Join us for a panel discussion with the 6 generalist repositories participating in the NIH Generalist Repository Ecosystem Initiative (GREI). Learn about common features and capabilities across repositories as well as repositories that support specific use cases. Discover how these repositories are working together to support NIH-funded researchers and participate in an audience Q&A.”

Open Science

“For a growing number of scientists, though, the process looks like this:

The data that the scientist collects is stored in an open access repository like figshare or Zenodo, possibly as soon as it’s collected, and given its own Digital Object Identifier (DOI). Or the data was already published and is stored in Dryad.
The scientist creates a new repository on GitHub to hold her work.
As she does her analysis, she pushes changes to her scripts (and possibly some output files) to that repository. She also uses the repository for her paper; that repository is then the hub for collaboration with her colleagues.
When she’s happy with the state of her paper, she posts a version to arXiv or some other preprint server to invite feedback from peers.
Based on that feedback, she may post several revisions before finally submitting her paper to a journal.
The published paper includes links to her preprint and to her code and data repositories, which makes it much easier for other scientists to use her work as starting point for their own research.

This open model accelerates discovery: the more open work is, the more widely it is cited and re-used. However, people who want to work this way need to make some decisions about what exactly “open” means and how to do it. You can find more on the different aspects of Open Science in this book.

This is one of the (many) reasons we teach version control. …”

PLOS partners with DataSeer to develop Open Science Indicators – The Official PLOS Blog

“To provide richer and more transparent information on how PLOS journals support best practice in Open Science, we’re going to begin publishing data on ‘Open Science Indicators’ observed in PLOS articles. These Open Science Indicators will initially include (i) sharing of research data in repositories, (ii) public sharing of code and, (iii) preprint posting, for all PLOS articles from 2019 to present. These indicators – conceptualized by PLOS and developed with DataSeer, using an artificial intelligence-driven approach – are increasingly important to PLOS achieving its mission. We plan to share the results openly to support Open Science initiatives by the wider community.”

Planet Research Data Commons Consultation Roundtables Tickets, Multiple Dates | Eventbrite

“The ARDC would like to invite environmental researchers and decision makers to a consultation roundtable for the Planet Research Data Commons.

The Planet Research Data Commons will deliver shared, accessible data and digital research tools that will help researchers and decision makers tackle the big challenges facing our environment, which include adapting to climate change, saving threatened species, and reversing ecosystem deterioration.

We invite environmental researchers and decision makers to get involved in the consultations for the Planet Research Data Commons to help guide the development of the new digital research infrastructure.

The Planet Research Data Commons is the second of 2 pilot Thematic Research Data Commons launching in the 2022-23 financial year with an initial budget of $15.8m. The first pilot, the People Research Data Commons, is focused on digital research infrastructure for health research. The Planet Research Data Commons will explore the digital research infrastructure needs for research challenges set out in the 2021 National Research Infrastructure Roadmap, including environment and climate resilience.

The Planet Research Data Commons will support environmental researchers to develop cross-sector and multi-disciplinary data collaborations on a national scale. It will integrate underpinning compute, storage infrastructure and services with analysis platforms and tools that are supported by expertise, standards and best practices. And it will bring together data from a range of sources to tackle the big questions….”

Towards a European network of FAIR-enabling Trustworthy Digital Repositories (TDRs) – A Working Paper | Zenodo

Philipp Conzett, Ingrid Dillo, Francoise Genova, Natalie Harrower, Vasso Kalaitzi, Mari Kleemola, Amela Kurta, Pedro Principe, Olivier Rouchon, Hannes Thiemann, & Maaike Verburg. (2022). Towards a European network of FAIR-enabling Trustworthy Digital Repositories (TDRs) – A Working Paper (v2.0). Zenodo.

Abstract: This working paper is a bottom-up initiative of  a group of stakeholders from the European repository community. Its purpose is to outline an aspirational vision of a European Network of FAIR-enabling Trustworthy Digital Repositories (TDRs). This initiative originates from the workshop entitled “Towards exploring the idea of establishing the Network”. The paper was created in close connection with the wider community, as its core was built on community feedback and the first draft of the paper was shared for community-wide consultation. This paper will serve as input for the EOSC Task Force on Long Term Digital Preservation. One of the core activities mentioned in the charter of this Task Force is to produce recommendations on the creation of such a network.

The working paper puts together a vision of how a European network of FAIR-enabling TDRs could be based on the community’s needs and its most important functions: Networking and knowledge exchange, stakeholder advocacy and engagement, and coordination and development. The specific activities hosted under these umbrella functions could address the wide range of topics that are important to TDRs. Beyond these functions and the challenges they address, the paper presents a framework to highlight aspects of the Network to further explore in the next steps of its development.


Frontiers | Rethinking the A in FAIR Data: Issues of Data Access and Accessibility in Research

“The FAIR data principles are rapidly becoming a standard through which to assess responsible and reproducible research. In contrast to the requirements associated with the Interoperability principle, the requirements associated with the Accessibility principle are often assumed to be relatively straightforward to implement. Indeed, a variety of different tools assessing FAIR rely on the data being deposited in a trustworthy digital repository. In this paper we note that there is an implicit assumption that access to a repository is independent of where the user is geographically located. Using a virtual personal network (VPN) service we find that access to a set of web sites that underpin Open Science is variable from a set of 14 countries; either through connectivity issues (i.e., connections to download HTML being dropped) or through direct blocking (i.e., web servers sending 403 error codes). Many of the countries included in this study are already marginalized from Open Science discussions due to political issues or infrastructural challenges. This study clearly indicates that access to FAIR data resources is influenced by a range of geo-political factors. Given the volatile nature of politics and the slow pace of infrastructural investment, this is likely to continue to be an issue and indeed may grow. We propose that it is essential for discussions and implementations of FAIR to include awareness of these issues of accessibility. Without this awareness, the expansion of FAIR data may unintentionally reinforce current access inequities and research inequalities around the globe.”



How Figshare meets the NIH ‘Desirable Characteristics for Data Repositories’ – a help article for using figshare

“The new NIH Policy for Data Management and Sharing (effective January 25, 2023) includes supplemental information on Selecting a Data Repository (NOT-OD-21-016), which outlines the data repositories characteristics that researchers should seek out to share their NIH-funded research data and materials. is an appropriate and well-established generalist repository for researchers to permanently store the datasets and other materials produced from their NIH-funded research and to include in their NIH Data Management and Sharing Plans. Figshare+ uses the same repository infrastructure to offer support for sharing large datasets including transparent costs that can be included in funding proposal budgets. Note that Figshare may also be included in Data Management and Sharing Plans in combination with discipline-specific repositories for sharing any types of research outputs that may not be accepted in more specific repositories. Figshare is currently working with NIH as part of their Generalist Repository Ecosystem Initiative to continue enhancing our support for NIH-funded researcher needs. 

Figshare repositories offer established repository infrastructure including adherence to community best practices and standards for persistence, provenance, and discoverability with the flexibility to share any file type and any type of research material and documentation. Figshare makes it easy to share your data in a way that is citable and reusable and to get credit for all of your work. 

Figshare is listed as a recommended data sharing resource in the following: 

NIH Scientific Data Sharing: Generalist Repositories
NIH National Library of Medicine (NLM): Generalist Repositories
NIH HEAL Initiative Recommended Repositories
Nature’s Data Repository Guidance …”

Long-term availability of data associated with articles in PLOS ONE | PLOS ONE

Abstract:  The adoption of journal policies requiring authors to include a Data Availability Statement has helped to increase the availability of research data associated with research articles. However, having a Data Availability Statement is not a guarantee that readers will be able to locate the data; even if provided with an identifier like a uniform resource locator (URL) or a digital object identifier (DOI), the data may become unavailable due to link rot and content drift. To explore the long-term availability of resources including data, code, and other digital research objects associated with papers, this study extracted 8,503 URLs and DOIs from a corpus of nearly 50,000 Data Availability Statements from papers published in PLOS ONE between 2014 and 2016. These URLs and DOIs were used to attempt to retrieve the data through both automated and manual means. Overall, 80% of the resources could be retrieved automatically, compared to much lower retrieval rates of 10–40% found in previous papers that relied on contacting authors to locate data. Because a URL or DOI might be valid but still not point to the resource, a subset of 350 URLs and 350 DOIs were manually tested, with 78% and 98% of resources, respectively, successfully retrieved. Having a DOI and being shared in a repository were both positively associated with availability. Although resources associated with older papers were slightly less likely to be available, this difference was not statistically significant, suggesting that URLs and DOIs may be an effective means for accessing data over time. These findings point to the value of including URLs and DOIs in Data Availability Statements to ensure access to data on a long-term basis.



Facts and Figures for open research data

“Figures and case studies related to accessing and reusing the data produced in the course of scientific production.”

Leveraging Data Communities to Advance Open Science – Ithaka S+R

“Several recent studies have indicated that large numbers of researchers in many STEM fields now accept the value of openly sharing research data. Yet, the actual practice of sharing data—especially in forms that comply with FAIR principles—remains a challenge for many researchers to integrate into their workflows and prioritize among the demands on their time.[1] In many disciplines and subfields, data sharing is still mostly an ideal, honored more in the breach than in practice.[2]

The barriers to open data sharing are numerous.[3] However, sustained funding from federal agencies in the United States including the NSF and NIH and important initiatives in other countries such as Canada’s Tri-Agency Research Data Management Policy and the European Union’s OpenAire, is creating a growing infrastructure for open sharing of research data, albeit one that highlights the tension between scientific research practices that are now regularly multi-national in scope yet exist within funding and regulatory structures determined largely by national entities.[4] In the US context, the most visible fruits of these efforts are the decentralized network of repositories that have become available to researchers in many fields and are now a vital infrastructure for data sharing across many fields. As incentive structures have slowly shifted, the number of researchers taking advantage of these resources has also grown.

The existence of these repositories are necessary enabling conditions for data sharing, but their ability to transform researcher’s practices around data depositing and sharing absent changes to incentive structures and the culture of research communities will remain uneven. Furthering the goals of open science requires convincing more researchers of the value of data sharing to themselves and to the community of researchers with whom they most tangibly identify. Creating and encouraging community norms that reward sharing is necessary because data sharing, especially FAIR (findable, accessible, interoperable, and reusable) compliant sharing, is hard work. Absent strong incentive and reward structures, researchers are often reluctant to take on this “extra” labor. Successful data sharing ultimately depends on cultural and social infrastructures as much as on technical infrastructures….”

Wolters Kluwer expands commitment to open science | Research Information

“To support the evolution of medical publishing toward higher velocity exchange of scientific findings, Wolters Kluwer, Health announced two key additions to the Lippincott® portfolio. The Lippincott Preprints, powered by Figshare, serves as a forum for sharing pre-review medical findings with the global medical community and the Lippincott Data Repository enables researchers to share data from their clinical experiments for greater transparency and deeper validation of findings.  …”

A new open-access platform to bring greater oversight of deforestation risks – |

“ZSL [Zoological Society of London], as a sub-grantee alongside Global Canopy, will be launching a revolutionary platform in 2022 bringing together the best data available on corporate exposure to, and reporting on, deforestation and other related environmental, social and governance (ESG) issues.

The project aims to provide market-leading data to help financial institutions identify risks and find opportunities for sustainable investments to meet the growing demand for responsible financial products in light of the biodiversity and climate crises.

The database will be underpinned by the data collected through ZSL’s SPOTT assessments, Global Canopy’s Forest 500 assessments and the Stockholm Environment Institute, Global Canopy and Neural Alpha’s Trase Supply Chains and Trase Finance data, and will be aligned with the Accountability Framework Initiative and its guidance.

Supported by a five-year grant from the Norwegian government, the resulting data and metrics will provide a more comprehensive view of company performance on deforestation, conversion and associated human rights risks. The dataset will also provide broader coverage of the most exposed forest risk supply chains (in particular: palm oil, soy, timber, pulp, rubber and cattle products) and geographies where corporate performance data on these topics is currently missing. By mapping and integrating data from aligned initiatives and external datasets, more complete and in-depth coverage of corporate performance data will be available….”

