Repository service for SSH (Dataverse) | SSHOPENCLOUD

The repository service for SSH is built upon the community-driven open source Dataverse software. 

Its modular design facilitates integration with other data services such as DataCite or ROpenSci, CLARIN’s Language Resource Switchboard, and supports the development of additional functionality and services. 

Two types of services are being developed: 

1) a central (ERIC-level) service in the cloud, adapted to the needs of the relevant European SSH community, for small institutes to have a research data repository for their designated community.
2) an ‘Archive in a box’ software installation package, an adapted version to the needs of the European SSH community with documentation, for downloading and usage in their own environment by institutes themselves.

Global presence of open-source research data management platform for libraries: the Dataverse project | Emerald Insight

Abstract:  Purpose

This paper aims to provide statistical information on the worldwide spread of the open-source research data management application, the Dataverse Project, to librarians, data managers and information managers who are considering using the application at their own institution.

Design/methodology/approach

To produce a list of dataverse repositories, the official Dataverse website was evaluated, and JSON data were downloaded and parsed. Data standardisation was performed to assess the state of installations in various nations and continents across the world.

Findings

Globally, the Dataverse repositories have seen a rise in overall installations. The year 2020 alone saw a 23.21% rise. In a country-by-country comparison, the USA (13) has the most dataverse installations, while Europe (25) has the highest number of installations worldwide.

Originality/value

This research will be useful to librarians, data managers and information managers, among others, who want to learn more about Dataverse repositories throughout the world before deploying at their local level.

Developing an updated plugin for Dataverse integration with OPS/OJS on Vimeo

“In this activity we present the current status of development of a plugin to integrate Dataverse with Open Preprint Servers (OPS) and Open Journal Systems (OJS) in their most recent versions (3.3.x series).

Presentation held on 11/19/21 at Open Publishing Fest 2021:
openpublishingfest.org/calendar.html#event-90/ …”

“Optional Data Curation Feature Use by Harvard Dataverse Repository Users” by Ceilyn Boyd

Abstract:  Objective: Investigate how different groups of depositors vary in their use of optional data curation features that provide support for FAIR research data in the Harvard Dataverse repository.

Methods: A numerical score based upon the presence or absence of characteristics associated with the use of optional features was assigned to each of the 29,295 datasets deposited in Harvard Dataverse between 2007 and 2019. Statistical analyses were performed to investigate patterns of optional feature use amongst different groups of depositors and their relationship to other dataset characteristics.

Results: Members of groups make greater use of Harvard Dataverse’s optional features than individual researchers. Datasets that undergo a data curation review before submission to Harvard Dataverse, are associated with a publication, or contain restricted files also make greater use of optional features.

Conclusions: Individual researchers might benefit from increased outreach and improved documentation about the benefits and use of optional features to improve their datasets’ level of curation beyond the FAIR-informed support that the Harvard Dataverse repository provides by default. Platform designers, developers, and managers may also use the numerical scoring approach to explore how different user groups use optional application features.

Opening Your Scholarship: Why should I DASH and Dataverse?

“Learn practices and platforms to achieve your open access goals!

Highlights on Harvard DASH and Dataverse.

Panelists:

– Sonia Barbosa, Manager of Data Curation, Harvard Dataverse, Manager of the Murray Research Archive

– Julie Goldman, Research Data Services Librarian

– Colin Lukens, Senior Repository Manager, Harvard Library Office for Scholarly Communication

– Katie Mika, Data Services Librarian …”

Dataverse and OpenDP: Tools for Privacy-Protective Analysis in the Cloud | Mercè Crosas

“When big data intersects with highly sensitive data, both opportunity to society and risks abound. Traditional approaches for sharing sensitive data are known to be ineffective in protecting privacy. Differential Privacy, deriving from roots in cryptography, is a strong mathematical criterion for privacy preservation that also allows for rich statistical analysis of sensitive data. Differentially private algorithms are constructed by carefully introducing “random noise” into statistical analyses so as to obscure the effect of each individual data subject.    OpenDP is an open-source project for the differential privacy community to develop general-purpose, vetted, usable, and scalable tools for differential privacy, which users can simply, robustly and confidently deploy. 

Dataverse is an open source web application to share, preserve, cite, explore, and analyze research data. It facilitates making data available to others, and allows you to replicate others’ work more easily. Researchers, journals, data authors, publishers, data distributors, and affiliated institutions all receive academic credit and web visibility.  A Dataverse repository is the software installation, which then hosts multiple virtual archives called Dataverses. Each dataverse contains datasets, and each dataset contains descriptive metadata and data files (including documentation and code that accompany the data).

This session examines ongoing efforts to realize a combined use case for these projects that will offer academic researchers privacy-preserving access to sensitive data. This would allow both novel secondary reuse and replication access to data that otherwise is commonly locked away in archives.  The session will also explore the potential impact of this work outside the academic world.”

Dataverse Community Meeting 2020

“The annual Dataverse Community Meeting is an opportunity to build, grow, and enrich the global community. Like the open-source Dataverse product itself, the activities of the Dataverse Community Meetings are community-driven. Over three days of presentations, workshops, and working group meetings we aim to promote and learn about behavioral and technical solutions and standards for curating, sharing, and preserving data that can be discovered and reused across disciplines to reproduce and advance research.

The Dataverse Community Meeting is hosted by Harvard’s Institute for Quantitative Social Science. Learn more about The Dataverse Project at our dataverse.org site….”

Advancing computational reproducibility in the Dataverse data repository platform

Abstract:  Recent reproducibility case studies have raised concerns showing that much of the deposited research has not been reproducible. One of their conclusions was that the way data repositories store research data and code cannot fully facilitate reproducibility due to the absence of a runtime environment needed for the code execution. New specialized reproducibility tools provide cloud-based computational environments for code encapsulation, thus enabling research portability and reproducibility. However, they do not often enable research discoverability, standardized data citation, or long-term archival like data repositories do. This paper addresses the shortcomings of data repositories and reproducibility tools and how they could be overcome to improve the current lack of computational reproducibility in published and archived research outputs.

 

COVID-19 Data Collection

“This is a general collection of COVID-19 data deposited in the Harvard Dataverse repository. The list in this collection is maintained by the Harvard Dataverse data curation team (IQSS and Harvard Library). Researchers who deposit their related data into Harvard Dataverse will have their data linked to this collection, to increase discoverability of their data. Please use the contact link if you have any questions about this collection.”

COVID-19 Data Collection

“This is a general collection of COVID-19 data deposited in the Harvard Dataverse repository. The list in this collection is maintained by the Harvard Dataverse data curation team (IQSS and Harvard Library). Researchers who deposit their related data into Harvard Dataverse will have their data linked to this collection, to increase discoverability of their data. Please use the contact link if you have any questions about this collection.”