The connection of open science practices and the methodological approach of researchers | SpringerLink

Abstract: The Open Science movement is gaining tremendous popularity and tries to initiate changes in science, for example the sharing and reuse of data. The new requirements that come with Open Science poses researchers with several challenges. While most of these challenges have already been addressed in several studies, little attention has been paid so far to the underlying Open Science practices (OSP). An exploratory study was conducted focusing on the OSP relating to sharing and using data. 13 researchers from the Weizenbaum Institute were interviewed. The Weizenbaum Institute is an interdisciplinary research institute in Germany that was founded in 2017. To reconstruct OSP a grounded theory methodology (Strauss in Qualitative Analysis for Social Scientists, Cambridge University Press, Cambridge, 1987) was used and classified OSP into open production, open distribution and open consumption (Smith in Openness as social praxis. First Monday, 2017). The research shows that apart from the disciplinary background and research environment, the methodological approach and the type of research data play a major role in the context of OSP. The interviewees’ self-attributions related to the types of data they work with: qualitative, quantitative, social media and source code. With regard to the methodological approach and type of data, it was uncovered that uncertainties and missing knowledge, data protection, competitive disadvantages, vulnerability and costs are the main reasons for the lack of openness. The analyses further revealed that knowledge and established data infrastructures as well as competitive advantages act as drivers for openness. Because of the link between research data and OSP, the authors of this paper argue that in order to promote OSP, the methodological approach and the type of research data must also be considered.


Evaluating the (in)accessibility of data behind papers in astronomy

Abstract:  This paper presents results of a survey of authors of journal articles published over several decades in astronomy. The study focuses on determining the characteristics and accessibility of data behind papers, referring to the spectrum of raw and derived data that would be needed to validate the results of a particular published article as a capsule of scientific knowledge. Curating the data behind papers can arguably lead to new discoveries through reuse. However, as shown through related research and confirmed by the results of the present study, a fully accessible portrait of the data behind papers is often unavailable. These findings have implications for reusability efforts and are presented alongside a discussion of open science.

Hello World, From Wikimedia Enterprise | 21 Jun 2022

“We launched Wikimedia Enterprise last year with a goal of making it easy to programmatically access data from across the Wikimedia Foundation projects. Since then, we have been busy building a product that can serve the needs of commercial users of any size. Today, we are thrilled to share some of the first customers using this product, in addition to new features that make it easy for anyone to start using Wikimedia Enterprise.  Today, we are excited to announce that: Google has become the very first customer of Wikimedia Enterprise. The Internet Archive will receive full access to Enterprise’s feature set, at no cost, for use in furthering their mission of archiving the Web. Self-service trial accounts are available to anyone to try out Wikimedia Enterprise for their own use. Trial accounts include unlimited free access to a monthly snapshot of the entire Wikimedia Enterprise project archive and 10,000 free requests from our On-Demand API. New product and pricing details are now available, including a pricing calculator to estimate usage cost after a trial, as well as comprehensive product documentation, and a customer service portal with detailed FAQs. We have also added a news page (you are reading it!) to better communicate updates and announcements to current and potential customers….”

A survey of researchers’ code sharing and code reuse practices, and assessment of interactive notebook prototypes [PeerJ]

Abstract:  This research aimed to understand the needs and habits of researchers in relation to code sharing and reuse; gather feedback on prototype code notebooks created by NeuroLibre; and help determine strategies that publishers could use to increase code sharing. We surveyed 188 researchers in computational biology. Respondents were asked about how often and why they look at code, which methods of accessing code they find useful and why, what aspects of code sharing are important to them, and how satisfied they are with their ability to complete these tasks. Respondents were asked to look at a prototype code notebook and give feedback on its features. Respondents were also asked how much time they spent preparing code and if they would be willing to increase this to use a code sharing tool, such as a notebook. As a reader of research articles the most common reason (70%) for looking at code was to gain a better understanding of the article. The most commonly encountered method for code sharing–linking articles to a code repository–was also the most useful method of accessing code from the reader’s perspective. As authors, the respondents were largely satisfied with their ability to carry out tasks related to code sharing. The most important of these tasks were ensuring that the code was running in the correct environment, and sharing code with good documentation. The average researcher, according to our results, is unwilling to incur additional costs (in time, effort or expenditure) that are currently needed to use code sharing tools alongside a publication. We infer this means we need different models for funding and producing interactive or executable research outputs if they are to reach a large number of researchers. For the purpose of increasing the amount of code shared by authors, PLOS Computational Biology is, as a result, focusing on policy rather than tools.


Study on EU copyright and related rights and access to and reuse of data – Publications Office of the EU

European Commission, Directorate-General for Research and Innovation, Senftleben, M., Study on EU copyright and related rights and access to and reuse of data, Publications Office of the European Union, 2022,

EU legislation in the field of copyright, related rights and sui generis database rights can have a deep impact on access to data resources for scientific research and the availability of data resulting from publicly funded research. To establish a copyright and related rights framework that offers appropriate data access and reuse opportunities for scientific research, it is necessary to identify potential barriers and challenges that may arise from EU copyright and related rights legislation and corresponding rights management. This study analyses the interaction between copyright and related rights law and data access and reuse for scientific research purposes. It proposes legislative and non-legislative measures to improve the current EU regulatory framework.

Adema & Kiesewetter (2022) Re-use and/as Re-writing | Community-led Open Publication Infrastructures for Monographs (COPIM)

Depending on the type of open licence, open access publications allow for the re-use of already published content. In addition to this, collaborative editing and writing tools enable further engagement with and around published works by (communities of) authors. The interactive and collaborative potential of open books can add further value and new avenues and formats that go beyond the more obvious benefits of open access, such as, for example, enhancing the discovery and online consultation (Snijder, 2019) of scholarly publications. 

Re-use can take different forms, being highly context-specific. Imagine, for example, a collage text entirely composed of text snippets, or a remix in which two existing texts are woven together in the fashion of a parallel montage. Re-use mobilises combinatorial creativity, or the process of combining existing ideas to produce something new, that can be perceived as a critique of the idea of the original genius, or, in the context of academia, of the single liberal humanist author (Popova, 2011). Re-use might also involve creating new communities and conversations around already existing books and texts, for example by means of gathering together comments and annotations, and adding hyperlinks. It can additionally foster experimentation with more social and open forms of performing humanities scholarship and scholarly interaction with and around books: for example, through open peer review and networked books. Other forms of re-use can be directed towards the updating, translating, modifying, reviewing, versioning, and forking of existing books. Combinatorial Books will experiment with such possibilities in theory and practice in order to stimulate, explore, and practice the full range of social book interactions made possible by open access. As such, it aims to promote the reuse of open access books as part of a workflow that enables the creation of new publications out of existing ones. Engaging with re-use in this way implies the adaptation of existing workflows, systems, practices, and licensing. However, these can be, as we hope to show in this series of blogposts, relatively simple, low-key adaptations that do not have to be labour- and cost-intensive and do not necessarily require advanced technological expertise.



Toward a definition of digital object reuse | Emerald Insight

Abstract:  Purpose

The purpose of this paper is to present conceptual definitions for digital object use and reuse. Typically, assessment of digital repository content struggles to go beyond traditional usage metrics such as clicks, views or downloads. This is problematic for galleries, libraries, archives, museums and repositories (GLAMR) practitioners because use assessment does not tell a nuanced story of how users engage with digital content and objects.


This paper reviews prior research and literature aimed at defining use and reuse of digital content in GLAMR contexts and builds off of this group’s previous research to devise a new model for defining use and reuse called the use-reuse matrix.


This paper presents the use-reuse matrix, which visually represents eight categories and numerous examples of use and reuse. Additionally, the paper explores the concept of “permeability” and its bearing on the matrix. It concludes with the next steps for future research and application in the development of the Digital Content Reuse Assessment Framework Toolkit (D-CRAFT).

Practical implications

The authors developed this model and definitions to inform D-CRAFT, an Institute of Museum and Library Services National Leadership Grant project. This toolkit is being developed to help practitioners assess reuse at their own institutions.


To the best of the authors’ knowledge, this paper is one of the first to propose distinct definitions that describe and differentiate between digital object use and reuse in the context of assessing digital collections and data.

Slow improvement to the archiving quality of open datasets shared by researchers in ecology and evolution | Proceedings of the Royal Society B: Biological Sciences

Abstract:  Many leading journals in ecology and evolution now mandate open data upon publication. Yet, there is very little oversight to ensure the completeness and reusability of archived datasets, and we currently have a poor understanding of the factors associated with high-quality data sharing. We assessed 362 open datasets linked to first- or senior-authored papers published by 100 principal investigators (PIs) in the fields of ecology and evolution over a period of 7 years to identify predictors of data completeness and reusability (data archiving quality). Datasets scored low on these metrics: 56.4% were complete and 45.9% were reusable. Data reusability, but not completeness, was slightly higher for more recently archived datasets and PIs with less seniority. Journal open data policy, PI gender and PI corresponding author status were unrelated to data archiving quality. However, PI identity explained a large proportion of the variance in data completeness (27.8%) and reusability (22.0%), indicating consistent inter-individual differences in data sharing practices by PIs across time and contexts. Several PIs consistently shared data of either high or low archiving quality, but most PIs were inconsistent in how well they shared. One explanation for the high intra-individual variation we observed is that PIs often conduct research through students and postdoctoral researchers, who may be responsible for the data collection, curation and archiving. Levels of data literacy vary among trainees and PIs may not regularly perform quality control over archived files. Our findings suggest that research data management training and culture within a PI’s group are likely to be more important determinants of data archiving quality than other factors such as a journal’s open data policy. Greater incentives and training for individual researchers at all career stages could improve data sharing practices and enhance data transparency and reusability.


CHORUS Forum: Making FAIR’s Interoperability and Reusability Data Goals Possible – CHORUS

“Since their publication in 2016, the FAIR Data Guiding Principles (Findable, Accessible, Interoperable, Reusable) have become clear and enabling goals to work towards in any work or policies around data.  As metrics have been developed around them, FAIR has become a set of tangible targets for funders, publishers, institutions and researchers to aim at and to be measured against.  Unsurprisingly, considerable work has gone into practically enabling all to achieve these.  For many (not all) it has become reasonable to expect data to be published in a manner that results in it being both Findable and Accessible. 

However, making these data furthermore Interoperable and Reusable remains a significant challenge.  This AGU / CHORUS Forum will have government, funder, academic institution, and scientific society stakeholders outline why I & R are proving so challenging, and will showcase some of the partial solutions being put into practice.  Finally, the audience will be invited to comment on at least one new promising solution. …”

Global Community Guidelines for Documenting, Sharing, and Reusing Quality Information of Individual Digital Datasets

Open-source science builds on open and free resources that include data, metadata, software, and workflows. Informed decisions on whether and how to (re)use digital datasets are dependent on an understanding about the quality of the underpinning data and relevant information. However, quality information, being difficult to curate and often context specific, is currently not readily available for sharing within and across disciplines. To help address this challenge and promote the creation and (re)use of freely and openly shared information about the quality of individual datasets, members of several groups around the world have undertaken an effort to develop international community guidelines with practical recommendations for the Earth science community, collaborating with international domain experts. The guidelines were inspired by the guiding principles of being findable, accessible, interoperable, and reusable (FAIR). Use of the FAIR dataset quality information guidelines is intended to help stakeholders, such as scientific data centers, digital data repositories, and producers, publishers, stewards and managers of data, to: i) capture, describe, and represent quality information of their datasets in a manner that is consistent with the FAIR Guiding Principles; ii) allow for the maximum discovery, trust, sharing, and reuse of their datasets; and iii) enable international access to and integration of dataset quality information. This article describes the processes that developed the guidelines that are aligned with the FAIR principles, presents a generic quality assessment workflow, describes the guidelines for preparing and disseminating dataset quality information, and outlines a path forward to improve their disciplinary diversity.

An open science argument against closed metrics

“In the Open Scientist Handbook, I argue that open science supports anti-rivalrous science collaborations where most metrics are of little, or of negative value. I would like to share some of these arguments here….

Institutional prestige is a profound drag on the potential for networked science. If your administration has a plan to “win” the college ratings game, this plan will only make doing science harder. It makes being a scientist less rewarding. Playing finite games of chasing arbitrary metrics or ‘prestige’ drags scientists away from the infinite play of actually doing science….

As Cameron Neylon said at the metrics breakout of the ‘Beyond the PDF’ conference some years ago, “reuse is THE metric.” Reuse reveals and confirms the advantage that open sharing has over current, market-based, practices. Reuse validates the work of the scientist who contributed to the research ecosystem. Reuse captures more of the inherent value of the original discovery and accelerates knowledge growth….”

How to reuse & share your knowledge as you wish through Rights Retention – YouTube

“In 2020 cOAlition S released its Rights Retention Strategy (RRS) with the dual purpose of enabling authors to retain rights that automatically belong to the author, and to enable compliance with their funders’ Open Access policy via dissemination in a repository.

This video explains briefly the steps a researcher has to follow to retain their intellectual property rights….”

The doors of precision: Reenergizing psychiatric drug development with psychedelics and open access computational tools

“In a truly remarkable way, the study was performed at essentially no additional cost. Ballentine et al. (3) made use of existent, openly available resources: the Erowid psychedelic “experience vault,” the pharmacokinetic profiles of each psychedelic, the Allen Human Brain gene transcription profiles, and the Schafer-Yeo brain atlas that mapped gene transcript to brain structure. The computational tools—primarily python toolboxes—that Ballentine et al. deployed were also available at no cost. So in the same way that the psychedelics industry is repurposing old drugs, Ballentine et al. repurposed old data and tools to define a new framework….”