Open-Science Guidance for Qualitative Research: An Empirically Validated Approach for De-Identifying Sensitive Narrative Data – Rebecca Campbell, McKenzie Javorka, Jasmine Engleton, Kathryn Fishwick, Katie Gregory, Rachael Goodman-Williams, 2023

Abstract:  The open-science movement seeks to make research more transparent and accessible. To that end, researchers are increasingly expected to share de-identified data with other scholars for review, reanalysis, and reuse. In psychology, open-science practices have been explored primarily within the context of quantitative data, but demands to share qualitative data are becoming more prevalent. Narrative data are far more challenging to de-identify fully, and because qualitative methods are often used in studies with marginalized, minoritized, and/or traumatized populations, data sharing may pose substantial risks for participants if their information can be later reidentified. To date, there has been little guidance in the literature on how to de-identify qualitative data. To address this gap, we developed a methodological framework for remediating sensitive narrative data. This multiphase process is modeled on common qualitative-coding strategies. The first phase includes consultations with diverse stakeholders and sources to understand reidentifiability risks and data-sharing concerns. The second phase outlines an iterative process for recognizing potentially identifiable information and constructing individualized remediation strategies through group review and consensus. The third phase includes multiple strategies for assessing the validity of the de-identification analyses (i.e., whether the remediated transcripts adequately protect participants’ privacy). We applied this framework to a set of 32 qualitative interviews with sexual-assault survivors. We provide case examples of how blurring and redaction techniques can be used to protect names, dates, locations, trauma histories, help-seeking experiences, and other information about dyadic interactions.


Navigating Risk in Vendor Data Privacy Practices – SPARC

“Produced in collaboration with Becky Yoose of LDH Consulting Services, Navigating Risk in Vendor Data Privacy Practices: An Analysis of Elsevier’s ScienceDirect documents a variety of data privacy practices that directly conflict with library privacy standards. The report raises important questions regarding the potential for personal data collected from academic products to be used in the data brokering and surveillance products of RELX’s LexisNexis subsidiary…”

Navigating Risk in Vendor Data Privacy Practices: An Analysis of Elsevier’s ScienceDirect

“Navigating Risk in Vendor Data Privacy Practices: An Analysis of Elsevier’s ScienceDirect documents a variety of data privacy practices that directly conflict with library privacy standards, and raises important questions regarding the potential for personal data collected from academic products to be used in the data brokering and surveillance products of RELX’s LexisNexis subsidiary.

By analyzing the privacy practices of the world’s largest publisher, the report describes how user tracking that would be unthinkable in a physical library setting now happens routinely through publisher platforms. The analysis underlines the concerns this tracking should raise, particularly when the same company is involved in surveillance and data brokering activities. Elsevier is a subsidiary of RELX, a leading data broker and provider of “risk” products that offer expansive databases of personal information to corporations, governments, and law enforcement agencies. 

As much of the research lifecycle shifts to online platforms owned by a small number of companies, the report highlights why users and institutions should actively evaluate and address the potential privacy risks as this transition occurs rather than after it is complete.”

SPARC Report Urges Action to Address Concerns with ScienceDirect Data Privacy Practices | SPARC

Today, SPARC released Navigating Risk in Vendor Data Privacy Practices: An Analysis of Elsevier’s ScienceDirect. Produced in collaboration with Becky Yoose of LDH Consulting Services, the report documents a variety of data privacy practices that directly conflict with library privacy standards, and raises important questions regarding the potential for personal data collected from academic products to be used in the data brokering and surveillance products of RELX’s LexisNexis subsidiary.



A social networking site is not an open access repository – Office of Scholarly Communication

“The simple answer is: ResearchGate and do not permit their users to take their own data and reuse it elsewhere, nor do their terms of service permit the library to extract that data on the authors’ behalf.

ResearchGate: “Users must not misuse the Service. Misuse of the Service includes, without limitation: … automated or massive manual retrieval of other Users’ profile data (‘data harvesting’).” “You agree not to do any of the following: … Attempt to access or search the Site, … through the use of any engine, software, tool, agent, device or mechanism (including spiders, robots, crawlers, data mining tools or the like).”…”

Healthcare research data sharing and academic journal: A cha… : Indian Journal of Anaesthesia

“To conclude, healthcare data are vast. Even though healthcare research data of published articles are minuscules, sharing such data can be helpful to develop evidence-based medicine early and efficiently. Academic journals can contribute to such a noble cause. Nevertheless, data sharing has a diverse pathway to travel and overcome geopolitical and ethical barriers beyond the technicality and intention of the researchers.”

OBIA: An Open Biomedical Imaging Archive – ScienceDirect

Abstract:  With the development of artificial intelligence (AI) technologies, biomedical imaging data play an important role in scientific research and clinical application, but the available resources are limited. Here we present Open Biomedical Imaging Archive (OBIA), a repository for archiving biomedical imaging and related clinical data. OBIA adopts five data objects (Collection, Individual, Study, Series, and Image) for data organization, accepts the submission of biomedical images of multiple modalities, organs, and diseases. In order to protect personal privacy, OBIA has formulated a unified de-identification and quality control process. In addition, OBIA provides friendly and intuitive web interface for data submission, browsing and retrieval, as well as image retrieval. As of September 2023, OBIA has housed data for a total of 937 individuals, 4136 studies, 24,701 series, and 1,938,309 images covering 9 modalities and 30 anatomical sites. Collectively, OBIA provides a reliable platform for biomedical imaging data management and offers free open access to all publicly available data to support research activities throughout the world. OBIA can be accessed at

Exchanging words: Engaging the challenges of sharing qualitative research data | PNAS

Abstract:  In January 2023, a new NIH policy on data sharing went into effect. The policy applies to both quantitative and qualitative research (QR) data such as data from interviews or focus groups. QR data are often sensitive and difficult to deidentify, and thus have rarely been shared in the United States. Over the past 5 y, our research team has engaged stakeholders on QR data sharing, developed software to support data deidentification, produced guidance, and collaborated with the ICPSR data repository to pilot the deposit of 30 QR datasets. In this perspective article, we share important lessons learned by addressing eight clusters of questions on issues such as where, when, and what to share; how to deidentify data and support high-quality secondary use; budgeting for data sharing; and the permissions needed to share data. We also offer a brief assessment of the state of preparedness of data repositories, QR journals, and QR textbooks to support data sharing. While QR data sharing could yield important benefits to the research community, we quickly need to develop enforceable standards, expertise, and resources to support responsible QR data sharing. Absent these resources, we risk violating participant confidentiality and wasting a significant amount of time and funding on data that are not useful for either secondary use or data transparency and verification.

Responsible data sharing: Identifying and remedying possible re-identification of human participants

Abstract:  Open data collected from humans creates a tension between scholarly values of transparency and sharing on the one hand, and privacy and security on the other. A common solution is to make datasets anonymous by removing personally identifying information before sharing. However, ostensibly anonymized datasets may be at risk of re-identification if they include demographic information. In the present article, we (a) review current privacy standards; (b) describe computer science data protection frameworks and their adaptability to the social sciences; (c) provide practical guidance for assessing and addressing re-identification risk; (d) introduce two open-source algorithms – MinBlur and MinBlurLite – to increase privacy while maintaining the integrity of open data; and (e) highlight aspects of ethical data sharing that require further attention. Technical innovations can support competing values so that science can be as open as possible to promote transparency and sharing, and as closed as necessary to maintain privacy and security.


Data sharing implementation in top 10 ophthalmology journals in 2021 | BMJ Open Ophthalmology

Abstract:  Background/Aims Deidentified individual participant data (IPD) sharing has been implemented in the International Committee of Medical Journal Editors journals since 2017. However, there were some published clinical trials that did not follow the new implemented policy. This study examines the number of clinical trials that endorsed IPD sharing policy among top ophthalmology journals.

Method All published original articles in 2021 in 10 highest-ranking ophthalmology journals according to the 2020 journal impact factor were included. Clinical trials were determined by the WHO definition of clinical trials. Each article was then thoroughly searched for the IPD sharing statement either in the manuscript or in the clinical trial registry. We collected the number of published clinical trials that implemented IPD sharing policy as our primary outcome.

Results 1852 published articles in top 10 ophthalmology journals were identified, and 9.45% were clinical trials. Of these clinical trials, 44% had clinical trial registrations and 49.14% declared IPD sharing statements. Only 42 (48.83%) clinical trials were willing to share IPD, and 5 (10.21%) of these share IPD via an online repository platform. In terms of sharing period, 37 clinical trials were willing to share right after the publication and only 2 showed the ending of sharing period.

Conclusion This report shows that the number of clinical trials in top ophthalmology journals that endorsed the IPD sharing policy and the number of registrations is lower than half even though the policy has been implemented for several years. Future updates are necessary as policy evolves.

Habits and perceptions regarding open science by researchers from Spanish institutions | PLOS ONE

Abstract:  The article describes the results of the online survey on open science (OS) carried out on researchers affiliated with universities and Spanish research centres and focused on open access to scientific publications, the publication process, the management of research data and the review of open articles. The main objective was to identify the perception and habits of researchers with regard to practices closely linked to open science and the scientific value added is that offers an in-depth picture of researchers as one of the main actors to whom this transformation and implementation of open science will fall. It focuses on the different aspects of OS: open access, open data, publication process and open review in order to identify habits and perceptions. This is to make possible an implementation of the OS movement. The survey was carried out among researchers who had published in the years 2020–2021, according to data obtained from WoS. It was emailed to a total of 8,188 researchers and obtained a total of 666 responses, of which 554 were complete, the rest being forms with some questions unanswered. The main results showed that open access still requires the diffusion of practices and services provided by the institution, as well as training (library or equivalent service) and institutional support from the competent authorities (vice rectors or equivalent) in specific aspects such as data management. In the case of data, around 50% of respondents stated they had stored data in a repository, and of all the options, the most frequently given was that of an institutional repository, followed by a discipline repository. Among the main reasons for doing this, we found transparency, visibility of data and the ability to validate results. For those who stated they had never stored data, the most frequent reasons for not having done so were privacy and confidentiality, the lack of a mandated data policy or a lack of knowledge of how to do it. In terms of open peer review, participants mentioned a certain reticence to the opening of evaluations due to potential conflicts of interest that may arise or because lower-quality content might be accepted in order to avoid conflicts. In addition, the hierarchical structure of senior researcher versus junior researcher might affect reviews. The main conclusions indicate a need for persuasion of OA to take place; APCs are an economic barrier rather than the main criterion for journal selection; OPR practices may seem innovative and emerging; scientific and evaluation policies seem to have a clear effect on the behaviour of researchers; researchers state that they share research data more for reasons of persuasion than out of obligation. Researchers do question the pathways or difficulties that may arise on a day-to-day basis and seem aware that we are undergoing change, where academic evaluation or policies related to open science, its implementation and habits among researchers may change. In this sense, more and better support is needed on the part of institutions and faculty support services.


The Platformisation of Scholarly Information and How to Fight It | LIBER Quarterly: The Journal of the Association of European Research Libraries

Abstract:  The commercial control of academic publishing and research infrastructure by a few oligopolistic companies has crippled the development of open access movement and interfered with the ethical principles of information access and privacy. In recent years, vertical integration of publishers and other service providers throughout the research cycle has led to platformisation, characterized by datafication and commodification similar to practices on social media platforms. Scholarly publications are treated as user-generated contents for data tracking and surveillance, resulting in profitable data products and services for research assessment, benchmarking and reporting. Meanwhile, the bibliodiversity and equal open access are denied by the dominant gold open access model and the privacy of researchers is being compromised by spyware embedded in research infrastructure. This article proposes four actions to fight the platformisation of scholarly information after a brief overview of the market of academic journals and research assessments and their implications for bibliodiversity, information access, and privacy: (1) Educate researchers about commercial publishers and APCs; (2) Allocate library budget to support scholar-led and library publishing; (3) Engage in the development of public research infrastructures and copyright reform; and (4) Advocate for research assessment reforms.


Frontiers | Toward an open access genomics database of South Africans: ethical considerations

Abstract:  Genomics research holds the potential to improve healthcare. Yet, a very low percentage of the genomic data used in genomics research internationally relates to persons of African origin. Establishing a large-scale, open access genomics database of South Africans may contribute to solving this problem. However, this raises various ethics concerns, including privacy expectations and informed consent. The concept of open consent offers a potential solution to these concerns by (a) being explicit about the research participant’s data being in the public domain and the associated privacy risks, and (b) setting a higher-than-usual benchmark for informed consent by making use of the objective assessment of prospective research participants’ understanding. Furthermore, in the South African context—where local culture is infused with Ubuntu and its relational view of personhood—community engagement is vital for establishing and maintaining an open access genomics database of South Africans. The South African National Health Research Ethics Council is called upon to provide guidelines for genomics researchers—based on open consent and community engagement—on how to plan and implement open access genomics projects.


Principles of Diamond Open Access Publishing: a draft proposal | Plan S

“The Action Plan for Diamond Open Access outlines a set of priorities to develop sustainable, community-driven, academic-led and -owned scholarly communication. Its goal is to create a global federation of Diamond Open Access (Diamond OA) journals and platforms around shared principles, guidelines, and quality standards while respecting their cultural, multilingual and disciplinary diversity. It proposes a definition of Diamond OA as a scholarly publication model in which journals and platforms do not charge fees to either authors or readers. Diamond OA is community-driven, academic-led and -owned, and serves a wide variety of generally small-scale, multilingual, and multicultural scholarly communities. 

Still, Diamond OA is often seen as a mere business model for scholarly publishing: no fees for authors or readers. However, Diamond OA can be better characterized by a shared set of values and principles that go well beyond the business aspect. These distinguish Diamond OA communities from other approaches to scholarly publishing. It is therefore worthwhile to spell out these values and principles, so they may serve as elements of identification for Diamond OA communities. 

The principles formulated below are intended as a first draft. They are not cast in stone, and meant to inspire discussion and evolve as a living document that will crystallize over the coming months. Many of these principles are not exclusive to Diamond OA communities. Some are borrowed or adapted from the more general 2019 Good Practice Principles for scholarly communication services defined by Sparc and COAR1, or go back to the 2016 Vienna Principles. Others have been carefully worked out in more detail by the FOREST Framework for Values-Driven Scholarly Communication in a self-assessment format for scholarly communities. Additional references can be added in the discussion.

The formulation of these principles has benefited from many conversations over the years with various members of the Diamond community now working together in the Action Plan for Diamond Open Access, cOAlition S, the CRAFT-OA and DIAMAS projects, the Fair Open Access Alliance (FOAA), Linguistics in Open Access (LingOA), the Open Library of Humanities, OPERAS, SciELO, Science Europe, and Redalyc-Amelica. This document attempts to embed these valuable contributions into principles defining the ethos of Diamond OA publishing….”