New Resources Available on Protecting Participant Privacy When Sharing Scientific Data – NIH Extramural Nexus

“NIH’s scientific data sharing site now offers information and resources on the following topics:

Principles and Best Practices for Protecting Participant Privacy
Designating Scientific Data for Controlled Access
Considerations for Obtaining Informed Consent
Considerations for Researchers Working with American Indian/Alaska Native (AI/AN) Communities…”

Researchers Forget to Report How to Share Data From Studies Published in Spanish Medical Journals – ScienceDirect

“Some time ago, Archivos de Bronconeumología reported on a radical turnabout by the ICMJE: after announcing in 2016 that they would require clinical trial researchers to share individual-level anonymized participant data with third parties, in 2017 they decided that such transfer would be voluntary.4 The news had a precedent in the Recommendations published a few years earlier, to the effect that some journal editors “ask authors to say whether the study data are available to third parties to view and/or use/reanalyze, while still others encourage or require authors to share their data with others for review or reanalysis”.1 It would be interesting to know which Spanish journals have included this requirement in their ‘instructions for authors’ and whether they comply with it.

To answer this question, we reviewed the portals of 24 Spanish journals with an impact factor greater than 1, on the understanding that they have greater influence than those with an impact factor ?1 and those with no impact factor. Of these 24, 14 are included in the list of ICMJE Recommendations (Supplementary material A). Of these, only 5 (Archivos of Bronconeumología, Atención Primaria, Enfermedades Infecciosas y Microbiología Clínica, Gaceta Sanitaria, and Medicina Intensiva) include a specific section, that we shall call “link to data repository”, that recommends, supports and encourages authors to share raw data from their studies with other researchers, and gives instructions on how to go about it. A sixth journal, the Revista de Neurología, recommends this procedure only for clinical trials (Supplementary material B). To determine the frequency with which authors report how data can be accessed compared to other requirements requested by the same journals, 2 control requirements were selected: reporting on conflicts of interest and study funding, that were included in the Recommendations much earlier. It is also of interest to determine whether supplementary material may be included online, as this is sometimes a way of including raw study data….

Sharing data from quantitative studies is much easier than from qualitative studies. Researchers performing qualitative studies frequently cite the lack of authorization of the participants, the sensitive nature of the data, and loss of confidentiality as reasons for not sharing data.6 However, qualitative studies are the exception among Spanish medical publications. By 2011, most researchers were already sharing their data, although this was challenging for more than a third of them; in the case of clinical trials, it has recently been reported that access7 to data is difficult despite authors’ commitment to share.8 Ideally, Spanish medical journals should require authors to share them in all the articles they publish, and if data sharing is impossible, to explain why.”

Data sharing and management in 21st century cancer research – Ryan – Molecular Oncology – Wiley Online Library

Abstract:  A central facet of scientific endeavour is that we must share our discoveries. Even in the ‘Ivory Tower’ science of previous centuries where scientists would often work alone, they would still disseminate their findings by attending events at learned societies and by publishing their work in scientific journals; otherwise, one could argue, if the discovery was not recorded, how do we know it happened? To this day, things are much the same, but also very different. We still need to meet and discuss our science at conferences and other events, so that we can share our ideas and early data to facilitate the progress of discovery. This also enables us to communicate with those whom we do not on a daily basis, e.g. through events that bring together basic scientists with those who undertake more translational research or are involved in clinical trials. In addition, we still need to report our findings in peer-reviewed journals. The pathways to share data and information are though more varied with people choosing to share their findings via social media platforms, webinars and other forms of digital media. The digital era has also changed the way we can ask scientific questions, with the ability to generate and analyse very large sets of data that have enormous power to make discoveries that were previously not possible. While the conclusions from these new large data studies can be easily communicated, the ability to manage and share the data, and more so, the metadata behind these studies are currently a burgeoning problem for many areas of research, including cancer research. This is not only a technical problem in handling the huge amounts of data involved, but in some cases also a legal problem when factors such as General Data Protection Regulations (GDPR, need to be considered.




EU Parliamentary Committee advising against US data pact | The Register

Lawmakers in the European Parliament have urged the European Commission not to issue the “adequacy decision” needed for the EU-US Data Privacy Framework (DPF) to officially become the pipeline for data to freely flow from the EU to the States.

It almost goes without saying that the current operation of the technology sector in Europe would not work without US tech companies’ services – so data transfers to these American corporations cannot practicably be avoided. However, European rules around privacy, data collection, and data subjects’ rights are considerably stronger than those in America, hence the need for rules of engagement that make US companies’ treatment of EU data as good as what they’d get at home.

The DPF was announced in March last year and is meant to address concerns raised by the EU’s Court of Justice in Schrems II, a 2020 case that struck down the so-called Privacy Shield data protection arrangements between the political bloc and the US.

EU president Ursula von der Leyen and US president Joe Biden said they’d reached an agreement in principle on the framework for transatlantic data flows at the time, with Biden signing an executive order (EO) on the matter in October last year.

But the European Parliament’s Committee on Civil Liberties, Justice and Home Affairs (LIBE) is still not happy with what it sees, and has put out a nonbinding draft opinion [PDF] on how adequate it thinks the protection given by the proposed cross-border data rules is. In short: it ain’t.




Next steps for preprint review infrastructure | Naomi Penfold, Feb 7, 2023 | Invest in Open Infrastructure

“…This meeting brought together preprint review initiative leads with funders, publishers, and researchers to discuss policies and practices that could encourage the adoption and development of preprint review in biology. There are different views on the future for preprint review: as a replacement for journals, a complement to the existing system, and/or a training exercise to grow and diversify the reviewer pool. Overall, this meeting highlighted the opportunity to use preprints to build a more collegial and constructive culture of peer review (than that typically experienced at journals). While the focus of the meeting was on policies and practices to encourage the adoption of preprint review, including how to incentivize researchers to contribute reviews, we noted some specific needs and gaps to consider in relation to investing in open infrastructure:

Efforts to encourage adoption need funding. As well as investing in the technical infrastructure enabling preprint review, we heard the call for funders to support initiatives that encourage scholars to try out preprint review, and that nurture the envisaged culture of collegiality. This support could be provided directly by funders through programmes for the scholars they fund and indirectly through investment in adoption-focussed projects by initiatives. In particular, in this nascent phase of preprint review, now is an opportune moment to fund initiatives focussed on improving diversity and inclusion in the scholarly communications process.
Preprint servers will need to evolve alongside preprint review initiatives to support a seamless experience for scholars. If preprint review is to be seen as a trusted and valuable contribution, and something worthwhile for researchers to read and use, it will be important to communicate its value clearly from the points at which researchers interact with preprints. Major points of interaction today are through two of the largest preprint servers for the life sciences, bioRxiv and medRxiv. We heard several users report how preprint reviews are not easy to find on the current site design, and that the banner on each preprint stating it has not been reviewed can be misleading. The banner text for preprints that have received reviews has recently been updated to read “This is a preprint. It has not been certified by a journal but peer reviews are available”. We also heard the rationale behind current design decisions at bioRxiv. We think it will be important for preprint server(s) and review services to continue to improve their user experience and design to meet the evolving needs of users. Several technology requirements for preprints as a whole, including review services, have already been noted. Drawing upon the ethos of open source development here, it may be helpful for preprint infrastructure funders to nurture an ecosystem that centers the needs of a diverse research community in design processes.
It’s too early for preprint review initiatives to have a plan for financial sustainability….”

Love Data Week 2023 at Harvard Library | Harvard Library Research Data Services

“Policy change, environmental change, social change… we can move mountains with the right data guiding our decisions. This year, we are focused on helping new and seasoned data users find data training and other resources that can help move the needle on the issues they care about. Data: Agent of Change.

If you haven’t participated before, International Love Data Week is the celebration of data. Love Data Week is dedicated to spreading awareness of the importance of research data management, sharing, preservation, and reuse. Research data are the foundation of the scholarly record and crucial for changing the world around us.

Join the Harvard Library community for a week of events focused on how we can share and use data to bring about changes that matter.”

Open science is in the interest of all professionals |

“Open science is in the interest of all professionals working in epilepsy care and patients. At the same time, we do have some challenges with open science within our field. For example, it clashes with patient-related data that cannot be shared due to privacy laws, and sometimes also with the interests of entrepreneurs who supply institutions with equipment/software.

We chose because it offers full professional support in making both new issues and the Epilepsie archive open access. We are proud that, as the Nederlandse Liga tegen Epilepsie, we are now using the most widely used open source publishing platform for scientific journals.”

Finding The ‘Rights’ Balance:. 7 ideas to harmonize debates… | by Open Data Charter | opendatacharter | Jan, 2023 | Medium

“The importance of the right to access to information and the right to personal data protection, both of which are fundamental human rights, raises the need to strike a balance to be able to exercise both in a complementary manner, that ensures that one of them does not jeopardize the guarantees enshrined by the other. There are different proposals and techniques that allow us to continue working on open data for a more democratic and just society, respecting the right to privacy. We explore some ideas below….”

An iterative and interdisciplinary categorisation process towards FAIRer digital resources for sensitive life-sciences data | Scientific Reports

Abstract:  For life science infrastructures, sensitive data generate an additional layer of complexity. Cross-domain categorisation and discovery of digital resources related to sensitive data presents major interoperability challenges. To support this FAIRification process, a toolbox demonstrator aiming at support for discovery of digital objects related to sensitive data (e.g., regulations, guidelines, best practice, tools) has been developed. The toolbox is based upon a categorisation system developed and harmonised across a cluster of 6 life science research infrastructures. Three different versions were built, tested by subsequent pilot studies, finally leading to a system with 7 main categories (sensitive data type, resource type, research field, data type, stage in data sharing life cycle, geographical scope, specific topics). 109 resources attached with the tags in pilot study 3 were used as the initial content for the toolbox demonstrator, a software tool allowing searching of digital objects linked to sensitive data with filtering based upon the categorisation system. Important next steps are a broad evaluation of the usability and user-friendliness of the toolbox, extension to more resources, broader adoption by different life-science communities, and a long-term vision for maintenance and sustainability.


Opinion: Why we’re becoming a Digital Public Good — and why we aren’t | Devex

“A few months ago, Medtronic LABS made the decision to open source our digital health platform SPICE, and pursue certification as a Digital Public Good. DPGs are defined by the Digital Public Good Alliance as: “Open-source software, open data, open AI models, open standards, and open content that adhere to privacy and other applicable laws and best practices, do no harm by design, and help attain the Sustainable Development Goals.” The growing momentum around DPGs in global health is relatively new, coinciding with the launch of the U.N. Secretary General’s Roadmap for Digital Cooperation in 2020. The movement aims to put governments in the driver’s seat, promote better collaboration among development partners, and reduce barriers to the digitization of health systems.”

Investigating the dimensions of students’ privacy concern in the collection, use, and sharing of data for learning analytics

Abstract:  The datafication of learning has created vast amounts of digital data which may contribute to enhancing teaching and learning. While researchers have successfully used learning analytics, for instance, to improve student retention and learning design, the topic of privacy in learning analytics from students’ perspectives requires further investigation. Specifically, there are mixed results in the literature as to whether students are concerned about privacy in learning analytics. Understanding students’ privacy concern, or lack of privacy concern, can contribute to successful implementation of learning analytics applications in higher education institutions. This paper reports on a study carried out to understand whether students are concerned about the collection, use and sharing of their data for learning analytics, and what contributes to their perspectives. Students in a laboratory session (n = 111) were shown vignettes describing data use in a university and an e-commerce company. The aim was to determine students’ concern about their data being collected, used and shared with third parties, and whether their concern differed between the two contexts. Students’ general privacy concerns and behaviours were also examined and compared to their privacy concern specific to learning analytics. We found that students in the study were more comfortable with the collection, use and sharing of their data in the university context than in the e-commerce context. Furthermore, these students were more concerned about their data being shared with third parties in the e-commerce context than in the university context. Thus, the study findings contribute to deepening our understanding about what raises students’ privacy concern in the collection, use and sharing of their data for learning analytics. We discuss the implications of these findings for research on and the practice of ethical learning analytics

Data Sharing for Qualitative Research

“Researchers in the U.S. and elsewhere are increasingly expected to have data management and sharing plans. Recently, the White House released a memo outlining upcoming changes to data sharing policies at the federal level, and other organizations like the National Institutes of Health (NIH) have already planned changes for early 2023. Qualitative and mixed methods researchers have expressed some hesitation for these guidelines, which are not always clearly or equally applicable to qualitative research. Among these concerns include what counts as data (transcriptions, field notes, lab meeting notes), how to properly explain data sharing to participants, and how to minimize the risk for reidentification.

Join us for a discussion on concerns around data sharing for qualitative and mixed-methods research. Ask any questions you might have as you prepare your own data management and sharing plans in alignment with federal guidelines, professional ethical guidelines in your field, and your own ethical codes.”

The evolving role of research ethics committees in the era of open data | South African Journal of Bioethics and Law

Abstract:  While open science gains prominence in South Africa with the encouragement of open data sharing for research purposes, there are stricter laws and regulations around privacy – and specifically the use, management and transfer of personal information – to consider. The Protection of Personal Information Act No. 4 of 2013 (POPIA), which came into effect in 2021, established stringent requirements for the processing of personal information and has changed the regulatory landscape for the transfer of personal information across South African borders. At the same time, draft national policies on open science encourage wide accessibility to data and open data sharing in line with international best practice. As a result, the operation of research ethics committees (RECs) in South Africa is affected by the conflicting demands of the shift towards open science on the one hand, and the stricter laws protecting participants’ personal information and the transfer thereof, on the other. This article explores the continuing evolving role of RECs in the era of open data and recommends the development of a data transfer agreement (DTA) for the ethical management of personal health information, considering the challenges that RECs encounter, which centres predominantly on privacy, data sharing and access concerns following advances in genetic and genomic research and biobanking.


Data for Good Can’t be a Casualty of Tech Restructuring  • CrisisReady

“Technology companies like Meta, Twitter and Amazon are laying off thousands of employees as part of corporate restructuring in an uncertain global economy. In addition to jobs, many internal programs deemed unnecessary or financially infeasible may be lost. Programs that fall under the rubric of “corporate social responsibility” (CSR) are generally the first casualties of restructuring. CSR efforts include “data for good” programs designed to translate anonymized corporate data into social good and may be seen in the current climate as a way that companies cater to employee values or enable friendlier regulatory environments; in other words, nice-to-haves rather than need-to-haves for the bottom line.  

We believe the platforms built to safely and ethically share corporate data to support public policy are not a luxury that companies should jettison or monetize. The data we produce in our daily lives has become integral to how public decisions are made while planning for public health or disaster response. Our 21st century public data ecosystem is increasingly reliant on novel private data streams that corporations own and currently share only conditionally and increasingly, for profit….

We contend that the rapid sharing of aggregated and anonymized location data with disaster response and public health agencies should be automatic and free — though conditional on strict privacy protocols and time-limited — during acute emergencies….

While the challenges to realizing the full value of private data for public good are many, there is precedent for a path forward. Two decades ago, the International Space Charter was negotiated to facilitate access to satellite data from companies and governments for the sake of responding to major disasters. A similar approach guaranteeing access rights to privately held data for good during emergencies is more important now….”

A Python library to check the level of anonymity of a dataset | Scientific Data

Abstract:  Openly sharing data with sensitive attributes and privacy restrictions is a challenging task. In this document we present the implementation of pyCANON, a Python library and command line interface (CLI) to check and assess the level of anonymity of a dataset through some of the most common anonymization techniques: k-anonymity, (?,k)-anonymity, ?-diversity, entropy ?-diversity, recursive (c,?)-diversity, t-closeness, basic ?-likeness, enhanced ?-likeness and ?-disclosure privacy. For the case of more than one sensitive attribute, two approaches are proposed for evaluating these techniques. The main strength of this library is to obtain a full report of the parameters that are fulfilled for each of the techniques mentioned above, with the unique requirement of the set of quasi-identifiers and sensitive attributes. The methods implemented are presented together with the attacks they prevent, the description of the library, examples of the different functions’ usage, as well as the impact and the possible applications that can be developed. Finally, some possible aspects to be incorporated in future updates are proposed.