The Health care data sharing rule and its roots at Boston Children’s Hospital – Discoveries

“Are you sick of health care systems not communicating with each other? Do you wish you could access more of your medical information — or your patients’ information — online? Do you ever wonder whether a health pattern you see is part of a larger trend? Two key developments have advanced the vision of seamless, secure exchange of electronic health records (EHRs) among health care institutions and patients.

That vision includes being able to learn from our data at a population scale. Through federal regulations issued this year, it will finally become reality in 2022. And the vision began at Boston Children’s Hospital more than a decade ago….”

OpenNotes – Patients and clinicians on the same page

“OpenNotes is the international movement promoting and studying transparent communication in healthcare. We help patients and clinicians share meaningful notes in medical records. We call these open notes….

OpenNotes is not software or a product. It’s a call to action.”

 

Federal Rules Mandating Open Notes

“Taking effect in April, 2021, rules implementing the bipartisan federal Cures Act specify that clinical notes are among electronic information that must not be blocked and must be made available free of charge to patients. To meet the interests of some patients, the rules allow specified exceptions….”

45 million medical scans from hospitals all over the world left exposed online for anyone to view – some servers were laced with malware • The Register

“Two thousand servers containing 45 million images of X-rays and other medical scans were left online during the course of the past twelve months, freely accessible by anyone, with no security protections at all.

Or so says research by CybelAngel, which sells a Digital Risk Protection Platform. Not only was the sensitive personal information unsecured, but malicious folk had also accessed those servers and poisoned them with apparent malware, the company added….”

The Principles of Open Scholarly Infrastructure

Open source – All software required to run the infrastructure should be available under an open source license. This does not include other software that may be involved with running the organisation.
Open data (within constraints of privacy laws) – For an infrastructure to be forked it will be necessary to replicate all relevant data. The CC0 waiver is best practice in making data legally available. Privacy and data protection laws will limit the extent to which this is possible
Available data (within constraints of privacy laws) – It is not enough that the data be made “open” if there is not a practical way to actually obtain it. Underlying data should be made easily available via periodic data dumps.
Patent non-assertion – The organisation should commit to a patent non-assertion covenant. The organisation may obtain patents to protect its own operations, but not use them to prevent the community from replicating the infrastructure….”

The ethics of data sharing and biobanking in health research

Abstract:  The importance of data sharing and biobanking are increasingly being recognised in global health research. Such practices are perceived to have the potential to promote science by maximising the utility of data and samples. However, they also raise ethical challenges which can be exacerbated by existing disparities in power, infrastructure and capacity. The Global Forum on Bioethics in Research (GFBR) convened in Stellenbosch, South Africa in November 2018, to explore the ethics of data sharing and biobanking in health research. Ninety-five participants from 35 countries drew on case studies and their experiences with sharing in their discussion of issues relating to respecting research participants and communities, promoting equitable sharing, and international and national approaches to governing data sharing and biobanking. In this editorial we will briefly review insights relating to each of these three themes.

 

Open Access Helper

“There are more than 25 million Open Access versions of otherwise “paywalled” scientific articles, however they are often not easy to find.

Open Access Helper for iOS & macOS is designed to help you get easy access to these documents, with a lot of help from some amazing APIs….

Open Access Helper is designed to make finding the best Open Access location easy. Whenever my app comes across a DOI, it will query the APIs of unpaywall.org & core.ac.uk to see if an Open Access copy is available elsewhere.

The App is free and Open Source and I have no intention to change that….”

Balancing’ privacy and open science in the context of COVID-19: a response to Ifenthaler & Schumacher (2016) | SpringerLink

Abstract:  Privacy and confidentiality are core considerations in education, while at the same time, using and sharing data—and, more broadly, open science—is increasingly valued by editors, funding agencies, and the public. This manuscript responds to an empirical investigation of students’ perceptions of the use of their data in learning analytics systems by Ifentahler and Schumacher (Educational Technology Research and Development, 64: 923-938, 2016). We summarize their work in the context of the COVID-19 pandemic and the resulting shift to digital modes of teaching and learning by many teachers, using the tension between privacy and open science to frame our response. We offer informed recommendations for educational technology researchers in light of Ifentahler and Schumacher’s findings as well as strategies for navigating the tension between these important values. We conclude with a call for educational technology scholars to meet the challenge of studying learning (and disruptions to learning) in light of COVID-19 while protecting the privacy of students in ways that go beyond what Institutional Review Boards consider to be within their purview.

 

Ten principles for data sharing and commercialization | Journal of the American Medical Informatics Association | Oxford Academic

Abstract:  Digital medical records have enabled us to employ clinical data in many new and innovative ways. However, these advances have brought with them a complex set of demands for healthcare institutions regarding data sharing with topics such as data ownership, the loss of privacy, and the protection of the intellectual property. The lack of clear guidance from government entities often creates conflicting messages about data policy, leaving institutions to develop guidelines themselves. Through discussions with multiple stakeholders at various institutions, we have generated a set of guidelines with 10 key principles to guide the responsible and appropriate use and sharing of clinical data for the purposes of care and discovery. Industry, universities, and healthcare institutions can build upon these guidelines toward creating a responsible, ethical, and practical response to data sharing.

 

COVID?19 and the boundaries of open science and innovation: Lessons of traceability from genomic data sharing and biosecurity: EMBO reports: Vol 0, No 0

“While conventional policies and systems for data sharing and scholarly publishing are being challenged and new Open Science policies are being developed, traceability should be a key function for guaranteeing socially responsible and robust policies. Full access to the available data and the ability to trace it back to its origins assure data quality and processing legitimacy. Moreover, traceability would be important for other agencies and organisations – funding agencies, database managers, institutional review boards and so on – for undertaking systematic reviews, data curation or process oversights. Thus, the term “openness” means much more than just open access to published data but must include all aspects of data generation, analysis and dissemination along with other organisations and agencies than just research groups and publishers. The COVID?19 crisis has highlighted the challenges and shortfalls of the current notions of openness and it should serve as an impetus to further advance towards real Open Science.”

 

Improving access and delivery of academic content – a survey of current & emerging trends | Musings about librarianship

“While allowing users to gain access to paywalled academic content aka delivery services is often seen to be less sexy than discovery it is still an important part of the researcher workflow that is worth looking at. In particular, I will argue that in the past few years we have seen a renewed interest in this part of the workflow and may potentially start to see some big changes in the way we provide access to academic content in the near future.

Note: The OA discovery and delivery front has changed a lot since 2017, with Unpaywall been a big part of the story, but for this blog post I will focus on delivery aspects of paywalled content. 1.0 Access and delivery – an age old problem

 

1.1 RA21, Seamless Access and getFTR

 

1.2 Campus Activated Subscriber Access (CASA)

1.3 Browser extensions/”Access Brokers” 1.4 Content syndication partnership between Springer Nature and ResearchGate (new) 1.5 Is the sun slowing setting on library link resolvers? 1.6 The Sci-hub effect?

1.7 Privacy implications …”

A pseudonymisation protocol with implicit and explicit consent routes for health records in federated ledgers – IEEE Journals & Magazine

Abstract:  Healthcare data for primary use (diagnosis) may be encrypted for confidentiality purposes; however, secondary uses such as feeding machine learning algorithms requires open access. Full anonymity has no traceable identifiers to report diagnosis results. Moreover, implicit and explicit consent routes are of practical importance under recent data protection regulations (GDPR), translating directly into break-the-glass requirements. Pseudonymisation is an acceptable compromise when dealing with such orthogonal requirements and is an advisable measure to protect data. Our work presents a pseudonymisation protocol that is compliant with implicit and explicit consent routes. The protocol is constructed on a (t,n)-threshold secret sharing scheme and public key cryptography. The pseudonym is safely derived from a fragment of public information without requiring any data-subject’s secret. The method is proven secure under reasonable cryptographic assumptions and scalable from the experimental results.

 

Dataverse and OpenDP: Tools for Privacy-Protective Analysis in the Cloud | Mercè Crosas

“When big data intersects with highly sensitive data, both opportunity to society and risks abound. Traditional approaches for sharing sensitive data are known to be ineffective in protecting privacy. Differential Privacy, deriving from roots in cryptography, is a strong mathematical criterion for privacy preservation that also allows for rich statistical analysis of sensitive data. Differentially private algorithms are constructed by carefully introducing “random noise” into statistical analyses so as to obscure the effect of each individual data subject.    OpenDP is an open-source project for the differential privacy community to develop general-purpose, vetted, usable, and scalable tools for differential privacy, which users can simply, robustly and confidently deploy. 

Dataverse is an open source web application to share, preserve, cite, explore, and analyze research data. It facilitates making data available to others, and allows you to replicate others’ work more easily. Researchers, journals, data authors, publishers, data distributors, and affiliated institutions all receive academic credit and web visibility.  A Dataverse repository is the software installation, which then hosts multiple virtual archives called Dataverses. Each dataverse contains datasets, and each dataset contains descriptive metadata and data files (including documentation and code that accompany the data).

This session examines ongoing efforts to realize a combined use case for these projects that will offer academic researchers privacy-preserving access to sensitive data. This would allow both novel secondary reuse and replication access to data that otherwise is commonly locked away in archives.  The session will also explore the potential impact of this work outside the academic world.”

OpenDP Hiring Scientific Staff | OpenDP

“The OpenDP project seeks to hire 1-2 scientists to work with faculty directors Gary King and Salil Vadhan and the OpenDP Community to formulate and advance the scientific goals of OpenDP and solve research problems that are needed for its success. Candidates should have a graduate-level degree (preferably a PhD), familiarity with differential privacy, and one or both of the following: 

Experience with implementing software for data science, privacy, and/or security, and an interest in working with software engineers to develop the OpenDP codebase.
Experience with applied statistics, and an interest in working with domain scientists to apply OpenDP software to data-sharing problems in their field. In particular, we are looking for a researcher to engage on an immediate project on Covid-19 and mobility and epidemiology.  See HDSI Fellow for more details….”

Computational social science: Obstacles and opportunities | Science

“An alternative has been to use proprietary data collected for market research (e.g., Comscore, Nielsen), with methods that are sometimes opaque and a pricing structure that is prohibitive to most researchers.

We believe that this approach is no longer acceptable as the mainstay of CSS, as pragmatic as it might seem in light of the apparent abundance of such data and limited resources available to a research community in its infancy. We have two broad concerns about data availability and access.

First, many companies have been steadily cutting back data that can be pulled from their platforms (5). This is sometimes for good reasons—regulatory mandates (e.g., the European Union General Data Protection Regulation), corporate scandal (Cambridge Analytica and Facebook)—however, a side effect is often to shut down avenues of potentially valuable research. The susceptibility of data availability to arbitrary and unpredictable changes by private actors, whose cooperation with scientists is strictly voluntary, renders this system intrinsically unreliable and potentially biased in the science it produces.

Second, data generated by consumer products and platforms are imperfectly suited for research purposes (6). Users of online platforms and services may be unrepresentative of the general population, and their behavior may be biased in unknown ways. Because the platforms were never designed to answer research questions, the data of greatest relevance may not have been collected (e.g., researchers interested in information diffusion count retweets because that is what is recorded), or may be collected in a way that is confounded by other elements of the system (e.g., inferences about user preferences are confounded by the influence of the company’s ranking and recommendation algorithms). The design, features, data recording, and data access strategy of platforms may change at any time because platform owners are not incentivized to maintain instrumentation consistency for the benefit of research.

For these reasons, research derived from such “found” data is inevitably subject to concerns about its internal and external validity, and platform-based data, in particular, may suffer from rapid depreciation as those platforms change (7). Moreover, the raw data are often unavailable to the research community owing to privacy and intellectual property concerns, or may become unavailable in the future, thereby impeding the reproducibility and replication of results….

Despite the limitations noted above, data collected by private companies are too important, too expensive to collect by any other means, and too pervasive to remain inaccessible to the public and unavailable for publicly funded research (8). Rather than eschewing collaboration with industry, the research community should develop enforceable guidelines around research ethics, transparency, researcher autonomy, and replicability. We anticipate that many approaches will emerge in coming years that will be incentive compatible for involved stakeholders….

Privacy-preserving, shared data infrastructures, designed to support scientific research on societally important challenges, could collect scientifically motivated digital traces from diverse populations in their natural environments, as well as enroll massive panels of individuals to participate in designed experiments in large-scale virtual labs. These infrastructures could be driven by citizen contributions of their data and/or their time to support the public good, or in exchange for explicit compensation. These infrastructures should use state-of-the-art security, with an escalation checklist of security measures depending on the sensitivity of the data. These efforts need to occur at both the university and cross-university levels. Finally, these infrastructures should capture and document the metadata that describe the data collection process and incorporate sound ethical principles for data collection and use….”