The Genomics Research and Innovation Network: creating an interoperable, federated, genomics learning system | Genetics in Medicine

Abstract:  Purpose:

Clinicians and researchers must contextualize a patient’s genetic variants against population-based references with detailed phenotyping. We sought to establish globally scalable technology, policy, and procedures for sharing biosamples and associated genomic and phenotypic data on broadly consented cohorts, across sites of care.

Methods

Three of the nation’s leading children’s hospitals launched the Genomic Research and Innovation Network (GRIN), with federated information technology infrastructure, harmonized biobanking protocols, and material transfer agreements. Pilot studies in epilepsy and short stature were completed to design and test the collaboration model.

Results

Harmonized, broadly consented institutional review board (IRB) protocols were approved and used for biobank enrollment, creating ever-expanding, compatible biobanks. An open source federated query infrastructure was established over genotype–phenotype databases at the three hospitals. Investigators securely access the GRIN platform for prep to research queries, receiving aggregate counts of patients with particular phenotypes or genotypes in each biobank. With proper approvals, de-identified data is exported to a shared analytic workspace. Investigators at all sites enthusiastically collaborated on the pilot studies, resulting in multiple publications. Investigators have also begun to successfully utilize the infrastructure for grant applications.

Conclusions

The GRIN collaboration establishes the technology, policy, and procedures for a scalable genomic research network.

The advantages of UK Biobank’s open access strategy for health research – Conroy – – Journal of Internal Medicine – Wiley Online Library

Abstract:  Ready access to health research studies is becoming more important as researchers, and their funders, seek to maximise the opportunities for scientific innovation and health improvements. Large?scale population?based prospective studies are particularly useful for multidisciplinary research into the causes, treatment and prevention of many different diseases. UK Biobank has been established as an open?access resource for public health research, with the intention of making the data as widely available as possible in an equitable and transparent manner. Access to UK Biobank’s unique breadth of phenotypic and genetic data has attracted researchers worldwide from across academia and industry. As a consequence, it has enabled scientists to perform world?leading collaborative research. Moreover, open access to an already deeply characterized cohort has encouraged both public and private sector investment in further enhancements to make UK Biobank an unparalleled resource for public health research and an exemplar for the development of open access approaches for other studies.

Graphene as an open-source material | TechCrunch

Graphene is fundamentally different from software in that it is a physical resource. Since the material’s discovery, quantity has been a serious issue, preventing the material from seeing widespread use. Natural reserves of graphene are few and far between, and while scientists have discovered ways of producing graphene, the methods have proved unscalable.

In addition, graphene would need a way to be experimented with by the average user. For those who don’t have the same equipment researchers do, how can they go about tinkering with graphene? In order for graphene to become an open-source material, a solution for these two problems must be found….

The solutions may be closer at hand than you might think….”

Depositing and reporting of reagents: Accelerating open and reproducible science. | The Official PLOS Blog

Centralized depositing of materials advances science in so many ways. It saves authors the time and burden of shipping requested materials. Researchers who request from repositories save time by not having to recreate reagents or wait months or years to receive samples. Many scientists have been on the receiving end of a request that was filled by an incorrect or degraded sample, which further delays research. Repositories like the ones recommended by PLOS handle the logistics of material requests, letting the scientists focus on what’s important: doing research….

By encouraging authors to deposit materials at the time of publication, journals will help accelerate research through timely distribution and accurate identification of reagents. Biological repositories exist to serve the scientific community. Take Addgene’s involvement in the explosive advancement of CRISPR research. Since 2012, over 8,400 CRISPR plasmids have been deposited and Addgene has distributed over 144,000 CRISPR plasmids worldwide, enabling researchers to share, modify, and improve this game-changing molecular tool. It is a prime example of the positive impact that biological repositories are making on research….”

Governance of a global genetic resource commons for non-commercial research: A case-study of the DNA barcode commons

Abstract:  Life sciences research that uses genetic resources is increasingly collaborative and global, yet collective action remains a significant barrier to the creation and management of shared research resources. These resources include sequence data and associated metadata, and biological samples, and can be understood as a type of knowledge commons. Collective action by stakeholders to create and use knowledge commons for research has potential benefits for all involved, including minimizing costs and sharing risks, but there are gaps in our understanding of how institutional arrangements may promote such collective action in the context of global genetic resources. We address this research gap by examining the attributes of an exemplar global knowledge commons: The DNA barcode commons. DNA barcodes are short, standardized gene regions that can be used to inexpensively identify unknown specimens, and proponents have led international efforts to make DNA barcodes a standard species identification tool. Our research examined if and how attributes of the DNA barcode commons, including governance of DNA barcode resources and management of infrastructure, facilitate global participation in DNA barcoding efforts. Our data sources included key informant interviews, organizational documents, scientific outputs of the DNA barcoding community, and DNA barcode record submissions. Our research suggested that the goal of creating a globally inclusive DNA barcode commons is partially impeded by the assumption that scientific norms and expectations held by researchers in high income countries are universal. We found scientific norms are informed by a complex history of resource misappropriation and mistrust between stakeholders. DNA barcode organizations can mitigate the challenges caused by its global membership through creating more inclusive governance structures, developing norms for the community are specific to the context of DNA barcoding, and through increasing awareness and knowledge of pertinent legal frameworks.

ZooArchNet: Connecting zooarchaeological specimens to the biodiversity and archaeology data networks

Abstract:  Interdisciplinary collaborations and data sharing are essential to addressing the long history of human-environmental interactions underlying the modern biodiversity crisis. Such collaborations are increasingly facilitated by, and dependent upon, sharing open access data from a variety of disciplinary communities and data sources, including those within biology, paleontology, and archaeology. Significant advances in biodiversity open data sharing have focused on neontological and paleontological specimen records, making available over a billion records through the Global Biodiversity Information Facility. But to date, less effort has been placed on the integration of important archaeological sources of biodiversity, such as zooarchaeological specimens. Zooarchaeological specimens are rich with both biological and cultural heritage data documenting nearly all phases of human interaction with animals and the surrounding environment through time, filling a critical gap between paleontological and neontological sources of data within biodiversity networks. Here we describe technical advances for mobilizing zooarchaeological specimen-specific biological and cultural data. In particular, we demonstrate adaptations in the workflow used by biodiversity publisher VertNet to mobilize Darwin Core formatted zooarchaeological data to the GBIF network. We also show how a linked open data approach can be used to connect existing biodiversity publishing mechanisms with archaeoinformatics publishing mechanisms through collaboration with the Open Context platform. Examples of ZooArchNet published datasets are used to show the efficacy of creating this critically needed bridge between biological and archaeological sources of open access data. These technical advances and efforts to support data publication are placed in the larger context of ZooarchNet, a new project meant to build community around new approaches to interconnect zoorchaeological data and knowledge across disciplines.

Extending U.S. Biodiversity Collections to Promote Research and Education

“Our national heritage of approximately one billion biodiversity specimens, once digitized, can be linked to emerging digital data sources to form an information-rich network for exploring earth’s biota across taxonomic, temporal and spatial scales. A workshop held 30 October – 1 November 2018 at Oak Spring Garden in Upperville, VA under the leadership of the Biodiversity Collections Network (BCoN) developed a strategy for the next decade to maximize the value of our collections resource for research and education. In their deliberations, participants drew heavily on recent literature as well as surveys, and meetings and workshops held over the past year with the primary stakeholder community of collections professionals, researchers, and educators.

Arising from these deliberations is a vision to focus future biodiversity infrastructure and digital resources on building a network of extended specimen data that encompasses the depth and breadth of biodiversity specimens and data held in U.S. collections institutions. The extended specimen network (ESN) includes the physical voucher specimen curated and housed in a collection and its associated genetic, phenotypic and environmental data (both physical and digital). These core data types, selected because they are key to answering driving research questions, include physical preparations such as tissue samples and their derivative products such as gene sequences or metagenomes, digitized media and annotations, and taxon- or locality-specific data such as occurrence observations, phylogenies and species distributions. Existing voucher specimens will be extended both manually and through new automated methods, and data will be linked through unique identifiers, taxon name and location across collections, across disciplines and to outside sources of data. As we continue our documentation of earth’s biota, new collections will be enhanced from the outset, i.e., accessioned with a full suite of data. We envision the ESN proposed here will be the gold standard for the structured cloud of integrated data associated with all vouchered specimens. These permanent specimen vouchers, in which genotypes and phenotypes link to a particular environment in time and space, comprise an irreplaceable resource for the millennia….”

Taking knowledge preservation to the next level: new partnership between protocols.io, Addgene, PLOS

Digital information carries a significant risk of disappearing, as one of the “fathers of the Internet” Vint Cerf has been

. This is particularly problematic for research communication as vanishing records undermine the reproducibility and integrity of science. We have taken this concern seriously at protocols.io from day one, constantly aiming for better ways to ensure stability, preservation, and visibility of the methods and knowledge shared on our platform. Digital archiving solutions have been the center of our focus; however, today we are excited to share with you our new physical preservation initiative, guaranteeing zero loss, long into the future. We are thrilled to be joined by the Addgene plasmid repository and the Public Library of Science (PLOS) in this initiative.

 
Of course, for many years we at protocols.io have had public APIs and PDF export of all protocols. In 2016, we became a

of CLOCKSS (the digital preservation archive for scholarly content, started by Stanford librarians in 1999), sending a PDF copy of every new protocol to them, the second it is made public. More recently, we introduced integration with Dropbox and GoogleDrive, to facilitate individual backups.

 
While all of our efforts are reasonable and ensure preservation and accessibility for decades, they are not infallible solutions in the long run. This is because preservation and accessibility are not the same thing. How many people today can open a file from 1997 WordPerfect or 1999 PowerPoint, particularly if it has been saved on a floppy disk? How confident are we that PDFs of protocols will be accessible and readable in seventy years by the scientists of the future?
 
With the above concerns in mind, we have been exploring over the last year more reliable solutions that take advantage of modern technology. And so, we are excited to announce a partnership with

and

for low-cost physical preservation of protocols, using laser cutters. The PLOS editorial team will be in charge of selecting protocols that warrant physical preservation and Addgene, with their expertise in physical storage, will be handling the long-term archiving in their freezers….”

 

Centuries-Old Plant Collection Now Online — A Treasure Trove For Researchers : NPR

“Funded by the National Science Foundation, the Mid-Atlantic Megalopolis Project will put about 800,000 records from about a dozen herbaria online via high-resolution photos of plant specimens that span the urbanized corridor from New York City to Washington, D.C….”

Sharing Publication-Related Data and Materials: Responsibilities of Authorship in the Life Sciences | The National Academies Press

“Biologists communicate to the research community and document their scientific accomplishments by publishing in scholarly journals. This report explores the responsibilities of authors to share data, software, and materials related to their publications. In addition to describing the principles that support community standards for sharing different kinds of data and materials, the report makes recommendations for ways to facilitate sharing in the future.”

Green digitization: Botanical collections data answer real-world questions | EurekAlert! Science News

“Special issue of Applications in Plant Sciences explores new developments and applications of digital plant data

Even as botany has moved firmly into the era of “big data,” some of the most valuable botanical information remains inaccessible for computational analysis, locked in physical form in the orderly stacks of herbaria and museums. Herbarium specimens are plant samples collected from the field that are dried and stored with labels describing species, date and location of collection, along with various other information including habitat descriptions. The detailed historical record these specimens keep of species occurrence, morphology, and even DNA provides an unparalleled data source to address a variety of morphological, ecological, phenological, and taxonomic questions. Now efforts are underway to digitize these data, and make them easily accessible for analysis. Two symposia were convened to discuss the possibilities and promise of digitizing these data–at the Botanical Society of America’s 2017 annual meeting in Fort Worth, Texas, and again at the XIX International Botanical Congress in Shenzhen, China. The proceedings of those symposia have been published as a special issue of Applications in Plant Sciences; the articles discuss a range of methods and remaining challenges for extracting data from botanical collections, as well as applications for collections data once digitized. Many of the authors contributing to the issue are involved in iDigBio (Integrated Digitized Biocollections), a new “national coordinating center for the facilitation and mobilization of biodiversity specimen data,” as described by Dr. Gil Nelson, a botanist at Florida State University and coeditor of this issue….”

Science Journals: editorial policies | Science | AAAS

“The Science Journals support the Transparency and Openness Promotion (TOP) guidelines to raise the quality of research published in Science and to increase transparency regarding the evidence on which conclusions are based….All data used in the analysis must be available to any researcher for purposes of reproducing or extending the analysis. Data must be available in the paper, deposited in a community special-purpose repository, accessible via a general-purpose repository such as Dryad, or otherwise openly available….”

Monticello archaeologists get boost in study of slavery | Local | dailyprogress.com

“In 2000, archaeologists at Monticello established the Digital Archaeological Archive of Comparative Slavery, or DAACS. It is a collaborative, online database where archaeologists can upload and share data about artifacts found during excavations of slavery sites at Monticello and other places in the Chesapeake region, according to Fraser Neiman, director of archaeology at Monticello….

The Natural History Museum is going high tech to save its archive | WIRED UK

“London’s Natural History Museum is digitising its specimens – all 80 million of them. “We need to record them to create data in aggregate,” says Vince Smith, the museum’s head of informatics. With the collection including everything from a blue whale skeleton to Martian meteorites, progress is understandably slow: since the project started in 2014, the museum has only digitised 4.5 per cent of the collection. Undeterred, the 11-person digital collections team has set its sights on recording 20 million specimens in the next few years with specially developed kit.”

Addgene Depositors Get More Citations

“Professor Feng Zhang’s original 2013 gene editing paper on CRISPR/Cas amassed nearly 2,400 citations in its first four years (1). In addition to publishing in Science, Professor Zhang deposited the associated plasmids with Addgene. Since then, Addgene has filled over 6,500 requests for these plasmids. While clearly an outlier, this story had us wondering: is there a larger trend here? Do papers associated with Addgene deposits accumulate more citations than those without Addgene deposits? Even more interestingly, could we tell if depositing a plasmid with Addgene causes a paper to get cited more? …So what do we find [from Web of Science]? Lots more citations for the papers with plasmids deposited at Addgene – typically about four times as many as papers without plasmids deposited with Addgene….”