“The Biodiversity Literature Repository (BLR) has been growing from a community on Zenodo to be a service dedicated to liberate and make open access, FAIR (findable, accessible, interoperable and reusable) data hidden in the hundreds of millions of pages of scholarly publications.
It is built on top of Zenodo, a digital repository hosted at CERN, which provides a sustainable and robust infrastructure for long tail research data, which can consist of small datasets that otherwise would be lost.
Originally a collaboration between Zenodo, Plazi and Pensoft, BLR began as a repository for taxonomic publications which lacked Digital Object Identifiers (DOI) and thus were effectively orphaned from the network of online citations. As it grew its scope expanded to morphed into a highly interlinked repository that focuses on include illustrations and taxonomic treatments contained in publications with all these content types interlinked among themselves and enhanced with and rich metadata.
The source data for BLR are scholarly publications that are most often in PDF or html format but sometimes in XML formats whose structured data facilitates the automated data extraction.
The largest data users are the Global Biodiversity Information Facility (GBIF) and the United States’ National Center for Biotechnology Information (NCBI).
Support of BLR comes from the Arcadia Fund and the three partner institutions Zenodo, Plazi and Pensoft.”
“This document describes the cooperation and collaboration of BHL and Plazi, on common goals. It outlines common goals and areas of common interests, and clarifies key areas of responsibility. The digital arena allows building a large corpus of literature and from that a “graph” of knowledge or knowledge graph through identification, extraction and linking of data. It provides an emerging access platform to the knowledge beyond the conventional traditional human-reader focused access. It allows new modes of access, including text and data mining, search, visualization and the discovery of new findings based on the accessibility of data. This knowledge graph does not replace existing media, but rather complements them. In the case of biodiversity sciences, it is based on both the estimated 500 Million pages of biodiversity literature and on increasingly born-digital publications. In biodiversity, the very rich data centric publications with the highly sophisticated implicit citation networks are a perfect base to build such a knowledge graph. In order to build the knowledge graph, the data in the publications must be liberated and made open, findable, accessible, interoperable, reusable (FAIR) for machine use. This is the necessary additional step after the digitization of existing literature….”
“Pensoft’s flagship journal ZooKeys invites free-to-publish research on key biological traits of SARS-like viruses potential hosts and vectors; Plazi harvests and brings together all relevant data from legacy literature to a reliable FAIR-data repository.”