An Open-Publishing Response to the COVID-19 Infodemic

Abstract:  The COVID-19 pandemic catalyzed the rapid dissemination of papers and preprints investigating the disease and its associated virus, SARS-CoV-2. The multifaceted nature of COVID-19 demands a multidisciplinary approach, but the urgency of the crisis combined with the need for social distancing measures present unique challenges to collaborative science. We applied a massive online open publishing approach to this problem using Manubot. Through GitHub, collaborators summarized and critiqued COVID-19 literature, creating a review manuscript. Manubot automatically compiled citation information for referenced preprints, journal publications, websites, and clinical trials. Continuous integration workflows retrieved up-to-date data from online sources nightly, regenerating some of the manuscript’s figures and statistics. Manubot rendered the manuscript into PDF, HTML, LaTeX, and DOCX outputs, immediately updating the version available online upon the integration of new content. Through this effort, we organized over 50 scientists from a range of backgrounds who evaluated over 1,500 sources and developed seven literature reviews. While many efforts from the computational community have focused on mining COVID-19 literature, our project illustrates the power of open publishing to organize both technical and non-technical scientists to aggregate and disseminate information in response to an evolving crisis.

 

The Invisible Citation Commons · Business of Knowing

“In recent years, there has been a push to openly license citation metadata to better enable large-scale analyses and discoverability of scholarly work. The “Initiative for Open Citations” (I4OC),undefined launched in 2017, has led the way in helping publishers share citations to their works under a public domain CC0 license. As of early 2021, over a billion citations from one scholarly article to another are collected in public domain databases, a major shift from just a few years earlier.undefined These open databases provide the backbone for new discovery tools, and are used by academics training artificial intelligence tools. Open corpora like the Microsoft Academic Graph are themselves widely cited.undefined However, Microsoft Academic Graph will be shuttered in 2021; despite their importance, new citation projects are reliant on continued funding and support by their host, and longevity is not always guaranteed….

Wikidata is a freely licensed and editable online database of linked data, with 94 million items as of June 2021.undefined Like its sister project Wikipedia, it has a vibrant multilingual volunteer community that develops and maintains it, and is supported by the non-profit Wikimedia Foundation. Wikidata also includes bibliographic metadata: as of June 2021, nearly 40 million items on Wikidata represented publications, accounting for 43% of all items.undefined These are a combination of semi-automated uploads of citations from other open databases, items about notable publications that have their own Wikipedia articles, and items added manually by editors. Wikidata is also attractive for libraries, archives, and cultural institutions that want to make their metadata more openly available and reusable, and there are several ongoing projects to incorporate Wikidata into library and archival cataloging processes and connect Wikidata to new open knowledgebases….”

The Invisible Citation Commons · Business of Knowing

“In recent years, there has been a push to openly license citation metadata to better enable large-scale analyses and discoverability of scholarly work. The “Initiative for Open Citations” (I4OC),undefined launched in 2017, has led the way in helping publishers share citations to their works under a public domain CC0 license. As of early 2021, over a billion citations from one scholarly article to another are collected in public domain databases, a major shift from just a few years earlier.undefined These open databases provide the backbone for new discovery tools, and are used by academics training artificial intelligence tools. Open corpora like the Microsoft Academic Graph are themselves widely cited.undefined However, Microsoft Academic Graph will be shuttered in 2021; despite their importance, new citation projects are reliant on continued funding and support by their host, and longevity is not always guaranteed….

Wikidata is a freely licensed and editable online database of linked data, with 94 million items as of June 2021.undefined Like its sister project Wikipedia, it has a vibrant multilingual volunteer community that develops and maintains it, and is supported by the non-profit Wikimedia Foundation. Wikidata also includes bibliographic metadata: as of June 2021, nearly 40 million items on Wikidata represented publications, accounting for 43% of all items.undefined These are a combination of semi-automated uploads of citations from other open databases, items about notable publications that have their own Wikipedia articles, and items added manually by editors. Wikidata is also attractive for libraries, archives, and cultural institutions that want to make their metadata more openly available and reusable, and there are several ongoing projects to incorporate Wikidata into library and archival cataloging processes and connect Wikidata to new open knowledgebases….”

Can Twitter data help in spotting problems early with publications? What retracted COVID-19 papers can teach us about science in the public sphere | Impact of Social Sciences

“Publications that are based on wrong data, methodological mistakes, or contain other types of severe errors can spoil the scientific record if they are not retracted. Retraction of publications is one of the effective ways to correct the scientific record. However, before a problematic publication can be retracted, the problem has to be found and brought to the attention of the people involved (the authors of the publication and editors of the journal). The earlier a problem with a published paper is detected, the earlier the publication can be retracted and the less wasted effort goes into new research that is based on disinformation within the scientific record. Therefore, it would be advantageous to have an early warning system that spots potential problems with published papers, or maybe even before based on a preprint version….”

Speculative Annotation Invites Public to Interact with Digitized Collections at the Library of Congress | Library of Congress

“Students, educators and learners of all ages are invited to interact with select items in the Library’s collections with the launch of Speculative Annotation, the latest experiment from LC Labs.

Created by artist and 2021 Innovator in Residence Courtney McClellan, Speculative Annotation is an open-source dynamic web application and public art project. The app presents a unique mini collection of free-to-use items from the Library for students, teachers and learners to annotate through captions, drawings and other types of mark-making. As a special feature for Speculative Annotation users, the app includes a collection of informative, engaging annotations from Library experts and resources on the Library’s website….”

White Paper · Quartz OA

“We are excited to share with you our vision for a more fair and sustainable future for independent open access publishing. In our white paper, we describe our learnings about the challenges of Open Access publishing and propose a new, cooperative, route to OA: Quartz Open Access….

We did our research and found the answers to our questions in many discussions and research pieces produced by our fellow academics as well as journalists. As we researched our way through the intricacies of the scholarly communication ecosystem, we became avid supporters of the open science movement and open access publishing. We also found that open access is not the same experience for everyone and some of the questions we asked above are more relevant for early-career researchers, those in the humanities and social sciences and those coming from less well-funded institutions as well as low- and lower-middle-income countries. We became increasingly aware of the existence of unintended consequences of the various OA policies resulting in increasing inequalities or perpetuating the same systems that have led to creating these inequalities in the first place. Independently, we came up with similar ideas to address these issues and then came together as a team to try and develop a solution to some of the barriers hampering the transition towards just, fair and sustainable open access publishing.

As newcomers, we looked into the different successful – and less so – initiatives, we explored the values associated with scholarly communication and academic research, we dug into the related publishing fields and found inspiration in some of the business models now applied in journalism and creative industries. We explored new technologies such as peer-to-peer networks and blockchain to see how these can help solve some of the problems in the transition towards open access academic publishing. We also drew inspiration from the proposed solutions to the crisis of accountability in big tech and the responsible innovation and value-sensitive design approaches to developing technological systems.

Our proposal to face these challenges is powered by three key components: 1) a platform cooperative allowing exchanges within the OA ecosystem, 2) a browser extension allowing readers to support open access content and communities, and 3) a crowdfunding infrastructure for OA….”

Open Future

“Numerous organisations and initiatives have been launched with a belief in openness and free knowledge. Their proponents placed their bets on the combined power of networked information services and new governance models for the production and sharing of content and data. We – as members of this broad movement – were among those who believed it possible to leverage this combination of power and opportunity to build a more democratic society, unleashing the power of the internet to create universal access to knowledge and culture. For us, such openness meant not only freedom, but also presented a path to justice and equality….

The open revolution that we imagined did not, however, happen. At least not on the scale that we and many other proponents of free culture expected.

Nevertheless, the growing Open movement demonstrated the viability of our ideas. As proof we have Wikipedia, Open Government data initiatives, the ascent of Open Access publishing, the role of free software in powering the infrastructure of the internet and the gradual opening of the collections of many cultural heritage institutions….

Over time, we have observed the significant evolution of our movement’s normative basis – away from a justification based on the voluntary exercise of rights by individual creators and towards a justification based on the production of social goods….

Over the last decade, we have witnessed a wholesale transformation of the networked information ecosystem. The web moved away from the ideals and the open design of the early internet and turned into an environment that is dominated by a small number of platforms….

The concentration of power in the hands of a small number of information intermediaries negates one of the core assumptions of the Open movement….”

A Study of the Quality of Wikidata | DeepAI

Abstract:  Wikidata has been increasingly adopted by many communities for a wide variety of applications, which demand high-quality knowledge to deliver successful results. In this paper, we develop a framework to detect and analyze low-quality statements in Wikidata by shedding light on the current practices exercised by the community. We explore three indicators of data quality in Wikidata, based on: 1) community consensus on the currently recorded knowledge, assuming that statements that have been removed and not added back are implicitly agreed to be of low quality; 2) statements that have been deprecated; and 3) constraint violations in the data. We combine these indicators to detect low-quality statements, revealing challenges with duplicate entities, missing triples, violated type rules, and taxonomic distinctions. Our findings complement ongoing efforts by the Wikidata community to improve data quality, aiming to make it easier for users and editors to find and correct mistakes.

 

Google AI Blog: A Step Toward More Inclusive People Annotations in the Open Images Extended Dataset

“In 2016, we introduced Open Images, a collaborative release of ~9 million images annotated with image labels spanning thousands of object categories and bounding box annotations for 600 classes. Since then, we have made several updates, including the release of crowdsourced data to the Open Images Extended collection to improve diversity of object annotations. While the labels provided with these datasets were expansive, they did not focus on sensitive attributes for people, which are critically important for many machine learning (ML) fairness tasks, such as fairness evaluations and bias mitigation. In fact, finding datasets that include thorough labeling of such sensitive attributes is difficult, particularly in the domain of computer vision.

Today, we introduce the More Inclusive Annotations for People (MIAP) dataset in the Open Images Extended collection. The collection contains more complete bounding box annotations for the person class hierarchy in 100k images containing people. Each annotation is also labeled with fairness-related attributes, including perceived gender presentation and perceived age range. With the increasing focus on reducing unfair bias as part of responsible AI research, we hope these annotations will encourage researchers already leveraging Open Images to incorporate fairness analysis in their research….”

Public feedback on preprints can unlock their full potential to accelerate science.

“Public preprint review can help authors improve their paper, find new collaborators, and gain visibility. It also helps readers find interesting and relevant papers and contextualize them with the reactions of experts in the field. Never has this been more apparent than in COVID-19, where rapid communication and expert commentary have both been in high demand. Yet, most feedback on preprints is currently exchanged privately.

Join ASAPbio in partnership with DORA, HHMI, and the Chan Zuckerberg Initiative to discuss how to create a culture of constructive public review and feedback on preprints….”

Wikipedia: The Most Reliable Source on the Internet? | PCMag

“[Q] Which brings us to Wikipedia. Many of us consult it, slightly wary of its bias, depth, and accuracy. But, as you’ll be sharing in your speech at Intellisys, the content actually ends up being surprisingly reliable. How does that happen?

[A] The answer to “should you believe Wikipedia?” isn’t simple. In my book I argue that the content of a popular Wikipedia page is actually the most reliable form of information ever created. Think about it—a peer-reviewed journal article is reviewed by three experts (who may or may not actually check every detail), and then is set in stone. The contents of a popular Wikipedia page might be reviewed by thousands of people. If something changes, it is updated. Those people have varying levels of expertise, but if they support their work with reliable citations, the results are solid. On the other hand, a less popular Wikipedia page might not be reliable at all….”

Glossary Organizing document – instructions for contributors (original doc) – Google Docs

“We invite all interested to: write definitions, comment on existing definitions, add alternative definitions where applicable, and suggest relevant references. If you feel that key terms are missing, please add it – you can let us know, or ask contact us with suggestions in the FORRT slack or email sam.parsons@psy.ox.ac.uk (please CC flavio.azevedo@uni-jena.de during the period Feb 12 to March 1st). The full list of terms will form part of a larger glossary to be hosted on https://FORRT.org, once all terms have been added, the lead writing team (Parsons, Azevedo, & Elsherif) will develop an abridged version to submit as a manuscript. We outline the kinds of contributions and their correspondence to authorship in more detail in the next section. Don’t forget to add your name and details to the contributions spreadsheet….”

Standard eBooks builds upon Project Gutenberg to offer a better reading experience – Good e-Reader

“Project Gutenberg has always been a commendable literary initiative that ensured the classic titles of yore lived on in the digital age. While that is great, the eBooks lack consistent typography. Cover art leaves something to be disired, in addiiton to many typos, that can mar the reading experience considerably.

It is here that the Standard eBooks come into the picture. As the name suggests, the Standard eBook refers to a set of guidelines that each of their eBooks is required to comply with. What that means is each of the books taken from Project Gutenberg re subjected to a laid-down procedure for publishing.

That includes formatting and typesetting with the help of a ‘professional-grade style manual.’ Also, each book is proofread with corrections made wherever necessary. It is only after this that a new digital edition of the book is created using the latest e-reader and browser technologies. This ensures each of the Standard ebooks thus created is compatible with almost all known e-reader devices currently in vogue….”

AcaWiki

“AcaWiki enables you to easily post summaries and literature reviews of peer-reviewed research. Many summaries on AcaWiki come up high on Google results. Please read our posting guidelines before proceeding. If you want to find summaries or literature reviews of peer-reviewed research, you can either browse summaries or search.”