Content at Scale – The Third Wave – The Scholarly Kitchen

“Third Wave – 2020s – AI and Open Content

This decade will see the tipping point reached for open research content between the [top down] expansion of OA initiatives from commercial publishers and the [bottom up] support for Open Science efforts from within the academy. Having more content freely available and more content on the same platforms enables large scale analyses. The economic models are shifting from the value of the content at the unit level to the deployment of tools to uncover intelligence in a large body of content….”

Academics edge closer to dream of research on cloud platforms | Financial Times

“In the race to harness the power of cloud computing, and further develop artificial intelligence, academics have a new concern: falling behind a fast-moving tech industry. In the US, 22 higher education institutions, including Stanford and Carnegie Mellon, have signed up to a National Research Cloud initiative seeking access to the computational power they need to keep up. It is one of several cloud projects being called for by academics globally, and is being explored by the US Congress, given the potential of the technology to deliver breakthroughs in healthcare and climate change….”


ScholarPhi: A Novel Interface for Reading Scientific Papers | UC Berkeley School of Information

“To help scientists deal with the increasing volume of published scientific literature, a research team at the I School is designing ScholarPhi, an augmented reading interface that makes scientific papers more understandable and contextually rich.

The project is led by UC Berkeley School of Information Professor Marti Hearst, and includes UC Berkeley postdoctoral fellows Andrew Head and Dongyeop Kang, and collaborators Raymond Folk, Kyle Lo, Sam Sjonsberg, and Dan Weld from the Allen Institute for AI (AI2) and the University of Washington. It is funded in part by the Alfred P. Sloan Foundation and by AI2. 

ScholarPhi broadens access to scientific literature by developing a new document reader user interface and natural language analysis algorithms for context-relevant explanations of technical terms and notation….”

Semantic Scholar | Semantic Reader

“Semantic Reader Beta is an augmented reader with the potential to revolutionize scientific reading by making it more accessible and richly contextual.

Observations of scientists reading technical papers showed that readers frequently page back and forth looking for the definitions of terms and mathematical symbols as well as for the details of cited papers. This need to jump around through the paper breaks the flow of paper comprehension.

Semantic Reader provides this information directly in context by dimming unrelated text and providing details in tooltips, and soon will also provide corresponding term definitions. It uses artificial intelligence to understand a document’s structure. Usability studies show readers answered questions requiring deep understanding of paper concepts significantly more quickly with ScholarPhi than with a baseline PDF reader; furthermore, they viewed much less of the paper.

Based on the ScholarPhi research from the Semantic Scholar team at AI2, UC Berkeley and the University of Washington, and supported in part by the Alfred P. Sloan Foundation, the Semantic Reader is now available in beta for a select group of arXiv papers on with plans to add additional features and expand coverage soon….”

Developing an objective, decentralised scholarly communication and evaluation system – YouTube

“This is our proposal for how we might create a radically new scholarly publishing system with the potential to disrupt the scholarly publishing industry. The proposed model is: (a) open, (b) objective, (c) crowd sourced and community-controlled, (d) decentralised, and (e) capable of generating prestige. Submitted articles are openly rated by researchers on multiple dimensions of interest (e.g., novelty, reliability, transparency) and ‘impact prediction algorithms’ are trained on these data to classify articles into journal ‘tiers’.

In time, with growing adoption, the highest impact tiers within such a system could develop sufficient prestige to rival even the most established of legacy journals (e.g., Nature). In return for their support, researchers would be rewarded with prestige, nuanced metrics, reduced fees, faster publication rates, and increased control over their outputs….”

Crowdsourcing Scholarly Discourse Annotations | 26th International Conference on Intelligent User Interfaces

Abstract:  The number of scholarly publications grows steadily every year and it becomes harder to find, assess and compare scholarly knowledge effectively. Scholarly knowledge graphs have the potential to address these challenges. However, creating such graphs remains a complex task. We propose a method to crowdsource structured scholarly knowledge from paper authors with a web-based user interface supported by artificial intelligence. The interface enables authors to select key sentences for annotation. It integrates multiple machine learning algorithms to assist authors during the annotation, including class recommendation and key sentence highlighting. We envision that the interface is integrated in paper submission processes for which we define three main task requirements: The task has to be . We evaluated the interface with a user study in which participants were assigned the task to annotate one of their own articles. With the resulting data, we determined whether the participants were successfully able to perform the task. Furthermore, we evaluated the interface’s usability and the participant’s attitude towards the interface with a survey. The results suggest that sentence annotation is a feasible task for researchers and that they do not object to annotate their articles during the submission process.


Coleridge Initiative – Show US the Data | Kaggle

“This competition challenges data scientists to show how publicly funded data are used to serve science and society. Evidence through data is critical if government is to address the many threats facing society, including; pandemics, climate change, Alzheimer’s disease, child hunger, increasing food production, maintaining biodiversity, and addressing many other challenges. Yet much of the information about data necessary to inform evidence and science is locked inside publications.

Can natural language processing find the hidden-in-plain-sight data citations? Can machine learning find the link between the words used in research articles and the data referenced in the article?

Now is the time for data scientists to help restore trust in data and evidence. In the United States, federal agencies are now mandated to show how their data are being used. The new Foundations of Evidence-based Policymaking Act requires agencies to modernize their data management. New Presidential Executive Orders are pushing government agencies to make evidence-based decisions based on the best available data and science. And the government is working to respond in an open and transparent way.

This competition will build just such an open and transparent approach. …”

It’s The End Of Citation As We Know It & I Feel Fine | Techdirt

” ScholarSift is kind of like Turnitin in reverse. It compares the text of a law review article to a huge database of law review articles and tells you which ones are similar. Unsurprisingly, it turns out that machine learning is really good at identifying relevant scholarship. And ScholarSift seems to do a better job at identifying relevant scholarship than pricey legacy platforms like Westlaw and Lexis.

One of the many cool things about ScholarSift is its potential to make legal scholarship more equitable. In legal scholarship, as everywhere, fame begets fame. All too often, fame means the usual suspects get all the attention, and it’s a struggle for marginalized scholars to get the attention they deserve. Unlike other kinds of machine learning programs, which seem almost designed to reinforce unfortunate prejudices, ScholarSift seems to do the opposite, highlighting authors who might otherwise be overlooked. That’s important and valuable. I think Anderson and Wenzel are on to something, and I agree that ScholarSift could improve citation practices in legal scholarship….

Anderson and Wenzel argue that ScholarSift can tell authors which articles to cite. I wonder if it couldn’t also make citations pointless. After all, readers can use ScholarSift, just as well as authors….”