This decade will see the tipping point reached for open research content between the [top down] expansion of OA initiatives from commercial publishers and the [bottom up] support for Open Science efforts from within the academy. Having more content freely available and more content on the same platforms enables large scale analyses. The economic models are shifting from the value of the content at the unit level to the deployment of tools to uncover intelligence in a large body of content….”
“In the race to harness the power of cloud computing, and further develop artificial intelligence, academics have a new concern: falling behind a fast-moving tech industry. In the US, 22 higher education institutions, including Stanford and Carnegie Mellon, have signed up to a National Research Cloud initiative seeking access to the computational power they need to keep up. It is one of several cloud projects being called for by academics globally, and is being explored by the US Congress, given the potential of the technology to deliver breakthroughs in healthcare and climate change….”
“To help scientists deal with the increasing volume of published scientific literature, a research team at the I School is designing ScholarPhi, an augmented reading interface that makes scientific papers more understandable and contextually rich.
The project is led by UC Berkeley School of Information Professor Marti Hearst, and includes UC Berkeley postdoctoral fellows Andrew Head and Dongyeop Kang, and collaborators Raymond Folk, Kyle Lo, Sam Sjonsberg, and Dan Weld from the Allen Institute for AI (AI2) and the University of Washington. It is funded in part by the Alfred P. Sloan Foundation and by AI2.
ScholarPhi broadens access to scientific literature by developing a new document reader user interface and natural language analysis algorithms for context-relevant explanations of technical terms and notation….”
“Semantic Reader Beta is an augmented reader with the potential to revolutionize scientific reading by making it more accessible and richly contextual.
Observations of scientists reading technical papers showed that readers frequently page back and forth looking for the definitions of terms and mathematical symbols as well as for the details of cited papers. This need to jump around through the paper breaks the flow of paper comprehension.
Semantic Reader provides this information directly in context by dimming unrelated text and providing details in tooltips, and soon will also provide corresponding term definitions. It uses artificial intelligence to understand a document’s structure. Usability studies show readers answered questions requiring deep understanding of paper concepts significantly more quickly with ScholarPhi than with a baseline PDF reader; furthermore, they viewed much less of the paper.
Based on the ScholarPhi research from the Semantic Scholar team at AI2, UC Berkeley and the University of Washington, and supported in part by the Alfred P. Sloan Foundation, the Semantic Reader is now available in beta for a select group of arXiv papers on semanticscholar.org with plans to add additional features and expand coverage soon….”
“This is our proposal for how we might create a radically new scholarly publishing system with the potential to disrupt the scholarly publishing industry. The proposed model is: (a) open, (b) objective, (c) crowd sourced and community-controlled, (d) decentralised, and (e) capable of generating prestige. Submitted articles are openly rated by researchers on multiple dimensions of interest (e.g., novelty, reliability, transparency) and ‘impact prediction algorithms’ are trained on these data to classify articles into journal ‘tiers’.
In time, with growing adoption, the highest impact tiers within such a system could develop sufficient prestige to rival even the most established of legacy journals (e.g., Nature). In return for their support, researchers would be rewarded with prestige, nuanced metrics, reduced fees, faster publication rates, and increased control over their outputs….”
Abstract: The number of scholarly publications grows steadily every year and it becomes harder to find, assess and compare scholarly knowledge effectively. Scholarly knowledge graphs have the potential to address these challenges. However, creating such graphs remains a complex task. We propose a method to crowdsource structured scholarly knowledge from paper authors with a web-based user interface supported by artificial intelligence. The interface enables authors to select key sentences for annotation. It integrates multiple machine learning algorithms to assist authors during the annotation, including class recommendation and key sentence highlighting. We envision that the interface is integrated in paper submission processes for which we define three main task requirements: The task has to be . We evaluated the interface with a user study in which participants were assigned the task to annotate one of their own articles. With the resulting data, we determined whether the participants were successfully able to perform the task. Furthermore, we evaluated the interface’s usability and the participant’s attitude towards the interface with a survey. The results suggest that sentence annotation is a feasible task for researchers and that they do not object to annotate their articles during the submission process.
“The archive’s catalog currently holds more than 120 million digital records, as well as “archival metadata and other types of records, including electronic databases.” However, the system has “an unsophisticated search” function, according to a request for information.
While NARA employees add metadata tags to digital records, “There is a delta between what NARA has been able to describe and the specific information that users want from our records,” the RFI states, asking, “Can AI fill the gap?”
During an informational day held in early April, NARA executives outlined some of the challenge, including a single search returning a flood of results from the same source—making it difficult to sift through to find multiple sources—and difficulty distinguishing between records with similar names, such as a search for “Truman” the president versus “Truman” the aircraft carrier.
The current search function also is not able to return accurate results if the search term input is not exactly the same as it exists in the metadata.
The RFI is seeking feedback on automated solutions that can analyze how users search the digital archives and associate those search terms with the appropriate record….”
“The [National Archives] Catalog currently has a large data set (over 100 million digital pages of records, plus archival metadata and other types of records, including electronic databases) and an unsophisticated search. The archival hierarchy of the records is intended to assist the user in discovery, but in the digital realm, users find it difficult to use. The metadata that we have entered manually cannot provide the granular information for users to get the search results they want and it has taken NARA decades to produce. There is a delta between what NARA has been able to describe, and the specific information that users want from our records. Can AI fill the gap?…”
“This competition challenges data scientists to show how publicly funded data are used to serve science and society. Evidence through data is critical if government is to address the many threats facing society, including; pandemics, climate change, Alzheimer’s disease, child hunger, increasing food production, maintaining biodiversity, and addressing many other challenges. Yet much of the information about data necessary to inform evidence and science is locked inside publications.
Can natural language processing find the hidden-in-plain-sight data citations? Can machine learning find the link between the words used in research articles and the data referenced in the article?
Now is the time for data scientists to help restore trust in data and evidence. In the United States, federal agencies are now mandated to show how their data are being used. The new Foundations of Evidence-based Policymaking Act requires agencies to modernize their data management. New Presidential Executive Orders are pushing government agencies to make evidence-based decisions based on the best available data and science. And the government is working to respond in an open and transparent way.
This competition will build just such an open and transparent approach. …”
” ScholarSift is kind of like Turnitin in reverse. It compares the text of a law review article to a huge database of law review articles and tells you which ones are similar. Unsurprisingly, it turns out that machine learning is really good at identifying relevant scholarship. And ScholarSift seems to do a better job at identifying relevant scholarship than pricey legacy platforms like Westlaw and Lexis.
One of the many cool things about ScholarSift is its potential to make legal scholarship more equitable. In legal scholarship, as everywhere, fame begets fame. All too often, fame means the usual suspects get all the attention, and it’s a struggle for marginalized scholars to get the attention they deserve. Unlike other kinds of machine learning programs, which seem almost designed to reinforce unfortunate prejudices, ScholarSift seems to do the opposite, highlighting authors who might otherwise be overlooked. That’s important and valuable. I think Anderson and Wenzel are on to something, and I agree that ScholarSift could improve citation practices in legal scholarship….
Anderson and Wenzel argue that ScholarSift can tell authors which articles to cite. I wonder if it couldn’t also make citations pointless. After all, readers can use ScholarSift, just as well as authors….”