[2302.07302] CiteSee: Augmenting Citations in Scientific Papers with Persistent and Personalized Historical Context

Abstract:  When reading a scholarly article, inline citations help researchers contextualize the current article and discover relevant prior work. However, it can be challenging to prioritize and make sense of the hundreds of citations encountered during literature reviews. This paper introduces CiteSee, a paper reading tool that leverages a user’s publishing, reading, and saving activities to provide personalized visual augmentations and context around citations. First, CiteSee connects the current paper to familiar contexts by surfacing known citations a user had cited or opened. Second, CiteSee helps users prioritize their exploration by highlighting relevant but unknown citations based on saving and reading history. We conducted a lab study that suggests CiteSee is significantly more effective for paper discovery than three baselines. A field deployment study shows CiteSee helps participants keep track of their explorations and leads to better situational awareness and increased paper discovery via inline citation when conducting real-world literature reviews.


Bill & Melinda Gates Foundation Joins Open Knowledge Maps as a Supporting Member

We are delighted to announce that the Bill & Melinda Gates Foundation has joined Open Knowledge Maps as a supporting member. The Bill & Melinda Gates Foundation is the second funding agency to join Open Knowledge Maps and the first to do so with a Visionary Membership.


Surprise machines | John Benjamins

“Although “the humanities so far has focused on literary texts, historical text records, and spatial data,” as stated by Lev Manovich in Cultural Analytics (Manovich, 2020, p.?10), the recent advancements in artificial intelligence are driving more attention to other media. For example, disciplines such as digital humanities now embrace more diverse types of corpora (Champion, 2016). Yet this shift of attention is also visible in museums, which recently took a step forward by establishing the field of experimental museology (Kenderdine et al., 2021).

This article illustrates the visualization of an extensive image collection through digital means. Following a growing interest in the digital mapping of images – proved by the various scientific articles published on the subject (Bludau et al., 2021; Crockett, 2019; Seguin, 2018), Ph.D. theses (Kräutli, 2016; Vane, 2019), software (American Museum of Natural History, 2020/2022; Diagne et al., 2018; Pietsch, 2018/2022), and presentations (Benedetti, 2022; Klinke, 2021) – this text describes an interdisciplinary experiment at the intersection of information design, experimental museology, and cultural analytics.

Surprise Machines is a data visualization that maps more than 200,000 digital images of the Harvard Art Museums (HAM) and a digital installation for museum visitors to understand the collection’s vastness. Part of a temporary exhibition organized by metaLAB (at) Harvard and entitled Curatorial A(i)gents, Surprise Machines is enriched by a choreographic interface that allows visitors to interact with the visualization through a camera capturing body gestures. The project is unique for its interdisciplinarity, looking at the prestigious collection of Harvard University through cutting-edge techniques of AI….”

FAIR and Interactive Data Graphics from a Scientific Knowledge Graph | Scientific Data

Abstract:  Graph databases capture richly linked domain knowledge by integrating heterogeneous data and metadata into a unified representation. Here, we present the use of bespoke, interactive data graphics (bar charts, scatter plots, etc.) for visual exploration of a knowledge graph. By modeling a chart as a set of metadata that describes semantic context (SPARQL query) separately from visual context (Vega-Lite specification), we leverage the high-level, declarative nature of the SPARQL and Vega-Lite grammars to concisely specify web-based, interactive data graphics synchronized to a knowledge graph. Resources with dereferenceable URIs (uniform resource identifiers) can employ the hyperlink encoding channel or image marks in Vega-Lite to amplify the information content of a given data graphic, and published charts populate a browsable gallery of the database. We discuss design considerations that arise in relation to portability, persistence, and performance. Altogether, this pairing of SPARQL and Vega-Lite—demonstrated here in the domain of polymer nanocomposite materials science—offers an extensible approach to FAIR (findable, accessible, interoperable, reusable) scientific data visualization within a knowledge graph framework.


Visual citation navigation of open education resources using Litmaps | Emerald Insight

Abstract:  Purpose

The purpose of this study is to visualize the key literature on the topic “Open Educational Resources” using the research discovery tool “Litmaps”.


Litmaps visual citation navigation, the ultimate science discovery platform, is used for the present study. It provides an interface for discovering scientific literature, explores the research landscape and discovers articles that are highly connected to maps. Litmaps provides quick-start options to import articles from reference manager, keyword search, ORCID ID, DOI or using a seed article. In this paper, “keyword search” and research strategy “Open Educational Resources” or “OER” are put to use.


The findings of the study revealed that Litmaps gives citations between articles over time visually. The map generated is dynamic as it is adjustable for making the map according to the researcher’s needs.

Research limitations/implications

Litmaps helps researchers in doing the literature review in a very brief and systematic way. It is helpful in finding the related or relevant studies through the seed paper/keyword search.


The study makes a useful contribution to the literature on this topic as one can independently find research topics and also compare topic overlapping. The study provides insights that help researchers in building citation maps and see connections between articles over time. The originality of the present paper lies in highlighting the importance of the research discovery tool Litmaps for the researchers as so far, to the best of the authors’ knowledge, no research has been taken place on using it.

Graphical integrity issues in open access publications: Detection and patterns of proportional ink violations

Abstract:  Academic graphs are essential for communicating complex scientific ideas and results. To ensure that these graphs truthfully reflect underlying data and relationships, visualization researchers have proposed several principles to guide the graph creation process. However, the extent of violations of these principles in academic publications is unknown. In this work, we develop a deep learning-based method to accurately measure violations of the proportional ink principle (AUC = 0.917), which states that the size of shaded areas in graphs should be consistent with their corresponding quantities. We apply our method to analyze a large sample of bar charts contained in 300K figures from open access publications. Our results estimate that 5% of bar charts contain proportional ink violations. Further analysis reveals that these graphical integrity issues are significantly more prevalent in some research fields, such as psychology and computer science, and some regions of the globe. Additionally, we find no temporal and seniority trends in violations. Finally, apart from openly releasing our large annotated dataset and method, we discuss how computational research integrity could be part of peer-review and the publication processes.


Dynamic visualisation of million?tip trees: The OneZoom project – Wong – – Methods in Ecology and Evolution – Wiley Online Library



The complete tree of life is now available, but methods to visualise it are still needed to meet needs in research, teaching and science communication. Dynamic visualisation of million-tip trees requires many challenges in data synthesis, data handling and computer graphics to be overcome.
Our approach is to automate data processing, synthesise data from a wide range of available sources, then to feed these data to a client-side visualisation engine in parts. We develop a way to store the whole tree topology locally in a highly compressed form, then dynamically populate metadata such as text and images as the user explores.
The result is a seamless and smooth way to explore the complete tree of life, including images and metadata, even on relatively old mobile devices.
The underlying methods developed have applications that transcend tree of life visualisation. For the whole complete tree, we describe automated ID mappings between well known resources without resorting to taxonomic name resolution, automated methods to collate sets of public domain representative images for higher taxa, and an index to measure public interest of individual species.
The visualisation layout and the client user interface are both abstracted components of the codebase enabling other zoomable tree layouts to be swapped in, and supporting multiple applications including exhibition kiosks and digital art.
After 10 years of work, our tree of life explorer is now broadly complete, it has attracted nearly 1.5 million online users, and is backed by a novel long-term sustainability plan. We conclude our description of the OneZoom project by suggesting the next challenges that need to be solved in this field: extinct species and guided tours around the tree.”

“The Google Earth of Biology” – Visually Stunning Tree of All Known Life Unveiled Online


“The OneZoom explorer – available at onezoom.org – maps the connections between 2.2 million living species, the closest thing yet to a single view of all species known to science. The interactive tree of life allows users to zoom in to any species and explore its relationships with others, in a seamless visualisation on a single web page. The explorer also includes images of over 85,000 species, plus, where known, their vulnerability to extinction.

OneZoom was developed by Imperial College London biodiversity researcher Dr. James Rosindell and University of Oxford evolutionary biologist Dr. Yan Wong. In a paper published today in Methods in Ecology and Evolution, Drs Wong and Rosindell present the result of over ten years of work, gradually creating what they regard as “the Google Earth of biology.” …”

Dr. Wong added: “It’s extraordinary how much research there is still to be done. Building the OneZoom tree of life was only possible through sophisticated methods to gather and combine existing data – it would have been impossible to curate all this by hand.”

Japan LIVE Dashboard” for COVID-19: A Scalable Solution to Monitor Real-Time and Regional-Level Epidemic Case Data

Abstract:  Under pandemic conditions, it is important to communicate local infection risks to better enable the general population to adjust their behaviors accordingly. In Japan, our team operates a popular non-government and not-for-profit dashboard project – “Japan LIVE Dashboard” – which allows the public to easily grasp the evolution of the pandemic on the internet. We presented the Dashboard design concept with a generic framework integrating socio-technical theories, disease epidemiology and related contexts, and evidence-based approaches. Through synthesizing multiple types of reliable and real-time local data sources from all prefectures across the country, the Dashboard allows the public access to user-friendly and intuitive disease visualization in real time and has gained an extensive online followership. To date, it has attracted c.30 million visits (98% domestic access) testifying to the reputation it has acquired as a user-friendly portal for understanding the progression of the pandemic. Designed as an open-source solution, the Dashboard can also be adopted by other countries as well as made applicable for other emerging outbreaks in the future. Furthermore, the conceptual design framework may prove applicable into other ehealth scaled for global pandemics.


20+ years of open access in Australia

“There have been open research initiatives in Australia since the very beginning of global discussions on open access to research publications in the early 2000s. The initiatives in Australia have come from a range of actors, including the federal government, funders, institutions, and peak and advocacy bodies. This arrow illustrates some of the key initiatives over the past 20 years. In 2020, the Council of Australian University Librarians (CAUL) and the Australasian Open Access Strategy Group (AOASG, now Open Access Australasia) facilitated a national discussion on open research. In 2021, there is increased momentum towards open access to research publications driven by work from the Office of the Chief Scientist, Dr Cathy Foley.”

Preclinical Western Blot in the Era of Digital Transformation and Reproducible Research, an Eastern Perspective | SpringerLink

Abstract:  The current research is an interdisciplinary endeavor to develop a necessary tool in preclinical protein studies of diseases or disorders through western blotting. In the era of digital transformation and open access principles, an interactive cloud-based database called East–West Blot (https://rancs-lab.shinyapps.io/WesternBlots) is designed and developed. The online interactive subject-specific database built on the R shiny platform facilitates a systematic literature search on the specific subject matter, here set to western blot studies of protein regulation in the preclinical model of TBI. The tool summarizes the existing publicly available knowledge through a data visualization technique and easy access to the critical data elements and links to the study itself. The application compiled a relational database of PubMed-indexed western blot studies labeled under HHS public access, reporting downstream protein regulations presented by fluid percussion injury model of traumatic brain injury. The promises of the developed tool include progressing toward implementing the principles of 3Rs (replacement, reduction, and refinement) for humane experiments, cultivating the prerequisites of reproducible research in terms of reporting characteristics, paving the ways for a more collaborative experimental design in basic science, and rendering an up-to-date and summarized perspective of current publicly available knowledge.


Harnessing Scholarly Literature as Data to Curate, Explore, and Evaluate Scientific Research

Abstract:  There currently exist hundreds of millions of scientific publications, with more being created at an ever-increasing rate. This is leading to information overload: the scale and complexity of this body of knowledge is increasing well beyond the capacity of any individual to make sense of it all, overwhelming traditional, manual methods of curation and synthesis. At the same time, the availability of this literature and surrounding metadata in structured, digital form, along with the proliferation of computing power and techniques to take advantage of large-scale and complex data, represents an opportunity to develop new tools and techniques to help people make connections, synthesize, and pose new hypotheses. This dissertation consists of several contributions of data, methods, and tools aimed at addressing information overload in science. My central contribution to this space is Autoreview, a framework for building and evaluating systems to automatically select relevant publications for literature reviews, starting from small sets of seed papers. These automated methods have the potential to help researchers save time and effort when keeping up with relevant literature, as well as surfacing papers that more manual methods may miss. I show that this approach can work to recommend relevant literature, and can also be used to systematically compare different features used in the recommendations. I also present the design, implementation, and evaluation of several visualization tools. One of these is an animated network visualization showing the influence of a scholar over time. Another is SciSight, an interactive system for recommending new authors and research by finding similarities along different dimensions. Additionally, I discuss the current state of available scholarly data sets; my work curating, linking, and building upon these data sets; and methods I developed to scale graph clustering techniques to very large networks.