Abstract: Wikidata has been widely used in Digital Humanities (DH) projects. However, a focused discussion regarding the current status, potential, and challenges of its application in the field is still lacking. A systematic review was conducted to identify and evaluate how DH projects perceive and utilize Wikidata, as well as its potential and challenges as demonstrated through use. This research concludes that: (1) Wikidata is understood in the DH projects as a content provider, a platform, and a technology stack; (2) it is commonly implemented for annotation and enrichment, metadata curation, knowledge modelling, and Named Entity Recognition (NER); (3) Most projects tend to consume data from Wikidata, whereas there is more potential to utilize it as a platform and a technology stack to publish data on Wikidata or to create an ecosystem of data exchange; and (4) Projects face two types of challenges: technical issues in the implementations and concerns with Wikidata’s data quality. In the discussion, this article contributes to addressing three issues related to coping with the challenges in the specific context of the DH field based on the research findings: the relevance and authority of other available domain sources; domain communities and their practices; and workflow design that coordinates technical and labour resources from projects and Wikidata.
“Keen to try something with Wikidata! Got a crazy idea? Or a provocation? Or an idea that needs investigating?
Wikimedia Australia and Wikimedia Aotearoa New Zealand are offering two creative fellowships grants of $1000 (AUD) and one of $1000 (NZD) to curate a data set, develop a prototype or undertake an investigation using Wikidata. You will be matched with a Wikimedian who will mentor you throughout your project offering resources, feedback and support.
We are open to applicants from all backgrounds and skill levels, and support proposals that involve investigations. We are looking for proposals that are enthusiastic and innovative as opposed to requiring pre-existing technical skills.”
Abstract: In this article, we focus on the importance of open research information as the foundation for transparent and responsible research assessment and discovery of research outputs. We introduce work in which we support the open research information commons by enabling, in particular, independent and small Open Access journals to provide metadata to several open data hubs (Open Citations, Wikidata, Open Research Knowledge Graph). In this context, we present The OPTIMETA Way, a means to integrate metadata collection, enrichment, and distribution in an effective and quality-ensured way that enables uptake even amongst small scholar-led publication venues. We have designed an implementation strategy for this approach in the form of two plugins for the most widely used journal publishing software, Open Journal Systems (OJS). These plugins collect, enrich, and automatically deliver citation metadata and spatio-temporal metadata for articles. Our contribution to research assessment and discovery with linked open bibliographic data is threefold. First, we enlarge the open research information data pool by advocating for the collection of enriched, user-validated metadata at the time of publication through open APIs. Second, we integrate data platforms and journals currently not included in the standard scientometric practices because of their language or lack of support from big publishing houses. Third, we allow new use cases based on location and temporal metadata that go beyond commonly used discovery features, specifically, the assessment of research activities using spatial coverage and new transdisciplinary connections between research outputs.
Abstract: In the last years, several scientific digital libraries (DLs) in digital humanities (DH) field have been developed following the Open Science principles. These DLs aim at sharing the research outcomes, in several cases as FAIR data, and at creating linked information spaces. In several cases, to reach these aims the Semantic Web technologies and Linked Data have been used. This paper presents how the current scientific DLs in the DH field can provide the creation of linked information spaces and navigational services that allow users to navigate them, using Semantic Web technologies to formally represent, search and browsing knowledge. To support the argument, we present our experience in developing a scientific DL supporting scholars in creating, evolving and consulting a knowledge base related to Medieval and Renaissance geographical works within the three years (2020–2023) Italian National research project IMAGO—Index Medii Aevi Geographiae Operum. In the presented case study, a linked information space was created to allow users to discover and navigate knowledge across multiple repositories, thanks to the extensive use of ontologies. In particular, the linked information spaces created within the IMAGO project make use of five different datasets, i.e. Wikidata, the MIRABILE digital archive, the Nuovo Soggettario thesaurus, Mapping Manuscript Migration knowledge base and the Pleiades gazetteer. The linking among different datasets allows to considerably enrich the knowledge collected in the IMAGO KB.
“Wikidata went live on 29 October 2012 ; in 2022, we are celebrating 10 years of Wikidata together! Let’s organize celebration events all around the world. We are hoping to create a huge network of decentralized, local and community-led events, that could take place onsite or online, around October 2022. The goal of these birthday celebrations are to celebrate the achievements of the community, to bring people together, and also to talk about Wikidata to the rest of the world in order to get more people onboard. In various areas of the world, people get together to organize plenty of different birthday events: meetups, workshops, discussions, live streams, editing campaigns… You can have a look at the events calendar below to find events in your area….”
Wiki Education is hosting webinars all of October to celebrate Wikidata’s 10th birthday. Below is a summary of our first event. Watch Tuesday’s webinar in full on our Youtube. Sign up for our next three events here.
Never before has the world had a tool like Wikidata. The semantic database behind Wikipedia and virtual assistants like Siri and Alexa is only ten years old this month, and yet with almost 1 billion unique items, it’s the biggest open database ever. Wiki Education’s “Wikidata Will” Kent gathered key players in the Wikidataverse to reflect on the last ten years and set our sights on the next ten. Kelly Doyle, the Open Knowledge Coordinator for the Smithsonian Institution; Andrew Lih, Wikimedian at Large with Smithsonian Institution and Wikimedia strategist with the Metropolitan Museum of Art; and Lane Rasberry, Wikimedian in Residence at University of Virginia’s Data Science Institute discussed the “little database that could” (not so little anymore!).
Abstract: Scholia for Software is a project to add software profiling features to Scholia, which is a scholarly profiling service from the Wikimedia ecosystem and integrated with Wikipedia and Wikidata. This document is an adaptation of the funded grant proposal. We are sharing it for several reasons, including research transparency, our wish to encourage the sharing of research proposals for reuse and remixing in general, to assist others specifically in making proposals that would complement our activities, and because sharing this proposal helps us to tell the story of the project to community stakeholders.
A “scholarly profiling service” is a tool which assists the user in accessing data on some aspect of scholarship, usually in relation to research. Typical features of such services include returning the biography of academic publications for any given researcher, or providing a list of publications by topic. Scholia already exists as a Wikimedia platform tool built upon Wikidata and capable of serving these functions. This project will additionally add software-related data to Wikidata, develop Scholia’s own code, and address some ethical issues in diversity and representation around these activities. The end result will be that Scholia will have the ability to report what software a given researcher has described using in their publications, what software is most used among authors publishing on a given topic or in a given journal, what papers describe projects which use some given software, and what software is most often co-used in projects which use a given software.
“In October 2022, we will celebrate the 10th anniversary of Wikidata together! For this special occasion, we are creating a collaborative video that will show people from all around the world celebrating Wikidata’s birthday, sharing wishes and appreciation to the Wikidata community, and why they like Wikidata. We would love to invite you to participate in this video! You will find below more information about how to participate. In short: you can film one or several videos and send them through this form before September 18th. Please make sure that your videos have a maximum size of 1GB and filmed in 30 or 60fps. If you need help with filming the video, feel free to contact us. You can also join one of our workshops….”
“OpenAlex is a free and open Scientific Knowledge Graph (SKG). It contains information describing approximately 230M scholarly works, drawn from both structured (eg: Crossref) and unstructured (eg: institutional repositories, publisher websites) sources, clustered/merged into distinct records, and linked by citations. By parsing work metadata and enriching it with external PID sources (ROR, ORCID, ISSN Network, PubMed, Wikidata, etc), OpenAlex describes and links (approximately) 200M author clusters, 100k institutions, and100k venues (journals and repositories). Using a neural-net classifier, we assign one or more of 50k Wikidata concepts to each work. All source code and ML models are available openly, and data is freely available via a high-performance API, a complete database dump, and a search-engine-style web interface. This talk will describe the construction of OpenAlex, compare it to other SKGs (eg Scopus, MAG), and discuss plans for the future.”
“Wikidata for Scholarly Communication Librarianship was developed for anyone working in an academic library (or interested in working in an academic library) who may have a small or large role in supporting scholarly communication related services. The first two chapters, however, could serve as a basic introduction to Wikidata for anyone in academic librarianship. The remaining three chapters focus on a few topics that may be of more interest to those who work on open metadata, research metrics, and researcher profile projects….”
Abstract: In this article, the authors share the different methods and tools utilized for supporting the Scholarly Profiles as Service (SPaS) model at Indiana University–Purdue University Indianapolis (IUPUI). Leveraging Wikidata to build a scholarly profile service aligns with interests in supporting open knowledge and provides opportunities to address information inequities. The article accounts for the authors’ decision to focus first on profiles for women scholars at the university and provides a detailed case study of how these profiles are created. By describing the processes of delivering the service, the authors hope to inspire other academic libraries to work toward establishing stronger open data connections between academic institutions, their scholars, and their scholars’ publications.
Abstract: Large public knowledge graphs, like Wikidata, contain billions of statements about tens of millions of entities, thus inspiring various use cases to exploit such knowledge graphs. However, practice shows that much of the relevant information that fits users’ needs is still missing in Wikidata, while current linked open data (LOD) tools are not suitable to enrich large graphs like Wikidata. In this paper, we investigate the potential of enriching Wikidata with structured data sources from the LOD cloud. We present a novel workflow that includes gap detection, source selection, schema alignment, and semantic validation. We evaluate our enrichment method with two complementary LOD sources: a noisy source with broad coverage, DBpedia, and a manually curated source with narrow focus on the art domain, Getty. Our experiments show that our workflow can enrich Wikidata with millions of novel statements from external LOD sources with a high quality. Property alignment and data quality are key challenges, whereas entity alignment and source selection are well-supported by existing Wikidata mechanisms. We make our code and data available to support future work.
Abstract: Biological taxonomy rests on a long tail of publications spanning nearly three centuries. Not only is this literature vital to resolving disputes about taxonomy and nomenclature, for many species it represents a key source—indeed sometimes the only source—of information about that species. Unlike other disciplines such as biomedicine, the taxonomic community lacks a centralised, curated literature database (the “bibliography of life”). This article argues that Wikidata can be that database as it has flexible and sophisticated models of bibliographic information, and an active community of people and programs (“bots”) adding, editing, and curating that information.
Abstract: Contemporary bioinformatic and chemoinformatic capabilities hold promise to reshape knowledge management, analysis and interpretation of data in natural products research. Currently, reliance on a disparate set of non-standardized, insular, and specialized databases presents a series of challenges for data access, both within the discipline and for integration and interoperability between related fields. The fundamental elements of exchange are referenced structure-organism pairs that establish relationships between distinct molecular structures and the living organisms from which they were identified. Consolidating and sharing such information via an open platform has strong transformative potential for natural products research and beyond. This is the ultimate goal of the newly established LOTUS initiative, which has now completed the first steps toward the harmonization, curation, validation and open dissemination of 750,000+ referenced structure-organism pairs. LOTUS data is hosted on Wikidata and regularly mirrored on https://lotus.naturalproducts.net. Data sharing within the Wikidata framework broadens data access and interoperability, opening new possibilities for community curation and evolving publication models. Furthermore, embedding LOTUS data into the vast Wikidata knowledge graph will facilitate new biological and chemical insights. The LOTUS initiative represents an important advancement in the design and deployment of a comprehensive and collaborative natural products knowledge base.