TIB at WikidataCon: Part 2

This is the second installment of a 2-part blog post covering the latest edition of WikidataCon, October 29–31st, 2021. Learn more about the conference and its general themes, as well as recent updates to the vision and strategy of Linked Open Data development within the Wikimedia ecosystem in the first part of the blog.

Focus on TIB’s conference contribution

OSL team members participated in 3 presentations on Sunday, October 31st, in the context of the Wikibase and Education and Science tracks. Learn more about each presentation below:

Wikibase as RDM infrastructure within NFDI4Culture

[Wikibase track]

In the first half of this session, OSL’s Ina Blümel and Lucia Sohmen discussed the new minimal-viable-product (MVP) toolchain that we are developing in the context of NFDI4Culture’s Task Area 1 “Data capture and enrichment”. The MVP architecture relies on Wikibase to store and structure contextual metadata and user-contributed annotations for 3D models and reconstructions of cultural assets.

Diagram of the MVP architecture
Simplified diagram representing the MVP architecture. Credit: Lozana Rossenova. CC-BY-SA 4.0

With a view towards sustainability, the MVP development aligns with the overall strategy of the Wikimedia Movement to support a decentralized ecosystem of federated Wikibase instances wherein data from Wikidata (and other data sources) is re-used and re-contextualized for specialist domains (e.g. cultural heritage). It further contributes to needs identified by various communities for additional, domain-specific extensions, tools, and user interfaces around the Wikibase software. In Phase 1 of development, we designed an accessible data upload pipeline which streamlines the metadata upload process (via the open source software OpenRefine, see below). What is more, we developed a custom branch of the open source software Kompakkt to serve as an extended frontend to the Wikibase repository. With Kompakkt, users can upload, view and annotate a range of files and formats of 2D, 3D, and audio-visual media in a modern, user-friendly, web-based interface. The next development phase will introduce the possibility to leverage the Wikibase API and SPARQL endpoint for bulk annotations as well. In the future, the MVP will be open to any project that wants to store, visualize and annotate complex visual data. [See presentation slides here.]

The second part of the session focused on the potential benefits, as well as challenges for using Wikibase in the context of “The 4Culture Knowledge Graph” (part of Task Area 5 which TIB also contributes to with staff and infrastructure resources). The presentation was delivered by Harald Sack from FIZ Karlsruhe, and provided insights as to the need for formal semantics to be an integral part of the 4Culture Knowledge Graph, and not just an add-on. Wikibase and its MediaWiki GUI increase user accessibility to LOD and offer opportunities for collaboration and community engagement, which are important incentives for broader adoption within the NFDI consortium. At the same time, the lack of native semantics and W3C standard vocabularies (RDF, RDFS, OWL) in Wikibase, negatively impacts interoperability, data reuse and federation outside the ‘bubble’ of the Wikidata/Wikibase ecosystem. The presentation offered several mitigation strategies for addressing the issue of formal semantics that are currently being tested and evaluated at FIZ Karlsruhe; these included: declarative semantic mapping, data import/export (via the triple store), and development of a dedicated semantic extension. The results of evaluating the workaround tactics will be published as Guidelines and Best Practices to enable the NFDI4Culture community to share their data resources within a federated Knowledge Graph via Wikibase instances. [Download presentation slides here.]

Using OpenRefine with arbitrary Wikibase instances

[Wikibase track]

Building on from the presentation of the 3D annotation MVP toolchain, Lozana Rossenova and Lucia Sohmen delivered a lightning talk which expanded on the data pipeline developed for the MVP. The talk focused on the role of OpenRefine in the data pipeline. OpenRefine allows users to clean data, transform it and reconcile against other open data sources, like Wikidata. It also makes it possible to directly upload to, as well as pull data from Wikidata. Recently, this functionality was extended to make it possible to connect to any Wikibase instance. However, this requires additional server-side and frontend configurations. Much of this is not yet fully documented, so with this presentation we aimed to provide a succinct overview of all necessary steps in the process. We also presented a service box, developed at TIB, that automates the server-side setup.

We demoed the steps users need to perform in the frontend and tested uploading sample data to a Wikibase instance. Given that the version of OpenRefine that allows Wikibase connection is still a beta pre-release, we did encounter some bugs during the live demonstration. Fortunately, OpenRefine’s lead developer Antonin Delpeuch was also in the audience and took note of it. We plan to work closely with the OpenRefine team to help with their documentation and bug testing efforts since OpenRefine is an essential part of our data upload pipeline. And in the spirit of “it takes a village to raise a tool” (see Part 1 of this blog post),  we want to support a tool that plays a vital role across many community projects within the Wikimedia Movement at large, as well as the 4Culture community more narrowly. [See presentation slides here.]

Integration of Wikidata 4OpenGLAM into data and information science curricula

[Education and Science track]

It is not new that Wikidata and OpenRefine are used in academia, as they are good tools for teaching data science skills. There are many examples of this and a lot of material that can be used in teaching and for self-study. In this presentation, Ina Blümel showcased several new online resources which were developed last semester as part of a project on linking and visualising cultural heritage data using Wikidata and two Data Science courses at Hannover University of Applied Sciences and Arts for and with students of information science.

Exemplary student work from a SPARQL and visualisation task
Exemplary student work from a SPARQL and visualisation task. Source: Slide deck (see link below).

We focussed on the description and discussion of how to integrate student work and the projects of the Open Science Lab (9 projects in total, out of which 6 are in OpenGLAM, and 4 use Wikidata and/or Wikibase) and on how to motivate students to engage with more advanced tasks in the field of cultural heritage. Lucia Sohmen presented tasks she designed for one of the courses to teach students different ways of interacting with open data. These included download via an API (OAI-PMH) and by scraping IIIF manifests using a Python library; cleaning and transforming data followed by uploading it to Wikidata – all through OpenRefine; and querying and visualizing their data by using Wikidata’s SPARQL interface. [See presentation slides here.]

Wikidata & Education: A Global Panel

[Education track]

During this panel, Houcemeddine Turki, Research Assistant at the Data Engineering and Semantics Research Unit based at the University of Sfax, Tunisia, showcased a joint research proposal of the DES Unit and OSL, which involves the use of Wikidata in OSL’s Book Sprints. This proposal was developed with Christian Hauschke, Lambert Heller and Simon Worthington from OSL, in collaboration with researchers from several other institutions.

Outlook

The next event where many of these topics will be presented is the Culture Community Plenary. If you want to stay up to date, you can follow Open Science Lab on Twitter and sign up for the NFDI4Culture newsletter.

Der Beitrag TIB at WikidataCon: Part 2 erschien zuerst auf TIB-Blog.

The post TIB at WikidataCon: Part 2 first appeared on Leibniz Research Alliance Open Science.

TIB at WikidataCon: Part 1

Reflecting on questions of sustainability, growing the ecosystem of decentralized data repositories and ensuring knowledge equity

Introduction

This year WikidataCon marked the 9th birthday of Wikidata: “a free, collaborative, multilingual knowledge base with a focus on verifiability” [1]. The biennial conference took place online across all timezones between October 29-31st, opening up participation to a global audience. The conference included 142 sessions, roughly 80 hours of programming and over 700 unique visitors who checked into the event platform Venueless [2]. Beyond the numbers, this conference marks the growth of Wikidata into a mature product – part of the family of applications developed and maintained by the Wikimedia Movement – as well as the growth of a dedicated community of “project shapers”, “gardeners”, and “re-users” [3].

Shortly before the opening of the conference, Wikimedia Germany (the primary maintainers of Wikidata) and the Wikimedia Foundation published updated documents for their 2021 Strategy regarding the development of Linked Open Data within the Wikimedia movement and the vision for the development of Wikidata, their flagship LOD platform, as well as Wikibase – the underlying software which can enable a decentralized ecosystem of LOD data repositories to grow. The strategy documents focus on several key areas that were reflected in the programming of the conference as well. Below we provide a short overview of these.

Diagramme showing an ecosystem of decentralized Wikibase knowledge bases.
A view of the Wikimedia Linked Open Data web. Credit: Dan Shick (WMDE) / CC-BY-SA 4.0

Focus on services

There is a strong thread throughout the strategy documents as well as the conference programming that focuses on the scalable and sustainable provision of knowledge services. This includes the acknowledgement that making data in Wikidata easy to find and re-use with a high degree of trust in its quality relies on a range of additional tools and interfaces that need to easily connect with Wikidata via new and improved APIs. Sessions in the conference that focused on this topic, included:

Another key aspect of the focus on services is the scalability of the current query service that Wikidata provides (WDQS), which has been under significant strain as the knowledge graph has grown over the past years. In the spirit of openness, the members of the technical teams of Wikidata and the Search Platform at Wikimedia offered an overview of current issues and a view for the future on how they plan to manage the risks of rapid scaling and system overload in two dedicated conference sessions. Besides short-term solutions, one of the key strategies for longer term scalability that was discussed was decentralization and federation across multiple data stores.

Last but not least, reliable service provision requires sustainable tool ecosystem management – a particular challenge to large open source software movements relying on a high degree of self-initiative and volunteer labour. A dedicated panel session brought together the perspectives of tool developers, maintainers, volunteers and WMF officials around the same (virtual) table at the conference to discuss this issue. A day before the session, a member of the tool development community published a related blog post analysing the current challenges facing WMF and its tool environment, and proposed relevant mitigation tactics, including the focus on collaboration and harnessing the contributions of non-technical volunteer support:

It takes a village to raise a tool ? and various specialties ranging from product ownership, design, development, operations, testing, QA, security, documentation… ?  yet more often than not, a single person is behind a tool. ~ Jean-Frédéric [4]

2x2 matrix diagram for prioritizing tool support needs in the Wikimedia ecosystem
2×2 matrix for prioritizing tool support needs, drafted by Andrew Lih and shared during the sustainable tool ecosystem management panel session.

Focus on equity 

Sustainability was indeed the main theme of the conference, but sustainability was discussed also in the context of a parallel initiative: Reimagining Wikidata from the margins [5]. This year, besides a focus on the technical, the new strategy documents and the conference as a whole had an explicitly social focus, too ? acknowledging the various inequities endemic to all open movements that rely on contributions from volunteers with access to technical skills, digital literacy, financial means and leisure time, among other forms of social privilege. What this meant in practice was that the conference was co-organized in partnership with the Wiki Movimento Brasil and there were many sessions aimed explicitly at representation of a diversity of national, ethnic and linguistic backgrounds, for example:

These sessions aimed to amplify a plurality of voices traditionally marginalized by the domination of organisations and communities from (primarily) North America and Western Europe in the decision-making and data (re)use policies and practices around Wikidata and the Wikimedia movement in general. Crucially, the conference engaged with the question of equity beyond simply the issue of representation. The opening keynote ‘Decolonizing Wikidata: why does knowledge justice matter for structured data’ was delivered by Anasuya Sengupta, an Indian feminist activist, scholar, and long-time Wikimedian. Throughout the keynote and in subsequent sessions, Sengupta provided a nuanced analysis of the state of the Wikimedia movement, the call to decolonize, and the need to move away from universalizing ideas around what a global knowledge base should look like. A clear message throughout these thought-provoking sessions was the need to focus on decentralization, and to allow for an interlinked ? but also non-universalizing ? ecosystem of plural community knowledge bases and plural ontologies to be sustained.

The ideas of: 1) decentralization, 2) sustainability through broad community engagement, and 3) recognition of the importance of bringing together diverse perspectives to the movement as a whole, and the development of software tools like Wikidata and Wikibase in particular; were all highlighted throughout the second and third day of the conference with the community tracks spanning 10 different topics including: Sustainability, GLAM, Education and Science, and more [6].

Focus on Wikibase track

Of particular significance to our work at the Open Science Lab at TIB were the GLAM and Education and Science tracks, as well as the track dedicated to Wikibase. OSL’s researcher Lozana Rossenova, serving as Wikibase community manager for NFDI4Culture, was invited by Wikimedia Germany to co-curate and help facilitate the programme for the Wikibase track. The programme for this track provided an opportunity to learn more about the latest research-led and institutional projects featuring Wikibase; get inspiration from diverse use-cases; and learn more about latest developments in the tool ecosystem around Wikibase. The track featured an introduction to the Wikibase Stakeholder Group, a new cross-institutional effort – including TIB – which was established to secure further development and long-term sustainability of Wikibase and related extensions. Furthermore, a presentation by Adam Shorland (Tech Lead for Wikidata and Wikibase at Wikimedia Germany) and Sam Alipio (Product Manager for Wikibase Ecosystem at Wikimedia Germany) announced a new service launching in 2022 – wikibase.cloud, which will aim to fulfill the need to easily deploy and manage cloud-based services for independent Wikibase users. At TIB, we will be working closely with the team at Wikimedia Germany to evaluate how wikibase.cloud can help meet the needs of our research partners in ongoing programs at OSL and NFDI4Culture.

OSL team members participated in 3 presentations on the final day of the conference – Sunday, October 31st, in the context of the Wikibase and Education and Science tracks. Learn more about the presentations in the second part of this blog post.

 

Endnotes

[1] Source: https://meta.wikimedia.org/wiki/LinkedOpenData/Strategy2021/Wikidata

[2] Stats provided by Léa Lacroix, Community Engagement Coordinator at Wikimedia Germany.

[3] Source: https://meta.wikimedia.org/wiki/LinkedOpenData/Strategy2021/Wikidata

[4] Berthelot, Jean-Frédéric. 2021. “Where is the technical volunteer support in the Wikiverse?” Available from: https://commonists.wordpress.com/2021/10/29/where-is-the-technical-volunteer-support-in-the-wikiverse/

[5] Source: https://www.wikidata.org/wiki/Wikidata:Reimagining_Wikidata_from_the_margins

[6] Source: https://www.wikidata.org/wiki/Wikidata:WikidataCon_2021/Program/Day_2_and_3_-_Community_tracks

Der Beitrag TIB at WikidataCon: Part 1 erschien zuerst auf TIB-Blog.

The post TIB at WikidataCon: Part 1 first appeared on Leibniz Research Alliance Open Science.

Coding da Vinci Niedersachsen 2020 endet mit Preisverleihung an herausragende Projekte

Zur Abschlussveranstaltung von Coding da Vinci Niedersachsen 2020 am 29. Januar 2021 präsentierten insgesamt 10 Projekte ihre großartigen Ergebnisse aus den vergangenen 14 Hackathon-Wochen vor etwa 250 Zuschauer*innen. Rund 40 Kulturinstitutionen aus Niedersachsen hatten ihre Datensets beim Kick-Off am 24. und 25. Oktober 2020 zur Verfügung gestellt und die Teilnehmenden dazu inspiriert, ihre kreativen Ideen dazu in neue Projekte zu übersetzen.

Die Preisverleihung fand Online statt und wurde vollständig live bei Youtube gestreamt. Der Stream ist hier abrufbar: https://youtu.be/1NrFbMcUBZs

Die Gewinnerinnen und Gewinner

Welches Potenzial in offenen Kulturdaten liegt, haben die Projektteams mit ihren beeindruckenden Ergebnissen wieder einmal eindrucksvoll bewiesen. Die fünfköpfige Jury, bestehend aus Expert*innen aus unterschiedlichen Bereichen der Open, – Kultur- und Tech-Szene, stellte die Wahl der Gewinnerteams vor eine erhebliche Herausforderung.

Vergeben wurden die Preise in den unterschiedlichen Kategorien an die Projekte

  • Appsolutly Old (ab ‘03:30:31 im Stream) in der Kategorie “Funniest Hack”, übergeben von Wolf-Tilo Balke
  • Maschinenlerner (ab ‘03:36:04 im Stream) in der Kategorie “Most Useful”, übergeben von Ina Blümel
  • FabSeal (ab ‘03:41:14 im Stream) in der Kategorie “Best Design”, übergeben von Mareike König

Über den 4. Gewinner entschied das Publikum via Online-Voting, an dem sich mehr als 220 Personen beteiligten:

  • Herzog VR August (ab ‘03:46:27 im Stream) in der Kategorie “Everybody’s Darling”, übergeben von Tabea Golgath
Bild Appsolutly Old: CC BY-SA 4.0 Kira Lorberg, Lukas Sontheimer // Bild Herzog VR August: CC BY-SA 3.0 Herzog VR August Team // Bild Maschinenlerner: CC BY-SA 3.0 DE Pit Noack / Team Maschinenlerner // Bild FabSeal: CC BY-SA 4.0 Joana Bergsiek

Chancen, Herausforderungen und Notwendigkeiten mit offenen Kulturdaten – Keynote von Ellen Euler

In Ihrer Keynote “Gemeinsam den digitalen Kulturraum der Zukunft gestalten” (ab ‘02:59:57 im Stream) sprach Prof. Dr. jur. Ellen Euler LL.M. (FH Potsdam) über die Bedeutung offener Kulturdaten aus Sicht der Wissenschaft. In ihrem Vortrag beleuchtete sie die Chancen und Herausforderungen, die in der Verfügbarkeit offener Kulturdaten liegen. Außerdem richtete sie den Fokus auf die Notwendigkeiten – auch aus juristischer Perspektive – die erforderlich sind, damit eine möglichst weite Verbreitung und große Nutzung der Daten erreicht werden kann.

Die Vortragfolien zur Keynote können hier abgerufen werden.

Dank, Abschied und ein Wiedersehen in Schleswig-Holstein

Wir bedanken uns bei allen Teilnehmenden für ihren großartigen Einsatz, für ihre Ideen, die Umsetzungen und die gelungene Präsentation!

Unser Dank geht an alle Akteur*innen von Coding da Vinci Niedersachsen 2020:

Der Jury aus den Mitgliedern

  • Dr. Tabea Golgath, (Stiftung Niedersachsen)
  • Antje Theise, (Universitätsbibliothek Rostock)
  • Wolf-Tilo Balke, (TU Braunschweig)
  • Dr. Mareike König (Deutsches Historisches Instituts, Paris)
  • Prof. Dr. Ina Blümel, (Hochschule Hannover und Open Science Lab der TIB Hannover)

und ein großes Dankeschön insbesondere an

  • die Kulturinstitutionen in Niedersachsen, die ihre Datensets für den Hackathon aufbereitet und zur Verfügung gestellt haben und
  • alle Projektteams, die über einen Zeitraum von 14 Wochen unermüdlich und mit viel Leidenschaft diese sehr spannenden Projekte entwickelt haben!

Nach dem Hackathon ist vor dem Hackathon – Weiter geht es im April, zum Start von Coding da Vinci Schleswig-Holstein 2021!

Der Beitrag Coding da Vinci Niedersachsen 2020 endet mit Preisverleihung an herausragende Projekte erschien zuerst auf TIB-Blog.

The post Coding da Vinci Niedersachsen 2020 endet mit Preisverleihung an herausragende Projekte first appeared on Leibniz Research Alliance Open Science.

20 Jahre Wikipedia  

ein Beitrag von Matti Stöhr und Michael Hohlfeld

Vor 20 Jahren hat Jimmy Wales zusammen mit Larry Sanger die Wikipedia aus der Taufe gehoben. Die freie Online-Enzyklopädie und größte digitale Wissenssammlung der Welt hat diesen runden Geburtstag am und rund um den 15. Januar 2021 ausgiebig gefeiert und sich (zu recht) feiern lassen.

Medial wurde der 20. Geburtstag auch hierzulande sehr breit aufgegriffen. Insbesondere Funk und Fernsehen haben sich in vielen Beiträgen mit den Hintergründen der Entstehung der Wikipedia und dem Thema freies Wissen auseinandergesetzt und aufgezeigt, wie die – sehr erfolgreiche, aber tatsächlich nicht unumstrittene – Plattform funktioniert. Empfehlen wollen wir an dieser Stelle die WDR-Dokumentation „Das Wikipedia Versprechen“ in der ARD-Mediathek, welche in einer etwas längeren Fassung auch ARTE im Programm hat. Nicht zuletzt kritische Stimmen kommen hier nicht zu kurz. Weitere Beiträge sind zum Beispiel auf der offiziellen Geburtstagsseite der Wikimedia Deutschland verlinkt. Auf dieser Seite finden sich aber nicht nur Medienberichte, sondern z.B. auch eine visualisierte Zeitreise, persönliche Geschichten und viele Informationen, wie man bei der Wikipedia mitmachen kann.

Auch in den den sozialen Netzwerken wurde und wird das Jubiläum unter dem Hashtag #Wikipedia20 ausgiebig thematisiert. Auf Twitter und Instagram haben wir uns am letzten Freitag sehr gerne den Gratulanten angeschlossen, da uns als Bibliothek und Forschungseinrichtung doch viel mit der Wikipedia verbindet. Lambert Heller, Leiter unseres Open Science Labs, hat dazu in einem kurzen Gratulationsvideo die Wikipedia aus TIB-Sicht gewürdigt. Er benennt beispielhaft Aktivitäten und Projekte aus der TIB in Nutzung und in Zusammenarbeit mit der Wikipedia, der Wikimedia Deutschland und diverser Schwesterprojekte. Unter anderem kommt die Mentor*innen-Beteiligung am Fellow-Programm Freies Wissen mit Ina Blümel zur Sprache.


Eine sehr aktuelle Verbindung ist etwa der Kultur-Hackaton Coding da Vinci Niedersachsen 2020, welcher ganz bald am 29. Januar 2021 mit einer Online-Preisverleihung endet. Ab 16 Uhr werden dann die einzelnen Projekte der Öffentlichkeit vorgestellt und in verschiedenen Kategorien ausgezeichnet. (hier kostenlos zur Preisverleihung anmelden)

Auch viele andere Bibliotheken nutzen mit verschiedenen Aktivitäten die Wikipedia professionell und/oder sind eng mit ihr verzahnten Plattformen wie Wikimedia Commons oder Wikidata verbunden. Das von der Deutschen Nationalbibliothek (DNB) und Wikimedia Deutschland initiierte WikiLibrary-Manifest unterstreicht diese Verbundenheit und das gemeinsame Ziel eines internationalen Wissensnetzwerks. Natürlich haben auch wir als TIB das Manifest mitgezeichnet.

Außerdem: Anlässlich des Wikipedia-Geburtstags haben wir im TIB AV-Portal eine kleine Liste von thematisch passenden Videos zusammengestellt. Seien Sie herzlich eingeladen in unsere Wikiversum-Watchlist reinzuschauen.

Screenshot der öffentlichen Watchlist „Wikiversum“ im TIB AV-Portal

#1Lib1Ref

Passend zum Geburtstag wurde am 15. Januar auch wieder die Editier-Kampagne #1Lib1Ref gestartet. Eine konkrete, aktive Form des Mitfeierns und des Mitgestaltens. #1Lib1Ref hat das Ziel, durch das Hinzufügen von mindestens einem Zitat bzw. Beleg aus zuverlässigen Quellen in Wikipedia-Artikeln, die Enzyklopädie stetig noch besser zu machen und baut auf die bibliothekarische Expertise. Die aktuelle Kampagne läuft noch bis zum 5. Februar (sowie vom 15. Mai bis 5. Juni) und der Aufwand für jede*n Einzelne*n hält sich in Grenzen. Weitere Details und Hilfe gibt es – natürlich – im Wikipedia-Artikel zu #1Lib1Ref. Übrigens: selbstverständlich können auch (wissenschaftliche) Videos als Quelle/Beleg in Wikipedia-Artikel eingefügt werden. Für die Einbindung von Videos aus dem TIB AV-Portal gibt es sogar praktische Vorlagen:

  1. zur Einbindung von expliziten Video-Zitaten – Vorlage 1: TIBAV sowie
  2. zur Einbindung von Video-Suchen unter Berücksichtigung bestimmter Parameter – Vorlage 2: TIBAV-Suche.

Zum Abschluss nochmals: herzlichen Glückwunsch zum Geburtstag an Wikipedia!

Wir freuen uns auf die weitere Zusammenarbeit und feiern freies Wissen und Offenheit generell – ganz im Sinne der strategischen Ziele und Aktivitäten der TIB, unter anderem gewürdigt durch den erhaltenen Open Library Badge 2020 .

Der Beitrag 20 Jahre Wikipedia   erschien zuerst auf TIB-Blog.

The post 20 Jahre Wikipedia   first appeared on Leibniz Research Alliance Open Science.