‘All your data are belong to us’: the weaponisation of library usage data and what we can do about it | UKSG

By Caroline Ball – Academic Librarian, University of Derby, #ebookSOS campaigner
Twitter: @heroicendeavour, Mastodon: @heroicendeavour@mas.to

and Anthony Sinnott, Access and Procurement Development Manager, University of York; Twitter: @librarianth

What do 850 football players and their performance data have in common with academic libraries and online resources? More than you’d think! The connecting factor is data, how it is collected, used and for what purposes.

‘Project Red Card’ is demanding compensation for the use of footballers’ performance data by betting companies, video game manufacturers, scouts and others, arguing that players should have more control over how their personal data is collected and particularly how it is monetized and commercialised.

Similarly, libraries’ online resources, whether a single ebook or vast databases, are producing enormous amounts of data, utilised by librarians to assist us in our vital functions: assessing usage and value, determining demand and relevance.

But are we the only ones using this data generated by our users? What other uses is this data being put to? We know for certain that vendors have access to more data than they provide to us via COUNTER statistics etc, but we have no way of knowing how much, what types, or what is done with it.

Witness the recent controversy generated by Wiley’s removal of 1,379 e-books from Academic Complete. Publishers like Wiley determine high use by accessing statistics generated by our end-users via the various e-book platforms through which they access the content. This in itself is indicative of our end-user/library data being provided to third parties without our knowledge or consent, particularly concerning given our licences are with vendors and not publishers. We are also not privy to what data-sharing agreements exist between vendors and publishers. Should we allow library usage data to be weaponized against us in this fashion? What recourse do we have to push back against this practice of ‘data extractivism’, to either withhold this data from publishers and vendors or prohibit them from using it for their own commercial gain?

 

 

Video: Open Access Usage Data: Present Knowledge, Future Developments | Open Access Book Network @ Youtube

Christina Drummond (Executive Director of the OA eBook Usage Data Trust) and Lucy Montgomery (Professor of Knowledge Innovation at Curtin University and co-lead of the Curtin Open Knowledge Initiative) discuss the OAeBU Usage Data Trust project and the new developments its work will take over the coming years.

Lucy Montgomery’s slides are available here: https://zenodo.org/record/7309149

Utilization of Open Access Journals by Library and Information Science Undergraduates in Delta State University, Abraka, Nigeria

Abstract:  The study examined the utilization of open access journals by Library and Information Science (LIS) undergraduate at Delta State University, Abraka. Two research questions and one hypothesis guided the study. A descriptive survey design was used by the researchers. The population of the study comprised 477 LIS undergraduates, and a simple random sampling technique was used to determine the sample size which is 217 students, representing 45% of the total population. The questionnaire was the instrument used for data collection. The questionnaire was validated by two experts and the Cronbach Alpha was used to establish the reliability of the instrument which yielded 0.75. Data were analysed with frequency count, simple percentages, and Statistical Product and Service Solutions (SPSS) version 23 was used to generate the mean, and standard deviation while Pearson’s product-moment correlation coefficient was used to test the hypothesis at 0.05 significant levels. The findings revealed that the students had a high level of awareness and a high level of usage of open access journals. From the test of the hypothesis, the study discovered that there is a significant relationship between the level of awareness and the use of open access journals. Hence, the student’s level of awareness positively influenced the use of open access journals. Based on the findings, the researchers recommended that the library management and lecturers should continue to promote the use of open access journals generally among the students to sustain its use.

Tracking Open Access Usage – ChronosHub

“Open Access usage is a complex topic. In this webinar, we’ll look at what metrics can be collected, and whether we should look at the data globally, or at an institutional level, possibly to evaluate affiliated institutions’ APC payments or open access agreements.

 

We will discuss the topic both from a publisher and a library perspective, with panelists sharing their experiences and opinion on the feasibility of conducting a usage-based analysis of open access articles to determine their value to institutions and libraries….”

Open Science Observatory – OpenAIRE Blog

“The Open Science Observatory (https://osobservatory.openaire.eu) is an OpenAIRE platform showcasing a collection of indicators and visualisations that help policy makers and research administrators better understand the Open Science landscape in Europe, across and within countries.  

The broader context: As the number of Open Science mandates have been increasing across countries and scientific fields, so has the need to track Open Science practices and uptake in a timely and granular manner. The Open Science Observatory assists the monitoring, and consequently the enhancing, of open science policy uptake across different dimensions of interest, revealing weak spots and hidden potential. Its release comes in a timely fashion, in order to support UNESCO’s global initiative for Open Science and the European Open Science Cloud (the current development and enhancement is co-funded by the EOSC Future H2020 project and will appear in the EOSC Portal).  …

How does it work: Based on the OpenAIRE Research Graph, following open science principles and an evidence-based approach, the Open Science Observatory provides simple metrics and more advanced composite indicators which cover various aspects of open science uptake such us

different openness metrics
FAIR principles
Plan S compatibility & transformative agreements
APCs

as well as measures related to the outcomes of Open Access research output as they relate to

network & collaborations
usage statistics and citations
Sustainable Development Goals

across and within European countries. ”

Taking Open Access book usage from reports to operational strategy | Digital Science

By Christina Drummond

While the term “usage data” most often refers to webpage views and downloads associated with a given book or book chapter, scholarly communications stakeholders have identified a near future where linked open access (OA) scholarship usage data analytics could directly inform publishing, discovery, and collections development in addition to impact reporting.

In the 2020-2022 Exploring Open Access Ebook Usage research project supported by the Mellon Foundation, publisher and library representatives expressed their interests in using OA eBook Usage (OAeBU) data analytics to inform overall OA program investment, strategy and fundraising. A report summarizing a year of virtual focus groups noted multiple operational use cases for OA book usage analytics, spanning book marketing, sales, and editorial strategy; collections development and hosting; institutional OA program strategy, reporting, and investment; and OA impact reporting for institutions and authors to support reporting to their funding agencies, donors, and policy-makers.

 

Public use and public funding of science | Nature Human Behaviour

Abstract:  Knowledge of how science is consumed in public domains is essential for understanding the role of science in human society. Here we examine public use and public funding of science by linking tens of millions of scientific publications from all scientific fields to their upstream funding support and downstream public uses across three public domains—government documents, news media and marketplace invention. We find that different public domains draw from various scientific fields in specialized ways, showing diverse patterns of use. Yet, amidst these differences, we find two important forms of alignment. First, we find universal alignment between what the public consumes and what is highly impactful within science. Second, a field’s public funding is strikingly aligned with the field’s collective public use. Overall, public uses of science present a rich landscape of specialized consumption, yet, collectively, science and society interface with remarkable alignment between scientific use, public use and funding.

 

Toward a definition of digital object reuse | Emerald Insight

Abstract:  Purpose

The purpose of this paper is to present conceptual definitions for digital object use and reuse. Typically, assessment of digital repository content struggles to go beyond traditional usage metrics such as clicks, views or downloads. This is problematic for galleries, libraries, archives, museums and repositories (GLAMR) practitioners because use assessment does not tell a nuanced story of how users engage with digital content and objects.

Design/methodology/approach

This paper reviews prior research and literature aimed at defining use and reuse of digital content in GLAMR contexts and builds off of this group’s previous research to devise a new model for defining use and reuse called the use-reuse matrix.

Findings

This paper presents the use-reuse matrix, which visually represents eight categories and numerous examples of use and reuse. Additionally, the paper explores the concept of “permeability” and its bearing on the matrix. It concludes with the next steps for future research and application in the development of the Digital Content Reuse Assessment Framework Toolkit (D-CRAFT).

Practical implications

The authors developed this model and definitions to inform D-CRAFT, an Institute of Museum and Library Services National Leadership Grant project. This toolkit is being developed to help practitioners assess reuse at their own institutions.

Originality/value

To the best of the authors’ knowledge, this paper is one of the first to propose distinct definitions that describe and differentiate between digital object use and reuse in the context of assessing digital collections and data.

Full article: Unsub in Real Life: Using Unsub as Part of Serials Decisions and Negotiations

Abstract:  This presentation introduced attendees to the benefits and limitations of Unsub, a data analysis tool designed by OurResearch. In this presentation, OurResearch co-founder, Heather Piwowar, demonstrated the use of Unsub for analyzing usage and cost data on a library’s “Big Deal.” The other two presenters, Jessica Harris of the University of Chicago, and Eric Schares of Iowa State University, discussed how they used the tool at their libraries to make collection development decisions for their libraries’ journal subscriptions.

Hybrid Open Access Dashboard | SUB Göttingen

Overview: Many academic publishers offer hybrid (hybrid OA) open access journals, where some articles in an otherwise subscription-based publication are made openly available. Recently, some funders have pushed for a transformation towards such a hybrid OA business model, where publishing houses are paid for open access publication. To draft, monitor and evaluate such transformative agreements, libraries and their consortia need data on the uptake, costs and impact of hybrid OA.

{HOAD} is a data product to meet this need. The dashboard is packaged as an extension to the R Project for Statistical Computing (an R package), released under an open source license and developed in the open at http://github.com/subugoe/hoad. The package has several components:

APIs to expose data from public bibliometric sources relevant to hybrid OA.
ETL pipelines (extraction, transformation, loading) and accompanying visualisations to answer hybrid OA business questions.
A web application to explore hybrid OA data, including customisation for individual journal portfolios.

The project is based on data gathered by the Crossref DOI registration agency and the OpenAPC initiative. The package is at the Göttingen State and University Libary as part of the DFG-funded eponymous Hybrid Open Access Dashboard project.

An early prototype of the application, including the interactive web frontend is available at https://subugoe.github.io/hoad/.

 

How does the growth of a particular publisher’s open access content factor into the relative value of a Big Deal? Part 2: The Findings – Delta Think

“Some final thoughts: (1) Overall usage was a stronger influence on the change in value than the small changes in the proportion of hybrid OA article usage. (2) Despite the range of research activity levels across our institutions, there wasn’t much difference in the proportion of the open versus controlled usage across the site-licensed institutions for either publisher. (3) COVID likely affected these trends, but precisely how was unclear. Did lockdown increase the usage or limit it? Did it affect our two publishers differently? We have no ‘non-COVID’ control unfortunately. (4) If the impact of transformative agreements on the rate of hybrid OA article output influenced these trends, the impact was quite small. Still, with more libraries negotiating transformative agreements, growth in the proportion of OA articles should accelerate. As long as usage in publisher packages continues to grow, cost per controlled use will increase more quickly than cost per use. This new cost per controlled use metric should help libraries track the return on investment from their journal package subscription payments as a growing proportion of underlying articles are free to read.”