Live Webinar: Open systems and library analytics – 1501970

“Open source software and interoperable services for library management and analytics provide libraries with more choice in how to deploy, support and develop mission-critical applications. Join this webinar to learn more about EBSCO’s support for FOLIO, the open source library services platform, and Panorama, an interoperable application for library analytics.”

Next generation Open Access analytics: A case study – IOS Press

Abstract:  A critical component in the development of sustainable funding models for Open Access (OA) is the ability to communicate impact in ways that are meaningful to a diverse range of internal and external stakeholders, including institutional partners, funders, and authors. While traditional paywall publishers can take advantage of industry standard COUNTER reports to communicate usage to subscribing libraries, no similar standard exists for OA content. Instead, many organizations are stuck with proxy metrics like sessions and page views that struggle to discriminate between robotic access and genuine engagement.

This paper presents the results of an innovative project that builds on existing COUNTER metrics to develop more flexible reporting. Reporting goals include surfacing third party engagement with OA content, the use of graphical report formats to improve accessibility, the ability to assemble custom data dashboards, and configurations that support the variant needs of diverse stakeholders. We’ll be sharing our understanding of who the stakeholders are, their differing needs for analytics, feedback on the reports shared, lessons learned, and areas for future research in this evolving area.

Standards, Inputs, and Outputs: Strategies for improving data-sharing and consortia-based epidemiologic research | American Journal of Epidemiology | Oxford Academic

Abstract:  Data sharing improves epidemiology research, but sharing data frustrates epidemiologic researchers. The inefficiencies of current methods and options for data-sharing are increasingly documented and easily understood by any study that has shared its data and any researcher who has received shared data. Temprosa and Moore et al. (Am J Epidemiol. XXXX;XXX(XX):XXXX–XXXX)) describe how the COnsortium of METabolomics Studies (COMETS) developed and deployed a flexible analytic platform to eliminate key pain points in large-scale metabolomics research. COMETS Analytics includes an online tool, but its cloud computing and technology are supporting, rather than the lead, actors in this script. The COMETS team identified the need to standardize diverse and inconsistent metabolomics and covariate data and models across its many participating cohort studies, and then they developed a flexible tool that gave its member studies choices about how they wanted to meet the consortium’s analytic requirements. Different specialties will have different specific research needs and will likely continue to use and develop an array of diverse analytic and technical solutions for their projects. COMETS Analytics shows how important and enabling the upstream attention to data standards and data consistency are to producing high-quality metabolomics, consortium-based, and large-scale epidemiology research.


Dryad Data — Repository Analytics and Metrics Portal (RAMP) 2020 data

“The Repository Analytics and Metrics Portal (RAMP) is a web service that aggregates use and performance use data of institutional repositories. The data are a subset of data from RAMP, the Repository Analytics and Metrics Portal (, consisting of data from all participating repositories for the calendar year 2020. For a description of the data collection, processing, and output methods, please see the “methods” section below….”

Repository Analytics and Metrics Portal – Web analytics for institutional repositories

“The Repository Analytics and Metrics Portal (RAMP) tracks repository items that have surfaced in search engine results pages (SERP) from any Google property. RAMP does this by aggregating Google Search Console (GSC) data from all registered repositories.

RAMP data are collected from GSC in two separate sets: page-click data and country-device data. The page-click data include the handle (aka URL) of every item that appeared in SERP. This dataset creates significant possibilities for additional research if the metadata of those items were mined. RAMP data are as free of robot traffic as possible and they contain no personally identifiable information.

RAMP data include the following metrics:

Impressions – number of times an item appears in SERP
Position – location of the item in SERP
Clicks – number times an item URL is clicked
Click-Through Ratios – number of clicks divided by the number of impressions
Date – date of the search
Device – device used for the search
Country – country from which the search originated….”

Guest Post – One Publisher to Rule Them All? Consolidation Trends in the Scholarly Communications and Research Sectors – The Scholarly Kitchen

“The story of mergers and acquisitions in scholarly communications is one dominated in the last 10 to 15 years by a series of eye-catching vertical acquisitions by publishers, content aggregators, and database providers which have expanded their services. These mergers have blurred traditional roles and reflect a strategy of traditional players moving to become broader providers of analytics and workflow.

The successful integration of early stage companies and managed transition by established commercial entities is one of the major reasons scholarly communications has not seen the level of disruption anticipated and desired by many who seek to change the status quo….

Access to bigger archives will become a key determinant in preserving subscription pricing models as the volume of new publications available via open access increases. As such, we can expect this to drive further mergers and monetization of valuable backlists….

Publishing open access now offers a less plausible ‘Exit’ strategy for researchers wishing to express dissatisfaction with the market status quo. It is harder to move away from larger, commercial publishers when they are also the largest open access publishers….

Overall, the industry remains very much in a growth phase with high potential for further acquisitions and mergers, played out against a backdrop of Plan S and COVID-19 with an ongoing battle for researchers’ loyalty. There is a widespread belief that eventually researchers’ desire for robust, fast, rigorous publishing with rapid dissemination and access for all will become more important than prestige of the publishing vehicle. When and if this happens, it remains to be seen whether this race will be won by organic growth, mergers, acquisitions or large scale disruption from outside the industry.”


Being transparent & privacy aware: ditching third-party trackers in Strathprints | Open Access @ Strathclyde


George Macgregor
Scholarly Publications & Research Data, University of Strathclyde

Over the years, and like a lot of websites, Strathprints has historically made use of third-party integrations. Some of these integrations have provided us, and Strathprints users, with useful functionality over the years. But because these integrations involve the implementation of tracking code within Strathprints, they have also entailed third-party cookies being attached to our users. This is most notable in our use of Google Analytics and AddThis, the former providing analytics on web traffic and the latter providing convenient social sharing buttons and web analytics. In fact, the Google Analytics Tracking Code (GATC) also entails the DoubleClick cookie used to enable remarketing for products like Google Ads, while AddThis engages in browser fingerprinting.

Given the tracking that is increasing occurring within the scholarly publishing industry generally, and the sometimes-nefarious purposes to which the collected data are being put, we feel it is inappropriate for an open repository like Strathprints to continue to use additional and unnecessary forms of tracking. We have therefore recently removed Google Analytics from Strathprints altogether and have implemented alternative social sharing options to replace AddThis. An additional benefit of removing these tools is that Strathprints is serving less Javascript, which helps to promote quicker page loading – so the benefits go beyond superior privacy to include a better user experience!


“Introducing Reproducibility to Citation Analysis” by Samantha Teplitzky, Wynn Tranfield et al.

Abstract:  Methods: Replicated methods of a prior citation study provide an updated transparent, reproducible citation analysis protocol that can be replicated with Jupyter Notebooks.

Results: This study replicated the prior citation study’s conclusions, and also adapted the author’s methods to analyze the citation practices of Earth Scientists at four institutions. We found that 80% of the citations could be accounted for by only 7.88% of journals, a key metric to help identify a core collection of titles in this discipline. We then demonstrated programmatically that 36% of these cited references were available as open access.

Conclusions: Jupyter Notebooks are a viable platform for disseminating replicable processes for citation analysis. A completely open methodology is emerging and we consider this a step forward. Adherence to the 80/20 rule aligned with institutional research output, but citation preferences are evident. Reproducible citation analysis methods may be used to analyze open access uptake, however, results are inconclusive. It is difficult to determine whether an article was open access at the time of citation, or became open access after an embargo.

Public draft: OA eBook Usage Data Analytics and Reporting Use-cases by Stakeholder. Feedback invited through July 10, 2021

Publishers, libraries, and a diverse array of scholarly communications platforms and services generate information about how OA books are accessed online. Since its launch in 2015, the OA eBook Usage Data Trust (@OAEBU_project) effort has brought together these thought leaders to document the barriers facing OA eBook usage analytics. To start addressing these challenges and to understand the role of a usage data trust, the effort has spent the last year studying and documenting the usage data ecosystem. Interview-based research led to the documentation of the OA book data supply chain, which maps related metadata and usage data standards and workflows. Dozens worldwide have engaged in human-centered design workshops and communities of practice that went virtual during 2020. Together these communities revealed how OA book publishers, platforms, and libraries are looking beyond their need to provide usage and impact reports. Workshop findings are now documented within use-cases that list the queries and activities where usage data analytics can help scholars and organizations to be more effective and strategic. Public comment is invited for the OA eBook Usage Data Analytics and Reporting Use Cases Report through July 10, 2021.

Repeat It or Take It Back

“Outside of eLife and , to an extent , PLoS , no one of scale and weight  in the commercial publishing sector has really climbed aboard the Open Science movement with a recognition of the sort of data and communication control that Open Science will require . 

So what is that requirement ? In two words – Replicability and Retraction . …”

Gelenkte Wissenschaft: Die DFG warnt vor Einfluss des Plattformkapitalismus (“Guiding” science: DFG warns against influence of platform capitalism) | Frankfurter Allgemeine

German Research Foundation warns against the growing influence of major publishers on research. Scientific freedom is under threat from two sides.



Die Deutsche Forschungsgemeinschaft warnt vor dem wachsenden Einfluss der Großverlage auf die Forschung. Die Wissenschaftsfreiheit ist hier von zwei Seiten bedroht.

Unsub Extender · Streamlit

This description from Twitter: 

“Very excited to announce my latest project: Unsub Extender!

Run an @unsub_org export .csv file through Unsub Extender to automatically make interactive plots and visualizations with filters. Written in python w/@streamlit and Altair @jakevdp @ellisonbg…”

Data tracking in research: aggregation and use or sale of usage data by academic publishers

“This briefing paper issued by the Committee on Scientific Library Services and Information Systems (AWBI) of the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) on the subject of data tracking in digital research resources describes options for the digital tracking of research activities. It outlines how academic publishers are becoming data analytics specialists, indicates the consequences for research and its institutions, and identifies the types of data mining that are being used. As such, it primarily serves to present contemporary practices with a view to stimulating discussion so that positions can be adopted regarding the consequences of these practices for the academic community. It is aimed at all stakeholders in the research landscape….

Potentially, research tracking of this kind can fundamentally contradict academic freedom and informational self-determination. It can endanger scientists and hinder the freedom of competition in the field of information provision. For this reason, scholars and academic institutions must become aware of the problem and clarify the legal, technical and ethical framework conditions of their information supply – not least so as to avoid involuntarily violating applicable law, but also to ensure that academics are appropriately informed and protected. AWBI’s aim in issuing this briefing paper is to encourage a broad debate within the academic community – at the level of academic decision-makers, among academics, and within information infrastructure institutions – so as to reflect on the practice of tracking, its legality, the measures required for compliance with data protection and the consequences of the aggregation of usage data, thereby enabling such measures to be adopted. The collection of data on research and research activity can be useful as long as it follows clear-cut, transparent guidelines, minimises risks to individual researchers and ensures that academic organisations are able to use such data if not have control over it.”