Experimenting with repository workflows for archiving: Automated ingest | Community-led Open Publication Infrastructures for Monographs (COPIM)

by Ross Higman

In a recent post, my colleague Miranda Barnes outlined the challenges of archiving and preservation for small and scholar-led open access publishers, and described the process of manually uploading books to the Loughborough University institutional repository. The conclusion of this manual ingest experiment was that while university repositories offer a potential route for open access archiving of publisher output, the manual workflow is prohibitively time- and resource-intensive, particularly for small and scholar-led presses who are often stretched in these respects.

Fortunately, many institutional repositories provide routes for uploading files and metadata which allow for the process to be automated, as an alternative to the standard web browser user interface. Different repositories offer different routes, but a large proportion of them are based on the same technologies. By experimenting with a handful of repositories, we were therefore able to investigate workflows which should also be applicable to a much broader spread of institutions.



Call for projects – DARIAH Theme 2022: Workflows | DARIAH

Arts and humanities researchers tend to be multitasking heroes and versatility buffs. This is probably not a matter of choice. Whether we work on digital editions of literary works, analyze historical events by creating and exploiting corpora of digitized newspapers, or model archaeological sites in 3D, our research processes are often quite complex: they involve multiple steps, different tools and a combination of methods. We are no strangers to heterogeneous datasets, modular system architectures, metadata crosswalks and software pipelines. And we are increasingly aware of the importance of data sharing and the notion of reproducible research in the age of Open Science. A scholarly process may start with identifying and collecting data and end with the publication of some research outputs, but the very beginning and the very end never tell the full story of the research data lifecycle.  

In this year’s DARIAH Theme Call, we are looking for proposals and projects that will explore, assess, analyze and embody the challenges of designing, implementing, documenting and sharing digitally-enabled workflows in the context of arts and humanities research from a technical, methodological, infrastructural and conceptual point of view.  

What is the state of the art in research workflows in the digital arts and humanities? What are we doing well, and what should we do better? How can we evaluate the appropriateness of a workflow or assess its efficiency? What makes a workflow innovative? What does it mean for a workflow to be truly reproducible? Are there modeling or standardization frameworks that make this job easier? What kind of documentation is necessary and at what level of granularity? What are the hidden costs of our workflows? What should DARIAH do – in addition to treating workflows as a particular content type on the SSH Open Marketplace – to help researchers develop, deploy and disseminate better workflows?


Wrapping up the Library Publishing Workflows Project | Educopia

Over the past three years, Library Publishing Workflows—an IMLS-funded (LG-36-19-0133-19) project of Educopia Institute, the Library Publishing Coalition, and twelve partner libraries—has been fostering conversation about the workflows library publishers use to publish journals, how libraries have developed their journal publishing services, and the major challenges they face in their day-to-day work. We have also released a wide range of materials—from workflows to documentation tools to reflections—to support library publishers in their work. As the project winds down, we wanted to provide a round-up of all of the major project outputs.

Experimenting with repository workflows for archiving: Manual ingest | Community-led Open Publication Infrastructures for Monographs (COPIM)

Barnes, M. (2022). Experimenting with repository workflows for archiving: Manual ingest. Community-Led Open Publication Infrastructures for Monographs (COPIM). https://doi.org/10.21428/785a6451.85c38501

Over the course of the last year (2021-2022), colleagues in COPIM’s archiving and preservation team have been considering ways to solve the issues surrounding the archiving and preservation of open access scholarly monographs. Most large publishers and many University presses have existing digital preservation relationships with digital preservation archives, but small and scholar-led publishers lag behind due to lack of resource.

One of the potential solutions we have been considering is the university repository as open access archive for some of these presses. COPIM includes a number of scholar-led presses, such as Mattering Press, meson press, Open Humanities Press, Open Book Publishers and punctum books. Partners on the project also include UCSB Library and Loughborough University Library. In cooperation with Loughborough University Library, we began to run some preliminary repository workflow experimentations to see what might be possible, using books from one of the partner publishers.

Loughborough University employs Figshare as their primary institutional repository, so we began with this as a test bed for our experimentations.



Revisiting: When is a Publisher not a Publisher? Cobbling Together the Pieces to Build a Workflow Business – The Scholarly Kitchen

“Ultimately, Elsevier’s user acquisition and monetization strategy here is as sophisticated as anything we have seen in scholarly publishing to date. Open access advocates might be concerned about some of these directions, but my sense is that many of these scientists and librarians remain largely focused on trying to compete with, or at least influence, scientific publishing. Building businesses that support, and potentially monetize, researcher workflow is a very different animal. While the Center for Open Science and the SHARE initiative are trying to offer up counterweights, there is little evidence that the open access community as a whole is engaged with Elsevier’s transformation. Springer Nature’s sibling Digital Science is probably Elsevier’s foremost competitor in this space, albeit with a different investment and integration model….”

Managing open access publication workflows and compliance | Jisc

“Higher education institutions must manage open access funds, track research outputs across the publication lifecycle, as well as meeting funders’ open research policies.?These resource intensive activities pose challenges across the sector. Our new product tackles this head on….

The product will include a publication database, reporting suite, transitional agreement log, analytics dashboard, and more. It will provide a platform that centralises major workflow components and streamlines open access management….”

New ESAC Resources on Transformative Agreements

The open access transition underway in scholarly journal publishing is transforming library services, workflows, financial streams and, naturally, library relationships with publishers. With the growth rate of open access publishing far outpacing that of the underlying scholarly journal market, there is increasing awareness that libraries cannot afford not to have an open access transition strategy.

Whether assessing a publisher “read and publish” offer for the first time or developing a strategic plan to navigate the open access transition, adapting to the evolution of scholarly publishing is a challenge that librarians everywhere are facing.  Some first movers have already worked through the transition locally and are looking at what comes after their transformative agreement phase, while many others are just starting out on their transformation pathway.

To support the global library and library consortium community in this process, the ESAC Initiative is excited to introduce three incredibly rich and authoritative resources:

The ESAC Reference Guide to Transformative Agreements
Threading together and contextualizing the many local guidelines, recommendations, toolkits, templates and data openly available, the reference guide serves as an authoritative and essential orientation on preparing, negotiating and implementing transformative agreements for librarians and consortium staff just starting out or looking to update their strategies based on the latest benchmarks.

How Transformative Is It
This spectrum illustrates the array of transformation drivers that characterize transformative agreements (TAs), to help institutions evaluate publisher proposals during the negotiation process, assess the progress of their current TAs, and define their next negotiation objectives, mapping out how successive transformative agreement iterations depart from the limitations of the subscription paradigm and lead, progressively and concretely, to an open and diverse scholarly communication environment.

2021 Enhancement to the ESAC Workflow Recommendations
Based on the critical insights and experience accumulated in the most recent wave of transformative agreements, the 2021 Enhancement to the ESAC Workflow Recommendations (2017) comprise an updated perspective on the responsibilities of the contractual partners and the metadata necessary to optimize workflows around open access publishing.

Austrian Transition to Open Access: a collaborative approach

This article presents a collaborative project, the ‘Austrian Transition to Open Access’ (AT2OA), initially running from 2017 to 2020, which had the overarching goal of enabling the large-scale transformation of publishing outputs from closed to open access (OA) in Austria. The initiative, which has recently secured funding for a second four-year cycle from the Austrian Federal Ministry of Education, Science and Research, brings together all key players: universities, research institutes, the national library consortium and a cOAlition S funding member, the Austrian Science Fund. The project outcomes include a transition feasibility study that builds on the methodology of the 2015 Schimmer et al. article, the seeds of a national OA monitoring data hub and transformative agreements with major publishers. In addition, the project helped launch institutional OA Publishing Funds across the country and explored alternative publishing models. Furthermore, it saw the emergence of a nationwide network of OA experts. The authors also share their thoughts on lessons learned.

?Implementing FAIR Workflows: A Proof of Concept Study in the Field of Consciousness? | Templeton World Charity Foundation, Inc.

“Although formally published research papers remain the most important means of communicating science today, they do not provide a sufficient amount of information to fully evaluate scientific work. There is typically no mechanism to easily link to experimental design the research data or analytical tools that were used, preventing researchers from being able to fully understand the results of the research, replicate the results, or decisively evaluate and reuse existing research.

Led by project director Helena Cousijn, DataCite and its partners aim to address this problem by developing an exemplar workflow and ecosystem that will assist teams in adhering to FAIR principles for making all research outputs available. By providing a workflow that is easy to implement, the team ultimately aims to start a culture change, where it becomes a standard part of the research culture to make outputs FAIR upon inception.   

The workflow will be developed in collaboration with, and applied to, a research study in the field of consciousness. This field is a fitting proving ground for such a project, as a lack of infrastructure for meaningfully aggregating data in consciousness research has contributed to a lack of agreement about what anatomical structures and physiological processes in the human brain give rise to consciousness despite almost three decades of focused research. Developing FAIR workflows will address that need, unleashing the possibility to better understand the neural foundations of consciousness.

Through this project, DataCite and its partners will develop a proof-of-concept product in the field of consciousness that will accelerate open science. The team’s end goal is to provide researchers in all disciplines with a method for engaging in FAIR research practices that is easy to implement and follow.”

Library Publishing Workflows Project Releases Journal Workflow Documentation | Educopia Institute

“There is no single correct way for a library to publish journals; it’s a process that often grows organically in response to local needs. However, having models to draw from when creating or updating a journal publishing workflow can result in better processes and stronger partnerships. 

To enable library publishers to build on each others’ work in this area, the Library Publishing Workflows project (IMLS 2019-2022) is excited to release a complete set of journal publishing workflow documentation for each of our twelve partner libraries.


The programs behind these workflows are large and small, high-touch and light-touch, and staffed and focused in a variety of ways. Individually, they offer models for similar programs. As a set, they highlight the diversity of practice in this vital area of librarianship. 

For each partner library, we have provided a program profile, one or more workflow diagrams, and accompanying detailed workflows. We are also releasing the workflow diagrams as a set, to enable quick review and comparison across all of the workflows. The documentation is the result of more than two years of interviews, revisions, group discussions, and peer reviews. Because publishing workflows are always evolving, however, this documentation represents a snapshot in time….”

Generalizing FAIR – Daniel S. Katz’s blog

“Most researchers and policymakers support the idea of making research, and specifically research outputs, findable, accessible, interoperably, and reusable (FAIR). The concept of FAIR has been well-developed for research data, but this is not the case for all research products. This blog post seeks to consider how the application of FAIR to a range of research products (beyond data) could result in the development of different sets of principles for applying FAIR to different research objects, and to ask about the implications of this….