Abstract: The Dutch Digital Heritage Network (DDHN) wants to improve the monitoring of file format obsolescence. The Preservation Watch group researched on how institutions can monitor the life cycle of file formats in their repositories and how the monitoring could be implemented on a broader scale. Monitoring file format life cycle implies there needs to be a way to measure format obsolescence or helps an institution to identify when a file format is getting obsolete. The applied research identified the needed information and used a known model to search for trends and is applied in widespread areas. The model was compared with a naive method to evaluate the more complex method. This approach was tested in different types of repositories and used different file formats to research the robustness of the approach. This paper will investigate the possibilities and shortcomings of this method and further research that is required.
“OAPEN and CLOCKSS have developed a strategic relationship for the long-term preservation of Open Access books! What better way to celebrate World Preservation Day?
It all began when we commissioned a project from the terrific Mikael Laakso, now published as Open access books through open data sources: Assessing prevalence, providers, and preservation. Mikael explains in his abstract that he utilized open bibliometric data sources to answer three questions:
how prevalent are OA books (data sources: Directory of Open Access Books, OpenAIRE, OpenAlex, Scielo Livros, The Lens, WorldCat)
what web domains are responsible for offering full-text access to these OA books, and
to what degree can OA books be verified to be archived in trusted preservation services (data sources: Cariniana, CLOCKSS, Global LOCKSS Network, Portico).
396,995 unique records were identified from the combined OA book, of which only 19% were found to be included in at least one of the preservation services. Good practice is to include content in at least three archives. The results therefore suggest real reason for concern for the long tail of OA books distributed at thousands of different web domains as these include volatile cloud storage or sometimes no longer contained the files at all….”
“The Internet Archive and the Software Preservation Network (SPN) support proposed revisions to the US Copyright Office electronic deposit rules as an important bulwark against vanishing culture….”
“Borealis, the Canadian Dataverse Repository, is a bilingual, multidisciplinary, secure, Canadian research data repository, supported by academic libraries and research institutions across Canada. Borealis supports open discovery, management, sharing, and preservation of Canadian research data….”
by Dr Miranda Barnes
The Open Book Futures project, which began in May 2023 as an acceleration and advancement of the COPIM Project (11/2019-04/2023), continues to focus on the open access monograph, with an emphasis on Scaling Small. This principle “eschews standard approaches to organisational growth that tend to flatten community diversity through economies of scale” (Adema & Moore, 2021). Work Packages in both projects focus more broadly on infrastructure, governance, accessibility, financial models and revenue, metadata and dissemination, and experimental publishing, but also archiving and preservation. It is the combined approach and multifaceted, collaborative interaction of the work packages that leads to our best insights and outputs.
Both COPIM and Open Book Futures depend on the breadth of different groups and individuals involved, which includes scholars, librarians, publishers, infrastructure providers and advocacy groups, as well as colleagues in a variety of other roles at the universities and institutions involved in the projects. Without the perspectives, knowledge, and support of the many members of this “community of communities” (Adema, Hart, et al, 2022), any impact made by our work would be isolated and siloed. Our research into community-supported options for archiving and preservation is no different. We gain a great deal from knowing what challenges face the academic author, the scholar-led press, and the libraries wishing to support the open research agenda. Our understanding is buoyed and clarified by our conversations and collaborations with digital preservation archives, infrastructure providers, and platforms, allowing us to develop and advance tools and guidance to benefit those who need it most. And without organisations such as the DPC, Jisc, OAPEN, and DOAB, and the collective expertise and experience they share, certainly our community and efforts would be greatly diminished.
Another important community we have been engaging is that of other projects examining similar issues from different perspectives, such as the work of the Embedding Preservability Project, and its predecessor Enhancing Services to Preserve New Forms of Scholarship, both led by New York University Libraries. This project specifically considers the challenges to long-term preservation of “increasingly complex publications that are not easily represented in print.” Also the Software Preservation Network’s EaaSI, or Emulation as a Service Infrastructure, who have built a platform with preservation potential for the most complicated published works that may depend on a virtual machine to render properly. Connecting and meeting with our colleagues within these projects has been immensely beneficial, particularly for project members working on experimental publishing and digital preservation. This is just one example of the collaborative, cooperative spirit pervasive within our “community of communities” that has come to define open research.
“The simple answer is: ResearchGate and Academia.edu do not permit their users to take their own data and reuse it elsewhere, nor do their terms of service permit the library to extract that data on the authors’ behalf.
ResearchGate: “Users must not misuse the Service. Misuse of the Service includes, without limitation: … automated or massive manual retrieval of other Users’ profile data (‘data harvesting’).”
Academia.edu: “You agree not to do any of the following: … Attempt to access or search the Site, … through the use of any engine, software, tool, agent, device or mechanism (including spiders, robots, crawlers, data mining tools or the like).”…”
“Climate change, human conflict, and natural disasters present risks to human lives and health, as well as to collections of cultural heritage materials. To future-proof these valuable collections in anticipation of loss through catastrophic events, as well as through normal deterioration, libraries and archives need to circumvent digital locks on works in their collections for the purpose of preserving them.
The US Copyright Office agrees: in its recent notice of proposed rulemaking, the office announced its intention to renew an exemption allowing eligible libraries, archives, and museums to break digital locks on DVDs and Blu-ray discs in their collections when creating preservation or replacement copies of motion pictures, including television shows and videos. The office granted this exemption for the first time in 2021; the current rulemaking cycle is the first time the exemption has been up for renewal….”
“Working collaboratively with colleagues in LCCOS and the wider Egyptology community has enabled us to make ‘Wepwawet: Research Papers in Egyptology’ available as open access through UCL Discovery, UCL’s open access repository.
Each year, thousands of academic journals publish innovative and exciting research. Some of these journals endure for decades; others rapidly become obsolete. They languish on library shelves, their contents forgotten. The journal ‘Wepwawet: Research Papers in Egyptology’ (volumes 1-3, 1985-1987), produced and edited by PhD students from the former UCL Department of Egyptology, was one of these publications….
Making a digital copy of the journal open access supports its preservation, makes it discoverable and ensures that scholars – including native Egyptian scholars seeking to interpret their own past – can access, read and cite this research. A Creative Commons licence (CC BY) makes it possible for others to share and build upon this work, while attributing the original creators.”
Joe Deville is Principal Investigator on Open Book Futures and is a Senior Lecturer at Lancaster University, based jointly in the Department of Sociology and the Department of Organisation, Work and Technology.
Our eNews co-editor Tom Morley sat down with Joe to find out more about the Open Book Futures project.
“If the description of Library Partnership (LP) Certification in our 2021 article intrigued you, you’ll be happy to know we’ve kept busy the past two years. Thanks to dedicated and thoughtful volunteers, LP Certification has grown and changed. This update tells you what we’re currently working on and provides a summary of the work done since fall of 2021.
First and foremost, LP Certification is now called Library Partnership (LP) Rating.undefined The goals and purposes remain the same.
As a quick reminder, LP Rating has three goals.
Provide information about journal publishers’ alignment with select library values to improve librarians’ funding decisions.
Improve clarity in librarians’ discussions about openness and publisher practices.
Give librarians and publishers a way to communicate and collaborate around these values….
LP Rating uses the LP Rubric to evaluate a journal publisher’s practices. The rubric underwent extensive work with members of the 2022-2023 LP Advisory Council (LPAC).undefined During June and July of 2023, a new group of librarians and publishers took another deep dive into the rubric and our associated files, seeing it all with fresh eyes. The feedback from this group of reviewersundefined has been incorporated into the LP Rubric and related documentation. We are indebted to both LPAC members and the reviewers for their hard work. Because of their input, the LP Rubric Beta version is now available….
LP Rating Values
Community. We want to work with:
Organizations that are transparent, cooperative, and collaborative in their business practices
Organizations that are strong partners; or, organizations that, over time, adopt practices better aligned with library values
Access. We seek:
Immediate open access to articles
Equitable access for readers and authors through reduced barriers and burdens
Affordability for libraries, authors, funders, and others
Rights. We favor:
Author retention of rights/permissions to their own work
Explicit permissions to readers to reuse and build on the work
Authors being given a choice of standard open licenses, or a publisher applying these by default
Recognizing diverse needs across disciplines
Discoverability and Accessibility. We prefer:
Open and indexable full-text and metadata
Diligent compliance with relevant accessibility standards
Participation in initiatives focused on interoperability
Preservation. We want partners to:
Deposit content into established and open federal, disciplinary, or institutional repositories
Participate in standard industry preservation efforts…”
Abstract: Currently, there is limited research investigating the phenomenon of research data repositories being shut down, and the impact this has on the long-term availability of data. This paper takes an infrastructure perspective on the preservation of research data by using a registry to identify 191 research data repositories that have been closed and presenting information on the shutdown process. The results show that 6.2 % of research data repositories indexed in the registry were shut down. The risks resulting in repository shutdown are varied. The median age of a repository when shutting down is 12 years. Strategies to prevent data loss at the infrastructure level are pursued to varying extent. 44 % of the repositories in the sample migrated data to another repository, and 12 % maintain limited access to their data collection. However, both strategies are not permanent solutions. Finally, the general lack of information on repository shutdown events as well as the effect on the findability of data and the permanence of the scholarly record are discussed.
“As of July 2023, Transkribus is proud to be a text recognition engine on Wikisource, which is an online digital library of public domain and freely licensed source texts and historical documents, and a sister project of Wikipedia.
Preserving and sharing historical knowledge is more important than ever, but the task of transcribing and making historical manuscripts accessible is not without its challenges, which is why innovative organisations join forces towards a common goal.
The Wikimedia Foundation — the nonprofit that operates Wikipedia, Wikisource, and other free knowledge Wikimedia projects — and Transkribus have recently started an exciting collaboration that began with the Wikisources Loves Manuscripts project, which is inspired by the digitisation and transcription of historical Balinese manuscripts. In this article, we will explain how this partnership came about and look at how Transkribus can benefit the Wikisource community. Additionally, we will show you how to use Transkribus within the Wikisource platform for a seamless transcription process….”
“Anna’s Archive scraped WorldCat, the world’s largest library catalog, in an effort to help preserve digital copies of every book in the world. The meta search engine is well aware of the legal risks but believes that these are well worth taking to preserve the written legacy of humanity. In addition, the archive’s database has gained interest from AI developers and LLM teams too….”
“The Ivy Plus Libraries Confederation is pleased to announce the launch of the Woman, Life, Freedom Movement of Iran web archive, curated by librarians at the IPLC. This web archive preserves material on, about, and from the Woman, Life, Freedom movement of Iran, which emerged in the wake of the 2022 police killing of Mahsa Jîna Amini. Her arrest by the morality police, on alleged grounds of non-compliance with the compulsory Hijab Law, ignited a series of protests that began in Kurdistan, spread across all levels of Iranian society, and reached other marginalized regions like Sistan-Baluchistan. This movement garnered international solidarity, with the Iranian diaspora and global activists demanding accountability from the Iranian government. Despite the government’s attempts to violently suppress dissent, the movement persists into 2023. This archive curates a collection of videos, photographs, art, music, petitions, statements, and diverse forms of expression that have emerged from this movement, showcasing both government crackdowns and the resilience and determination of the Iranian people in their pursuit of meaningful change….
The Ivy Plus Libraries Confederation’s Web Collecting Program is a collaborative collection development effort to build curated, thematic collections of freely available, but at-risk, web content in order to support research….”
“If you haven’t heard, in 2024 Humanities Commons will be launching a completely reimagined open-access repository. It’s currently under heavy construction. So we’ve been asking ourselves: Why does the Commons have a repository in the first place? At our heart we are a social network, a hub for scholarly exchange. Most of us don’t think “repository” when we think about social networks like Mastodon or Instagram or Facebook. So what exactly is a repository? And why will the new repository be so vital to the life of the Commons?…
How will the new Commons repository broadcast researchers’ work? Reaching an audience is partly about open access. This is not just a matter of letting visitors view the works on the repository site free-of-charge. It is also about letting other open access services and sites “re-broadcast” works from the Commons collection. So we will offer free access to the Commons repository in the formats that other tools and aggregators can use: a REST API, OAI-PMH streams, and (later on) the COAR Notify protocol. And we will embed data about each work in its repository page so that it is catalogued by services like Google Scholar. This extends the audience for members’ work far beyond the circle of people who visit the Commons….”