Meet the Activist Archivists Saving the Internet From the Digital Dustbin | Discover Magazine

“Archive Team, a self-described “loose collective of rogue archivists, programmers, writers and loudmouths dedicated to saving our digital heritage,” is a volunteer organization that monitors fading or at-risk sites before they’ve vanished completely. When Google announced the end of failed social network Google+, the collective saved 1.56 petabytes of its data in under four weeks.

Much of what Archive Team saves is then stored within the Internet Archive, which anyone can use to digitize whatever they feel is important. But the Wayback Machine uses bots to crawl the web and take snapshots as they go, while the Archive Team is laser focused on preserving endangered sites. It’s the difference between slowly amassing a huge library and trying to save every book from a specific collection that’s about to catch fire.     To accomplish this, anyone can donate bandwidth and hard drive space to the “Warrior,” an archiving application that systematically downloads sites the group is worried about. Those downloads are then sent to the Archive Team’s servers before being moved to the safety of the Internet Archive. The Warrior’s current projects include the soon-to-shutter Freewebs, a hosting service that’s housed 55 million webpages since 2001, as well as certain subreddits that have been quarantined, often the first step discussion website Reddit takes before deleting an entire forum. The content of conversations within those communities might help researchers understand how, for example, extremist viewpoints spread online….”

Accelerating Standards for 3D Data to Improve Long-Term Usability – Association of Research Libraries

“3D data means different things to different people. Most are probably familiar with highly processed outputs, like the previous examples, which often lack documentation describing how the data has been created and processed. In fact, depending on the creation method, the creator may not even have access to the processing information due to the use of proprietary tools. However, even when 3D data is well documented through the best efforts of a creator, data steward, or repository, the data’s description is generally bespoke, and the terms used are ambiguous. This gives 3D data a steep slope to climb to achieve findability, accessibility, interoperability, and reusability (FAIR-ness).

The use of 3D technologies has grown exponentially in the last 10 years. As a result, research libraries have invested significant infrastructure, services, and people into supporting research, teaching principles, and modeling applications of 3D technologies and data. Research libraries have begun creating and capturing 3D data using a variety of methods and formats, establishing 3D immersion labs, opening 3D printing shops within their library spaces, and adding 3D data to their repositories. As use of these tools and services has become more widespread, appropriate stewardship of the digital data is critical for ongoing accessibility, but not yet widely established or agreed upon. Enter the Community Standards for 3D Data Preservation (CS3DP) initiative.

Organized by colleagues at Washington University in St. Louis, the University of Michigan, and Iowa State University, CS3DP aims to be an open, radically inclusive, and collaborative community invested in creating standards. Composed of working groups from national and international participants, the CS3DP community has increased awareness and accelerated the creation and adoption of best practices, metadata standards, and policies for the stewardship of 3D data….”

NFTs and AI Are Unsettling the Very Concept of History | WIRED

“But now the survival of archives as we know them is uncertain. Whether we know it or not, we all rely on a patchwork of chronically underfunded public and private institutions that hold the world’s histories and cultural heritages in trust for all of us and make them accessible….

It was only a matter of time before the market figured out a way to manufacture and sell digital scarcity, and the marketplace for cultural objects has moved well past the archival ecosystem. Artists, gamers, entertainers, athletes, and executives now sell NFTs, tokenized digital objects whose authenticity is said to be assured by the reverse traceability of blockchain transactions. The combination of Covid-19 isolation and cryptocurrency profits created a powerful incentive for digital-positive collectors to compete for these NFTs, and some creators are raking in Ethereum….

Nothing could be a greater cultural and ethical shock to archives than NFTs. Prevailing archival ethics generally dictate that all users are treated equally, and that archival materials aren’t exposed or sold only to high bidders. And once archives select materials for retention, they consider themselves in most cases ethically bound to do so permanently….

As poor a fit with archival DNA as tokenizing archive collections as NFTs may be, the possibility of leveraging digital scarcity by selling NFTs while retaining physical materials is a hefty temptation. The archival world is a world of inadequate budgets and financial constraint, filled with underpaid workers and massive, poorly resourced projects like digital preservation, and the challenging task of digitizing analog materials. Will archives be tempted by the potential upside of NFTs and tokenize digital representations of their crown jewels (or the rights to these assets)? This would worsen an already bad situation…

One working solution is for cultural and historical institutions like archives to run their own trusted registries of digital objects. But this is expensive, and it creates further incentives for archives to monetize their holdings and become less accessible to noncommercial users, like genealogists, the group that uses archives more than anyone else. …”

Joint Position Statement on “Data Repository Selection – Criteria That Matter” | Zenodo

Abstract:  Over the past three years, “Data Repository Selection-Criteria That Matter” – “a set of criteria for the identification and selection of those data repositories that accept research data submissions” – were developed by a group of publishers facilitated by the FAIRsharing initiative. Throughout this time, a large number of organizations and individuals have formulated responses and expressed concern about the criteria and the process through which the criteria were developed. Collectively, our organizations consider that the “Data Repository: Selection Criteria that Matter” recommendations – as currently conceived – will act as an impediment to achieving these aims. As such, we are issuing this Joint Position Statement to highlight the community’s concerns and request that the authors of these criteria respond with specific actions.


UCLA researchers digitize massive collection of folk medicine | UCLA

“A project more than 40 years in the making, the Archive of Healing is one of the largest databases of medicinal folklore from around the world. UCLA Professor David Shorter has launched an interactive, searchable website featuring hundreds of thousands of entries that span more than 200 years, and draws from seven continents, six university archives, 3,200 published sources, and both first and second-hand information from folkloric field notes.

The entries address a broad range of health-related topics including everything from midwifery and menopause to common colds and flus. The site aims to preserve Indigenous knowledge about healing practices, while preventing that data from being exploited for profit….”

A collaborative approach to preserving at-risk open access journals | Zenodo

Abstract:  In the September 2020 preprint “Open is not forever”, (Laakso et al.) discuss the high number of Open Access journals that disappear from the web. It is a known problem in the digital preservation world that long-tail journals are especially at-risk of disappearing. Five leading parties are now collaborating to address this problem: the Directory of Open Access Journals (DOAJ), CLOCKSS, Internet Archive, the Public Knowledge Project Preservation Network (PKP PN), and International ISSN / Keepers Registry. Building from the existing DOAJ infrastructure, we are establishing a central hub where preservation agencies can harvest consistent metadata, and access full-text. Each of the preservation partners offers somewhat different solutions for publishers to preserve their content. The project will offer free and low-cost options for preservation and access. In the first phase, the target is diamond OA journals (those with no author processing charges), because these are the journals that are least likely to participate in a preservation service and hence are most at-risk of disappearing. The project is currently coordinating technical designs, service development, infrastructure, and sustainability planning.


Research repository plus | Jisc

“Research repository plus is an ‘end-to-end’ service that provides the most comprehensive and interoperable long-term management approach for your digital research outputs.

It gives you central oversight of all your research outputs, joins up your various digital research management platforms and automates the workflows that support sharing, storage and long-term digital preservation.

There are three distinct but interconnected components to research repository plus:

Research repository – multi-content repository for research articles, datasets, theses and other digital outputs
Preservation – full and active preservation to ensure your research outputs continue to be usable throughout the 25+ years research funders often demand
Research systems connect – integrates our research repository and preservation solutions, facilitating automatic preservation of digital objects and metadata submitted into the research repository…”

Exploring Perpetual Access: The Serials Librarian: Vol 0, No 0

Abstract:  When libraries transitioned their collection development from primarily print to greater reliance on e-resources, acquisition methods also shifted from a sales contract to a licensing business model. This shift effected the long-held perception that academic libraries support education and research through the preservation and provision of the scholarly record in perpetuity. Libraries can encourage copyright holders to participate in digital preservation initiatives, but to date few initiatives have seen a large uptake. Open Access publishing further amplifies this vulnerable situation. At risk is the assurance that digital scholarly content in all formats remains available to future users. This review of the digital preservation landscape examines a variety of case studies that shed light on the impact e-resource licensing strategies have on safeguarding perpetual access; the use of the unique rights libraries have under copyright law to preserve intellectual property; and the technological access complexities of digital preservation. Recognizing that practical, economic, and culturally responsive initiatives are limited by a library’s local capacity, the need to preserve e-resources has energized an increasing number of collaborative solutions. Using the Institute of Museum and Library Services’ concept that local efforts help build a National Digital Platform, this scan of diverse initiatives explores the evolving framework emerging in support of ensuring future access to digital scholarship.


Opening the record of science: making scholarly publishing work for science in the digital era: ISC Report February 2021

“As a basis for analysing the extent to which contemporary scientific and scholarly publishing serves the above purposes, a number of fundamental principles are advocated in the belief that they are likely to be durable in the long term. They follow, in abbreviated form: I. There should be universal open access to the record of science, both for authors and readers. II. Scientific publications should carry open licences that allow reuse and text and data mining. III. Rigorous and ongoing peer review is essential to the integrity of the record of science. IV. The data/observations underlying a published truth claim should be concurrently published. V. The record of science should be maintained to ensure open access by future generations. VI. Publication traditions of different disciplines should be respected. VII. Systems should adapt to new opportunities rather than embedding inflexible infrastructures. These principles have received strong support from the international scientific community as represented by the membership of the International Science Council (ISC)….”