“For two decades the Jewish Museum in Prague, or JMP, has undertaken a global search for lost publications from the city’s Jewish Community Library, which was looted and shuttered by Nazi occupiers during World War II. With the recent emphasis on digitization of collections by academic libraries, including UCLA’s, the museum’s work has become a lot easier and more fruitful. The JMP’s efforts to repatriate these stolen items have increased in intensity as anyone capable of using an online search tool can access these vast online repositories.
UCLA Library is one of the earliest and largest contributors to one such repository, the HathiTrust — a collaborative of academic research libraries that have thus far digitized 17 million volumes and made them full-text-searchable….”
“HathiTrust Community Week is back! Members asked and we listened, so we’ve reserved the week of July 11-15 for members to go deep on all things HathiTrust — from what their library is doing with the services and collection to what users are finding that enables their teaching, learning, and scholarship. HathiTrust Community Week is dedicated to giving space for members of our wider community to share projects, research, and workshops (and anything else related to HathiTrust) with other members of the community.
This year, we invite participants to step outside the webinar box and consider other ways to bring people into the HathiTrust world — whether it’s building a joint collection in Collection Builder in real time, teaching a research 101 course using HathiTrust, or inviting in students and faculty to help illustrate the role of HathiTrust in teaching and learning. …”
“Since 2018, preservation services supervisor Anne Conway has spent six hours each week researching the copyright status of online books. She has now completed an outstanding 50,000 assessments as a volunteer for HathiTrust’s Copyright Review Program.
HathiTrust is a not-for-profit collaborative of academic and research libraries—including the University Libraries at UNC-Chapel Hill—that preserves digital copies of more than 17 million books and other materials. When those texts are in the public domain, meaning they are free of copyright restrictions, then HathiTrust makes them accessible online for anyone to read….
It is meaningful work, but it can be complex. While all books first published in the United States before 1928 are in the public domain, reviewers like Conway must apply a rigorous review process to determine whether other texts can be made freely accessible.
That multi-step process includes assessing whether the book matches the project’s legal scope; determining whether its copyright has been renewed; and determining whether the book contains credits, permissions or acknowledgements indicating that the digital file might contain other copyrighted content.
This requires nuance and attention to detail. All copyright reviewers go through an extensive training program before they start evaluating texts, according to the HathiTrust website. Even then, each file has to be assessed by two independent reviewers who must agree on its status before it is made public….”
Abstract: We present a new dataset built on prior work consisting of 1,671,370 randomly sampled pages of English-language prose roughly divided between modes of fictional and non-fictional writing and published between the years 1800 and 2000. In addition to focusing on the “page’’ as the basic bibliographic unit, our work employs a single predictive model for the historical period under consideration in contrast to prior work. Besides publication metadata, we also provide an enriched feature set of 107 features including part-of-speech tags, sentiment scores, word supersenses and more. Our data is designed to give researchers in the digital humanities large yet portable random samples of historical writing across two foundational modes of English prose writing. We present initial insights into transformations of linguistic patterns across this historical period using our enriched features as possible pointers to future work. The data can be accessed at https://doi.org/10.7910/DVN/HAKKUA.
“In 2020, CDL joined in collaboration with the Center for Research Libraries and HathiTrust (the CCH Collaboration) to play a facilitative leadership role in advancing shared print’s transition to a new phase of integration and interoperability (read more here). In its first year, the Collaboration released a freely available shared print comparison tool for serials and journals. On December 1st & 2nd of 2021, the Collaboration hosted a summit bringing together the shared print community, library technologists, and service providers to map a path forward for embedding shared print in the collections lifecycle. …”
“Ohio University Libraries is excited to announce that it is one of the newest members of Hathi Trust, a global collaborative of libraries that seeks to preserve the cultural record long into the future. HathiTrust is made up of over 200 member libraries that have created a repository of 17.4 million volumes in their digital library.”
“Scholars studying the shifting landscape of work can now dig deep into more than a half-century’s worth of knowledge from the ILR [Industrial and Labor Relations] School’s digitized publications available on HathiTrust Digital Library, a vast collection of digitized content from libraries around the world….”
“…Access to knowledge and open inquiry are necessary for just societies and for the creation of new scholarship and research. We use our position to promote the broadest possible access to the scholarly and cultural record today and into the future….”
“It is my great pleasure to share that HathiTrust membership ratified the Statement of Values in the voting process that ended June 1. Response was strong with 124 of 190 of voting members participating and all weighted votes cast in support. We appreciate the high level of engagement by the membership, especially during these trying times….”
“Formed in January 2020, the California Digital Library (CDL), the Center for Research Libraries (CRL), & HathiTrust Collaboration is building on a decade of collaboration, innovation, and expertise to define a new phase of shared print built on open and interconnected infrastructure. The collection comparison tool brings together the shared print retention commitments registered in CRL’s Print Archives Preservation Registry (PAPR), HathiTrust digital collection metadata, and the local library serials data submitted by users. Users know instantly which titles in their collection have been retained by shared print collections and which have not. This tool utilizes technology included in WEST’s decision- support system, AGUA, built and supported by CDL….”
“In the beginning of 2020, CDL joined with the Center for Research Libraries (CRL) and HathiTrust to form a Collaboration for Shared Print Infrastructure. Working together, the Collaboration seeks to build on a decade of community innovation and expertise to define a new phase of shared print built on open and interconnected infrastructure.
As a first step toward that larger vision, CDL is proud to announce the launch of a new collection comparison tool realized in partnership with the Center for Research Libraries and HathiTrust. This new tool is a completely open means of comparing local serial and journal holdings against shared print commitments across North America and select digital repositories, including hundreds of thousands of HathiTrust digital serial and journal titles.
The tool can be accessed at papr.crl.edu/tools/compare. …”
“HathiTrust Research Center (HTRC) has selected four projects to participate in its special round of Advanced Collaborative Support (ACS), funded by the Andrew W. Mellon Foundation through the Scholar-Curated Worksets for Analysis, Reuse & Dissemination (SCWAReD) project.
The projects will seek to build HTRC worksets drawn from materials related to historically under-resourced and marginalized textual communities, and in doing so, to identify gaps in the HathiTrust collection where such communities are not represented in the digital library. The worksets will be analyzed using text and data mining techniques. The worksets, derived data outputs, and associated documentation will be shared at the end of the projects as illustrative research models of the text and data mining process. The four research models will join a flagship model that is being developed concurrently in collaboration with co-PI Maryemma Graham and her History of Black Writing project at the University of Kansas.
“Earlier this year in July, we shared news with the WEST membership that California Digital Library (CDL, administrative host for WEST), the Center for Research Libraries (CRL), and HathiTrust released a set of framing documents that would guide their collaborative work to realize open infrastructure to support shared print as an integral part of contemporary collection management and development.
Now we are happy to share that the three organizations have embarked on their first collaborative project: a freely available, web-accessible tool that will enable any library to compare lists of serial holdings with serial retention commitments in PAPR in order to distinguish, in the local list, what has been retained and what has not.
This project embodies the intention of the CDL, CRL & HT collaboration not to rebuild, but to build upon and connect the resources and capabilities that already abound in our community. This project is possible because of the cumulative efforts of many: the original work undertaken by CRL and CDL, funded by the Mellon Foundation, to conceive and develop PAPR; subsequent contributions by WEST in developing the AGUA graphic interface and on-the-fly reporting capability; and, as an added bonus, the collaboration aims to open up a new dataset for comparison: HathiTrust’s digital serials….”
“The HathiTrust Research Center (HTRC) requests proposals for a special funded round of its Advanced Collaborative Support (ACS) program, with support from the Andrew W. Mellon Foundation for HTRC’s “Scholar-Curated Worksets for Analysis, Reuse & Dissemination (SCWAReD)” project.
ACS is a scholarly service offering collaboration between researchers and HTRC staff to solve challenging problems related to computational analysis of the HathiTrust corpus. In this special cycle of ACS, we seek to collaborate with scholars to recover volumes in HathiTrust that tell the story of historically under-resourced and marginalized textual communities, and to identify gaps in the HathiTrust collection where such communities are not represented in the digital library. …”