Google Books 2020 Update | Communications

“What would you do if Google came to you and said: You have 1 million items that we would like to scan for you and make available to the world?

Over the past two years, a team from Access Services, Stacks Management, Library Technology Services, Information and Technical Services, Harvard Depository, and ReCAP have been attempting to do just that as part of a Harvard Library Digital Strategies and Innovation (DSI) initiative. This project began nearly a decade after our first partnership with Google Books, and it has been an opportunity to approach this work differently — to identify the challenges that we face at each step of the workflow and to look for creative, iterative ways to meet them….

Between 2004 and 2009, Google scanned 891,164 volumes from Harvard. Google has begun reprocessing those materials, enhancing and correcting the raw images and running them through updated OCR to create better, more searchable, machine-readable text.  

As part of this relationship, we are involved in the Google Library Partners group, an active community of our colleagues from peer institutions who also share their materials with Google. As a group we have been able to advocate for and contribute to reviews for handling of materials, quality assurance in scanning, and expanded treatments for items with foldouts or materials of non-traditional size. We have also led a review of how our peers provide access to materials and are actively partnering with HathiTrust to conduct more research into how users find and utilize these materials….”