FOSS wins again: Free and Open Source Communities comes through on 19th Century Newspapers (and Books and Periodicals…) – Internet Archive Blogs

“I have never been more encouraged and thankful to Free and Open Source communities. Three months ago I posted a request for help with OCR’ing and processing 19th Century Newspapers and we got soooo many offers to help.  Thank you, that was heart warming and concretely helpful– already based on these suggestions we are changing over our OCR and PDF software completely to FOSS, making big improvements, and building partnerships with FOSS developers in companies, universities, and as individuals that will propel the Internet Archive to have much better digitized texts.  I am so grateful, thank you.   So encouraging.

I posted a plea for help on the Internet Archive blog: Can You Help us Make the 19th Century Searchable? and we got many social media offers and over 50 comments the post– maybe a record response rate.   

We are already changing over our OCR to Tesseract/OCRopus and leveraging many PDF libraries to create compressed, accessible, and archival PDFs….”