I’ve been thinking at length about how best to present scanned books online. I know a lot of people think PDFs are best, but they’re blasted difficult to maintain, and the quality of searchable text depends on OCR… and OCR sucks in Adobe Acrobat Pro (at least).
But I may as well give it a shot, as a baseline against which I can compare further efforts in other directions. So I’ve arbitrarily chosen the book I scanned last night, and put together into a PDF of page images (no text yet). If nothing else, it will make a good test bed for software I might try developing.
It’s a single issue of a magazine, relatively short (112 scans) but full of very very small type. It’s been downsampled quite a bit by Adobe, so it may not be entirely readable. Again, a good baseline from which to plan improvements.
Read and enjoy. It’s currently being proofread at Distributed Proofreaders, and sometime in a couple of years (!) it will be released into Project Gutenberg. Or maybe something better will happen to it in the meantime.….
