In the last few years, an increasing share of the books added to
Project Runeberg (runeberg.org, the Scandinavian e-text archive)
have been digitally photographed rather than scanned. We always
capture images of book pages before we run OCR, as an aid for
proofreading, and images can be captured either way. This is a
natural development, as digital cameras are getting better,
cheaper and more common.
So far we have improvised and tried various camera mounts. But we
don't yet have anything like the "Scribe" stations that the
Internet Archive uses, with dual cameras and a V-shaped glass to
press down over the open book. Some webpages say these are
developed by IA together with Kirtas Technologies or together with
various universities. I don't see anything like this on Kirtas'
website. Other webpages say they look very similar to the
BookDrive DIY from Atiz. I've seen mentions of the bare Atiz
BookDrive DIY being sold at $3500, which sounds a bit high.
Is this something that we can buy in (Northern) Europe? Or should
we try to build one ourselves? They don't look too complicated.
If this was an open source design, where can one find blueprints?
Are any patents involved?
What is the best working interface between camera and computer?
USB 2.0 for direct transfer? Or memory card? Or Firewire? Does
the Internet Archive develop its own software for this? The time
to get the images into the computer could decide how many pages
one can capture per hour.
Or is "Sribe" the name of the whole system (hardware+software),
where the Atiz BookDrive DIY is just the camera mount?
I know 600 dpi was the old standard for bitonal scanned images and
that 300 dpi works fine with color scans. I've been looking for
numbers on what the IA uses, and found some in
http://www.openlibrary.org/details/openlibrary
where it says the Scribe can capture 300 dpi over 16 inches.
That's an impressing 17 megapixels, suggesting that Canon EOS-1Ds
Mark II cameras are used, which currently sell for US$ 7000.
However, very few books are 16 inches (or 400 millimetres) tall,
so if the IA has a handful such scanning stations it would be a
waste to install the top cameras in more than one station. For
smaller books one could get away with a pair of 10 megapixel
cameras, such as the Canon EOS-400D, at around one-tenth of the
price. And it probably gets cheaper next year.
--
Lars Aronsson (
lars@...)
Project Runeberg - free Nordic literature -
http://runeberg.org/