[Date Prev][Date Next][Subject Prev][Subject Next][
Date Index][
Subject Index]
Re: Xy3 - reference manuals
- Subject: Re: Xy3 - reference manuals
- From: Nathan Sivin nsivin@xxxxxxxx
- Date: Wed, 15 Jul 2009 09:52:42 -0400
Getting editable files depends mainly on the Optical Character
Recognition (OCR) software, which transforms an image of words into
text. It is included with good-quality printers, but it is not as
capable as the better standalone OCR programs. I use ReadIris, since it
is highly accurate and can handle most foreign languages.
If you make a PDF file using Acrobat or some similar program, you can
specify making it searchable. What that does is to use Acrobat's
built-in OCR program, which is not top of the line but is not bad, to
hide a text file in back of the image. It is then possible, using
Readiris, to extract it to make a file in any common format. In other
words, the person who scans the documents doesn't have to be the one who
makes the text files from the PDF's.
This is a bigger job than it appears. No OCR program is perfectly
accurate, so someone has to proofread everything. And any document that
is not clearly printed on clean paper is likely to contain a lot of
mistakes. I will be glad to pitch in on the conversion part of the job,
but it will be way too much for one or two people.
Cheers,
Nathan
--
Nathan Sivin
History and Sociology of Science
University of Pennsylvania
Philadelphia PA 19104-6304
(215) 242-1596
nsivin@xxxxxxxx
http://ccat.sas.upenn.edu/~nsivin/