[Date Prev][Date Next][Subject Prev][Subject Next][ Date Index][ Subject Index]

scanning for editable file



I am using a Fujitsu S510 ScanSnap.  Inexpensive, reasonably fast, and will scan to Word or Excel, as well as to PDF.  Much as I detest Word, the S510 cuts out a cumbersome step as compared to Nathan's "bigger job than it appears."

Fred



Nathan Sivin wrote:
Getting editable files depends mainly on the Optical Character Recognition (OCR) software, which transforms an image of words into text. It is included with good-quality printers, but it is not as capable as the better standalone OCR programs. I use ReadIris, since it is highly accurate and can handle most foreign languages.

If you make a PDF file using Acrobat or some similar program, you can specify making it searchable. What that does is to use Acrobat's built-in OCR program, which is not top of the line but is not bad, to hide a text file in back of the image. It is then possible, using Readiris, to extract it to make a file in any common format. In other words, the person who scans the documents doesn't have to be the one who makes the text files from the PDF's.

This is a bigger job than it appears. No OCR program is perfectly accurate, so someone has to proofread everything. And any document that is not clearly printed on clean paper is likely to contain a lot of mistakes. I will be glad to pitch in on the conversion part of the job, but it will be way too much for one or two people.

Cheers,

Nathan