[Date Prev][Date Next][Subject Prev][Subject Next][
Date Index][
Subject Index]
Re: Xy3 - reference manuals
- Subject: Re: Xy3 - reference manuals
- From: David Auerbach auerbach@xxxxxxxx
- Date: Wed, 15 Jul 2009 18:45:00 -0400
I habitually download PDFs from journals and archive sites. These days
they mostly are already textified (as in that now golden oldie, "I
just want to textify") but some aren't, particularly older ones (like
Russell's 1905 "On Denoting"; Mind just did a centenary issue on it.)
My version of Acrobat Pro happily scans them (wholesale). As far as I
can tell it *displays* the image, but the text is now searchable. With
printed matter like that I've simply never caught it in an error
(which doesn't mean it doesn't make any).
(It isn't brilliant with some mathematical notation, but that
doesn't alter the appearance, just the searchability. )
It will also look for URLs and linkify them. I imagine that other
linking will needs hand doing via its bookmark feature.
David Auerbach
Department of Philosophy & Religion
Box 8103
NCSU
Raleigh, NC 27695-8103
On Jul 15, at 2:29 PM, flash wrote:
Myron,
≪Scan-to-pdf programs start with an image file and then run an OCR
routine on the image to produce a PDF that contains searchable text.
Acrobat does this. If you don't run the OCR part of the process, you
end
up with pure image files.≫
I have scanned from Acrobat 5.0 full version (not Reader) and all I
get
is an image file, noneditable and nonsearchable. The OCR routine
appears, as you described, in version 6.0, but is a multi-mouse-click
process for each page. It is extremely tedious (without even
proofreading); I don't want to do this for several hundred pages of
text.