[Date Prev][Date Next][Subject Prev][Subject Next][
Date Index][
Subject Index]
Open post to K. Re. Indexing
- Subject: Open post to K. Re. Indexing
- From: jenterli@xxxxxxxx (James Enterline)
- Date: Wed, 20 Nov 1996 16:27:15 -0500 (EST)
While we have the attention of K. from TTG concerning bug complaints, I
would like to address a positive subject that he let pass by earlier. In my
posting of 10/16/96 (12:42 PM) I showed how to make the Index function of
XyWrite much more useful. Auxilliary to that, a call was put out for noise
word files that TTG could supply for general use. No response whatsoever
has been forthcoming from TTG.
Here is a list of some postings followed by excerpts.
10/16/96 12:42 PM James Enterline:
Nobody mentioned
an easy way to get started with a list of all indexable words. I think an
initial list of all unique words in the document would be a good start. It
seems to me such a list could then be pared down leaving only the words one
wants to subject to indexing (and add sub-headings to). Maybe the answer
was too obvious for anyone to mention, but I finally realized how and will
mention it: Do a SPELL command on the entire document while all dictionaries
have been unloaded. What you get will be an alphabetical list of all unique
words. It took my 40 MHz 386 only 7 1/2 minutes to do a 90,000 word book
ms. (ca. 6,000 unique words).
Even better would be to SPELL check with a dictionary containing all common,
non-interesting words, leaving much less paring down to be done. Anybody
know of such a list in adaptable electronic form?
10/16/96 at 18:47 Jerry Bernstein:
I have used ZyIndex (ZyLabs, Chicago) to index docs. Versions exist for
both DOS and Windows 3.n. The program has a list of "noise words" that it
filters out of the doc. before extracting the index. You could copy that
list, or edit it for your own use.
10/18/96 Harry Binswanger:
And I know those lists exist, but how to get one?
10/19/96 Dorothy Day:
Owners of Orbis will have a basic list of words considered too common to
index in your textbase (OMIT.LST), which you could substitute for the
spelling dictionary.
10/19/96 James Enterlne:
Might I suggest that The Technology Group purchase rights to such a list,
edit it to XyWrite format, and distribute it as an accessory to XyWrite?
This would take very little and low-level manpower.
10/20/96 Dorothy Day:
A cheaper alternative would be to buy a copy of the now deceased
Magellan, which did much the same thing as ZyIndex (and much more
besides), and has a multilingual file called MAGELLAN.SKP, which serves
the same purpose as OMIT.LST.
10/21/96 James Enterline:
Nathan Sivin wrote:
>If anyone wants to compile a list of common words for the indexing
>function to ignore, using any of the approaches already discussed, I
>will be glad to post it for downloading.
Are there any attorneys on the list who can tell me if this would be Fair
use of MAGELLAN.SKP? Copyright only protects expression, not data, at least
in printed matter. I haven't read up on the highly specialized matter of
software copyright.
I posted that message primarily for TTG to pay attention to. And there
things have stood for a month. Are they looking into it? Will anything
happen? Stay tuned. Jim.
James Enterline 144 West 95th Street 1 (212) 865-9648 Voice/Fax .
............... New York NY 10025 jenterli@xxxxxxxx