[Date Prev][Date Next][Subject Prev][Subject Next][
Date Index][
Subject Index]
Re Word Frequency Analyzer
- Subject: Re Word Frequency Analyzer
- From: Patricia M Godfrey pmgodfrey@xxxxxxxx
- Date: Wed, 23 Apr 2003 14:36:40 -0400
This is too, too rich for words. I had originally posted a message to the
list on this subject, and it came back to me with the following notice
from the listproc:
"Your recent message to the XYWRITE list has been rejected for the
following reason:
The following string matched one of ListProc's command words:
[the "following string," which I mustn't use or this message too will be
bounced, was a synonym for "nuances."]
Hence, your message looks like [sic] it was intended for the ListProc
server, rather than for a particular list. "
If any proof of the inadequacies of algorithms in dealing with the
English language were needed...
Herewith my original post, edited to avoid confusing the ListProc's
algorithms.
Not to sound like a wet blanket, much less a Luddite, but having made the
nuances of the English language my life's work, I have to wonder whether
any algorithm could be much use in this area. After all, homographs are
NOT the same words: `lead' (the present tense of the English equivalent
of `ducere') is not the same word as `lead' (Latin `plumbum', formerly
thought to be the heaviest metal); `arms' (the forelimbs of an erect
biped) is not the same word as `arms' (as in "Arma virumque cano"). And
even when the same word (etymologically) is involved, the semantic
variety can be enormous: `tree' can be the botanical specimen, a family
tree, a decision tree (in MBA speak), or even (salva reverentia) the tree
of life, or of knowledge, or the "arbor una nobilis."
Yes, people do overuse pet words. I know I do (`ineluctably',
`exceedingly'--probably from translating Latin valde--and
`disquisition' come off my pen or keyboard far more frequently than
anywhere else in English writing, and a good friend once declared this
"the year of the balderdash" after I had so characterized too many
opinions). But if authors aren't aware of their own pets, their editors
soon become so. After all, one of the things an editor needs is a good
memory; not just for overused words but for things like "Wait a minute.
Wasn't Othgar Siegmond's uncle a few pages back? And now he's his
grandfather?"). Algorithms and search features are great for helping one
CONFIRM or disprove that a word is being overused. But on their own, I
doubt they do much better than the so-called grammar checkers (which are
a bad, bad joke).
Patricia