[Date Prev][Date Next][Subject Prev][Subject Next][ Date Index][ Subject Index]

Batch Spell Check Enhancer (U2)



There was some discussion on the list last week about spell checking. I've
attached a XY4 U2 module called SPELL_U2 which I wrote several years ago to
automate batch spell checking, and in which a few members have expressed an
interest.

First some background. There are two ways to check spelling in Xywrite:
Interactive, by working through the document to be checked and dealing with the
errors one by one in the dialog box as they are encountered; or Batch, where the
document is checked in one quick pass and all exception words are written to a
file for later examination.

Batch mode is very fast. You can check a 100 page document in a few seconds,
whereas going thru that document interactively, especially if it has a number of
proper names etc. which are not in the dictionary, is a tedious process.

Batch mode only writes a word to the exception file once no matter how many
times it occurs in the document being checked. If your document uses
Xakasdyufgh on every page, it will only appear in the exception file once. And
if you misspell it Xacasdyufgh, the two instances will be really apparent in the
exception file.

But Batch mode has some drawbacks. Out of the box, it's a little cumbersome to
use. And for some reason it trips over words that are adjacent to special
characters like true quotes ([264]'s and [265]'s) and em dashes ([260]'s) even
though Interactive mode has no problem with them.


The attached U2 module works thus:

Upon being activated by SPELL or SP, the document in the current window is
saved to $SPELL.___

$SPELL.___ is then called and true quotes and em dashes are replaced so they
won't interfere.

Next, batch spell check is run on $SPELL.___ and the exception words are
written to BADWORDS.___

BADWORDS.___ is then sorted and displayed in a new window, four columns wide
and 20 lines deep.

At this point, you need to look through BADWORDS.___ and identify the errors,
then switch to the document itself to find and fix them.


Some enhancements that I've never got around to doing:

1. Instead of changing only a few things like true quotes that have proved
troublesome, it would probably be cleaner to just remove all special characters
(how to do that?) and indeed all formatting, leaving just the raw text to be
checked. $SPELL.___ is discarded once BADWORDS.___ is created, so it wouldn't
matter that the formatting was trashed.

2. It would be nice to have a routine which could be activated by a keystroke
while the cursor was on a misspelled word in BADWORDS.___, which would switch to
the document itself and find all instances of the error, stopping after each to
allow correction (maybe invoking the interactive spell checker at that point)
and then returning to BADWORDS.___ to pick up the next misspelled word.

Enjoy!


Tom Hawley
tjh@xxxxxxxx

Attachment: SPELL_U2
Description: Binary data