[Date Prev][Date Next][Subject Prev][Subject Next][ Date Index][ Subject Index]

Re: email-style italics



Paul:
I think this is all handleable with a few preliminary steps, only the first of which requires conscious judgment:
1. Do a SEarch for two adjacent underscores. Such will normally not be
found. When they are, you make a judgment: is it an error or is it
something that should stay? If it is an error, fix it manually (you could
incorporate this option into a macro that would ask you whether to delete
the extra underline or not, but it probably isn't worth semi-automating
this step).
2. Any remaining cases of 2 or more adjacent underlines are now supposed to
stay as underlines, so do a CI to change them into some other unique string
(a string that will later be turned back into underlines)--say ^%^ which is
really unlikely to be in an email E.g.,

CI /__/^%^/
Where there is an odd number of underlines in a row, that will leave a single underline after the ^%^, so you need to allow for that by substituting a different unique string--say: ,*&)

CI /^%^_/^%^,*&/
3. Now you are ready to test for balance. The simple way is to see if there's an even or odd (unbalanced) total:

CI /_/_/
When you replace a string with itself, as above, the prompt line still shows the number of replacements. If it's an odd number, there's an unbalance somewhere (and I don't know how to find it except by testing the first half of the file, then . . .)
4. Now do the previously suggested CIs for single underline to  and


5. Finally, undo the unique strings:

CI /^%^/__/  <=== put back all the even-number of underlines

CI /,*&/_/   <=== put back any odd remainder underlines
I haven't tested these CIs but even if I've got an error in there, you get the idea.
It would be child's play to put all these CIs into an XPL routine, but I
think the first step requires conscious attention and judgment.
Yeah, most pairs probably would be balanced, but every now and then they wouldn't (or you might get two underscores in a row); you'd also need to watch out for underscore characters in email and web addresses.
So maybe a program that focused only on "[S]_[S]" strings first and asked
for input, then autoconverted underscores based on [S]_ and _[S], and then
checked for broken IT-ON IT-OFF pairs would mostly work.
Such a system would still mistake a string of underscores used to
represent a blank or something else -- google offered up 147,000,000
examples of "__", so I'm sure it'd occur in an email.

Paul Lagasse


Harry Binswanger
hb@xxxxxxxx