[Date Prev][Date Next][Subject Prev][Subject Next][
Date Index][
Subject Index]
Re: email-style italics
- Subject: Re: email-style italics
- From: Harry Binswanger hb@xxxxxxxx
- Date: Sat, 09 Aug 2008 14:47:21 -0400
Paul:
I think this is all handleable with a few preliminary steps, only the first
of which requires conscious judgment:
1. Do a SEarch for two adjacent underscores. Such will normally not be
found. When they are, you make a judgment: is it an error or is it
something that should stay? If it is an error, fix it manually (you could
incorporate this option into a macro that would ask you whether to delete
the extra underline or not, but it probably isn't worth semi-automating
this step).
2. Any remaining cases of 2 or more adjacent underlines are now supposed to
stay as underlines, so do a CI to change them into some other unique string
(a string that will later be turned back into underlines)--say ^%^ which is
really unlikely to be in an email E.g.,
CI /__/^%^/
Where there is an odd number of underlines in a row, that will leave a
single underline after the ^%^, so you need to allow for that by
substituting a different unique string--say: ,*&)
CI /^%^_/^%^,*&/
3. Now you are ready to test for balance. The simple way is to see if
there's an even or odd (unbalanced) total:
CI /_/_/
When you replace a string with itself, as above, the prompt line still
shows the number of replacements. If it's an odd number, there's an
unbalance somewhere (and I don't know how to find it except by testing the
first half of the file, then . . .)
4. Now do the previously suggested CIs for single underline to and
5. Finally, undo the unique strings:
CI /^%^/__/ <=== put back all the even-number of underlines
CI /,*&/_/ <=== put back any odd remainder underlines
I haven't tested these CIs but even if I've got an error in there, you get
the idea.
It would be child's play to put all these CIs into an XPL routine, but I
think the first step requires conscious attention and judgment.
Yeah, most pairs probably would be balanced, but every now and then they
wouldn't (or you might get two underscores in a row); you'd also need to
watch out for underscore characters in email and web addresses.
So maybe a program that focused only on "[S]_[S]" strings first and asked
for input, then autoconverted underscores based on [S]_ and _[S], and then
checked for broken IT-ON IT-OFF pairs would mostly work.
Such a system would still mistake a string of underscores used to
represent a blank or something else -- google offered up 147,000,000
examples of "__", so I'm sure it'd occur in an email.
Paul Lagasse
Harry Binswanger
hb@xxxxxxxx