[Date Prev][Date Next][Subject Prev][Subject Next][ Date Index][ Subject Index]

Re: XYWRITE digest 1043





XyWriters-
I probably spend more time playing with and dealing with diacriticals than anybody I know. I work as a consultant to the International Olympic Committee databasing all the Olympic results and the name spellings in the various languages contain literally hundreds of different accent marks. I work primarily in XyWrite since the original text files were coded in that format since about 1985. But the IOC and many other places need the material in MS Word or other similar formats (Excel especially) to place into their web sites. I've developed numerous ways to handle the problem, none of them simple.
First of all, it is not proper anymore to simple use "oe" etc. to replace
German umlauts. That is why we have computers and word processors so that
words can now be spelled correctly across various languages. Hungarian has
many unusual accent marks and they get livid if they are omitted as they
end up changing the meaning of many words, similar to the Miami Herald
story about dropping the tildes.
Secondly, the Western European accent marks are not a major problem - the
acutes, graves, umlauts, circumflexes,etc. ASCII recognizes these and they
show up on the screen correctly. Converting to MS Word via WordPort ends
up with these converted correctly as well. If you choose to convert
manually as a text file, this does not work, but one can write a simple XPL
program (I have several) that converts the diacriticals and then when
loaded into HTML they show up correctly.
A much bigger problem is Eastern European accents such as the hacek
(inverted v over a letter). They have no ASCII equivalent but in the late
80s I wrote programs so that XyWrite can recognize them and print them
correctly - but they are not WYSIWYG on the screen. I have learned to
recognize my arcane character codes for them.
Again these are handled in conversion by an XPL program to convert XyWrite
to MS Word diacriticals. It works somewhat like the old Adobe
Word-for-Word conversions. Each character has to be converted first to
coded format - all look like #s&v#. The ## is the symbol that this will be
a diacritical. The first letter "s" is the base, and the second "v" is the
diacritical mark. These get converted to MS Word or Excel or FoxPro and
then I have programs in them to convert #s&v# into the proper symbol - an s
with a hacek. That is not easy either because Microsucks has not provided
the ability to do a replace with these unusual accents marks so I had to
develop a strange macro that calls them from a separate file. Fortunately
it works.
Another possible way to handle diacriticals is to use the Character Map in
MS Office Accessories and cut/paste them into your e-mail document. That
is actually what I do a lot when writing a simple e-mail message.
I would provide my XPL programs for the group but they tend to be very
specific to my set of coding. I use 7LJ3-7J.PRN for my printer file to
recognize these characters and the character set it use is slightly
different than a standard XyWrite character set. I chose it because it has
the capabililty to make more diacriticals than any other XyWrite character
set. If you're not using this *.PRN file, and you're likely not, my XPL
programs would not work for you.
Hopefully, someday much of this will be easier when Unicode gets to become
more standard, as Nathan Sivin has lamented before. His concern is Chinese
character sets which get even more complicated. But the current problem
with Unicode is the cut and paste function in MS Word - it does not
properly recognize Unicode yet, although one can enter Unicode characters
directly into the document. That makes conversion between programs fairly
difficult.
Sorry to go on, but this problem occupies a large part of my life. Hope
this has helped somebody.


Bill