[Date Prev][Date Next][Subject Prev][Subject Next][ Date Index][ Subject Index]

Re: converting v4.017 diacritics



> No. Vaguely possible it's an 850 code page (rather than the standard US
> English 437) but I doubt it. The letters mentioned -- àèâôäë -- do all seem
> to come out OK in Windows screens. They're ASCII 133, 138, 131, 147, 132, 137
> respectively. 

> Nope. Not so. 

I disagree.

My reading of his remarks was that he must be looking at chars>255.
If they were 133, 138, 131 etc, he could see them. But each char is rendered as
"a distinct string of gibberish", which I take to mean 'more than one character
long', e.g. -- just a guess -- 3 characters long! Moreover he adverts *first* to
the "smiley faces" and "boxes" that appear in those strings; every character
(almost) over 511 has a smiley face in it, either Ascii-1 or Ascii-2, while every
character 256-511 has a box (Ascii-254) as part of its 3-byte
string! Ergo he is unquestionably staring at high order chars -- created for use
with Speedos, to be specific. But he calls them "a grave, e grave, a circumflex"
etc! Now, the ONLY way anyone in their right mind is going to put a char like
a-circumflex as a 3-byte string (char 785) instead of as an ordinary 1-byte string
(char 131) is by using the ACcent table; and to load the XyWrite ;AC; ACcent table,
the CodePage MUST be 850! That's a fact, friends. You set the CodePage by a
command to the OS, e.g. (in OS/2's CONFIG.SYS) "CODEPAGE=850,437", or with the
undocumented VAriable LAnguage: "df la=850", or "d la=850" (which overrides the
global system setting). You poll the codepage with VA$CP (system setting, i.e.
at the OS level) or with VALA (local setting; note how it toggles between
437 in eXPanded mode, and 850 in formatted+ mode -- ever wondered why XyWrite's
display of chars>255 changes between these two modes?? there's your answer).

What Roche has to do, IMO, is acquire and launch v4.017, load the DOC,
replace the high chars with whatever code his native WP requires to display those
same chars, then use W-4-W to get the thus-modified DOC into a format that his WP
can read. Finito. The only real rub for him is that he may not have ready access
to the specific Speedo that his author used during composition; the charsets are not
100% uniform from one font to another; he could find himself unable to read
all of the high order chars accurately & therefore unable to translate
a few of them. But in the main he'll manage, by looking at the
context...

Another solution for Roche would be to tell us what Font the author used, i.e.
check the  statement, then make a list of all these 3-byte strings, and
anyone here can tell him what he's looking at.

Even EASIER would be to send him a copy of file CHARSET, as distributed with
Xy4, which lists the chars>255 (or, if you don't view CHARSET in XyWrite, the raw
strings) and their equivalent meanings in plain English. Let Roche dope the
strings out and do an S&R in his native WP. There's nothing mysterious about those
strings, they're simple and very orderly. I think that's what I'll do, in fact...


-----------
Robert Holmgren
holmgren@xxxxxxxx
-----------