[Date Prev][Date Next][Subject Prev][Subject Next][ Date Index][ Subject Index]

massive search and replace failures



I'm using XyWrite 4 for DOS to post-process data dumps from MS Access tables
into XML files. The biggest part of the job involves translating what were
ANSI-accented characters in Windows (but viewed, of course, as ASCII in DOS)
to the corresponding ISO entities ("é" etc). To compound matters,
these conversions are being performed only within cdata tags
() and not on the rest of the data in the XML files (these
ANSI-accents are being stripped to plain lower-128 ASCII). So my routines
are defining blocks and running sequences of cha/s search and replace
operations.

They seem to be 99% effective, but every once in a while they skip a
character that should have been translated, usually after another character
has been translated within the same defined block. Has anyone else
experienced similar hickups on massive search and replace sequences? Is
there a known cure?

I have a parser that detects the misses in the XML files, so I can fix them,
but the repairs are slowing me down. I'm dealing with big files, too, and
the error rate seems to increase toward the end: should I be making the
files smaller? (I remember XyWrite 3 used to have problems when files
exceeded a certain size, but I thought XyWrite 4 fixed that.) To speed
processing, I reduced df wa to 1; is that too low?

--Chuck Creesy
Princeton University Press