[Date Prev][Date Next][Subject Prev][Subject Next][ Date Index][ Subject Index]

Re: XY search question



"Yo Intl. YK" wrote:

> >Because there is the danger in this process you will loose the markers the
> >writer intended to establish paragraphs, this program first changes double
> >hard returns to the string "xxx", (one presumes the text does not have
> >"xxx" in it). It strips out all hard returns and later replaces the two
> >hard returns in place the "xxx".
>
> Yes, that would work nicely or files that are written in a consistent style.
> However, I often get files where intended paragraphs are only one return, and
> there are multiple returns where none should be (try sending an Adobe file to
> pdf2text.com, and you will see).
> So, I wondered if I can simply tell XY to ignore the returns. Not possible?
>
> -- Rene von Rentzell, Tokyo (rrr @ twics.com)

I could be mistaken here, but -- at least as far as the inconsistent Returns --
this might have some overlap with CLEANUP, a _test_ .U2 frame that Carl created
at my request, a while back. In my use of it, it seemed to be of rather limited
use, insofar as the ragged email or text files I tried it on. There were some
files that showed improvements after passing through this "filter," many others
that did not. I still think that what this routine does could be further
refined. If I can come up with any concrete suggestions (and if Carl is willing
to revisit the xpl), that remains something I would like to pursue. I think it
is a matter of better defining the problem -- with a number of appropriate
textual examples -- to clarify the rules of the routine; coding it is probably
not the major stumbling block.

You can find this frame in the archives, or I can fwd. it, to see if it does
anything for your situation. Maybe not, as I don't recall it dealing with
indents. No one else really had much comment back when it was posted here.
Other suggestions could have been useful.

Jordan