[Date Prev][Date Next][Subject Prev][Subject Next][
Date Index][
Subject Index]
Re: Cleaning up html
- Subject: Re: Cleaning up html
- From: Jay McNally jmcnally@xxxxxxxx
- Date: Tue, 04 Jun 2002 11:10:05 -0400
Thanks Nicholas.
Whenever I can I also copy and past into XyWrite, and this works, as you
said, like a charm. But some web sites put some protection of some kind on
the file which prevents copying and pasting. The Detroit News (detnews.com)
is particularly difficult in this regard.
The only way to get the text, from what I can figure out, is to open the
htm file in Xy and strip out the html tags.
Some newspaper do not allow you to copy and paste
At 10:59 AM 6/4/02 -0400, you wrote:
Your solution is far more ingenious than mine, but I do this all the time
from dowloaded newspaper and journal articles. I simply call up the saved
htm file in my browser, select what I want, copy it to the Windows
clipboard with Control-C, open a file in XyWrite, hit Alt-Enter to show
Xywrite in that little box (windows box?) and then use the paste command
on box. In goes all the text. Works like a charm, usually.
For some reason, though I save most of the stuff through my Netscape
browser, when I call it up to copy and paste it to XyWrite, I use Mr.
Gates's Internet Explorer, and that seems to work better.
Nicholas Clifford
clifford@xxxxxxxx
PS It would be nice if Windows programs like NBWin or Word for Windows,
could translate html files. They say they can, but in fact they can't, and
all the html junk appears in the word processing file. At least it does for me.
Jay McNally wrote:
Can anyone offer me some advice for this problem?
I often need to take text from a web document that has been saved in html.
My somewhat tedious but simple process for some years is to simply loop
an xpl routine that defines then deletes everything from the first "less
than" bracket to the next "greater than" bracket. I then manually clean
up the rest of the junk. It works. . . .