[Date Prev][Date Next][Subject Prev][Subject Next][ Date Index][ Subject Index]

Duplicates Pattern Search



There was a post here within the last couple weeks suggesting the
use of a .U2 frame called REP or Repeats. This sparked a query I
had in mind, but when I later went to find that message it was
gone, perhaps deleted here by accident.

I have a number of large Bookmark files -- each up to around 350K
in size. Unavoidably, there gets to be a certain amount of
unintended duplication, even within the same file. So, I was
wondering if there might not be some .U2 function that could
parse a large file like this and spit out a list of duplicated or
very-near-duplicated URLs ? (Dates, descriptions, etc. around
the URL are probably irrelevant.) I guess this amounts to a
search where there is no search string that is known or specified
in advance -- maybe too tall an order for xpl ?

I'm pretty sure there must be some small utilities floating
around that do this. I still have occasion to use an old DOS
util., REPEATS.COM (there are others, like FINDDUPE, which may or
may not be DOS), that perform a vaguely similar function,
searching umpteen directories for duplicated files. PMSEEK, the
Finder built into OS/2 also does this, but I don't think it
generates a report file for later reference.

Jordan