[Date Prev][Date Next][Subject Prev][Subject Next][ Date Index][ Subject Index]

Re: XY search question





Wow! This will be very, very helpful.



At 09:05 PM 12/17/01 -0500, you wrote:
Reply to note from "Yo Intl. YK" Mon, 17 Dec 2001 12:49:57 +0900 > sometimes I have to search for text in messy files, ... I often > know the text string is there, but I do not know where it might > be interrupted by some stupid hard return. So the question, as I understand it, is this: can you search for phrase, say "Happy New Year", in one of these messy files. It might be sloppily typed as "Happy New Year" (with more than one space between words), or it might straddle two lines, like "Happy New Year" (with a carriage return after "Happy") or, the end of the line might be padded out with spaces, like "Happy[space][space][Cr] New Year", or it might be written idiosyncratically as "Happy, New- Year", or ... what have you. In other words, can you SEarch for a phrase even if the words are separated by an indeterminate number of separators, not necessarily spaces? The answer (surprise!) is Yes. You'll need to formulate your SEarch statement using, instead of literal spaces, the separator wildcard (which looks like a reverse- video uppercase S) preceded by an appropriate *numeric* wildcard (which looks like a reverse-video number). Thus, instead of commanding SE "Happy New Year" you'd do, for example, SE "Happy[5][S]New[5][S]Year" If you load XY4.KBD, [S] is the wildcard produced by Alt-Shift-S and [5] is the wildcard produced by Alt-Shift-5. What this command says, in plain English, is to find the words "Happy New Year" separated by anywhere from 1 to 5 separators (including spaces, punctuation and carriage returns). It finds all of the variants described above. The "messier" the text, the higher the number should be. You can go as high as 999 (three [9] wildcards in a row), like this: SE "Happy[9][9][9][S]New[9][9][9][S]Year" If you're dealing with someone who's not only a slob but a nincompoop -- someone who might write, say, "Ha3pppy Neuw Yaer" -- you can broaden the search by using [L] (any letter), [A] (any alphanumeric character) or [X] (any character) instead of, or in conjuction with, [S]. The ridiculous locution quoted above is flagged, for example, by: SE "h[1][0][X]n[1][0][X]y[3][A]r" Keep your command as simple as possible. Don't put two numeric wildcard expressions in a row; separate them by at least one literal character. Thus, if you want to find "Ha3pppy Neuw", a command like SE "h[7][A][5][S]n" will fail; whereas SE "h[7][A] [5][S]n" (with an intermediate literal space) succeeds. There's no need for any preliminary "cleanup" of messy files when you have such powerful search tools at your disposal. -- Carl Distefano cld@xxxxxxxx http://users.datarealm.com/xywwweb/