Sometimes you need to search for something more complex than a string of characters. For example:
text that can have several variant spellings
text only when it occurs on a line WITHOUT other specified text
text only in the form of a complete word
text patterns in which some of the characters may vary
Extended Search Patterns allow you to make these and other types of complex searches.
KKExtended Search Pattern Characters
Extended Search Patterns assign special meanings to the characters shown below. These meanings only take effect if the search string begins with a backslash (\) character -- otherwise every character in a search string represents itself. (There is one exception to the preceding sentence for the entity name operand of commands that operate on groups of entities. See the FILE operand of the FileLibrary command for an example of this exception. The FileLibrary command accepts ? and * in the FILE operand and does not accept a preceding backslash in the entity name.)
In an Extended Search Pattern:
? (single character wildcard) matches any single character.
'\A?M' matches 'ABM' or 'ASM'.
* (many character wildcard) matches any sequence of zero or more characters.
'\I A*M' matches 'I AM' or 'I ARM' or 'I ANTIDISESTABLISHMENTARIANISM'.
@ (repeated character wildcard) matches zero or more instances of the character that follows it; also suppresses any special function of that character.
'\X@-Y' matches 'XY', 'X-Y', or 'X-----------Y'.
< (begin word) matches the start of the search area or any non-alphanumeric character.
'\<HAT>' finds a match in 'THE OLD FELT HAT', but does not find one in 'THAT MADMAN'.
> (end word) matches the end of the search area or any non-alphanumeric character.
'\<HAT>' finds a match in 'THE OLD FELT HAT', but does not find one in 'THE MAD HATTER'.
! (begin area) matches the start of the search area.
'\!XAS' will not match 'TEXAS'
+ (AND) combines two sub-patterns BOTH of which must match.
'\SAINT+LOUIS' matches 'LOUIS, THE PATRON SAINT OF PROGRAMMERS' but does not match 'ST. LOUIS' or 'SAINT LOUIE'.
| (OR) combines two sub-patterns EITHER of which may match.
'\<SC>|<TN>' matches both 'GREENVILLE, SC' and 'KNOXVILLE, TN'.
~ (NOT) reverses the match failure/success of the sub-pattern which follows.
'\~JONES' will match any search area that does not contain 'JONES'.
\ (suppress) indicates the character which follows it is NOT to be treated as an Extended Search Pattern character.
'\X*\+ Y' matches 'X + Y' and 'XANADU, JOHNSON + YOLANDA'. Without the \, it would have matched any search area that contained both 'X' and ' Y', taking into account the special meaning of +.
% (position marker) indicates the position in the search area at which the match is considered to have taken place. For commands such as Locate, this is where the cursor will be positioned after a successful search. For the EditChange command, this is where replacement of text will begin. A second % may be used to indicate the end of he portion of the search pattern which is to be replaced.
The meaning of "search area" in the preceding descriptions varies according to context. In EditChange, ExcludePattern, IntegratePattern, Locate, LocateUp, Qualify, SelectPattern, and the STR operand of the FileScan command, "search area" means the ZONE column range of a session line. In Find, FindFirst, FindUp, NotFind, and NotFindUp, "search area" means the columns from COL to the end of a session line. In FileLibrary, DirLibrary, UserLibrary, and FileScan (other than the STR operand), "search area" means the value of the variable or attribute referenced.
KUsing Extended Search Pattern to Search for "Words"
Sometimes, it is useful to find instances of a particular word but not find instances that are contained in other words. The < and > characters can be used in an Extended Search Pattern to represent a non-alphanumeric character which begins or ends a word, or the start or end of the search area.
For example, to position to the next instance of the word 'we', but not words that contain it like 'were' or 'tower', you could enter
=> locate \<we>
To display all instances of variable names in a program which start with 'N' but not any of the other variable names that contain 'N', you could enter
=> qualify \<N
Because they function by matching non-alphanumeric characters and then moving the search pointer forward or backward, < and > may appear to behave oddly if they are not at the beginning and end of the search string. For example, the search string '\ABC<DE>F' will never match anything and the search string '\>#<' will match a single # anywhere.
There are several ways to match multiple words, depending on what function you need. For example:
Two words on the same line, in any order:
=> locate '\<major>+<standards>'
Two consecutive words with one blank between them:
=> locate '\<major standards>'
Two words in order, any characters between:
=> locate '\<major>*<standards>'
Two consecutive words, any number of blanks between:
=> locate '\<major@ standards>'
Of course, for a multiple word match to succeed, the words must all appear on a single line, not be split across lines.
KUsing Extended Search Pattern "Wildcards"
Extended Search Patterns provide several ways to indicate unknown characters. ? represents one unknown character, * represents zero or more unknown characters, and @ indicates zero or more repeats of the character which follows it. These can be combined for various effects:
'\A?*B' 'A' separated from 'B' by at least one character.
'\A????*B' 'A' separated from 'B' by at least four characters.
'\A\!@!B' 'A' separated from 'B' by one or more ! characters. ('\A@!\!B' will never match anything because @! consumes all repeats of !).
Using the * at the beginning of a string will make a fixed column search command operate like a variable column one. For example, the following use of the Find command will find the next instance of the word "metamagical" between columns 8 and the session maximum line width, operating essentially like a Locate command.
=> find \*metamagical col=8
KExtended Search Pattern Combinations
Sometimes, it is useful to search for several different spellings of a name or find instances of a string which appear with (or do not appear with) another text string. Extended Search Patterns provide the + (AND), (OR), and ~ (NOT) for these situations.
For example, to find the next instance of any of three alternate spellings for Kelsey, you could enter
=> locate '\<Kelsey>|<Kelsy>|<Kelsie>'
To display all lines in which Larson appears without Peterson, you could enter
=> qualify '\<Larson>+~<Peterson>'
+ and | function by dividing the search string into sub-patterns. Sub-patterns are matched in strictly left-to-right sequence until the search string is fully consumed or the outcome is judged to be determined. The outcome is judged to be determined when the character following a sub-pattern is | and the sub-pattern matches or when the character following a sub-pattern is + and the sub-pattern does not match.
~ functions by reversing the match results of the sub-pattern which follows it.
When you mix + and | in an Extended Search Pattern, you should give the most controlling sub-pattern first. For example, if you want to search for the next line where Flaws appears with Kingsbury or Maloney, you would enter
=> locate \<Flaws>+<Kingsbury>|<Maloney>
If you entered
=> locate \<Kingsbury>|<Maloney>+<Flaws>
the result would be the next line where Kingsbury appeared or the next line where Maloney appeared with Flaws.
Not all Boolean conditions can be represented with these rules. For example, there is no way to search for "Larson without Peterson or Peterson without Larson".
The use of + combinations and ~ are not recommended with the EditChange command because the start and end positions for replacement may not be what you would expect.
Making Search Commands Work Like One Another
For example, to find the next line in the current session that does NOT contain the string MALLARD at column 10, you could enter any of the following:
=> nfind MALLARD 10
=> find \~MALLARD 10
=> locate \~!MALLARD zone=10-*
Using Extended Search Patterns in this way may be confusing if you inherit STR values from command to command.
KUsing the % Extended Search Pattern Position Marker
Sometimes it is useful if the column position at which matching is considered to take place is other than the first character of the search string. Extended Search Patterns provide % for this purpose. For search commands such as Locate that cause cursor positioning, % indicates the column at which cursor will be placed after a successful search. For the EditChange command, % indicates the column at which the replacement string will be inserted. If the pattern does not contain a %, the match is considered to take place at the first non-pattern character which participated in the match.
For example, to position the cursor vertically to the next line which contains '// EXEC ' and horizontally to just after the '// EXEC ', you could enter
=> locate '\// EXEC %'
You can use % with + and | by following the rule that % should appear in ALL sub-patterns combined with | but only in ONE of the sub-patterns combined with +. For example, to position the cursor vertically to the next line which contains the string EXTENT and the string VOL123 and horizontally at the V of VOL123, you could enter
=> locate \EXTENT+%VOL123;cursor
% is meaningless when used with ~.
In the EditChange command, a second % may appear in to mark the end of the characters to be removed. If there is no second %, the last non-pattern character participating in the match will be the last character replaced.
For example, to replace all instances of "ITM-PRICE + ITM-COMM" with "ITM-PRICE - ITM-COMM", you could enter
=> change '\ITM-PRICE %\+% ITM-COMM' '-' *
When the position indicated by % does not follow or immediately precede a non-pattern character, the position set by % may not be what you expect. For example, the pattern '\%?ADAM' will set the position to the beginning of search area containing "ADAM", not to the character preceding "ADAM". This situation occurs when % is mixed with wildcard indicators at the beginning of an Extended Search Pattern When using the EditChange command with % and + or |, beginning and ending % should both be in the same sub-pattern. Otherwise the start and end positions for replacement may not be what you would expect.