char *kre_comp(
char *pat)
none
^ If this is the first character of the regular expression, it matches the beginning of the line. $ If this is the last character of the regular expression, it matches the end of the line.
[...] or [^..] Matches any one character contained within the brackets. If the first character after the '[' is the ']', then it is included in the characters to match. If the first character after the '[' is a '^', then it will match all characters NOT included in the []. The '-' will indicate a range of characters. For example, [a-z] specifies all characters between and including the ascii values 'a' and 'z'. If the '-' follows the '[' or is right before the ']' then it is interpreted literally. There are special symbols that can be used as short hand: \\w will expand to '0-9a-z_A-Z', \\d expands to '0-9', and \\s expands to ' \\t\\n\\r\\f'
{n,m} Match between n and m times the DFA directly {n,} before this range syntax. Thus, 'a{2,10}' {n} will match a minimum of 2 a's and a maximum 10. The {n,} syntax tells the parser to match n or more times., and the {n} syntax tells it to match exactly n times. * Match the preceding character or range of characters 0 or more times. This is equivalent to the range syntax {0,}
+ Match the preceding character or range of characters 1 or more times. This is equivalent to the range syntax {1,}
? Match the preceding character or range of characters 0 or 1 times. This is equivalent to the range syntax {0,1}
| This symbol is used to indicate where to separate two sub regular expressions for a logical OR operation. (..) Group boundaries. This pattern indicates an area of a memory tagged region of the regular expression that can be used to match the exact same pattern later in the regular expression via a reference, or used in kre_subs to bring in this part of the matched string. It can also be used to indicate areas where the or symbol '|' should be applied. Note, only 127 groups are allowed.
\\b Word boundary. This pattern will match the empty char before the start and after the end of a word. By default, a word character contains 0-9a-z_A-Z. This can be modified by the kre_modw routine.
\\B Non-word boundary. This pattern will match the empty character between two characters in a word.
\\1-\\127 These symbols are used to reference the 1st through 127th () region.
\\h Stored the ASCII character '\\b'
\\A If it is the first character of the regular expression, it matches the empty character at the beginning of the string.
\\Z If it is the last character of the regular expression, it matches the empty character at the end of the string. \\c@-\\cZ These symbols are translated into control-@ through control-Z. Any other values, and the \\c part is ignored.
\\d Same as [0-9].
\\D Same as [^0-9].
\\s Same as [ \\t\\n\\r\\f].
\\S Same as [^ \\t\\n\\r\\f].
\\w Same as [a-zA-Z_0-9].
\\W Same as [^a-zA-Z_0-9].
\\Q..\\E A section enclosed in these symbols it taken literally. In side these sections, meta characters and special symbols have no meaning. If a \\E needs to appear in one of these sections, the \\ must be escaped with \\.
\\ This escapes the meaning of a special character.
none
none
Extended to support \\d\\s\\w\\Q\\E\\c?\\A\\Z. \\< and \\> were modified to be \\b and \\B. (SJ) 11/94.
$BOOTSTRAP/objects/library/kutils/src/regex.c