Single Character Regular Expressions
Single character regular expressions match at most one character from the target string. There are several types of single character regular expressions:
-
Ordinary characters. Any character that is not used in other sorts of regular expression syntax is an ordinary character. To match the regular expression, the target string must have that same character in the same position as it occurs in the regular expression. For example, the single character regular expression "g" indicates that the target string must have a "g" in the appropriate position.
-
Escaped special characters. The characters left bracket ([), right bracket (]), backslash (\), asterisk (*), period (.), dollar sign ($), and caret (^) are special in that they are used to form other types of regular expression syntax. To construct a single character regular expression that matches any one of those characters literally, preceed the character with a backslash (\). For example, the single character regular expression "\*" matches a single asterisk in the appropriate position. Recall that these rules only apply when the comparison operator is RELOP_RE, so these characters should not be escaped in all strings.
-
A period. A period (.) forms a special single character regular expression that matches any single character except NEWLINE. This indicates that the target string can have any character except NEWLINE in the position specified by the period. For example, the strings "a", "W", and "7" all match the single character regular expression ".".
-
Sets of allowed characters. Brackets ([ and ]) enclosing a set of characters indicates that any of the enclosed characters may occur in the appropriate position in the target string. The special characters period (.), asterisk (*), left-bracket ([), and backslash (\) are treated as normal characters when enclosed in brackets. Additionally, the right-bracket (]) is treated as a normal character if it is the first character in the set. For example, the single character regular expression "[]abc]" indicates that the target string can have either a right-bracket, an "a", a "b", or a "c" in the appropriate position.
-
A dash (-) between two other characters indicates a range of characters in ASCII sequence from the first character to the second. For example, the single character regular expression "[0-9]" indicates that the target string must have a digit in the appropriate position. The dash loses this meaning if it occurs as the first character in the set.
-
A caret (^) immediately following the left-bracket ([) is used to exclude the remaining characters within brackets from matching the target string. It indicates that the target string may have any character in the appropriate position except those following the caret, and NEWLINE. For example, the single character regular expression "[^0-9]" indicates that the target string cannot have a digit in the appropriate position. The caret loses this special meaning if it is not the first character within the square brackets. In addition, the left-bracket ([) and dash (-) may be excluded from matching if they immediately follow the caret, rather than immediately following the initial left-bracket ([) as previously described. For example, the regular expression "[^[]" matches any single character except for a left bracket.