Regular Expressions
Regular expressions are a formal language offering a very powerful way for pattern matching techniques which includes parsing and identifying sub strings.
Position matching
Symbol
Function
^
Only match the beginning of a string.
"^A" matches first "A" in "An A+ for Anita."
$
Only match the ending of a string.
"t$" matches the last "t" in "A cat in the hat"
\b
Matches any word boundary
"ly\b" matches "ly" in "possibly tomorrow."
\B
Matches any non-word boundary
Literals
Symbol
Function
Alphanumeric
Matches alphabetical and numerical characters literally.
\n
Matches a new line
\f
Matches a form feed
\r
Matches carriage return
\t
Matches horizontal tab
\v
Matches vertical tab
\?
Matches ?
\*
Matches *
\+
Matches +
\.
Matches .
\|
Matches |
\{
Matches {
\}
Matches }
\\
Matches \
\[
Matches [
\]
Matches ]
\(
Matches (
\)
Matches )
\xxx
Matches the ASCII character expressed by the octal number xxx.
"\50" matches "(" or chr (40).
\xdd
Matches the ASCII character expressed by the hex number dd.
"\x28" matches "(" or chr (40).
\uxxxx
Matches the ASCII character expressed by the UNICODE xxxx.
"\u00A3" matches "£".
Characters classes
Symbol
Function
[xyz]
Match any one character enclosed in the character set.
"[a-e]" matches "b" in "basketball".
[^xyz]
Match any one character not enclosed in the character set.
"[^a-e]" matches "s" in "basketball".
.
Match any character except \n.
\w
Match any word character. Equivalent to [a-zA-Z_0-9].
\W
Match any non-word character. Equivalent to [^a-zA-Z_0-9].
\d
Match any digit. Equivalent to [0-9].
\D
Match any non-digit. Equivalent to [^0-9].
\s
Match any space character. Equivalent to [ \t\r\n\v\f].
\S
Match any non-space character. Equivalent to [^ \t\r\n\v\f].
Repetition
Symbol
Function
{x}
Match exactly x occurrences of a regular expression.
"\d{5}" matches 5 digits.
{x,}
Match x or more occurrences of a regular expression.
"\s{2,}" matches at least 2 space characters.
{x,y}
Matches x to y number of occurrences of a regular expression.
"\d{2,3}" matches at least 2 but no more than 3 digits.
?
Match zero or one occurrences. Equivalent to {0,1}.
"a\s?b" matches "ab" or "a b".
*
Match zero or more occurrences. Equivalent to {0,}.
+
Match one or more occurrences. Equivalent to {1,}.
 
Alternation & Grouping
Symbol
Function
()
Grouping a clause to create a clause. May be nested. "(ab)?(c)" matches "abc" or "c".
|
Alternation combines clauses into one regular expression and then matches any of the individual clauses.
"(ab)|(cd)|(ef)" matches "ab" or "cd" or "ef".
Back references
Symbol
Group
()\n
Matches a clause as numbered by the left parenthesis
"(\w+)\s+\1" matches any word that occurs twice in a row, such as "hubba hubba."
Replace functionality
In the search pattern
Symbol
Function
()
Stores the enclosed search result in the next buffer
In the replace pattern
Symbol
Function
$1
Uses the content of the first buffer