zettelkasten/Eingang/Regex.md
Ralf Koop 7389e600f9 vault backup: 2023-11-12 11:19:55
Affected files:
2023-11-12 11:19:55 +01:00

5.7 KiB

Title : Regex.md

Common operators

To define patterns to match, you can use these common operators:

Operator Description Example Returns
^ Matches the beginning of a string ^abc abc, abcdef..., abc123
$ Matches the end of a string abc$ my:abc, 123abc, theabc
. Matches any character as a wildcard a.c abc, asc, a123c
An OR character abc
(...) Captures values in the parentheses (a)b(c) a and c
[...] Matches anything within the brackets [abc] a, b, or c
[a-z] Matches lowercase characters between a and z [b-z] bc, mind, xyz
[0-9] Matches any number values between 0 and 9 [0-3] 3201
{x} The exact number of times to match (abc){2} abcabc
{x,} The minimum number of times to match (abc){2,} abcabcabc
* Matches anything in the place of the *, or a "greedy" match ab*c abc, abbcc, abcdc
+ Matches the character before the + one or more times a+c ac, aac, aaac
? Matches the character before the ? zero or one times, or a "non-greedy" match ab?c ac, abc
/ Escapes the character after the /, or creates an escape sequence a/bc a c, with the space matching the /b

To use an operator's literal character within a pattern, not as regex:

  • For a circumflex (^), period (.), open bracket ([), dollar sign ($), open or close parenthesis (() or ()), pipe (|), asterisk (*), plus sign (+), question mark (?), open brace ({), or backslash (), follow it with the escape operator ().
  • For an end bracket (]) or end brace (}), make it the first character, with or without an opening ^.
  • For a dash (-), make it the first or last character, or the second endpoint of a range.

Tip: All characters within brackets are taken literally, and not as regex operators. For example, [*+?{}.] matches any of the literal characters within the brackets.

Match start or end of string (^ and $)

To match patterns at the beginning or end of the string, use the operators ^ and $ , respectively. For example:

Example Matches
^The Any string that starts with The
of despair$ Any string that ends with of despair
^abc$ A string that starts and ends with abc—an exact match

Tip: If neither ^ or $ is used, the pattern matches any string that contains the characters specified. For example, notice—with no ^ or $—returns any string that contains notice.

Match characters (*, +, and ?)

To match patterns based on a specific character, follow the character with the operator *, +, or ?. These operators indicate the number of times the character should occur for a match—zero or more, one or more, or one or zero, respectively. For example:

Example Matches
ab* A string that contains a, followed by zero or more bs—ac, abc, or abbc
ab+ A string that contains a, followed by one or more bs—abc or abbc, but not ac
ab? A string that contains a, followed by zero or one bs—ac or abc, but not abc
a?b+$ A string that ends with one or more bs, with or without a preceding a; for example, ab, abb, b, or bb, but not aab or aabb

Match characters' frequency ({...} or (...))

To match a pattern based on how often a single character occurs, follow it with the number or range of instances, wrapped in braces ({...}) . For example:

Example Matches
ab{2} A string that contains a, followed by exactly 2 bs—abb
ab{2,} A string that contains a, followed by at least 2 bs—abb, abbbb, etc.
ab{3,5} A string that contains a, followed by three to five bs—abbb, abbbb, or abbbbb

Tip: Always specify the first number of a range—{0,2}, not {,2}. Instead of the ranges {0,}{0,}, {1,}, or {0,1}, you can use the operators *, +, or ?, respectively.

To match a pattern based on how often a sequence of characters occurs, wrap it in parentheses ((...)). For example, a(bc){1,5} matches a string that contains a, followed by one to five instances of bc.

Match one of multiple patterns (|)

To match one of multiple patterns—such as this OR that—use the OR operator | . For example:

Example Matches
hi|hello A string that contains either hi or hello
(b|cd)ef A string that contains either bef or cdef
(a | b)*c A string that has a sequence of alternating as and bs, ending with c

Match any character (.)

To represent any character in a pattern to match, use the wildcard operator . . For example:

Example Matches
a.[0-9] A string that contains a, followed by any character and a digit
^.{3}$ Any string of exactly three characters

Match character position ([...])

To match a pattern based on the position of a character, use brackets ([...]). For example:

Example Matches
[ab] A string that contains either a or b; equivalent to a|b
[a-d] A string that contains a lowercase a, b, c, or d; equivalent to a|b|c|d or [abcd]
^[a-zA-Z] A string that starts with any letter, regardless of case
[0-9]% A string that contains any single digit followed by a percent sign
,[a-zA-Z0-9]$ A string that ends with a comma followed by any character

Note: All characters within brackets are taken literally, and not as regex operators. For example, [*+?{}.] matches any of the literal characters within the brackets.

Match unwanted characters ([^...])

To match a pattern that does not contain characters, start the sequence with an ^ operator, and wrap it in brackets. For example, %[^a-zA-z]% matches a string with any non-letter character between two percent signs.

#nochzubearbeiten