zettelkasten/Eingang/Regex.md

104 lines
5.7 KiB
Markdown
Raw Normal View History

2023-08-29 11:10:29 +02:00
Title : Regex.md
2023-08-29 11:15:25 +02:00
===
2023-08-29 11:10:29 +02:00
2023-08-17 19:32:37 +02:00
## Common operators
To define patterns to match, you can use these common operators:
|Operator|Description |Example |Returns |
|--------|--------------|------------|---------------|
|^ |Matches the beginning of a string |^abc |abc, abcdef..., abc123|
|$ |Matches the end of a string |abc$ |my:abc, 123abc, theabc|
|. |Matches any character as a wildcard |a.c |abc, asc, a123c|
|| |An OR character |abc|xyz |abc or xyz|
|(...) |Captures values in the parentheses |(a)b(c) |a and c|
|[...] |Matches anything within the brackets |[abc] |a, b, or c|
|[a-z] |Matches lowercase characters between a and z |[b-z] |bc, mind, xyz|
|[0-9] |Matches any number values between 0 and 9 |[0-3] |3201|
|{x} |The exact number of times to match |(abc){2} |abcabc|
|{x,} |The minimum number of times to match |(abc){2,} |abcabcabc|
|* |Matches anything in the place of the *, or a "greedy" match |ab*c |abc, abbcc, abcdc|
|+ |Matches the character before the + one or more times |a+c |ac, aac, aaac|
|? |Matches the character before the ? zero or one times, or a "non-greedy" match |ab?c |ac, abc|
|/ |Escapes the character after the /, or creates an escape sequence |a/bc |a c, with the space matching the /b|
To use an operator's literal character within a pattern, not as regex:
- For a circumflex (^), period (.), open bracket ([), dollar sign ($), open or close parenthesis (() or ()), pipe (|), asterisk (*), plus sign (+), question mark (?), open brace ({), or backslash (\), follow it with the escape operator (\).
- For an end bracket (]) or end brace (}), make it the first character, with or without an opening ^.
- For a dash (-), make it the first or last character, or the second endpoint of a range.
_Tip: All characters within brackets are taken literally, and not as regex operators. For example, **[*\+?{}.]** matches any of the literal characters within the brackets._
## Match start or end of string (^ and $)
To match patterns at the beginning or end of the string, use the operators **\^** and **\$** , respectively. For example:
|Example |Matches|
|-----------|-------|
|^The |Any string that starts with The|
|of despair$ |Any string that ends with of despair|
|^abc$ |A string that starts and ends with abc—an exact match|
_Tip: If neither **\^** or **\$** is used, the pattern matches any string that contains the characters specified. For example, notice—with no **^** or **$**—returns any string that contains notice._
## Match characters (*, +, and ?)
To match patterns based on a specific character, follow the character with the operator **\*, +, or ?**. These operators indicate the number of times the character should occur for a match—zero or more, one or more, or one or zero, respectively. For example:
|Example |Matches|
|-------|-------|
|ab* |A string that contains a, followed by zero or more bs—ac, abc, or abbc|
|ab+ |A string that contains a, followed by one or more bs—abc or abbc, but not ac|
|ab? |A string that contains a, followed by zero or one bs—ac or abc, but not abc|
|a?b+$ |A string that ends with one or more bs, with or without a preceding a; for example, ab, abb, b, or bb, but not aab or aabb|
## Match characters' frequency ({...} or (...))
To match a pattern based on how often a single character occurs, follow it with the number or range of instances, wrapped in braces **({...})** . For example:
|Example |Matches|
|-----|------|
|ab{2} |A string that contains a, followed by exactly 2 bs—abb|
|ab{2,} |A string that contains a, followed by at least 2 bs—abb, abbbb, etc.|
|ab{3,5} |A string that contains a, followed by three to five bs—abbb, abbbb, or abbbbb|
_Tip: Always specify the first number of a range—{0,2}, not {,2}. Instead of the ranges <span style="color:red">{0,}</span>{0,}, {1,}, or {0,1}, you can use the operators *, +, or ?, respectively._
To match a pattern based on how often a sequence of characters occurs, wrap it in parentheses **((...))**. For example, **a(bc){1,5}** matches a string that contains a, followed by one to five instances of bc.
## Match one of multiple patterns (|)
To match one of multiple patterns—such as this OR that—use the OR operator **|** . For example:
|Example |Matches|
|-----|-----|
|hi&#124;hello |A string that contains either hi or hello|
|(b&#124;cd)ef |A string that contains either bef or cdef|
|(a &#124; b)*c |A string that has a sequence of alternating as and bs, ending with <span style="color:red">c</span>|
## Match any character (.)
To represent any character in a pattern to match, use the wildcard operator **.** . For example:
|Example |Matches|
|----|-----|
|a.[0-9] |A string that contains a, followed by any character and a digit|
|^.{3}$ |Any string of exactly three characters|
## Match character position ([...])
To match a pattern based on the position of a character, use brackets **([...])**. For example:
|Example |Matches|
|----|-----|
|[ab] |A string that contains either a or b; equivalent to a&#124;b|
|[a-d]|A string that contains a lowercase a, b, c, or d; equivalent to a&#124;b&#124;c&#124;d or [abcd]|
| ^[a-zA-Z] |A string that starts with any letter, regardless of case|
| [0-9]% |A string that contains any single digit followed by a percent sign|
| ,[a-zA-Z0-9]$ |A string that ends with a comma followed by any character|
_Note: All characters within brackets are taken literally, and not as regex operators. For example, **[*\+?{}.]** matches any of the literal characters within the brackets._
## Match unwanted characters ([^...])
2023-08-29 11:10:29 +02:00
To match a pattern that does not contain characters, start the sequence with an **^** operator, and wrap it in brackets. For example, **%[^a-zA-z]%** matches a string with any non-letter character between two percent signs.