Skip to content
Related Articles

Related Articles

Perl | Regex Cheat Sheet
  • Last Updated : 31 Jul, 2019

Regex or Regular Expressions are an important part of Perl Programming. It is used for searching the specified text pattern. In this, set of characters together form the search pattern. It is also known as regexp. When user learns regular expression then there might be a need for quick look of those concepts which he didn’t use often. So to provide that facility, a regex cheat sheet is created which contains the different classes, Characters, modifiers etc. which are used in regular expression.

Character Classes

Character classes are used to match the string of characters. These classes let the user match any range of characters, which user don’t know in advance.

ClassesExplanation
[abc.]It includes only one of specified characters i.e. ‘a’, ‘b’, ‘c’, or ‘.’
[a-j]It includes all the characters from a to j.
[a-z]It includes all lowercase characters from a to z.
[^az]It includes all characters except a and z.
\wIt includes all characters like [a-z, A-Z, 0-9]
\dIt matches for the digits like [0-9]
[ab][^cde]It matches that the characters a and b should not be followed by c, d and e.
\sIt matches for [\f\t\n\r] i.e form feed, tab, newline and carriage return.
\WComplement of \w
\DComplement of \d
\SComplement of \s

Example:




# Perl program to demonstrate
# character class
    
# Actual String
$str = "45char";
    
# Prints match found if 
# its found in $str
# by using \w
if ($str =~ /[\w]/)
{
    print "Match Found\n";
}
    
# Prints match not found 
# if it is not found in $str
else
{
    print "Match Not Found\n";
}

Output:

Match Found

Anchors

Anchors do not match any character at all. Instead, they match a particular position as before, after, or between the characters.



AnchorsExplanation
^It matches at the beginning of the string.
$It matches at the end of the string.
\bIt matches at the word boundary of the string from \w to \W.
\AIt matches at the beginning of the string.
\ZIt matches at the ending of the string or before the newline.
\zIt matches only at the end of the string.
\GIt matches at the specified position pos().
\p{….}Unicode character class like IsLower, IsAlpha etc.
\P{….}Complement of Unicode character class
[:class:]POSIX Character Classes like digit, lower, ascii etc.

Example:




# Perl program to demonstrate
# use of anchors in regex
    
# Actual String
$str = "55";
    
# Prints match found if 
# its found in $str
# using Anchors / 
if ($str =~ /[[:alpha:]]/)
{
    print "Match Found\n";
}
    
# Prints match not found 
# if it is not found in $str
else
{
    print "Match Not Found\n";
}

Output:

Match Not Found

Meta Characters

Metacharacters are used to match patterns in Perl regular expressions. All the metacharacters must be escaped.

CharactersExplanation
^To check the beginning of the string.
$To check the ending of the string.
.Any character except newline.
*Matches 0 or more times.
+Matches 1 or more times.
?Matches 0 or more times.
()Used for grouping.
\Use for quote or special characters.
[]Used for set of characters.
{}Used as repetition modifier.

Quantifiers

These are used to check for the special characters. There are three types of quantifiers

  • ‘?’ It matches for 0 or 1 occurrence of character.
  • ‘+’ It matches for 1 or more occurrence of character.
  • ‘*’ It matches for 0 or more occurrence of character.
Using QuantifiersExplanation
a?It checks if ‘a’ occurs 0 or 1 time.
a+It checks if ‘a’ occurs 1 or more time
a*It checks if ‘a’ occurs 0 or more time
a{2, 6}It checks if ‘a’ occurs 2 to 6 times
a{2, }It checks if ‘a’ occurs 2 to infinite times
a{2}It checks if ‘a’ occurs 2 time.

Example:




# Perl program to demonstrate
# use of quantifiers in regex
    
# Actual String
$str = "color";
    
# Prints match found if 
# its found in $str
# using quantifier ?
if ($str =~ /colou?r/)
{
    print "Match Found\n";
}
    
# Prints match not found 
# if it is not found in $str
else
{
    print "Match Not Found\n";
}

Output:

Match Found

Modifiers

ModifiersExplanation
\gIt is used to replace all the occurrence of string.
\gcIt allows continued search after \g match fails.
\sIt treats string as a single line.
iIt turns off the case sensitivity.
\xIt disregard all the white spaces.
(?#text)It is used to add comment in the code.
(?:pattern)It is used to match pattern of the non capturing group.
(?|pattern)It is used to match pattern of the branch test.
(?=pattern)It is used for positive look ahead assertion.
(?!pattern)It is used for negative look ahead assertion.
(<=pattern)It is used for positive look behind assertion.
(<!pattern)It is used for negative look behind assertion.

White Space Modifiers

ModifiersExplanation
\tUsed for inserting tab space
\rCarriage return character
\nUsed for inserting new line.
\hUsed for inserting horizontal white space.
\vUsed for inserting vertical white space.
\LUsed for lowercase characters.
\UUsed for upper case characters.

Quantifiers – Modifiers

MaximalMinimalExplanation
???It can occur 0 or 1 time
++?It can occur 1 or more times.
**?It can occur 0 or more times.
{3}{3}?Must match exactly 3 times.
{3, }{3, }?Must match at least 3 times.
{3, 7}{3, 7}?Must match at least 3 times but not more than 7 times.

Grouping and Capturing

Inside regex, these groups are referred by ‘\1’ and outside regex these groups are referred by ‘$1’. These groups can be fetched by variable assignment in list context is known as capture. The grouping construct (…) creates capture groups known as capture buffers.

(…)These are used for grouping and capturing.
\1, \2, \3During regex matching, these are used to capture buffers.
$1, $2, $3During successful matching, these are used to capture variables.
(?:…)These are used to group without capturing.(these neither set this $1 nor \1)

Attention reader! Don’t stop learning now. Get hold of all the important DSA concepts with the DSA Self Paced Course at a student-friendly price and become industry ready.

My Personal Notes arrow_drop_up
Recommended Articles
Page :