Perl Regex Cheat Sheet


http://perldoc.perl.org/perlre.html



m
Treat string as multiple lines. That is, change "^" and "$" from matching the start or end of the string to matching the start or end of any line anywhere within the string.
s
Treat string as single line. That is, change "." to match any character whatsoever, even a newline, which normally it would not match.
Used together, as /ms, they let the "." match any character whatsoever, while still allowing "^" and "$" to match, respectively, just after and just before newlines within the string.
i
Do case-insensitive pattern matching.
x
Extend your pattern's legibility by permitting whitespace and comments. 
p
Preserve the string matched such that ${^PREMATCH}, ${^MATCH}, and ${^POSTMATCH} are available for use after matching.
g and c
Global matching, and keep the Current position after failed matching. Unlike i, m, s and x, these two flags affect the way the regex is used rather than the regex itself. See Using regular expressions in Perl in perlretut for further explanation of the g and c modifiers.
a, d, l and u



 \        Quote the next metacharacter
    ^        Match the beginning of the line
    .        Match any character (except newline)
    $        Match the end of the line (or before newline at the end)
    |        Alternation
    ()       Grouping
    []       Bracketed Character class



   *           Match 0 or more times
    +           Match 1 or more times
    ?           Match 1 or 0 times
    {n}         Match exactly n times
    {n,}        Match at least n times
    {n,m}       Match at least n but not more than m times



Sequence   Note    Description
  [...]     [1]  Match a character according to the rules of the
                   bracketed character class defined by the "...".
                   Example: [a-z] matches "a" or "b" or "c" ... or "z"
  [[:...:]] [2]  Match a character according to the rules of the POSIX
                   character class "..." within the outer bracketed
                   character class.  Example: [[:upper:]] matches any
                   uppercase character.
  \w        [3]  Match a "word" character (alphanumeric plus "_", plus
                   other connector punctuation chars plus Unicode
                   marks)
  \W        [3]  Match a non-"word" character
  \s        [3]  Match a whitespace character
    \S        [3]  Match a non-whitespace character
  \d        [3]  Match a decimal digit character
  \D        [3]  Match a non-digit character
  \pP       [3]  Match P, named property.  Use \p{Prop} for longer names
  \PP       [3]  Match non-P
  \X        [4]  Match Unicode "eXtended grapheme cluster"
  \C             Match a single C-language char (octet) even if that is
                   part of a larger UTF-8 character.  Thus it breaks up
                   characters into their UTF-8 bytes, so you may end up
                   with malformed pieces of UTF-8.  Unsupported in
                   lookbehind.
  \1        [5]  Backreference to a specific capture group or buffer.
                   '1' may actually be any positive integer.
  \g1       [5]  Backreference to a specific or previous group,
  \g{-1}    [5]  The number may be negative indicating a relative
                   previous group and may optionally be wrapped in
                   curly brackets for safer parsing.
  \g{name}  [5]  Named backreference
  \k<name>  [5]  Named backreference
  \K        [6]  Keep the stuff left of the \K, don't include it in $&
  \N        [7]  Any character but \n (experimental).  Not affected by
                   /s modifier
    \v        [3]  Vertical whitespace
    \V        [3]  Not vertical whitespace
    \h        [3]  Horizontal whitespace
    \H        [3]  Not horizontal whitespace
    \R        [4]  Linebreak


 \b  Match a word boundary
    \B  Match except at a word boundary
    \A  Match only at beginning of string
    \Z  Match only at end of string, or before newline at the end
    \z  Match only at end of string


code=yes