perltutorial
root
Character class abbreviations allow you to match any of a set of characters without too much hassle.
One way to do this is to put the set of characters you want to match from within [].
For instance [0123456789] would allow you to match any of those numbers. This can be
kind of cumbersome. You can also negate a character class by placing a caret at the front of it. For
instance [^0123456789] matches anything that is not a number. You shouldn't be surprised that Perl makes your life much easier by
defining some character class a bbreviations. These are alphanumeric characters preceded by a
backslash. Perl allows you to match any number with a \d in your regular expression.<BR><BR>
Now for a quick word about metacharacters. Metacharacters are characters that have special meaning within regular
expressions. Therefore if you put them into a regular expression they won't match literally. Unless you precede the
metacharacter with a \. The metacharacters are \|()$^.?* Now for a quick word about each of them do before
we return to character class abbreviations.
<BR>
<TABLE border=1 cellpadding=2>
<TR><TD>Metacharacter(s)</TD><TD>Meaning</TD></TR>
<TR><TD>.</TD><TD>Matches any character besides newline</TD></TR>
<TR><TD>()</TD><TD>Used for grouping characters</TD></TR>
<TR><TD>[]</TD><TD>Used for defining character classes</TD></TR>
<TR><TD>|</TD><TD>Used for or in regular expression</TD></TR>
<TR><TD>\</TD><TD>Denotes the beginning of a character class abbreviation, or for the following metacharacter to be matched literally</TD></TR>
<TR><TD>*</TD><TD>Quantifier matches 0 or more of the previous character or group of characters</TD></TR>
<TR><TD>?</TD><TD>Makes a quantifier nongreedy</TD></TR>
<TR><TD>^</TD><TD>Matches the beginning of a string (or line if /m is used)</TD></TR>
<TR><TD>$</TD><TD>Matches the end of a string (or line if /m is used)</TD></TR>
</TABLE>
<BR>
<BR>Now lets define some character classes<BR><BR>
<TABLE border=1 cellpadding=2>
<TR><TD>Character Class</Td><TD>Meaning</TD><TR>
<TR><TD>\d</TD><TD>digit or [0123456789]</TD></TR>
<TR><TD>\D</TD><TD>nondigit or [^0123456789]</TD></TR>
<TR><TD>\w</TD><TD>word (alphanumeric) or [a-zA-Z_0-9]</TD></TR>
<TR><TD>\W</TD><TD>nonword</TD></TR>
<TR><TD>\b</TD><TD>word boundary</TD></TR>
<TR><TD>\s</TD><TD>whitespace character [ \t\r\n\f]</TD></TR>
<TR><TD>\S</TD><TD>non whitespace character</TD></TR>
</TABLE>
<BR>
<BR>
That's a lot of information to get a handle on. So lets check out [pattern-matching examples]