comment on

Character class abbreviations allow you to match any of a set of characters without too much hassle. One way to do this is to put the set of characters you want to match from within []. For instance [0123456789] would allow you to match any of those numbers. This can be kind of cumbersome. You can also negate a character class by placing a caret at the front of it. For instance [^0123456789] matches anything that is not a number. You shouldn't be surprised that Perl makes your life much easier by defining some character class a bbreviations. These are alphanumeric characters preceded by a backslash. Perl allows you to match any number with a \d in your regular expression.

Now for a quick word about metacharacters. Metacharacters are characters that have special meaning within regular expressions. Therefore if you put them into a regular expression they won't match literally. Unless you precede the metacharacter with a \. The metacharacters are \|()$^.?* Now for a quick word about each of them do before we return to character class abbreviations.

Metacharacter(s)	Meaning
.	Matches any character besides newline
()	Used for grouping characters
[]	Used for defining character classes
\|	Used for or in regular expression
\	Denotes the beginning of a character class abbreviation, or for the following metacharacter to be matched literally
*	Quantifier matches 0 or more of the previous character or group of characters
?	Makes a quantifier nongreedy
^	Matches the beginning of a string (or line if /m is used)
$	Matches the end of a string (or line if /m is used)

Now lets define some character classes

Character Class	Meaning
\d	digit or [0123456789]
\D	nondigit or [^0123456789]
\w	word (alphanumeric) or [a-zA-Z_0-9]
\W	nonword
\b	word boundary
\s	whitespace character [ \t\r\n\f]
\S	non whitespace character

That's a lot of information to get a handle on. So lets check out pattern-matching examples

In reply to Character Class Abbreviations by root

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Welcome to the Monastery
	PerlMonks