http://www.perlmonks.org?node_id=854278


in reply to Re: Reading Reg Exp
in thread Reading Reg Exp

\s+                      whitespace (\n, \r, \t, \f, and " ")
That's actually incorrect. \s matches 25 different characters, although locale (and EBCDIC) can change the set of characters matched. Even in the LATIN-1 range, next line ("\x85") and no-break space ("\xA0") will be matched by \s if either the pattern or subject has the UTF-8 flag set.

Replies are listed 'Best First'.
Re^3: Reading Reg Exp
by kejohm (Hermit) on Aug 11, 2010 at 10:33 UTC

    YAPE::Regex::Explain is probably only set up for the most common uses; since it uses YAPE::Regex to parse the regex, it probably can't detect encoding or locale. Since it is only providing an explanation of the regex, in most cases it wouldn't really matter.

      But even ignoring locale or encoding, it's still not listing 80% of the characters the class can match. That's like saying [a-z] matches all the vowels.

        According to the perlrecharclass manpage:

        \s matches any single character that is considered whitespace. In the ASCII range, \s matches the horizontal tab (\t), the new line (\n), the form feed (\f), the carriage return (\r), and the space.

        It also says:

        Without a locale or EBCDIC code page, \s matches the five characters mentioned in the beginning of this paragraph.

        Update: Link fixed.

        Consonants, rather :)