\s+                      whitespace (\n, \r, \t, \f, and " ")
That's actually incorrect. \s matches 25 different characters, although locale (and EBCDIC) can change the set of characters matched. Even in the LATIN-1 range, next line ("\x85") and no-break space ("\xA0") will be matched by \s if either the pattern or subject has the UTF-8 flag set.