http://www.perlmonks.org?node_id=1003542


in reply to match utf8

None of them deal with UTF-8. The regex matching engine expects Unicode codepoints. Decode your input (e.g. using Encode's decode) first, then \w will work.