good chemistry is complicated, and a little bit messy -LW |
|
PerlMonks |
comment on |
( [id://3333]=superdoc: print w/replies, xml ) | Need Help?? |
You need to useutf8; to tell Perl that your source file is in UTF-8. That way non-ASCII literal strings work the way you want them to. use strict; use warnings; use 5.010; use utf8; binmode STDOUT, ':encoding(UTF-8)'; my $str = "ครัวซองเเซนด์วิชไข่ดาว Croissant Egg Sandwich ครัวซองเเซนด์วิชไข่ดาว"; $str =~ s/[^\p{Latin}\p{Common}]//g; $str =~ s/^\s+|\s+$//g; say $str; __END__ Croissant Egg Sandwich See also: Character Encodings in Perl. Updated to unlinkify the brackets, and to exclude \p{Common} instead of \s from removal. In reply to Re: How to remove other language character from a string
by moritz
|
|