"be consistent" | |
PerlMonks |
Re^3: how to unicode filenames?by zentara (Archbishop) |
on Jun 28, 2012 at 08:42 UTC ( [id://978857]=note: print w/replies, xml ) | Need Help?? |
Hi, I would like to share my Unicode battles with you, since we both are fighting the same battle it seems. After a few unicode related posts, yours being one of them, I decided to try and make a little utility I wrote, named vgrep, unicode aware. It was quite a hit or miss transformation. See Gtk2 Visual Grep I has to add the -CS perlrun switch, use the unicode::all module, and even after all that, I still needed to use $Encode::decode() in many places to get the desired output. Even though my linux filesystem locale is en_US.UTF-8 in my .bashrc, I still needed to run input strings and filenames thru decode. I'm using Perl 5.14.1. It works, but it definitely seems to my sensibilities that it should be simpler. I guess the problem comes from having many files and filenames comng in thru the net, and left over from previous Latin-1 linux installations, which are not UTF-8. The general rule I seem to be seeing is "treat all input as binary" then decode. My vgrep program still emits some errors when searching thru pdf files, which are detected as being -t text, but contain binary images; and I don't understand why File::Find dosn't automatically see unicode filenames, without having to decode $File::Find::name. I'm not really a human, but I play one on earth. Old Perl Programmer Haiku ................... flash japh
In Section
Seekers of Perl Wisdom
|
|