Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much

Re^5: Is File::Find Unicode-(Conformant|Compliant|Enabled|Capable)?

by Anonymous Monk
on Jun 08, 2010 at 01:46 UTC ( #843602=note: print w/replies, xml ) Need Help??

in reply to Re^4: Is File::Find Unicode-(Conformant|Compliant|Enabled|Capable)?
in thread Is File::Find Unicode-(Conformant|Compliant|Enabled|Capable)?

That fails both the easiness test and the simplicity test. It's also not portable programming.

Meh, maybe :D

Is File::Find magically Unicode-capable on operating systems besides Microsoft Windows?...

Yes, in that it "assumes byte strings both as arguments and results", and the underlying operating system calls will happily take and give unicode if they support that. The win32 version specifically uses the ascii version of the system calls (the unicode is called wide system calls), so you get shortnames (its win32 backwards compatibility).

Modern Perl really ought to be able to do file handling stuff well (it does), and it ought to be able to handle Unicode well (it does now), and it ought to be able to do both at once well (it doesn't - not on Windows).

Yeah, Modern Perl really ought to be without bugs too :) See this note from perlunicode

One reason why Perl does not attempt to resolve the role of Unicode in these cases is that the answers are highly dependent on the operating system and the file system(s). For example, whether filenames can be in Unicode, and in exactly what kind of encoding, is not exactly a portable concept. Similarly for the qx and system: how well will the 'command line interface' (and which of them?) handle Unicode?

Does anyone here know: Is File::Find going to be enhanced soon to handle Unicode directory and file names? It's a core module.

*yawn* here is patch

--- 2010-06-07 18:03:58.671875000 -0700 +++ 2010-06-07 18:33:52.109375000 -0700 @@ -6,7 +6,16 @@ our $VERSION = '1.16'; require Exporter; require Cwd; - +BEGIN { eval { + use Win32::Unicode::Native(); ## TODO fix exporting + use subs qw( opendir readdir closedir stat ); + *File::Find::opendir = *Win32::Unicode::Native::opendir ; + *File::Find::readdir = *Win32::Unicode::Native::readdir ; + *File::Find::closedir = *Win32::Unicode::Native::closedir ; + *File::Find::stat = *Win32::Unicode::File::statW; +#~ use subs qw( lstat ); *File::Find::lstat = *Win32::Unicode: +:File::statW; ## bogus , TODO BUG, Can't stat .: No such file or dir +ectory + }; +} # # Modified to ensure sub-directory traversal order is not inverded by + stack # push and pops. That is remains in the same order as in the directo +ry file, @@ -892,13 +901,14 @@ $dir= $dir_name; # $File::Find::dir + my $DIRHANDLE; # Get the list of files in the current directory. - unless (opendir DIR, ($no_chdir ? $dir_name : $File::Find::curren +t_dir)) { + unless (opendir $DIRHANDLE, ($no_chdir ? $dir_name : $File::Find: +:current_dir)) { warnings::warnif "Can't opendir($dir_name): $!\n"; next; } - @filenames = readdir DIR; - closedir(DIR); + @filenames = readdir $DIRHANDLE; + closedir($DIRHANDLE); @filenames = $pre_process->(@filenames) if $pre_process; push @Stack,[$CdLvl,$dir_name,"",-2] if $post_process; @@ -1156,13 +1166,14 @@ $dir = $dir_name; # $File::Find::dir + my $DIRHANDLE; # Get the list of files in the current directory. - unless (opendir DIR, ($no_chdir ? $dir_loc : $File::Find::current +_dir)) { + unless (opendir $DIRHANDLE, ($no_chdir ? $dir_loc : $File::Find:: +current_dir)) { warnings::warnif "Can't opendir($dir_loc): $!\n"; next; } - @filenames = readdir DIR; - closedir(DIR); + @filenames = readdir $DIRHANDLE; + closedir($DIRHANDLE); for my $FN (@filenames) { if ($Is_VMS) {
Seems to work, but needs a test case, and Win32::Unicode::File::stat needs some help.... good luck, I hope you submit a bug report (perlbug) and get this patched quick.

-- Some guy who happened to stop by and tune in for a minute

  • Comment on Re^5: Is File::Find Unicode-(Conformant|Compliant|Enabled|Capable)?
  • Download Code

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://843602]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (7)
As of 2018-06-24 22:49 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (126 votes). Check out past polls.