Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

rename to UTF8 filename

by exilepanda (Monk)
on Jul 30, 2012 at 04:20 UTC ( #984343=perlquestion: print w/ replies, xml ) Need Help??
exilepanda has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

I have a task to do file renaming Step 1. Open and Read an Excel file, there are 2 col. First is ID, and second is Name

Step 2. Read into a specified folder, and there are files which name is identical can found in the ID col from Excel

Step 3. Rename the file from ID to the paired name in the Excel file.

Problem. The files name turn into monster character when there are chars out from ANSI.

Since I am reading the names from Excel, so I assumed the source(text of name) is sure UTF-8. Anyway the rename() part is not done correctly.

I had a little study. And I tried Win32API::File::MoveFile() and Win32::Unicode::File::moveW() but without success. I had also tried to create a batch file and run it in cmd, still not success.

My questions come into 2 parts:
1. Anything I need to do on the environment before I do this task ? like if I need binmode STDOUT, ":utf8"or use utf8 or use Encode?

2. What can be done to do a correct rename?

Thanks in advance

Comment on rename to UTF8 filename
Select or Download Code
Re: rename to UTF8 filename
by Anonymous Monk on Jul 30, 2012 at 04:30 UTC

    Since I am reading the names from Excel, so I assumed the source(text of name) is sure UTF-8

    Well, the docs don't agree with your assumption, see http://search.cpan.org/perldoc/Win32::OLE#CP, you need to use CP_UTF8

    I had a little study. And I tried Win32API::File::MoveFile() and Win32::Unicode::File::moveW() but without success. I had also tried to create a batch file and run it in cmd, still not success.

    And the error message was? Maybe you need to post your code :)

      Thanks for reply. Um.. to narrow down the err possibility, I've cut and paste the 2 cols into a text file, use notepad, and save as UTF-8. But please tell me if that still not the version that Perl happy with. Here's my code and there are 2 paths see the remarked lines as version 2 :
      use Win32API::File 0.08 qw( :ALL ); #use Win32::Unicode::File; binmode STDOUT, ":utf8"; open F, "Result_fineName.txt"; #binmode F, ":utf8"; # binmode or not still won't work my @list = <F>; close F; chomp @list; my %hash; foreach (@list) { my ( $id, $name ) = split /\t/, $_, 2; $name =~ s/[\\\/\<\>\?\:\|\*]/ /g; $name =~ s/ {2,}/ /g; $hash{$id} = $name; } chdir ( "./doRename/2.4" ) ; my $source = `dir /b`; my @src = split /\n/, $source; chomp @src; foreach my $f ( @src ) { my @parts = split /[ _]/, $f; print $f ; MoveFile $f, "$hash{$parts[1]}.mp3" or print fileLastError() ; #W +in32API::File #moveW $f, "$hash{$parts[1]}.mp3" or print errorW; # Win32::Un +icode::File print $/; }
      There are 2 sample files in folder 2.4.
      One will rename to a file name starts with ANSI char. If I use the Win32API::File way, there is no error raised, it do the rename, but it turns into monster char.
      If I use the Win32::Unicode::File way, it raised an err Undefined subroutine Errno::ERROR_FILE_EXISTS called at C:/Perl/site/lib/Unicode/Error.pm line 31.


      The other will rename to a pure Japanese file name.
      If I use the Win32API::File way, it rasied an err : The filename, directory name, or volume label syntax is incorrect
      If use the Win32::Unicode::File way, it does nothing, no err, and no rename.

      Note, before I run this from my cmd prompt, I've changed codepage to 65001. Without this, the error message will raise in Chinese ( I guess ), as it showed in monster as well.

        .. to narrow down the err possibility, I've cut and paste the 2 cols into a text file, use notepad, and save as UTF-8.... as it showed in monster as well.

        ???

        moveW works for me, start cmd , chcp 65001, perl kebab.pl > keb.txt , type keb.txt, notepad keb.txt

        #!/usr/bin/perl -- use strict; use warnings; use Path::Class; use File::Slurp; use Data::Dump; use Win32::Unicode::File; our $thisf = file(__FILE__)->absolute; our $thisd = $thisf->dir; our $tmp = 'fafafa'; chdir $thisd or die sprintf q/ERRRR(%d)(%s)(%d)(%s)/, $!,$!,$^E,$^E; my $names = file('utfnamelist.txt')->absolute; #~ my $bin = read_file( $names, { binmode => ':raw' } ) ; dd $bin; my $bin = "\xEF\xBB\xBF# a utf-8 file-o kebab's\r\nutfnamelist.txt +\xD1\x9B\xD0\xB5\xD0\xB2\xD0\xB0\xD0\xBF.txt\r\n\xD1\x9B\xD0\xB5\xD0\ +xB2\xD0\xB0\xD0\xBF.txt ra\xC5\xBEnji\xC4\x87.txt\r\nra\xC5\ +xBEnji\xC4\x87.txt \xC4\x87evap.txt\r\n\xC4\x87evap.txt + \xD0\xBA\xD0\xB5\xD0\xB1\xD0\xB0\xD0\xBF\xD1\x87\xD0\xB5.txt\r\n\x +D0\xBA\xD0\xB5\xD0\xB1\xD0\xB0\xD0\xBF\xD1\x87\xD0\xB5.txt keb +ab.txt\r\n"; mkdir $tmp or die sprintf q/ERRRR(%d)(%s)(%d)(%s)/, $!,$!,$^E,$^E; chdir $tmp or die sprintf q/ERRRR(%d)(%s)(%d)(%s)/, $!,$!,$^E,$^E; write_file( $names->basename, \$bin ); binmode STDOUT, ':encoding(UTF-8)';print "\x{feff}";#BOM ddir(); open my($fh), '<:encoding(UTF-8)', $names or die sprintf q/ERRRR(%d)(% +s)(%d)(%s)/, $!,$!,$^E,$^E; while(<$fh>){ next if $. == 1; # /#/; my( $here, $there ) = split /\s+/, $_; next unless $here and $there; print "# moveW $here => $there\n"; #~ dd [ $here, $there ]; moveW( $here, $there ) or die sprintf q/ERRRR(%d)(%s)(%d)(%s)/, $ +!,$!,$^E,$^E; ddir(); } chdir '..'; dir($tmp)->rmtree("verbose","cautious"); sub ddir { my $dir = `cmd /x /c "dir /b "`;; #~ use Encode::Detective qw' detect '; #~ warn detect($dir) ; # "UTF-8" utf8::decode($dir); print "# dir # ", $dir ; } __END__
        # dir # utfnamelist.txt
        # moveW  utfnamelist.txt   =>    ћевап.txt
        # dir # ћевап.txt
        # moveW  ћевап.txt   =>    ražnjić.txt
        # dir # ražnjić.txt
        # moveW  ražnjić.txt   =>    ćevap.txt
        # dir # ćevap.txt
        # moveW  ćevap.txt   =>    кебапче.txt
        # dir # кебапче.txt
        # moveW  кебапче.txt   =>    kebab.txt
        # dir # kebab.txt
        unlink fafafa\kebab.txt
        rmdir fafafa
        

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://984343]
Approved by davido
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (12)
As of 2014-08-27 09:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (232 votes), past polls