http://www.perlmonks.org?node_id=984345


in reply to rename to UTF8 filename

Since I am reading the names from Excel, so I assumed the source(text of name) is sure UTF-8

Well, the docs don't agree with your assumption, see http://search.cpan.org/perldoc/Win32::OLE#CP, you need to use CP_UTF8

I had a little study. And I tried Win32API::File::MoveFile() and Win32::Unicode::File::moveW() but without success. I had also tried to create a batch file and run it in cmd, still not success.

And the error message was? Maybe you need to post your code :)

Replies are listed 'Best First'.
Re^2: rename to UTF8 filename
by exilepanda (Friar) on Jul 30, 2012 at 05:52 UTC
    Thanks for reply. Um.. to narrow down the err possibility, I've cut and paste the 2 cols into a text file, use notepad, and save as UTF-8. But please tell me if that still not the version that Perl happy with. Here's my code and there are 2 paths see the remarked lines as version 2 :
    use Win32API::File 0.08 qw( :ALL ); #use Win32::Unicode::File; binmode STDOUT, ":utf8"; open F, "Result_fineName.txt"; #binmode F, ":utf8"; # binmode or not still won't work my @list = <F>; close F; chomp @list; my %hash; foreach (@list) { my ( $id, $name ) = split /\t/, $_, 2; $name =~ s/[\\\/\<\>\?\:\|\*]/ /g; $name =~ s/ {2,}/ /g; $hash{$id} = $name; } chdir ( "./doRename/2.4" ) ; my $source = `dir /b`; my @src = split /\n/, $source; chomp @src; foreach my $f ( @src ) { my @parts = split /[ _]/, $f; print $f ; MoveFile $f, "$hash{$parts[1]}.mp3" or print fileLastError() ; #W +in32API::File #moveW $f, "$hash{$parts[1]}.mp3" or print errorW; # Win32::Un +icode::File print $/; }
    There are 2 sample files in folder 2.4.
    One will rename to a file name starts with ANSI char. If I use the Win32API::File way, there is no error raised, it do the rename, but it turns into monster char.
    If I use the Win32::Unicode::File way, it raised an err Undefined subroutine Errno::ERROR_FILE_EXISTS called at C:/Perl/site/lib/Unicode/Error.pm line 31.


    The other will rename to a pure Japanese file name.
    If I use the Win32API::File way, it rasied an err : The filename, directory name, or volume label syntax is incorrect
    If use the Win32::Unicode::File way, it does nothing, no err, and no rename.

    Note, before I run this from my cmd prompt, I've changed codepage to 65001. Without this, the error message will raise in Chinese ( I guess ), as it showed in monster as well.

      .. to narrow down the err possibility, I've cut and paste the 2 cols into a text file, use notepad, and save as UTF-8.... as it showed in monster as well.

      ???

      moveW works for me, start cmd , chcp 65001, perl kebab.pl > keb.txt , type keb.txt, notepad keb.txt

      #!/usr/bin/perl -- use strict; use warnings; use Path::Class; use File::Slurp; use Data::Dump; use Win32::Unicode::File; our $thisf = file(__FILE__)->absolute; our $thisd = $thisf->dir; our $tmp = 'fafafa'; chdir $thisd or die sprintf q/ERRRR(%d)(%s)(%d)(%s)/, $!,$!,$^E,$^E; my $names = file('utfnamelist.txt')->absolute; #~ my $bin = read_file( $names, { binmode => ':raw' } ) ; dd $bin; my $bin = "\xEF\xBB\xBF# a utf-8 file-o kebab's\r\nutfnamelist.txt +\xD1\x9B\xD0\xB5\xD0\xB2\xD0\xB0\xD0\xBF.txt\r\n\xD1\x9B\xD0\xB5\xD0\ +xB2\xD0\xB0\xD0\xBF.txt ra\xC5\xBEnji\xC4\x87.txt\r\nra\xC5\ +xBEnji\xC4\x87.txt \xC4\x87evap.txt\r\n\xC4\x87evap.txt + \xD0\xBA\xD0\xB5\xD0\xB1\xD0\xB0\xD0\xBF\xD1\x87\xD0\xB5.txt\r\n\x +D0\xBA\xD0\xB5\xD0\xB1\xD0\xB0\xD0\xBF\xD1\x87\xD0\xB5.txt keb +ab.txt\r\n"; mkdir $tmp or die sprintf q/ERRRR(%d)(%s)(%d)(%s)/, $!,$!,$^E,$^E; chdir $tmp or die sprintf q/ERRRR(%d)(%s)(%d)(%s)/, $!,$!,$^E,$^E; write_file( $names->basename, \$bin ); binmode STDOUT, ':encoding(UTF-8)';print "\x{feff}";#BOM ddir(); open my($fh), '<:encoding(UTF-8)', $names or die sprintf q/ERRRR(%d)(% +s)(%d)(%s)/, $!,$!,$^E,$^E; while(<$fh>){ next if $. == 1; # /#/; my( $here, $there ) = split /\s+/, $_; next unless $here and $there; print "# moveW $here => $there\n"; #~ dd [ $here, $there ]; moveW( $here, $there ) or die sprintf q/ERRRR(%d)(%s)(%d)(%s)/, $ +!,$!,$^E,$^E; ddir(); } chdir '..'; dir($tmp)->rmtree("verbose","cautious"); sub ddir { my $dir = `cmd /x /c "dir /b "`;; #~ use Encode::Detective qw' detect '; #~ warn detect($dir) ; # "UTF-8" utf8::decode($dir); print "# dir # ", $dir ; } __END__
      # dir # utfnamelist.txt
      # moveW  utfnamelist.txt   =>    ћевап.txt
      # dir # ћевап.txt
      # moveW  ћевап.txt   =>    ražnjić.txt
      # dir # ražnjić.txt
      # moveW  ražnjić.txt   =>    ćevap.txt
      # dir # ćevap.txt
      # moveW  ćevap.txt   =>    кебапче.txt
      # dir # кебапче.txt
      # moveW  кебапче.txt   =>    kebab.txt
      # dir # kebab.txt
      unlink fafafa\kebab.txt
      rmdir fafafa
      
        .. to narrow down the err possibility, I've cut and paste the 2 cols into a text file, use notepad, and save as UTF-8.... as it showed in monster as well.

        ???

        It's because my origional work is reading the list from an Excel file

        Thanks, but Sorry won't work for me. ERRRR(2)(No such file or directory)(2)(The system cannot find the file specified) at test6.pl line 25.

        Where Ln 25 is  open my($fh), '<:encoding(UTF-8)', $names or die sprintf q/ERRRR(%d)(%s)(%d)(%s)/, $!,$!,$^E,$^E;

        I've check against the output, folder fafafa is built correct, as well the content in utfnamelist.txt. It's just won't do the rest. Anywhere I can further check with ?