Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^2: rename to UTF8 filename

by exilepanda (Pilgrim)
on Jul 30, 2012 at 05:52 UTC ( #984351=note: print w/replies, xml ) Need Help??


in reply to Re: rename to UTF8 filename
in thread rename to UTF8 filename

Thanks for reply. Um.. to narrow down the err possibility, I've cut and paste the 2 cols into a text file, use notepad, and save as UTF-8. But please tell me if that still not the version that Perl happy with. Here's my code and there are 2 paths see the remarked lines as version 2 :
use Win32API::File 0.08 qw( :ALL ); #use Win32::Unicode::File; binmode STDOUT, ":utf8"; open F, "Result_fineName.txt"; #binmode F, ":utf8"; # binmode or not still won't work my @list = <F>; close F; chomp @list; my %hash; foreach (@list) { my ( $id, $name ) = split /\t/, $_, 2; $name =~ s/[\\\/\<\>\?\:\|\*]/ /g; $name =~ s/ {2,}/ /g; $hash{$id} = $name; } chdir ( "./doRename/2.4" ) ; my $source = `dir /b`; my @src = split /\n/, $source; chomp @src; foreach my $f ( @src ) { my @parts = split /[ _]/, $f; print $f ; MoveFile $f, "$hash{$parts[1]}.mp3" or print fileLastError() ; #W +in32API::File #moveW $f, "$hash{$parts[1]}.mp3" or print errorW; # Win32::Un +icode::File print $/; }
There are 2 sample files in folder 2.4.
One will rename to a file name starts with ANSI char. If I use the Win32API::File way, there is no error raised, it do the rename, but it turns into monster char.
If I use the Win32::Unicode::File way, it raised an err Undefined subroutine Errno::ERROR_FILE_EXISTS called at C:/Perl/site/lib/Unicode/Error.pm line 31.


The other will rename to a pure Japanese file name.
If I use the Win32API::File way, it rasied an err : The filename, directory name, or volume label syntax is incorrect
If use the Win32::Unicode::File way, it does nothing, no err, and no rename.

Note, before I run this from my cmd prompt, I've changed codepage to 65001. Without this, the error message will raise in Chinese ( I guess ), as it showed in monster as well.

Replies are listed 'Best First'.
Re^3: rename to UTF8 filename
by Anonymous Monk on Jul 30, 2012 at 07:56 UTC

    .. to narrow down the err possibility, I've cut and paste the 2 cols into a text file, use notepad, and save as UTF-8.... as it showed in monster as well.

    ???

    moveW works for me, start cmd , chcp 65001, perl kebab.pl > keb.txt , type keb.txt, notepad keb.txt

    #!/usr/bin/perl -- use strict; use warnings; use Path::Class; use File::Slurp; use Data::Dump; use Win32::Unicode::File; our $thisf = file(__FILE__)->absolute; our $thisd = $thisf->dir; our $tmp = 'fafafa'; chdir $thisd or die sprintf q/ERRRR(%d)(%s)(%d)(%s)/, $!,$!,$^E,$^E; my $names = file('utfnamelist.txt')->absolute; #~ my $bin = read_file( $names, { binmode => ':raw' } ) ; dd $bin; my $bin = "\xEF\xBB\xBF# a utf-8 file-o kebab's\r\nutfnamelist.txt +\xD1\x9B\xD0\xB5\xD0\xB2\xD0\xB0\xD0\xBF.txt\r\n\xD1\x9B\xD0\xB5\xD0\ +xB2\xD0\xB0\xD0\xBF.txt ra\xC5\xBEnji\xC4\x87.txt\r\nra\xC5\ +xBEnji\xC4\x87.txt \xC4\x87evap.txt\r\n\xC4\x87evap.txt + \xD0\xBA\xD0\xB5\xD0\xB1\xD0\xB0\xD0\xBF\xD1\x87\xD0\xB5.txt\r\n\x +D0\xBA\xD0\xB5\xD0\xB1\xD0\xB0\xD0\xBF\xD1\x87\xD0\xB5.txt keb +ab.txt\r\n"; mkdir $tmp or die sprintf q/ERRRR(%d)(%s)(%d)(%s)/, $!,$!,$^E,$^E; chdir $tmp or die sprintf q/ERRRR(%d)(%s)(%d)(%s)/, $!,$!,$^E,$^E; write_file( $names->basename, \$bin ); binmode STDOUT, ':encoding(UTF-8)';print "\x{feff}";#BOM ddir(); open my($fh), '<:encoding(UTF-8)', $names or die sprintf q/ERRRR(%d)(% +s)(%d)(%s)/, $!,$!,$^E,$^E; while(<$fh>){ next if $. == 1; # /#/; my( $here, $there ) = split /\s+/, $_; next unless $here and $there; print "# moveW $here => $there\n"; #~ dd [ $here, $there ]; moveW( $here, $there ) or die sprintf q/ERRRR(%d)(%s)(%d)(%s)/, $ +!,$!,$^E,$^E; ddir(); } chdir '..'; dir($tmp)->rmtree("verbose","cautious"); sub ddir { my $dir = `cmd /x /c "dir /b "`;; #~ use Encode::Detective qw' detect '; #~ warn detect($dir) ; # "UTF-8" utf8::decode($dir); print "# dir # ", $dir ; } __END__
    # dir # utfnamelist.txt
    # moveW  utfnamelist.txt   =>    ћевап.txt
    # dir # ћевап.txt
    # moveW  ћевап.txt   =>    ražnjić.txt
    # dir # ražnjić.txt
    # moveW  ražnjić.txt   =>    ćevap.txt
    # dir # ćevap.txt
    # moveW  ćevap.txt   =>    кебапче.txt
    # dir # кебапче.txt
    # moveW  кебапче.txt   =>    kebab.txt
    # dir # kebab.txt
    unlink fafafa\kebab.txt
    rmdir fafafa
    
      .. to narrow down the err possibility, I've cut and paste the 2 cols into a text file, use notepad, and save as UTF-8.... as it showed in monster as well.

      ???

      It's because my origional work is reading the list from an Excel file

      Thanks, but Sorry won't work for me. ERRRR(2)(No such file or directory)(2)(The system cannot find the file specified) at test6.pl line 25.

      Where Ln 25 is  open my($fh), '<:encoding(UTF-8)', $names or die sprintf q/ERRRR(%d)(%s)(%d)(%s)/, $!,$!,$^E,$^E;

      I've check against the output, folder fafafa is built correct, as well the content in utfnamelist.txt. It's just won't do the rest. Anywhere I can further check with ?

        Bah! sorry about that ( a bug, the program relied on a pre-existing utfnamelist.txt ), try this

        #!/usr/bin/perl -- use strict; use warnings; use Path::Class; use File::Slurp; use Data::Dump; use Win32::Unicode::File; our $thisf = file(__FILE__)->absolute; our $thisd = $thisf->dir; our $tmp = 'fafafa'; chdir $thisd or die sprintf q/ERRRR(%d)(%s)(%d)(%s)/, $!,$!,$^E,$^E; mkdir $tmp or die sprintf q/ERRRR(%d)(%s)(%d)(%s)/, $!,$!,$^E,$^E; chdir $tmp or die sprintf q/ERRRR(%d)(%s)(%d)(%s)/, $!,$!,$^E,$^E; my $names = file('utfnamelist.txt')->absolute; #~ my $bin = read_file( $names, { binmode => ':raw' } ) ; dd $bin; my $bin = "\xEF\xBB\xBF# a utf-8 file-o kebab's\r\nutfnamelist.txt +\xD1\x9B\xD0\xB5\xD0\xB2\xD0\xB0\xD0\xBF.txt\r\n\xD1\x9B\xD0\xB5\xD0\ +xB2\xD0\xB0\xD0\xBF.txt ra\xC5\xBEnji\xC4\x87.txt\r\nra\xC5\ +xBEnji\xC4\x87.txt \xC4\x87evap.txt\r\n\xC4\x87evap.txt + \xD0\xBA\xD0\xB5\xD0\xB1\xD0\xB0\xD0\xBF\xD1\x87\xD0\xB5.txt\r\n\x +D0\xBA\xD0\xB5\xD0\xB1\xD0\xB0\xD0\xBF\xD1\x87\xD0\xB5.txt keb +abNOMORE.txt\r\n"; write_file( $names->basename, \$bin ); binmode STDOUT, ':encoding(UTF-8)';print "\x{feff}";#BOM ddir(); open my($fh), '<:encoding(UTF-8)', \$bin or die sprintf q/ERRRR(%d)(%s +)(%d)(%s)/, $!,$!,$^E,$^E; while(<$fh>){ next if $. == 1; # /#/; my( $here, $there ) = split /\s+/, $_; next unless $here and $there; print "# moveW $here => $there\n"; #~ dd [ $here, $there ]; moveW( $here, $there ) or die sprintf q/ERRRR(%d)(%s)(%d)(%s)/, $ +!,$!,$^E,$^E; ddir(); } chdir '..'; dir($tmp)->rmtree("verbose","cautious"); sub ddir { my $dir = `cmd /x /c "dir /b "`;; #~ use Encode::Detective qw' detect '; #~ warn detect($dir) ; # "UTF-8" utf8::decode($dir); print "# dir # ", $dir ; }

        Sorry but won't work for me. .... Anywhere I can further check with ?

        I don't know, you've just described an impossible situation

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://984351]
help
Chatterbox?
[ovedpo15]: it isn't homework. its a module I use at work. The process to add additional module isn't worth it. p.s never heard a university course which teach perl
[marto]: there are many, to this day.
[ovedpo15]: @Corion Yes I use the regex you showed me. its very good regex. although now i need to know that substring unit that comma
marto mad corrections to one around 18 months ago
[ovedpo15]: my code checks the value after that comma, if it isn't valid it will remove it. so I would like to remove the substring after that comma meaning getting the string before comma.
[hippo]: Are you sure that Text::CSV_XS isn't already installed at your work? It's such a useful module that it might well be there.
[marto]: pointing out the advantages of cpan modules is well worth in, as both developer and $client/$company benefit greatly
[Tux]: $src =~ m{^(.*),(.*)$/ and $2 !~ $valid and $src = $1;
[Tux]: s,/,},

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (9)
As of 2018-05-27 10:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?