Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: How to fix wrongly encoded filenames?

by graff (Chancellor)
on Mar 18, 2014 at 03:37 UTC ( #1078721=note: print w/ replies, xml ) Need Help??


in reply to How to fix wrongly encoded filenames?

Do you still have the original tar file that came from the unix system? If so, you should be able to open that with Archive::Tar, and get the raw byte strings of the file names. If they really are encoded as iso-8859-1, then it's trivial to decode those strings to utf8 (and if necessary, re-encode them to whatever works on your windows server).

If that's possible, then maybe you want to just delete the first attempt from the windows server and try again using Archive::Zip (instead of 7zip, whatever that is); you can iterate through the tar file, decode the non-ASCII names into perl-internal utf8 (and re-encode for windows if necessary); then create directories and files on the server filesystem as needed to unpack the tar contents.

Who knows, maybe you'll want to decode/recode the file contents while you're at it.


Comment on Re: How to fix wrongly encoded filenames?
Re^2: How to fix wrongly encoded filenames?
by Anonymous Monk on Mar 18, 2014 at 06:17 UTC
    That is what was tried before. It fails for two reasons: First: Performance, cause you have to handle the files separately. The count of file may be upto 5000 files.

    Second: Size, the TAR-balls which are to handle, are of the size of some Gbytes.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1078721]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (9)
As of 2014-07-25 12:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (171 votes), past polls