Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re^2: Writing UTF8 Filename

by eserte (Deacon)
on Nov 17, 2007 at 16:34 UTC ( [id://651420]=note: print w/replies, xml ) Need Help??


in reply to Re: Writing UTF8 Filename
in thread Writing UTF8 Filename

The current state of support for non-ASCII characters in file names is not what I would call "stable" (or "sane" or "worth the hassle").
Simply name it: it's non-existent.

Maybe we'll have encoded filenames support in perl 5.12?

Replies are listed 'Best First'.
Re^3: Writing UTF8 Filename
by graff (Chancellor) on Nov 17, 2007 at 17:00 UTC
    Maybe we'll have encoded filenames support in perl 5.12?

    My "current state of support" comment was a reference to OS-level issues (on whatever OS). I would not expect such support from perl any time soon, given that there is no consistent form of OS support.

      Well, Win32 has had stable support for Unicode filenames for many years. Perl's support for that is woeful but I'm working on that. I'd make a joke about "Real" operating systems, but it seems the Unix mongers may need a break from that in order to recover their sense of humor. (:

      - tye        

        One problem, even with Win32, is that you can have multiple filesystems on a single system, even within a single tree. Not every filesystem handles filenames the same way. Any solution for Perl would be incomplete without the possibility to override the encoding decision per path.

        I'm hoping for a solution that is sufficiently abstracted that all platforms can use it. Win32's implementation would probably be a bit easier than one for, say, Linux, but even if you have to set things explicitly per path, it's better than what we have now. The following is copied from a post to p5p a while ago.

        I tend to agree, however pragmas tend to be global, program- or packagewise, and what suits best here is individual, perl-call flag.

        Global is a problem in most cases, but I feel it would be perfect here, simply because the filesystem is equally global. In fact, it's even longer lived than your Perl program :)

        Better yet, global variables can be localized to dynamic scope. This is good, because when you set the encoding for /foo, it should work for encoding-unaware modules too.

        Maybe a hash would be nice:

        ${^FS_ENCODING}{foo} = 'A'; ${^FS_ENCODING}{foo}{bar} = 'B'; ${^FS_ENCODING}{foo}{bar}{baz}{quux} = 'auto'; open my $fh, ">", "/foo/bar/baz/quux/blah/hello.txt";
        Which then actually does:
        open my $fh, ">", join("/", "" encode(detect_encoding("/"), "foo"), encode("A", "bar"), encode("B", "baz"), encode("B", "quux"), encode(detect_encoding("/foo/bar/baz/quux"), "blah"), encode(detect_encoding("/foo/bar/baz/quux/blah"), "hello.txt") +, );

        Juerd # { site => 'juerd.nl', do_not_use => 'spamtrap', perl6_server => 'feather' }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://651420]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (6)
As of 2024-04-26 08:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found