Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: How to stat a file with a Unicode (UTF16-LE) filename in Windows?

by ikegami (Pope)
on Feb 06, 2009 at 06:12 UTC ( #741808=note: print w/ replies, xml ) Need Help??


in reply to How to stat a file with a Unicode (UTF16-LE) filename in Windows?

Of course, calling stat (a unix system call emulation) is a very roundabout way of calling GetFileTime. The work has already been done for you in Win32API::File::Time.

use strict; use warnings; use Encode qw( encode ); use Win32API::File::Time qw( GetFileTime ); { # The file name consists of a black heart (U+2665). my $fn = encode('UCS-2le', "\x{2665}"); local ${^WIDE_SYSTEM_CALLS} = 1; my ($atime, $mtime, $ctime) = GetFileTime($fn) or die("GetFileTime: $^E\n"); print("atime: ", scalar(localtime($atime)), "\n"); print("mtime: ", scalar(localtime($mtime)), "\n"); print("ctime: ", scalar(localtime($ctime)), "\n"); }
atime: Fri Feb 6 00:44:39 2009 mtime: Fri Feb 6 00:44:39 2009 ctime: Fri Feb 6 00:44:39 2009


Comment on Re: How to stat a file with a Unicode (UTF16-LE) filename in Windows?
Select or Download Code
Re^2: How to stat a file with a Unicode (UTF16-LE) filename in Windows?
by alanhaggai (Initiate) on Feb 06, 2009 at 06:27 UTC

    It is working well. Now I understand why I was not able to stat(). I did not use Symbol, and gensym(). I will read about them. Also, in Windows, internally, which encoding is used for filenames? UTF16-le or UCS-2le?

    Thanks again for sparing your time and for the great code that you have posted.

      Also, in Windows, internally, which encoding is used for filenames? UTF16-le or UCS-2le?

      It is my tenuous understanding that the difference between UTF-16 and UCS-2 is UTF-16 can address characters above 64K and UCS-2 cannot. I haven't seen any support for multi-word characters in Windows, so I believe it's UCS-2. In practice, it doesn't matter which one you use.

      I did not use Symbol, and gensym()

      The documentation for OsFHandleOpen clearly defines what is acceptable, and an undefined lexical isn't one one of those.

        Oh I see. Thank you.
      For what it is worth (not much) the MSDN says the following:
      Windows stores the long file names on disk in Unicode. ...The valid character set for these long file names is the NTFS character set, less one character: the colon (':') ...

      I have searched high and low for a definition of "the NTFS character set" but could not find a thing.

      The MSDN also offers conventions (like the Pirate's code) for naming files:
      Use any character in the current code page for a name, except characters in the range 0 through 31 or any character explicitly disallowed by the file system. A name can contain characters in the extended character set (128255). However, it cannot contain the following reserved characters: < > : " / \ |
      This implies single byte characters, which contradicts that above. The phrase "any character explicitly disallowed by the file system" is wonderful when they do not seem to define what they might be.
        See http://en.wikipedia.org/wiki/Filename section 'Comparison of filename limitations'

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://741808]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (2)
As of 2014-11-29 08:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My preferred Perl binaries come from:














    Results (204 votes), past polls