Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: How to stat a file with a Unicode (UTF16-LE) filename in Windows?

by ikegami (Pope)
on Feb 06, 2009 at 06:12 UTC ( #741808=note: print w/replies, xml ) Need Help??


in reply to How to stat a file with a Unicode (UTF16-LE) filename in Windows?

Of course, calling stat (a unix system call emulation) is a very roundabout way of calling GetFileTime. The work has already been done for you in Win32API::File::Time.

use strict; use warnings; use Encode qw( encode ); use Win32API::File::Time qw( GetFileTime ); { # The file name consists of a black heart (U+2665). my $fn = encode('UCS-2le', "\x{2665}"); local ${^WIDE_SYSTEM_CALLS} = 1; my ($atime, $mtime, $ctime) = GetFileTime($fn) or die("GetFileTime: $^E\n"); print("atime: ", scalar(localtime($atime)), "\n"); print("mtime: ", scalar(localtime($mtime)), "\n"); print("ctime: ", scalar(localtime($ctime)), "\n"); }
atime: Fri Feb 6 00:44:39 2009 mtime: Fri Feb 6 00:44:39 2009 ctime: Fri Feb 6 00:44:39 2009

Replies are listed 'Best First'.
Re^2: How to stat a file with a Unicode (UTF16-LE) filename in Windows?
by alanhaggai (Novice) on Feb 06, 2009 at 06:27 UTC

    It is working well. Now I understand why I was not able to stat(). I did not use Symbol, and gensym(). I will read about them. Also, in Windows, internally, which encoding is used for filenames? UTF16-le or UCS-2le?

    Thanks again for sparing your time and for the great code that you have posted.

      Also, in Windows, internally, which encoding is used for filenames? UTF16-le or UCS-2le?

      It is my tenuous understanding that the difference between UTF-16 and UCS-2 is UTF-16 can address characters above 64K and UCS-2 cannot. I haven't seen any support for multi-word characters in Windows, so I believe it's UCS-2. In practice, it doesn't matter which one you use.

      I did not use Symbol, and gensym()

      The documentation for OsFHandleOpen clearly defines what is acceptable, and an undefined lexical isn't one one of those.

        Oh I see. Thank you.
      For what it is worth (not much) the MSDN says the following:
      Windows stores the long file names on disk in Unicode. ...The valid character set for these long file names is the NTFS character set, less one character: the colon (':') ...

      I have searched high and low for a definition of "the NTFS character set" but could not find a thing.

      The MSDN also offers conventions (like the Pirate's code) for naming files:
      Use any character in the current code page for a name, except characters in the range 0 through 31 or any character explicitly disallowed by the file system. A name can contain characters in the extended character set (128255). However, it cannot contain the following reserved characters: < > : " / \ |
      This implies single byte characters, which contradicts that above. The phrase "any character explicitly disallowed by the file system" is wonderful when they do not seem to define what they might be.
        See http://en.wikipedia.org/wiki/Filename section 'Comparison of filename limitations'

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://741808]
help
Chatterbox?
[Corion]: Hehe - $work is a place where we have lots of (money) accounts, and lots of journals where every transaction is recorded. But our HR system where the accounts of hours worked and vacation days taken are stored, there is no real account of who changed ...
[Corion]: ... that balance, and when. And it seems to me that they somehow really messed up the database since the start of the year and have been frantically adding and subtracting numbers from the totals, but without trace ;)

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (7)
As of 2018-01-23 16:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    How did you see in the new year?










    Results (249 votes). Check out past polls.

    Notices?