Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: Directory operations and Unicode

by patcat88 (Deacon)
on Nov 06, 2011 at 02:39 UTC ( #936221=note: print w/ replies, xml ) Need Help??


in reply to Directory operations and Unicode

How would someone use Window's UTF16LE unicode aware APIs with Perl's ASCII until infected with UTF8 string system in XS? I looked for examples of MultiByteToWideChar in perl's source.
1. in win32/win32.c#l815 in perl.git and https://metacpan.org/source/COSIMO/Win32-API-0.60/API.xs#L139 MultiByteToWideChar was set up with "CP_ACP ANSI code page" always.
2. In cpan/Win32/Win32.xs#l161 in perl.git, the SV is checked for utf8 flag, if so its converted as CP_UTF8, otherwise, CP_ACP.
3. or better to force all SVs to utf8 using perl api before MultiByteToWideChar then with CP_UTF8 like in https://metacpan.org/source/BJOERN/Win32-MultiLanguage-0.72/MultiLanguage.xs#L51?
4. Or is it best to leave is to the caller to UTF16LE encode their scalars using Encode and pass "binary garbage" scalars to XS, then nothing but a SvPV, and cast the PV */char * to a wchar_t * the way its done in Win32API::File?
5. use mbstowcs instead of MultiByteToWideChar? mbstowcs uses null string end marking, not exactly safe
6. so far all examples of MultiByteToWideChar, the flags parameter has been zero, except for https://metacpan.org/source/SOMMAR/Win32-SqlServer-2.007/convenience.cpp#L51, where its MB_PRECOMPOSED for ascii and 0 for utf8. Whats the correct handling for the flags parameter of MultiByteToWideChar?


Comment on Re: Directory operations and Unicode
Replies are listed 'Best First'.
Re^2: Directory operations and Unicode
by Anonymous Monk on Nov 06, 2011 at 05:18 UTC
Re^2: Directory operations and Unicode
by nikosv (Hermit) on Nov 06, 2011 at 05:21 UTC
    When using Win32::COM I subsequently use the OS COM facilities, hence I bypass any Wide APIs, call the Scripting.FileSystemObject and access the filesystem in UTF:
    Win32::OLE->Option(CP => Win32::OLE::CP_UTF8); $obj = Win32::OLE->new('Scripting.FileSystemObject');
    and manipulate its methods, for example :
    $folder = $obj->GetFolder("."); $collection= $folder->{Files};
    If you want to keep your sanity do not start looking into the wide API's ! :)
      When using Win32::COM I subsequently use the OS COM facilities, hence I bypass any Wide APIs, call the Scripting.FileSystemObject and access the filesystem in UTF

      No. You don't bypass the Wide APIs, you wrap them using a ridiculously large stack of other APIs. After uselessly burning a lot of CPU cycles, the Scripting.FileSystemObject finally ends calling the Wide APIs.

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
        hence I bypass any Wide APIs

        in the sense that I don't have to deal or think about them them

        you wrap them using a ridiculously large stack of other APIs. After uselessly burning a lot of CPU cycles

        might be true but it gets the job done. what is your suggestion of doing it otherwise ?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://936221]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (9)
As of 2015-07-30 18:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (273 votes), past polls