Re: Spreadsheet::ParseXLSX filename non Latin Tk getOpenFile

by vr (Curate)
on Nov 21, 2018

in reply to Spreadsheet::ParseXLSX filename non Latin Tk getOpenFile

I wouldn't rely on short names "on any desktop machine" -- 8.3 names could have easily been disabled on some of them. Besides, underlying MS API may be buggy, though I didn't investigate thoroughly. It seems that short name is not looked up (read from directory) if long name is itself 8.3 and uses CP1252 (? - not sure) characters.

E.g. my windows CP is not 1252 and doesn't contain "˝" at all, therefore, for directory "Bu˝uel", "short" (actually, it's longer) name "BUUEL~1" is generated and must be supplied to non-complying software, such as Perl :).

D:\>mkdir Bu˝uel

D:\>dir /x
# skipped

21/11/2018  17:01    <DIR>          BUUEL~1      Bu˝uel

# skipped

Then I put "x.xlsx" into this directory, and:

use strict;
use warnings;
use feature 'say';
use utf8;
use Spreadsheet::ParseXLSX;
use Win32::API;
use Win32::LongPath;
use Encode qw/ encode decode /;
use Test::More;

Win32::API::More-> Import( 'kernel32', 'GetShortPathNameW', 'NPN', 'N' ) or die;

my $dir = 'D:\Bu˝uel';
my $tmp = encode 'UTF16LE', "$dir\0";
my $ptr = unpack 'L', pack 'p', $tmp;
my $buf = ' ' x 100;
my $len = GetShortPathNameW( $ptr, $buf, 100 ) or die;

my $short = substr +( decode 'UTF16LE', $buf ), 0, $len;

is $dir, $short,                "but they should not be the same!";
is shortpathL( $dir ), $short,  "because Win32 API is used anyway";


my $parser   = Spreadsheet::ParseXLSX-> new;
#my $workbook = $parser-> parse( shortpathL( "$dir/x.xlsx" ));   # will die here
my $workbook = $parser-> parse( 'D:\BUUEL~1/x.xlsx' );          # go on living


ok 1 - but they should not be the same!
ok 2 - because Win32 API is used anyway
