http://www.perlmonks.org?node_id=1061041


in reply to Re: open sqilite path unicode
in thread open sqilite path unicode

Hello, I'm running ActiveState 5.16

Here is the simple code I'm working on right now with some (maybe) useful error messages. The database is in a folder containing a "ü" in its name...

use strict; use warnings; use UTF8; use Tk; use DBD::SQLite; use Data::Dumper; $Data::Dumper::Useqq=1; my $mw = new MainWindow; my $types = [ ['Text', '.db'], ['All Files', '*'],]; my $file_1= $mw->getOpenFile(-filetypes => $types); #just looking at how the path is print Dumper($file_1); #connecting to sqlite my $dbh = DBI->connect("dbi:SQLite:$file_1", "", "", { RaiseError => 1 +, AutoCommit => 1, PrintError => 0 });

... connection error

Replies are listed 'Best First'.
Re^3: open sqilite path unicode
by soonix (Canon) on Nov 03, 2013 at 13:36 UTC

    What file system are you using there? NTFS or one of the FAT versions? (I know about PHP having issues with NTFS and non-ASCII characters)

    And, perhaps you could diagnose what readdir returns as name of the concerned folder...

      I'm working on a Windows 8 machine (64bit), NTFS: not working. Just tried the same script, same database, same ActiveState on a Windows 7 machine, NTFS...ad it works...I'm confused

        Don't forget, there are different ways to support Unicode in the file system. I've worked a lot with Unicode texts in Perl and it is fairly good, nothing to complain about. But when it comes to file names, the system surrounding the application makes it complex. There are different ways to store Unicode names in the file system. Common in Linux (Unix?) is UTF-8 encoding. But there are also UTF-16 (BE or LE), there are UCS-2 and UCS-4 and there might be some other custom way.

        So, to solve the problem, it is better to learn, how Unicode characters are stored in your file system. Here, the readdir definitely can help. Just read the content of directory, dump file/directory names as hex and see which encoding is used. Of course, if you intend your application to run on different systems, then you are doomed to learn all possible ways Unicode is supported on them and finding out, how your program can automatically figure out the correct one.