in reply to Connect SQLite with unicode directory
Filesystems in general don't know about encoding.
Unixish filesystems (and the APIs) usually expose the filename as a binary blob, which matches well with using UTF-8 encoded filenames.
Windows filesystems (and the APIs) usually expose the filename as Wide Characters, so if you get the filename as UTF-8, you need to translate it to Wide Characters and you also need to use the Wide APIs (CreateFileW etc) to access such files.
As a workaround to these issues, I am a fan of Text::Unidecode (and Text::CleanFragment) to downcase characters to ASCII.
Personally, I try to avoid non-ASCII characters in the functional parts of programs and instead use the named entities:
my $PathCorpusDB2 = "\N{LOWER CASE LATIN LETTER U WITH DIAERESIS}/data +baseTest2.db";
This still won't solve your problem with Umlauts in the charset though. I think that using CreateFileW() with UTF-8 encoded filenames should work, but I don't know how to tell SQLite that.
|
---|