Re: Connect SQLite with unicode directory

by Corion (Pope)
in reply to Connect SQLite with unicode directory

Filesystems in general don't know about encoding.

Unixish filesystems (and the APIs) usually expose the filename as a binary blob, which matches well with using UTF-8 encoded filenames.

Windows filesystems (and the APIs) usually expose the filename as Wide Characters, so if you get the filename as UTF-8, you need to translate it to Wide Characters and you also need to use the Wide APIs (CreateFileW etc) to access such files.

As a workaround to these issues, I am a fan of Text::Unidecode (and Text::CleanFragment) to downcase characters to ASCII.

Personally, I try to avoid non-ASCII characters in the functional parts of programs and instead use the named entities:

my $PathCorpusDB2 = "\N{LOWER CASE LATIN LETTER U WITH DIAERESIS}/data +baseTest2.db";

This still won't solve your problem with Umlauts in the charset though. I think that using CreateFileW() with UTF-8 encoded filenames should work, but I don't know how to tell SQLite that.

