Encoding problem(reposting in more detail)

by john_oshea (Priest)
on May 29, 2006 at 11:40 UTC

in reply to Encoding problem(reposting in more detail)

It appears that you've got filenames in ISO-8859-7, so I'm wondering if you couldn't just serve the html page(s) out with ISO-8859-7 encoding as well? This would reduce the number of from_to conversions to the final one that inputs the contents of $passage into the database.

Encoding problem(reposting in more detail)
by Nik on May 29, 2006 at 12:40 UTC
    Well that can work and i though of it but i want to keep using UTF8 and if i find a reason of wwhy nto greek file strings arent UTF8 be default as i save them and correct ti then i wont have to use encoding at all.

      OK, I've done some digging in your previous posts and have found this snippet from you:

      Because windows are unable of saving greek filenames also in utf8 format(they save only the file contents iam afraid)

      which makes me wonder if Windows is saving the filenames correctly and that you actually have a display issue in cmd.exe. Googling for 'windows greek filenames cmd.exe' gets me this as one of the results, which has, about halfway down, the following:

      While the reason for their existance might be odd, whitespaces, dots, percentage signs are perfectly legal characters for file names in Windows 2000 (Though they weren't in DOS). The only illegal characters for file names in Windows are: \ / : * ? " < > | Anything else is OK, including international characters which may seem odd to you if your FS is FAT and your codepage is not set correctly (or simply because you don't speak russian/greek/hebrew/arabic/etc).

      Unfortunately I don't 'do' Windows well enough to confirm if this is the issue or not, but, at the very least, something like this should get you started.

      Hope that helps

