Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Encoding problem(reposting in more detail)

by Nik (Initiate)
on May 29, 2006 at 08:29 UTC ( [id://552268]=perlquestion: print w/replies, xml ) Need Help??

Nik has asked for the wisdom of the Perl Monks concerning the following question:

Same question that remaines unanswered although i ahve asked it in many irc channels and forum places is why the greek file name string arent UTF8 as i saved them when creating the file. It it were then it would have benn appeared correctly in the popup menu and they wouldnt require 3 re-encodingsa back and forth from utf8 => iso-8859-7 and so on. here is the code again:
my @files = <../data/text/*.txt>; my @display_files = map /([^\/]+)\.txt/, @files; Encode::from_to($_, "ISO-8859-7", "utf8") for @display_files; #1st con +version in order to appear correctly in the popup menu print br; print start_form( action=>'index.pl' ); print h1( {class=>'lime'}, "&#917;&#960;&#941;&#955;&#949;&#958;&# +949; &#964;&#959; &#954;&#949;&#943;&#956;&#949;&#957;&#959; &#960;&# +959;&#965; &#963;&#949; &#949;&#957;&#948;&#953;&#945;&#966;&#941;&#9 +61;&#949;&#953; => ", popup_menu( -name=>'select', -values=> +\@display_files ), submit('&#917;&#956;&#966;&#940;&#957; +&#953;&#963;&#951;')); print end_form; my $passage = param('select') || "&#913;&#961;&#967;&#953;&#954;&#942; + &#931;&#949;&#955;&#943;&#948;&#945;!"; Encode::from_to($passage, "utf8", "ISO-8859-7") if param(); #2nd conversion in order for the user selected file from the popup men +u to be able to be opened if ( param('select') ) { open(FILE, "<../data/text/$passage.txt") or die $!; local $/; $data = <FILE>; close(FILE); Encode::from_to($passage, "ISO-8859-7", "utf8"); #3nd conversion in order for the user selected file from the popup men +u to be able to be inserted as "UTF8" in the database. $select = $dbh->prepare( "UPDATE guestlog SET passage=?, date=?, c +ounter=counter+1 WHERE host=?" ); $select->execute( $passage, $date, $host ); } else more code
As you can see this re-encoding stuff its becaming very tredious and not only to this script(index.pl) but in other as well.

Replies are listed 'Best First'.
Re: Encoding problem(reposting in more detail)
by john_oshea (Priest) on May 29, 2006 at 11:40 UTC

    It appears that you've got filenames in ISO-8859-7, so I'm wondering if you couldn't just serve the html page(s) out with ISO-8859-7 encoding as well? This would reduce the number of from_to conversions to the final one that inputs the contents of $passage into the database.

    I've not seen your previous postings, so apologies in advance if this is repeating previous comments -hope that helps.

      Well that can work and i though of it but i want to keep using UTF8 and if i find a reason of wwhy nto greek file strings arent UTF8 be default as i save them and correct ti then i wont have to use encoding at all.

        OK, I've done some digging in your previous posts and have found this snippet from you:

        Because windows are unable of saving greek filenames also in utf8 format(they save only the file contents iam afraid)

        which makes me wonder if Windows is saving the filenames correctly and that you actually have a display issue in cmd.exe. Googling for 'windows greek filenames cmd.exe' gets me this as one of the results, which has, about halfway down, the following:

        While the reason for their existance might be odd, whitespaces, dots, percentage signs are perfectly legal characters for file names in Windows 2000 (Though they weren't in DOS). The only illegal characters for file names in Windows are: \ / : * ? " < > | Anything else is OK, including international characters which may seem odd to you if your FS is FAT and your codepage is not set correctly (or simply because you don't speak russian/greek/hebrew/arabic/etc).

        Unfortunately I don't 'do' Windows well enough to confirm if this is the issue or not, but, at the very least, something like this should get you started.

        Hope that helps

        A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://552268]
Approved by Zaxo
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (3)
As of 2024-04-19 17:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found