Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Weird encoding after grabing filenames

by Nik (Initiate)
on Jun 16, 2009 at 19:42 UTC ( #772123=perlquestion: print w/replies, xml ) Need Help??

Nik has asked for the wisdom of the Perl Monks concerning the following question:

Hello i have the following code:
#!/usr/bin/perl -w use strict; use CGI::Carp qw/fatalsToBrowser/; use CGI qw/:standard/; use DBI; use Encode; print "Content-Type: text/html\n\n"; my @files = glob "$ENV{'DOCUMENT_ROOT'}/data/text/*.txt"; my @menu_files = map {/([^\/]+)\.txt$/} @files; Encode::from_to(@menu_files, 'ISO-8859-7', 'utf8'); print "@files"; print "@menu_files";
I don't know why but when i get the whole bunch of files form '/data/text' folder and print it although the path that preceed each of the files appear ok the filename itself appear liek squares(weird encoding)
here is the output of this code: http://tech-nikos.gr/cgi-bin/test.pl
i tried to switch the encoding from greek to utf8 but if i use it or don't use it output produces remains the same. Any ideas why?

Replies are listed 'Best First'.
Re: Weird encoding after grabing filenames
by moritz (Cardinal) on Jun 16, 2009 at 19:48 UTC
    print "Content-Type: text/html\n\n";

    Always include the encoding, ie Content-Type: text/html; charset=utf-8\n\n (or let CGI generate that for you, you use it anyway).

    Also you have to pass a scalar as the first argument to from_to, not an array. Read the docs for usage information.

      yes i do use the cgi equivalent in longer scripts that is 'print header( -charset=>'utf-8' );'

      Yes but i need to re-encode all filenames(@menu_files) to utf-8 at one step, can't it be done without a repeatiton loop?
      Why the path of files is taken correctly while the filaname appears like this?

        can't it be done without a repeatiton loop?

        You want to repeat an action without a loop?

        Well, I suppose you could do

        from_to($menu_files[0], 'ISO-8859-7', 'UTF-8') if @menu_files >= 1; from_to($menu_files[1], 'ISO-8859-7', 'UTF-8') if @menu_files >= 2; from_to($menu_files[2], 'ISO-8859-7', 'UTF-8') if @menu_files >= 3; from_to($menu_files[3], 'ISO-8859-7', 'UTF-8') if @menu_files >= 4; die("Need more!") if @menu_files >= 5;

        Does it count as a loop if the repeating is done by the person rather than the computer?

        Or if all you want to do is hide the loop

        sub from_to_multi { my $fr = shift; my $to = shift; from_to($_, $fr, $to) for @_; } from_to_multi('ISO-8859-7', 'UTF-8', @menu_files);

        But then you end up with two loops. One to place the elements on the stack, and one to process the elements on the stack.

        When I visited your web page (http://tech-nikos.gr/cgi-bin/test.pl), I was able to get a sensible display by telling my browser to treat the page as iso-8859-7 (greek). But I gather you want the text to be in utf8, which I think would be a good idea.

        As you follow moritz's good advice, you have to respect the docs regarding Encode::from_to(). Here's an easy way to do the required loop in a single line of code:

        Encode::from_to($_, 'ISO-8859-7', 'utf8') for (@menu_files);
        The reason why the path strings are showing up fine is because they are just plain ascii characters; it's only the file names that are non-ascii, and if the web server and browser don't agree on what the encoding is for those non-ascii characters, it's just noise.

        (updated to fix grammar in first sentence)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://772123]
Approved by Perlbotics
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (3)
As of 2023-05-31 03:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?