http://www.perlmonks.org?node_id=960836

packetstormer has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks

I have a REALLY strange display problem I can only assume is a bug in Perl Templates (very unlikly, I know!). If you stay with me I will explain what is happening:

I have a small MySQL database with a table being queried. Below is the function to query the database, the file that calls the function and the template file displaying the data. The first set displays the data correctly. The second set, queries the same database and the same table, the only thing that is different is the where clause. Yet the data is displayed incorrectly.

Working function, file and template

#Calling file #!/usr/bin/perl use strict; use Template; use CGI; use CGI::Session ( '-ip_match' ); use DBI; use Data::Dumper; require "class.pm"; my $cgi = new CGI; my @cases = show_cases_home($library_id,$users_id); my $template = Template->new(); my $file = 'templates/home.tt'; my $vars = { cases => @cases }; $template->process($file,$vars) || die $template->error(), "\n"; # SUB ROUTINE IN CLASS FILE sub show_cases_home { my $dbh = new_dbh(); my @cases; my $library_id = $_[0]; my $assigned_to = $_[1]; #my $query = 'SELECT c.*, DATE_FORMAT(c.added_date, "%d/%m/%Y"), b +.firstname, b.surname from cases c, borrower b #WHERE c.library_id=? AND c.assigned_to = ? and c.borrower_id = b. +borrower_id order by added_date desc'; my $query = 'SELECT c.caseid,c.status,c.case_header,DATE_FORMAT(c. +added_date, "%d/%m/%Y"), b.firstname, b.surname FROM cases c LEFT JOIN borrower b USING (borrower_id) WHERE c.library_id=? AND c.assigned_to =? AND c.status = "open" order by added_date desc'; my $sth = $dbh->prepare($query); $sth->execute($library_id,$assigned_to); while (my @row = $sth->fetchrow_array()) { push @cases, \@row; } return \@cases; } # TEMPLATE FILE SHOWING ALL CORRECTLY [% INCLUDE templates/header_auth.inc %] [% FOREACH wah IN cases %] <td><a href="show_case.pl?caseid=[% wah.0 %]">[% wah.0 %]< +/a></td><td>[% wah.3 %]</td><td>[% wah.2 %]</td><td>[% wah.4 %] [% wa +h.5 %]</td><td>[% wah.1 %]</td> </tr> [% END %] # END of WORKING FILES

The next set will use the same class file, the same dbh connection and the same table. However the data, specifically data with accents e.g "Séan" don't show correctly. Note: both template files are using the same header (UTF8) calling the same database (using the same dbh call with UTF8 enabled) and the same table

use strict; use Template; use CGI; use CGI::Session ( '-ip_match' ); use DBI; use Data::Dumper; use Sphinx::Search; require "class.pm"; my $cgi = new CGI; my $dbh = new_dbh(); my $caseid = $cgi->param("caseid"); my $users_id = "31"; my @lines = word_test($caseid,$users_id); # Sub routine sub word_test { my $dbh = new_dbh(); my @cases; my $caseid = $_[0]; my $assigned_to = $_[1]; my $query = 'SELECT caseid,status,case_header,DATE_FORMAT(added_da +te, "%d/%m/%Y") FROM cases where caseid = ? and assigned_to = ? order by added_date desc'; my $sth = $dbh->prepare($query); $sth->execute($caseid,$assigned_to); while (my @row = $sth->fetchrow_array()) { push @cases, \@row; } return \@cases; } # TEMPLATE FILE [% INCLUDE templates/header_auth.inc %] [% FOREACH row IN lines %] "[% row.0 %] [% row.2 %] " <br /> [% END %] # END FILE

If I remove the caseid from the function the data seems to return and display correctly.
I have spent hours on this today and I am hoping it is a stupid problem that some will spot straight away!

Anyone!?

Replies are listed 'Best First'.
Re: Bug in Template?
by Ralesk (Pilgrim) on Mar 21, 2012 at 23:29 UTC

    I apologise for being off-topic right in the first reply, but:

    my $vars = { cases => @cases };

    This poked me in the eye, and I don’t think you can do that — well, you can, but you probably shouldn’t. This will create $vars with { cases => $cases[0], "$cases[1]" => $cases[2], … } and that’s more than likely not what you want.

Re: Bug in Template?
by Ralesk (Pilgrim) on Mar 21, 2012 at 23:42 UTC

    Being more on-topic now, there are a few places where character encoding and in particular UTF-8 can go wrong. The database, the database connection, the code (eg. bad encode/decode), and the output.

    You didn’t say what kind of output you get instead of “Séan”. I think we’ll need that information to debug this.

Re: Bug in Template?
by remiah (Hermit) on Mar 21, 2012 at 23:55 UTC
    Hello packetstormer

    You said
    > (using the same dbh call with UTF8 enabled)
    This may be "internally decoded perl's utf8". So encode it to external UTF8 before you pass them to Tempalte.

    #!/usr/bin/perl use strict; use warnings; use Encode qw(encode decode); use Template; my @chars_not_encoded=(); my @chars_encoded=(); #foreach my $code ( hex('3041') .. hex('3096') ){ foreach my $code ( hex('00C0') .. hex('00F0') ){ push @chars_not_encoded, chr($code); push @chars_encoded, encode('utf8', chr($code)) ; }; my $t =Template->new(); #corrupt output $t->process("test.tmpl", {lines=>\@chars_not_encoded}, "log_noenc" ) o +r die $t->error(); #OK $t->process("test.tmpl", {lines=>\@chars_encoded}, "log_enc" ) or die +$t->error();
    And template
    <html> <head> <meta HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8" +> </head> <body> [% FOREACH item IN lines %] item=#[% item %]#<br> [% END %] </body> </html>

    I am also confusing about encoding of Template, And there seems a lot to read for theses troubles(for example Template::Provider::Encoding )... good luck

      Oh ... it seems I am totally confused.
      I 'll post later when I clear my mind.

        This seems not a problem of Template. I also want advice for this.

        “Séan”'s é may be 00E9 of unicode table http://www.utf8-chartable.de/unicode-utf8-table.pl. I thought decode it to perl internal utf8 and pass them to Template encoding it utf8 will work. But it is not work. Without Template, there is strange behavior.

        #!/usr/bin/perl use strict; use warnings; use Encode qw(is_utf8 encode decode); use Template; my(@raw, @decoded_internal_utf8,@encoded_raw_utf8,@encoded_internal_ut +f8); my @chars=hex('00C0') .. hex('00F0'); #target characters #my @chars=hex('3041') .. hex('3096'); #hiragana foreach my $code ( @chars ){ my($raw, $chr); $raw =chr($code); if ( is_utf8($raw) ){ $chr=$raw; } else { $chr=decode('utf8',$raw); } push @raw, $raw; push @decoded_internal_utf8, $chr; push @encoded_raw_utf8 , encode('utf8', $raw); push @encoded_internal_utf8, encode('utf8', $chr); } print "======================\n"; print "perl=$^X : version=$]\n"; print "1.###raw\n"; print "#$_#\n" for @raw; print "2.###decoded_intenal_utf8\n"; #print "#$_#\n" for @decoded_internal_utf8; print "3.###encoded_raw_utf8\n"; print "#$_#\n" for @encoded_raw_utf8; print "4.###encoded_internal_utf8\n"; print "#$_#\n" for @encoded_internal_utf8;
        It is strange No3 only works at this case. I usualy print characters with No 4. Japanese characters like "hiragana" seems to have no problem( for example,'3041' .. '3096').

        I saw similar problem at Why Doesn't Text::CSV_XS Print Valid UTF-8 Text When Used With the open Pragma?. At that time, I didn't understand well and thought newer version would have no problem... Is this the same trouble? I tried with 5.012002 and 5.014002. They print exact same output except version number.