Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Encoding issue after upgrade

by Bod (Parson)
on May 06, 2024 at 21:08 UTC ( [id://11159309]=perlquestion: print w/replies, xml ) Need Help??

Bod has asked for the wisdom of the Perl Monks concerning the following question:

After a server change, we are getting lots of strange characters from an encoding issue.
Double spaces and emojis are displayed as Â

I think this issue is related to the change from Perl version from 5.16.3 to 5.36.0. From the Perl Delta, I note there have been some changes to the way Perl handles UTF encoding, but I don't understand the implications of this.

We've also upgraded MariaDB from 10.5 to 10.11 but both the character set and the collation are the same. utf8mb4 and utf8mb4_general_ci respectively.

This issue is not just about data that was created prior to the change. Although emojis created after the change are not mutilated, double spaces are.

All web output is UTF8 encoded using:
Content-Type: text/html; charset=UTF-8

Any suggestions where I should look to solve this issue.

Replies are listed 'Best First'.
Re: Encoding issue after upgrade
by ikegami (Patriarch) on May 06, 2024 at 22:20 UTC

    I think this issue is related to the change from Perl version from 5.16.3 to 5.36.0.

    Unlikely.

    Any suggestions where I should look to solve this issue.

    Provide a demonstration, such as a minimal program that exhibits the problem.

      Provide a demonstration, such as a minimal program that exhibits the problem.

      If I knew how to reproduce the problem, I wouldn't be asking for places to look for the problem!

      In changing from one server setup to another, transferring the data from one instance of MariaDB to another via SQL dumpfiles, and running the same Perl scripts albeit under a different version of Perl, we have gone from correctly rendering webpages to webpages containing numerous  characters.

      An example blog post on our test site. The  characters did not appear prior to the change.

        Compare the database data bytewise (old versus new). Log each function's input arguments, compare the logs on the old and new systems. Run each function on the old and new system, compare the returned data (should have been covered by tests).

        map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

        If I knew how to reproduce the problem

        You didn't say anything about the problem being intermittent. You made it sound like it was the opposite, that you always got the junk characters with previously-inserted data. Is this not the case? That would make the problem reproducible.

        And since it is reproducible, the request is very straight forward. Simply remove everything that's not relevant. You can cut down on huge swaths of code by determining if it's a problem with the data coming from the DB, or if it's a problem with the output.

        If it truly isn't reproducible, then please re-explain the problem more clearly.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11159309]
Approved by johngg
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (2)
As of 2024-06-16 08:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.