![]() |
|
Do you know where your variables are? | |
PerlMonks |
Yet another Encoding issue...by Bod (Parson) |
on Jun 01, 2024 at 19:34 UTC ( [id://11159743]=perlquestion: print w/replies, xml ) | Need Help?? |
Bod has asked for the wisdom of the Perl Monks concerning the following question: I'm using AI::Chat to create a Turkish practice, AI-Powered chat. The first part is for the AI to analyse the Turkish supplier by the user (me) and check it for errors. Because Turkish uses some non-latin characters in the alphabet, this has created another character encoding issue for me. To eliminate the OpenAI API and AI::Chat, I have created this test script that demonstrates the issue...(no apologies for inline CSS marto - this is a quick and dirty test script!)
The incl::HTML module (here renamed to incl::HTMLtest) takes the URL query string and splits it up into key value pairs that it puts into %data In this minimalistic script, text is entered into <div id="userChat"> and sent back to the Perl script when the button is clicked. This uses the fetch API. The content is in $data{'userChat'} which is just sent back as a very simple JSON object to be written into <div id="chatBox">. This works as expected until we introduce non-latin characters - for example "café" which gets displayed as "café" I've captured the query string before decoding and it is "userChat=caf%C3%A9" It seems very strange to me that we start off with four characters in "café" and seem to get to five with "caf%C3%A9" which gets decoded as five characters... The code that does the decoding in incl::HTML looks like this. I cannot recall where it came from but it has been working for many, many years and has definitely handled Turkish characters in the past under Perl v5.16.3. I wonder if it is failing after the change to Perl v5.36.0
I am beginning to think that I will never understand this mysterious world of character encodings...then I remember that for many, many years references, especially hashrefs were a total mystery to me and now I use them without having to think too hard about it. This is in no small part thanks to the Monastery and I'm hoping a similar magical revelation might be bestowed on me for character encoding! Everything was so much easier when all we had was ASCII!
Back to
Seekers of Perl Wisdom
|
|