Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Problem with SOAP::Lite and accented characters.

by jdtoronto (Prior)
on Dec 11, 2006 at 19:57 UTC ( #589132=perlquestion: print w/ replies, xml ) Need Help??
jdtoronto has asked for the wisdom of the Perl Monks concerning the following question:

Esteemed monks,

SOAP::Lite has bitten me yet again. I have an application that queries a server which returns data from a MySQL database. Everything is fine - as long as there are no 'oddball' characters in the data. For example, I need do nothing more than add an ( 0xE9 according to the Windows character map ) and then the whole thing stops. The table in MySQL has been defined with a character set of UTF-8.

I assume I am doing something inherently stupid, surely no reasonable module would reject standard characters sets like that?

In the following example method, taken from my server, I have whittled the code down to the point where I just set the name.

sub _10533 { # test method my ( $class, $message ) = @_; use XML::Simple; my $ref = XMLin($message); $ref->{data}->{firstname} = 'jo'; return XMLout($ref, KeepRoot => 1 ); }
If the name is 'joe' then the client gets the XML, if it is 'jo' then no data is returned.

So what am I doing wrong?

jdtoronto

updated correct typo, thanks marto

Comment on Problem with SOAP::Lite and accented characters.
Download Code
Re: Problem with SOAP::Lite and accented characters.
by Joost (Canon) on Dec 11, 2006 at 20:02 UTC
      Good question, but exactly the same thing happened when the database table had a default character set of latin-1.

      jdtoronto

        Ok, but encoding mismatches still might be the source of the problem: if your (SOAP) xml prolog and/or HTTP headers end up with the wrong indication for the character encoding, it's likely you'll run into trouble somewhere. IIRC perl will use either utf-8 or latin-1 for 'high-bit' encoding.

        You probably expect your XML to be in one or the other, so you need to be sure the data in it is correctly encoded. If everything else is working correctly, doing an Encode::encode() to utf-8 or latin-1 over the whole resulting XML response and a binmode() to :bytes (or possibly just a binmode to utf8 if that's what you're using) should work.

Re: Problem with SOAP::Lite and accented characters.
by clinton (Priest) on Dec 11, 2006 at 20:43 UTC
    In your simple case, unless you have

     use utf8;

    in your script, the jo is not being interpreted as UTF8. You may want to try:

    $ref->{data}->firstname="jo\x{e9}";

    to be sure that it is interpreted correctly.

    When I retrieve UTF8 from MySQL, I do:

     utf8::decode($value)

    to make sure that it is correctly interpreted as UTF8

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://589132]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (11)
As of 2014-10-23 11:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (125 votes), past polls