Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?

Posting utf-8 data with LWP

by ajt (Prior)
on Sep 23, 2002 at 12:34 UTC ( #200072=perlquestion: print w/replies, xml ) Need Help??
ajt has asked for the wisdom of the Perl Monks concerning the following question:

Fellow monks I have hit a Perl 5.6.x unicode awkwardness.

In my application I was posting iso-8859-1 encoded XML to a SAP Business Connector Server. This was recently upgraded, and now it uses uft-8 encoding for both input and output.

So I removed all my iso-8859-1 to utf-8 encoding on the output stage, and used XML::LibXSLT to template it up, and all is well.

The input stage is more of a problem. The input data is iso-8859-1 encoded from browser form input. I've done all the magic I want on it to create an XML file, but the SAP BC server doesn't like it in this encoding, even with the correct encoding declaration. So I converted the data string to utf-8 with the encodeToUTF8 function in XML::LibXML, however when LWP POSTs this to the BC Server, BC complains that the data is no longer valid XML.

What I think is happening, is the one octet test caharacter E9, becomes a two octet pair 00E9 in utf-8. When LWP calculates the length of the string it seems to counts characters not octets, so LWP seems to truncate the file by one octet (in this example) producing invalid XML, so the BC Server dies. If I pad the post with a bunch of spaces at the end, the file goes through okay. Trailing whitespace is ignored by BC's XML parser.

Q1: Does this sound like a plausible explanation?

Q2: Can anyone think of a more elegant solution?

As ever humble thanks in advance....


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://200072]
Approved by Corion
[robby_dobby]: This anonymonk's post reads weird to me
[robby_dobby]: Other than the content, that Chrome would not allow submitting code sounds like something funky going on there :P
[erix]: lazyness-driven inventio :)
[erix]: "the dog ate my homework"
[Eily]: looks more like "all my plates are broken and I don't understand why. Also I can't retrieve the broken parts because the elephant in the room is on the way"

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (10)
As of 2017-04-25 13:07 GMT
Find Nodes?
    Voting Booth?
    I'm a fool:

    Results (453 votes). Check out past polls.