Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Thanks again. This code was a bit of a mess and your comments and the others have helped me see what was going wrong. I appologise for now providing better information but there was a lot of code for something which should have been quite simple. This is what the original code did:

  1. Opened data file with encoding(UTF-8)
  2. Read a line of comma separated strings from it and split them on the comma
  3. Put the split fields into a hash with keys describing the data
  4. Passed to hash to a hand written function that tried to produce a x-url-formencoded string but this function was broken and instead just stuck an '&' between each key=value so it wasn't form encoded at all
  5. Passed the resulting string into NFKD and did the substitution as I described earlier
  6. Passed the resulting string into encode to encode as UTF-8
  7. Passed the resulting string into a LWP POST

So it was horribly broken because it did not form encode properly and then NFKD was a workaround he discovered which I suspect only works because the API does normalization itself (which would not surprise me). I replaced the hand written (incorrect) form encoding with WWW::Form::UrlEncoded build_urlencoded and as you both state the NFKD is a noop as is the substitution and and it works. This was confused because it appears when it didn't work originally (without the NFKD) he was told by the API support to turn diacritics into normal characters. The actual code was a lot more complicated than this and the more I looked at it the more problems I found so I've spent most of the day rewriting it.

Thanks again for your insights.


In reply to Re^2: Strange Unicode normalization question by mje
in thread Strange Unicode normalization question by mje

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (4)
As of 2024-03-29 07:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found