Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

Re^3: Can someone please write a *working* JSON module (Send money)

by Corion (Patriarch)
on Oct 24, 2021 at 10:23 UTC ( #11137957=note: print w/replies, xml ) Need Help??

in reply to Re^2: Can someone please write a *working* JSON module (Send money)
in thread Can someone please write a *working* JSON module

Since JSON is (supposed to be) UTF-8, you merely need to mark the resulting data as being UTF-8 decoded. You could even do it for all string data, assuming that all your input has been verified as UTF-8. See for example Re: Bypass utf-8 encoding/decoding?, the function/macro you want is newSVpvn_utf8.

Obviously, this implies that you're trusting your input data to actually be valid UTF-8...

  • Comment on Re^3: Can someone please write a *working* JSON module (Send money)
  • Download Code

Replies are listed 'Best First'.
Re^4: Can someone please write a *working* JSON module (Send money)
by cnd (Acolyte) on Oct 24, 2021 at 11:06 UTC
    newSVpvn_utf8 sounds awesome!. Is there some simple way to detect invalid UTF-8 ?

    I guess something, somewhere, knows this - since croak() is the bane of my existence right now: email subject lines which may or may not have been truncated somewhere are 100% guaranteed to spew invalid UTF-8 at *some* point.

    Is there some way perl can auto-magically handle UTF-16 as well? e.g. (from the RFC): "... UTF-16 surrogate pair. So, for example, a string containing only the G clef character (U+1D11E) may be represented as "\uD834\uDD1E"." (those 4 bytes (and/or 12 characters) are also an example of why truncated text breaks everything I expect)

      See Encode::Unicode for the translations between the various Unicode encodings.

      I think converting from UTF-16 to UTF-8 is merely a mathematical transformation between two encoding styles of the same number, so you can easily model that. I'm not sure how easy it is to determine whether a backslash-escaped sequence is UTF-8 or UTF-16, but maybe if it's just two characters, it's UTF-8.

      A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11137957]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (5)
As of 2023-01-30 08:38 GMT
Find Nodes?
    Voting Booth?

    No recent polls found