Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses

Re^2: UTF8 with YAML or JSON

by SBECK (Friar)
on Jun 29, 2012 at 17:36 UTC ( #979157=note: print w/replies, xml ) Need Help??

in reply to Re: UTF8 with YAML or JSON
in thread UTF8 with YAML or JSON

The 'use utf8' pragma is actually included in my module (though I omitted it from the simple code I posted here). However, I've played with it quite a bit and never got the results I wanted either.

In the code posted here, the variable $in DOES contain UTF8 characters, but when you pass it to Load (which is obviously outside the scope of anything in this script) it gets converted. For that reason, though technically correct, I don't think that adding the pragma here will have any impact. If I'm wrong though, I'm certainly open to correction. I"m definitely not a UTF8 expert.

Replies are listed 'Best First'.
Re^3: UTF8 with YAML or JSON
by zwon (Abbot) on Jun 29, 2012 at 17:59 UTC
    never got the results I wanted either

    So do you actually want \x{c4}\x{83} as YAML::Syck returns to you, or you want \x{103}?

      I want the characters included in the data structure to be EXACTLY what I included in the text that got parsed. So, if I send in a string which contains a scalar of UTF8 values, then I should see UTF8 values in the data structure. YAML::Syck does this. YAML/YAML::XS/JSON/JSON::XS all take the scalars with UTF8 values in them and produce data structures containin perl encodings.

        I should see UTF8 values ... YAML::Syck does this

        I don't see this from your example. YAML::Syck returns you two latin1 characters instead of a single \x{103} that the file contains, which is exactly the opposite to what you are saying you want. YAML::XS expects UTF-8 octets on input, and it checks that it is correct UTF-8, and it returns you UTF-8 characters. I have impression that you don't realise what you are getting from the modules, maybe you should use Dump from the Devel::Peek to inspect values instead of Dumper, also if you add

        use open ":utf8"; use open ":std";
        to your script, it will be clear to you, that YAML::Syck doesn't return ă, but ă.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://979157]
[marto]: yes, the last time I saw them was in Berlin a few years ago, at the old Olympic stadium
[marto]: After Sunday I don't think I'll go see them again :)
[LanX]: Corion: come on, people are people! ;)
[Corion]: Naah, I think it's still an OK show so far. Their new songs aren't exactly great, but I'm not going there for new material anyway ;)
[marto]: I got the feeling from the last show that for big sections of it, they were not really into what they were doing
[Corion]: LanX: Sure, they can bask in my Halo
[marto]: more so than the previous show I saw
[Corion]: marto: Well, I think they go a tour every two years and I think it's hard to even get a connection with the crowd at a 20k people concert... But maybe after this time I'll stop too ;)
[Corion]: I still have to see the Pet Shop Boys live before they stop touring at all
[marto]: yeah, I think that as a group creatively they're done. I can understand how it'd be hard to stop the process, album/tour, album/tour, if that's pretty much all you've ever done :)

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (13)
As of 2017-03-24 11:39 GMT
Find Nodes?
    Voting Booth?
    Should Pluto Get Its Planethood Back?

    Results (301 votes). Check out past polls.