Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?

Did the JSON module change?

by nilesOien (Novice)
on Mar 01, 2018 at 17:07 UTC ( #1210167=perlquestion: print w/replies, xml ) Need Help??
nilesOien has asked for the wisdom of the Perl Monks concerning the following question:

We have perl code that used to produce JSON that looked, in part, something like this :

 "count" : "3"

And then after installing a new module via cpanm, which caused a lot of updates, it produced JSON that looks like this :

 "count" : 3

Notice that there are no quotes around the number now. This caused much confusion as it caused the downstream parser of the JSON to choke. After a lot of chasing, it seems like after the update, if you do math with a hash reference, it changes the type of the variable from a string to a number, even if the hash reference is not used to store the output. It's probably best illustrated by the script below.

#!/usr/bin/perl use warnings; use strict; use JSON; my $json = JSON->new->allow_nonref->allow_unknown->allow_blessed->pretty(1); # Take the string "1", put it in a hash, encode as JSON and print. # This prints : # { # "num" : "1" # } # The quotes around the number 1 are significant, they mean # that perl treats the has entry as a string (not surprising). my $data1; $data1->{"num"} = "1"; my $body1 = $json->encode($data1); print $body1; # Take the number (not string!) 2, put in in a hash, # encode as JSON and print. # This prints : # { # "num" : 2 # } # There are no quotes around the 2, since perl # treats it as a number. # Again, not surprising. my $data2; $data2->{"num"} = 2; my $body2 = $json->encode($data2); print $body2; # Take the string "3", put it in a hash, do some math with it, # then encode as JSON and print. # This prints : # { # "num" : 3 # } # So, perl is treating the hash entry, which was a sting, # as a number, because using it to do math seems to cause # perl to treat it as a number in the JSON. # Not sure if the hash entry changed type, or had some internal # flag set on it, but this is surprising, and this behavior # seems to be a departure from the past. # Did something change in the JSON module? my $data3; $data3->{"num"} = "3"; my $addr = $data3->{"num"} + 7; my $body3 = $json->encode($data3); print $body3; exit 0;

We are running on CentOS 6.8 with this perl :

/usr/bin/perl --version This is perl, v5.10.1 (*) built for x86_64-linux-thread-multi

What changed? Is it the JSON module? It seems like it used to make a number out of the string in a temporary way, do math with it, and then discard that number, but now it actually changes the type of the hash reference. And is this new way the desired behavior? Is this change documented? It was hard to figure out in our case.

Thanks, all - Niles Oien.

Replies are listed 'Best First'.
Re: Did the JSON module change?
by Corion (Pope) on Mar 01, 2018 at 17:46 UTC

    Perl has no fixed distinction between "string" and "number". This is a bit problematic for modules like JSON that try to guess such concepts from circumstantial evidence.

    My approach is to try to explicitly state whether I want a value to be treated as string or number by performing either string concatenation on it or an addition:

    my $string = 3; my $number = "42"; my $data = { string => "".$string, number => 0+$number, };

      I agree explicit typing (if "typing" is the right word, I'm not even sure if perl has types as such, but I'll talk about "typing" because I think you know what I mean) is probably best. What surprised us is that we didn't think something like this :

      my $var = "3"; my $otherVar = $var + 0;

      Would change the typing on $var. It's as if in the past, perl created a temporary variable, a copy of $var that was treated as a number, and used that in the math without messing with $var, and now that's not the case? Or maybe the new JSON module is looking at something different to determine variable type? As you say it must be hard for JSON to figure that out, so maybe they're clutching at a different straw in that determination now. I kind of suspect that's the case, and I'd be interested to know. Is there a way to contact the JSON module author(s)?

        It's as if in the past, perl created a temporary variable, a copy of $var that was treated as a number, and used that in the math without messing with $var, and now that's not the case?

        All that's changing are the variable's flags, and also the contents of its IV (integer) and NV (float) slots . You can see those internals using the Devel::Peek module - eg:
        use Devel::Peek; $var = "3"; Dump $var; print "^^^^^^^^^^\n"; $x = $var + 0; Dump $var; print "^^^^^^^^^^\n"; $var += 0; Dump $var; print "^^^^^^^^^^\n";
        For me, that outputs:
        SV = PV(0x4fae04) at 0x612d94 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x5ffdcc "3"\0 CUR = 1 LEN = 12 ^^^^^^^^^^ SV = PVIV(0x610f84) at 0x612d94 REFCNT = 1 FLAGS = (IOK,POK,pIOK,pPOK) IV = 3 PV = 0x5ffdcc "3"\0 CUR = 1 LEN = 12 ^^^^^^^^^^ SV = PVIV(0x610f84) at 0x612d94 REFCNT = 1 FLAGS = (IOK,pIOK) IV = 3 PV = 0x5ffdcc "3"\0 CUR = 1 LEN = 12 ^^^^^^^^^^
        and you can see that the FLAGS have changed from each step to the next.
        Those changes can have a bearing on the values that are printed out.

        Additionally, the flags that are currently set, and the ways in which they change have differed over the years - so if JSON is looking at the flags then it is possible that behaviour may change from one version of perl to another.
        And if JSON has actually changed the way it treats values based on the flags then that could also change behaviour.

        For example, wrt $var in the middle of the script where the flags are (IOK,POK,pIOK,pPOK), it may well have once been the case that the value would be displayed surrounded by quotes if the POK and pPOK flags were set, else no quotes. (This would make sense).
        Subsequently, the condition could have changed to not surrounding the value with quotes if the IOK and pIOK flags were set, else surround with double quotes. (This would also make sense.)
        Two different treatments - both sane.

        The changes to the flags are all part of perl's commitment to "least surprise". It generally works quite well, but there's no way of guaranteeing what programmers will do with the flags.

Re: Did the JSON module change?
by VinsWorldcom (Parson) on Mar 01, 2018 at 18:16 UTC

    What version of JSON module? I'm on Windows 10 64-bit with Strawberry 5.24.1 MSWin32-x64-multi-thread. It seems your desired output works for me.

    Perl> use JSON; Perl> Perl> my $json = More? JSON->new->allow_nonref->allow_unknown->allow_blessed->prett +y(1); Perl> my $data3; Perl> $data3->{"num"} = "3"; Perl> my $addr = $data3->{"num"} + 7; Perl> my $body3 = $json->encode($data3); Perl> print $body3; { "num" : "3" } Perl> print $JSON::VERSION; 2.90 Perl> exit

      Using what you give above, I have version 2.97001

      I think that's pretty close to the latest? Also I just installed JSON::XS and it seems like it makes no difference.

      EDIT : I take it back, I had screwed up the installation of JSON::XS. When I got it installed correctly, this :

      my $data3; $data3->{"num"} = "3"; my $addr = $data3->{"num"} + 7; my $body3 = $json->encode($data3); print $body3;

      Does indeed print this :

      { "num" : "3" }

      I'm guessing the difference is in the version of the JSON module?

Re: Did the JSON module change?
by choroba (Bishop) on Mar 02, 2018 at 17:10 UTC
    That's why Cpanel::JSON::XS::Type exists:
    #!/usr/bin/perl use warnings; use strict; use Cpanel::JSON::XS; use Cpanel::JSON::XS::Type; my $json = Cpanel::JSON::XS->new ->allow_nonref->allow_unknown->allow_blessed->pretty(1); my $type = { 'num' => JSON_TYPE_INT }; my $data1; $data1->{"num"} = "1"; my $body1 = $json->encode($data1, $type); print $body1; my $data2; $data2->{"num"} = 2; my $body2 = $json->encode($data2, $type); print $body2; my $data3; $data3->{"num"} = "3"; my $addr = $data3->{"num"} + 7; my $body3 = $json->encode($data3, $type); print $body3;
    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,

      This seems like the definitive option for editing the script.

      I think there's also an issue in how I've been managing packages, though, which I've touched on in another reply.


Re: Did the JSON module change?
by ikegami (Pope) on Mar 01, 2018 at 19:24 UTC

    [Ignore/delete this post. I misunderstood the issue. The hash is still a red herring, but the rest doesn't apply. Differences in the choice of back-end could still be relevant, but it's not the only possibility. I don't have time to look deeper right now.]

    The hash is a red herring. The results of mathematical operators are numbers, and the result of concatenation is a string. Period.

    $ perl -MJSON -E' say encode_json({ a=>"7", b=>8, c=>"1"+"8", d=>1 . 0 }); ' {"a":"7","b":8,"c":9,"d":"10"}

    You also asked about changes in behaviour. Keep in mind that JSON is a merely front for JSON::XS, but it defaults to using JSON::PP if JSON::XS is not available. That means its behaviour could theoretically change based on which modules you have installed.

Re: Did the JSON module change?
by nilesOien (Novice) on Mar 02, 2018 at 00:39 UTC

    So, as I noted above, when I installed JSON::XS, the behavior went back to what it was before, this :

    my $data3; $data3->{"num"} = "3"; my $addr = $data3->{"num"} + 7; my $body3 = $json->encode($data3); print $body3;

    Prints this :

    { "num" : "3" }

    What's the moral of the story here? Is it that I should always install JSON::XS? Or that I should type explicitly by doing stuff like :

    $var = "" . $var;

    Before sending stuff off to JSON? And/or did I just get a "bad" release of JSON for my purposes? Or all of the above?

      What's the moral of the story here?

      That's a very important question and I suspect you might end up with a hatful of different answers. Here's mine.

      You said that the problem occurred after installing a new module via cpanm, which caused a lot of updates and that you are running with /usr/bin/perl on Centos 6. This means that you are using modules installed directly from CPAN via an automated installer (cpanm in this case) with the system perl. I'd say that the moral here is not to do that.

      CentOS 6 has packages for both JSON and JSON::XS available via yum. It's therefore absolutely fine to use these with the system perl. However, as soon as you go monkeying around with non-packaged modules all bets are off.

      Many monks here will contend that you should not use the system perl for any userspace applications at all and instead install your own perl from source, keeping the two perls entirely separate. This approach has its merits.

      I'm an old hand and am (mostly) happy enough to use the system perl for my tasks but I am very careful not to pollute it with conflicting modules or to install newer versions of packaged modules into the system tree. Usually I will install modules by hand rather than use cpan/cpanplus/cpanm so I can see exactly what is happening at each step and even decide for myself which depedencies need updating, etc. On those rare occasions when using an automated installer is worth it I will ensure that the target directory of the installs is not into a system tree but into a user-controlled tree and so will no affect any OS use of the system perl and would be much simpler to back out if required.

      So for me the moral is: don't mix the system perl with user-installed modules unless you are very, very sure you know what you are doing. If in doubt, install your own perl which you can mess up without banjaxing your system.

        The TL;DR is : I suspect you're right.

        Some more detail : For my own personal machines, I use Arch linux. This is VERY lightweight, to the point that you have to install a cron package to be able to run cron jobs. For my Arch machines, I've been using cpanm, because there are few to no perl modules installed by default. That seems to have worked OK, because cpanm winds up managing pretty much all the perl modules.

        CentOS is a different animal. It comes with a reasonable number of perl modules installed by default. And we always wind up wanting some module that is not available in the system package manager (ie yum) yet. The one I wanted in this case was GeoIP2, which is pretty new. Up until now I've been using cpanm for anything not available in yum. But after this, I'm thinking the way to go may well be to download the source from CPAN and build by hand.

        I'm certainly open to hearing about other, better management schemes. I'm relatively new to perl, for one thing. And what I've come up with as described here wasn't something I thought about too much, I just kind of fell into it. A workmate was also suggesting that building by hand may be the way to go.

        Having my own separate perl... Hmmm.... Also something to think about.


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1210167]
Approved by stevieb
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (5)
As of 2018-12-14 03:44 GMT
Find Nodes?
    Voting Booth?
    How many stories does it take before you've heard them all?

    Results (64 votes). Check out past polls.