Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Outputting JSON with sorted names/keys

by pryrt (Abbot)
on Jan 26, 2020 at 00:15 UTC ( [id://11111894]=perlquestion: print w/replies, xml ) Need Help??

pryrt has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to create a simple JSON file. I wanted to try to get the names/keys output in a specific (non-alphanumeric) order (to make it easier on the human users of the JSON -- I know it's irrelevant to any automated parsers). Using the JSON module, I can get unsorted or alphanumerically sorted. Manually rolling it, I can get in whatever order I specify. Is there an option I'm not seeing in the JSON docs for altering the sort-order? or another module that has user-supplied sort order?

sscce

use warnings; use strict; use autodie; use JSON; use 5.010; my @order = qw/id disp ver auth/; my $dat = [ ({ id => 'a', disp => 'a', ver => 'a', auth => 'a' })x2 ]; + # dummy data local $\ = "\n"; print 'Unsorted => ', JSON->new->indent->space_after->encode( $dat ); print 'Alpha Sorted => ', JSON->new->indent->space_after->canonical->e +ncode( $dat ); print 'My Order => ', manual_ordered_json( $dat, \@order ); use Data::Dumper; sub manual_ordered_json{ my @list = @{$_[0]}; my @ordr = @{$_[1]}; my $out = "[\n"; for my $i ( 0 .. $#list ) { my $h = $list[$i]; $out .= " {\n"; for my $j ( 0 .. $#ordr ) { my $k = $ordr[$j]; next unless defined $k; next unless exists $h->{$k}; $out .= sprintf qq| "%s": "%s"|, $k, $h->{$k} // '< +undef>'; $out .= ',' if $j < $#ordr; $out .= "\n"; } $out .= " }"; $out .= "," if $i < $#list; $out .= "\n"; } $out .= "]\n"; return $out; }

(It's not overly important; this is mostly as a learning opportunity; my manual_ordered_json could have finished this one-off job for me already, but I was hoping to expand my toolbase knowledge.)

Thanks

Replies are listed 'Best First'.
Re: Outputting JSON with sorted names/keys
by 1nickt (Canon) on Jan 26, 2020 at 02:45 UTC

    Hi pryrt, take a look at jq. I almost always pipe JSON output through it, whether from scripts or one-liners. (Maybe its sort_by is enough?)

    Hope this helps!


    The way forward always starts with a minimal test.

      ooh. jq looks interesting. If I start doing more with JSON, I'll probably want to play with that, because a structure-aware sed-like tool would be quite useful.

        I usually plug jq as being like "awk for JSON data" (in that just I think of it more in terms of doing "awk-y" things with it than sed like things (e.g. I want to pull these "fields" (some set of keys from an object) out of each line (each textual line of the file being a JSON object)).

        At any rate seconding it as a highly recommended useful thing to have lying around on your PATH somewhere. Check out the @csv and @sh formatting modifiers which can be handy wrangling JSON from the command line (search for "Format strings and escaping" in the manual).

        The cake is a lie.
        The cake is a lie.
        The cake is a lie.

Re: Outputting JSON with sorted names/keys
by tobyink (Canon) on Jan 26, 2020 at 18:25 UTC

      Thanks! By using Tie::Hash::MultiValueOrdered to create the initial hash, and then JSON::MultiValueOrdered to serialize it into JSON, the serialized output was in the right order.

      (As you can tell from my commented-out line, I had a brain failure in my first attempt, and converted the ordered hash into a plain hashref, thus losing the original order. Once I realized that, I got it working.)

        You may find a wrapper like this helpful:

        sub h { tie my %hash, 'Tie::Hash::MultiValueOrdered'; while (my ($k, $v) = splice @_, 0, 2) { $hash{$k} = $v; } \%hash; }

        That way you can build your data structure pretty naturally, just using h(...) for a hashref instead of {...}:

        my $people = [ h( name => "Alice", age => 21 ), h( name => "Bob", name => "Robert", age => 22 ), ]; print JSON::MultiValueOrdered->new(pretty=>1)->encode($people);

      JSON::MultiValueOrdered allows [...] duplicate keys like:

      { "a": 1, "b": 2, "a": 3 }

      ... but the RFCs say that keys SHOULD be unique. On the other hand, the 2nd edition of ECMA-404 has seriously f...ed up the spec, while pretending to specify the same thing as the RFCs. See Re^4: Outputting JSON with sorted names/keys.

      Short, if you want to play safe, don't assume any ordering on JSON objects, and make sure all keys of an object are unique. Or clearly document that you assume other rules.

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

        On the other hand... be liberal in what you accept. JSON::MultiValueOrdered was written with that principle in mind: to accept JSON documents that contain multiple values for keys or rely on key order. But it also allows them to be round-tripped.

        Either way, it's a SHOULD and not a MUST. (Except in the I-JSON spec, which I think you missed out in your comparison of specs?) As long as you know what you're doing and have valid reasons to do so, you can forget SHOULDs. And if you're using JSON::MultiValueOrdered instead of JSON::PP or JSON::MaybeXS, it's probably for a reason.

        Also, bear in mind that I started work on JSON::MultiValueOrdered more than seven years, so it predates all the JSON RFCs (except RFC 4627) and ECMA 404. The format description on json.org was also a lot more concise back then.

Re: Outputting JSON with sorted names/keys
by LanX (Saint) on Jan 26, 2020 at 00:23 UTC
    > I know it's irrelevant to any automated parsers

    I seem to remember that the specs are unclear if sorted hashes (aka objects) are allowed in JSON

    Update

    I seem to remember that Daxim was complaining once about contradictory specs.

    Some searching told me that there are JSON parsers which preserve order, most famously V8.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

      I seem to remember that the specs are unclear if sorted hashes (aka objects) are allowed in JSON.

      https://www.json.org/ is quite clear:

      JSON is built on two structures:

      • A collection of name/value pairs. In various languages, this is realized as an object, record, struct, dictionary, hash table, keyed list, or associative array.
      • An ordered list of values. In most languages, this is realized as an array, vector, list, or sequence.

      [...]

      An object is an unordered set of name/value pairs.

      So, "objects" are not sorted, they have no implied order, but you are free to write them out sorted by any criteria you like. But you should not expect that the ordering is retained.

      BTW: I think the name "object" is an unfortunate choice, because it is no object in the sense of object-oriented programming.

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
        > BTW: I think the name "object" is an unfortunate choice, because it is no object in the sense of object-oriented programming.

        It is correct in JS, they don't have plain hashes, you have to (ab)use objects.

        In other words you have to take care not to inherit keys. 🙄

        NB: OOP is classless in JS

        > https://www.json.org/ is quite clear:

        Well what I remember were two contradictory RFC's but I can't find the discussion anymore.

        Please keep in mind that like with utf8 definitions changed over time from more relaxed to more strict.

        This leaves some early adopters like Perl sometimes in a limbo of backwards compatibility.

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

        > An object is an unordered set of name/value pairs.

        To make it worse, I'm pretty sure that in JS this "object" should be written capitalized as "Object".

        "Object" is the base "object" instance in JS from which all other objects inherit°, including the objects Array and Function. A literal "Object" is written with key value pairs surrounded by curlies. "object" is a type.

        From a Perl perspective objects are best understood as tied hashes which lookup missing keys via the "prototype" chain (exposed in FF thru __proto__ ).

        I'm starting to agree that Crockford should have avoided the term "object" in his JSON definition...

        update

        Demo from the FF console (please note that __proto__ is not necessarily available in other dialects)

        > hash = {a:1} Object { a: 1 } > typeof hash "object" > array = [0,1,2] Array(3) [ 0, 1, 2 ] > typeof array "object" > array.__proto__ Array [] > array.__proto__.__proto__ Object { … }

        update

        > h={} // empty hash? Object { } > h["constructor"] // well, beware of inherited attributes function Object()

        footnote

        °) seems like newer ECMA versions allow the creation of objects without prototype. This would actually be very close to a Perl hash.

        > h = Object.create(null) Object { } > h["constructor"] undefined

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

        A reply falls below the community's threshold of quality. You may see it by logging in.
Re: Outputting JSON with sorted names/keys (Updated)
by LanX (Saint) on Jan 26, 2020 at 01:13 UTC
    > Is there an option I'm not seeing in the JSON docs for altering the sort-order?

    Are you aware of JSON::PP#sort_by ?

    Not sure if I'm missing your point though...

    Update

    To elaborate more: JSON.pm has the choice between different backends which must implement it's interface.

    But a backend can also provide more extended features.

    See demo in my answer here.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

      > Are you aware of JSON::PP#sort_by ?

      as a demo:

      (NB: I changed your indentation levels to make it identical, otherwise please uncomment the whitespace eraser prior to testing. )

      use warnings; use strict; use autodie; BEGIN { $ENV{PERL_JSON_BACKEND}='JSON::PP'; } use JSON; use Test::More; use 5.010; my @order = qw/id disp ver auth/; my $dat = [ ({ id => 'a', disp => 'a', ver => 'a', auth => 'a' })x2 ]; + # dummy data local $\ = "\n"; #print 'Unsorted => ', JSON->new->indent->space_after->encode( $dat ); #print 'Alpha Sorted => ', JSON->new->indent->space_after->canonical-> +encode( $dat ); print "Pryrt's Order => ", my $pryrt = manual_ordered_json( $dat, \@order ); # --- Sorting with JSON::PP my %ORDER; @ORDER{@order} = 1..@order; my $sort = sub { package JSON::PP; ($ORDER{$a} // 1e999) <=> ($ORDER{$b} // 1e999) or $a cmp $b }; print "LanX' Order => ", my $lanx = JSON->new->sort_by($sort)->indent->space_after->encode( + $dat ); # # --- erase whitespace diffs # s/\s//g for $pryrt, $lanx; is($lanx,$pryrt,"same JSON"); done_testing(); use Data::Dumper; sub manual_ordered_json{ my @list = @{$_[0]}; my @ordr = @{$_[1]}; my $out = "[\n"; for my $i ( 0 .. $#list ) { my $h = $list[$i]; $out .= " {\n"; for my $j ( 0 .. $#ordr ) { my $k = $ordr[$j]; next unless defined $k; next unless exists $h->{$k}; $out .= sprintf qq| "%s": "%s"|, $k, $h->{$k} // '<un +def>'; $out .= ',' if $j < $#ordr; $out .= "\n"; } $out .= " }"; $out .= "," if $i < $#list; $out .= "\n"; } $out .= "]\n"; return $out; }

      --->

      Pryrt's Order => [ { "id": "a", "disp": "a", "ver": "a", "auth": "a" }, { "id": "a", "disp": "a", "ver": "a", "auth": "a" } ] LanX' Order => [ { "id": "a", "disp": "a", "ver": "a", "auth": "a" }, { "id": "a", "disp": "a", "ver": "a", "auth": "a" } ] ok 1 - same JSON 1..1

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

        >> Are you aware of JSON::PP#sort_by?

        Thanks! I knew that JSON used JSON::PP or JSON::XS under the hood, but I hadn't thought to go searching through the backend modules for features not advertised (edit: "...not advertised at the top level"). That's just what I was hoping would exist somewhere. And thanks for the example to go with it: works great for me.

      Rest assured: JSON::XS supports the same feature.
        Rest assured: JSON::XS supports the same feature.

        Interesting. Since you resurrected this thread, my curiosity has been piqued, and I couldn't remember whether I'd tried to do the same with JSON::XS three years ago or not.

        Searching through the JSON::XS codebase finds 0 instances of sort_by , so I am not sure what line of code in the source of JSON::XS could be implementing that function (though maybe there's an inheritance in the XS source, because I'm not sure what the equivalent of use/require/parent/base/@ISA are in XS). Because of that uncertainty, I took the working code that LanX had posted three years ago, and changed every instance of JSON::PP to JSON::XS . When I ran that, I got the message sort_by is not supported by JSON::XS. at C:\....\11111902.pl line 32. and the is() test failed, because the $lanx version was not sorted. Based on that experiment, I cannot see how to use sort_by with JSON::XS in an equivalent manner to how it's used with JSON::PP.

        Could you show an example of JSON::XS using sort_by (using the same data and structure as in the working example that LanX posted earlier)? And if it's just called something other than sort_by in the JSON::XS version, please let me know what the right name is (and whether or not it's documented in JSON::XS's POD). Because if it is possible, I'd like to see how. Thanks.

Re: Outputting JSON with sorted names/keys
by Anonymous Monk on Jan 26, 2020 at 00:34 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11111894]
Approved by GrandFather
Front-paged by LanX
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (4)
As of 2024-04-19 23:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found