Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

make duplicate JSON keys an error

by daxim (Curate)
on Oct 24, 2013 at 11:28 UTC ( [id://1059431]=perlquestion: print w/replies, xml ) Need Help??

daxim has asked for the wisdom of the Perl Monks concerning the following question:

I need a JSON validator (grammar/syntax, not schema) or something, the JSON module unfortunately allows duplicate keys for input, what can you recommend?

I'm asking because RFC 4627 §2.2 is relaxed, but RFC 6902 §A.13. is restricted.

Example: I want

[ { "op": "add", "path": "/baz", "value": "qux", "op": "remove" } ]
to throw an error.
to throw an error.

Replies are listed 'Best First'.
Re: make duplicate JSON keys an error
by sundialsvc4 (Abbot) on Oct 24, 2013 at 13:19 UTC

    Musing about this ... while it might be nice to be able to detect such an error in what the data-supplier gives us, it might also be necessary to anticipate the problem and compensate for it in some reasonable way ... i.e. associate an arrayref of values with the key, pushing values onto it in the order received.   If the supplier is using an Unenlightened Language that is doing the wrong thing, we just might have to find a way to live with it.   If we want to / have to, as we so often find that we must.

    Our pragmatic situation just might be:   “I realize that the incoming data is incorrect.   I can’t do anything about it, because the supplier of these data is a Leviathan company that runs at a glacial pace, and I can’t wait two years for them to decide they won’t accept a trouble-ticket from us.   Meanwhile, I must not lose data, especially without knowing if or that I did.”

    Over the years, I have dealt with a number of data-supplier situations where they “just switched to JSON,” often apparently writing their own code or template to do it, and they simply treated it “like XML/SOAP,” which does permit an arbitrary number of identically-named groups to appear in any container.   They just did it that way, and fixing their code is not on their radar and never will be.

Re: make duplicate JSON keys an error
by tobyink (Canon) on Oct 28, 2013 at 16:35 UTC

    JSON::Tiny::Subclassable is my fork of davido's JSON::Tiny to make it more easy to subclass usefully. In fact, I wrote it to usefully handle duplicate keys. But if in fact you want to throw an error for duplicates, it's not much extra effort...

    use strict; use warnings; # A tied hash that dies if you try to insert a key that already exists +. { package Tie::Hash::NoOverwrite; use Tie::Hash (); use Carp; use parent -norequire, qw(Tie::StdHash); sub STORE { my ($hash, $key, $value) = @_; # Ignore these packages for the purpose of reporting errors local our @CARP_NOT = qw( JSON::MultiValueDie JSON::Tiny::Subclassable ); croak "Duplicate key: $key" if exists $hash->{$key}; $hash->{$key} = $value; } } # A JSON decoder that ties all hashes with Tie::Hash::NoOverwrite. { package JSON::MultiValueDie; use parent qw(JSON::Tiny::Subclassable); sub _new_hash { tie my %h, 'Tie::Hash::NoOverwrite'; return \%h } } my $json = JSON::MultiValueDie->new; $json->decode(q/ { "a":1, "b":2 } /); print $json->error || 'ok', "\n"; $json->decode(q/ { "a":1, "a":2 } /); print $json->error || 'ok', "\n";
    use Moops; class Cow :rw { has name => (default => 'Ermintrude') }; say Cow->new->name
Re: make duplicate JSON keys an error
by vsespb (Chaplain) on Oct 24, 2013 at 17:22 UTC
    You could try to write code (perhaps, a single regexp) which will prepend prefix to all keys in JSON string
    { "1_op": "add", "2_path": "/baz", "3_value": "qux", "4_op": "remove" + }

    You don't even have to check if this string is a key (it's ok to damage values)
    next, parse with JSON module, and when you find "1_op" and "4_op" in same hash - that would mean error.
    UPD:
    Alpha version:
    use strict; use warnings; use JSON::XS; use Data::Dumper; my $s = <<"END"; [ { "op": "add", "path": "/baz", "value": "qux", "op": "remove" } ] END my $x = 1; $s =~ s/\"([^\"]+)\"/"\"".++$x."_".$1."\""/ge; my $j = JSON::XS->new()->filter_json_object(sub { my %seen; for (keys %{shift()}) { die unless /^\d+\_(.*)$/; die "key [$1] already seen" if $seen{$1}++; } }); $j->decode($s);
    prints
    key [op] already seen
    UPD2:
    You can even avoid double-parsing, just remove prepended numbers in filter_json_object and return correct data. In the end yo'll get correct hash on first pass.
Re: make duplicate JSON keys an error
by Bloodnok (Vicar) on Oct 24, 2013 at 12:10 UTC
    I know nothing about the JSON module, but couldn't you get it (the JSON module) to do the parsing and then implement a post-JSON processing, grep based duplicate keys error generator ?

    Just a thought in an idle passing moment ...

    A user level that continues to overstate my experience :-))

      This must have been a very short idle moment... ;-)

      The above json snippet gets parsed without errors. The problem is, that the resulting hash does only have one key 'op' and the value seems to be the last parsed value. So the resulting hash does have only three distinct keys. You can't see at the parsing result whether the delivered json snippet was not valid according to the mentioned RFC.

      UPDATE: Looking at the documentation of the JSON module showed that the behaviour can be influenced by several variables. But non of them is targeting your problem. The possibility of incremental parsing seems also not solve your problem. So, my conclusion: Make a feature request to add a behaviour config variable disallowing multiple equal keys. Knowing that won't help you at the moment.

      Best regards
      McA

Re: make duplicate JSON keys an error
by Anonymous Monk on Oct 24, 2013 at 19:48 UTC

    If it's just the exception you want, why not just copy the JSON::PP module, change its name to something like PP_Strict, and insert a check in sub object() right before assigning the value to the key?

      I can think of several reasons why not:

      JSON::PP apparently is an OO class, but it is not well-factored and compartmentalised in the sense that it is possible to sub-class it to achieve what I want. It has lots of internal global state expressed as upper-scoped variables, like parsers traditionally do, and even refers to the original package name a lot, even complicated by the fact that it treats the package name specially to work together with the optional ::XS variant. Just renaming the package and inserting the check like you said is not enough to make it work.

      Having to keep the modified copy in sync with upstream is annoying.

      Copy-pasta code opens another can of worms. So far in the project we only have code written on our own, uniformly licenced. Since we distribute the code, I have to spend time figuring out what is affected by this licence-wise "foreign object", and in the end it might well turn out that we just need three additional lines here and there, but that's not at all what I like to spend my time with as a programmer.

      Therefore I would go this direction only as a last resort.

Re: make duplicate JSON keys an error
by hdb (Monsignor) on Oct 24, 2013 at 16:22 UTC

      It's because in Randal's code some whitespaces are not allowed. Here's the version that allows more whitespaces and parses the above example: Regexp.pm (it'll soon be on CPAN as JSON::Decode::Regexp 0.03).

      Also, here's a version that checks for duplicate keys: Regexp.pm.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1059431]
Approved by marto
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (6)
As of 2024-03-28 16:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found