Beefy Boxes and Bandwidth Generously Provided by pair Networks Bob
There's more than one way to do things
 
PerlMonks  

JSON parser as a single Perl Regex

by merlyn (Sage)
on Sep 26, 2012 at 19:25 UTC ( #995856=CUFP: print w/ replies, xml ) Need Help??

$client has a script that needed minimal module support, but wants to parse JSON. Couldn't find anything like YAML::Tiny (which I borrowed to remove the YAML dependency), so I hacked up this regex to parse and extract JSON. Doesn't handle unicode yet, but that wasn't a client requirement.
#!/usr/bin/env perl use Data::Dumper qw(Dumper); my $FROM_JSON = qr{ (?&VALUE) (?{ $_ = $^R->[1] }) (?(DEFINE) (?<OBJECT> (?{ [$^R, {}] }) \{ (?: (?&KV) # [[$^R, {}], $k, $v] (?{ # warn Dumper { obj1 => $^R }; [$^R->[0][0], {$^R->[1] => $^R->[2]}] }) (?: , (?&KV) # [[$^R, {...}], $k, $v] (?{ # warn Dumper { obj2 => $^R }; [$^R->[0][0], {%{$^R->[0][1]}, $^R->[1] => $^R->[2]}] }) )* )? \} ) (?<KV> (?&STRING) # [$^R, "string"] : (?&VALUE) # [[$^R, "string"], $value] (?{ # warn Dumper { kv => $^R }; [$^R->[0][0], $^R->[0][1], $^R->[1]] }) ) (?<ARRAY> (?{ [$^R, []] }) \[ (?: (?&VALUE) (?{ [$^R->[0][0], [$^R->[1]]] }) (?: , (?&VALUE) (?{ # warn Dumper { atwo => $^R }; [$^R->[0][0], [@{$^R->[0][1]}, $^R->[1]]] }) )* )? \] ) (?<VALUE> \s* ( (?&STRING) | (?&NUMBER) | (?&OBJECT) | (?&ARRAY) | true (?{ [$^R, 1] }) | false (?{ [$^R, 0] }) | null (?{ [$^R, undef] }) ) \s* ) (?<STRING> ( " (?: [^\\"]+ | \\ ["\\/bfnrt] # | # \\ u [0-9a-fA-f]{4} )* " ) (?{ [$^R, eval $^N] }) ) (?<NUMBER> ( -? (?: 0 | [1-9]\d* ) (?: \. \d+ )? (?: [eE] [-+]? \d+ )? ) (?{ [$^R, eval $^N] }) ) ) }xms; sub from_json { local $_ = shift; local $^R; eval { m{\A$FROM_JSON\z}; } and return $_; die $@ if $@; return 'no match'; } while (<>) { chomp; print Dumper from_json($_); }

-- Randal L. Schwartz, Perl hacker

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

Comment on JSON parser as a single Perl Regex
Download Code
Re: JSON parser as a single Perl Regex
by Anonymous Monk on Sep 26, 2012 at 21:06 UTC
    As an alternative you can use fatpack to get the minimal module support. http://search.cpan.org/~ether/App-FatPacker-0.009010/bin/fatpack
Re: JSON parser as a single Perl Regex
by Anonymous Monk on Sep 27, 2012 at 01:34 UTC

    Its quicker to fail on buggy json than RFC: A walkthrough from JSON ABNF to Regexp::Grammars

    sub fa { my $time = time; my $ref = from_json(@_); $time = time-$time; print "T$time ", Dumper( $ref ),"\n"; } fa(q{["double extra comma",,]}); ## THE BUGGY fa(q{[1,[2,[3],[]]]}); ## THE REGULAR fa( q{[{"k":"v"},{"v":"k"}] } ); fa( q{{"ro":["sham","bo"],"t":{"i":{"c":{"t":{"o":"c"}}}}}} ); __END__ T5 $VAR1 = 'no match'; T0 $VAR1 = [ 1, [ 2, [ 3 ], [] ] ]; T0 $VAR1 = [ { 'k' => 'v' }, { 'v' => 'k' } ]; T0 $VAR1 = { 'ro' => [ 'sham', 'bo' ], 't' => { 'i' => { 'c' => { 't' => { 'o' => 'c' } } } } };
      It'd probably fail much faster if I used ratcheting in VALUE.

      -- Randal L. Schwartz, Perl hacker

      The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

Re: JSON parser as a single Perl Regex
by spazm (Monk) on Sep 27, 2012 at 23:26 UTC
Re: JSON parser as a single Perl Regex
by Tommy (Chaplain) on Sep 30, 2012 at 01:12 UTC

    *picks jaw back up off the floor*

    That's amazing. Seriously that's really awesome, Randal. Wow. Just wow.

    --
    Tommy
    $ perl -MMIME::Base64 -e 'print decode_base64 "YWNlQHRvbW15YnV0bGVyLm1lCg=="'
      Deepest apologies for misspelling your name. It has been corrected.
      --
      Tommy
      $ perl -MMIME::Base64 -e 'print decode_base64 "YWNlQHRvbW15YnV0bGVyLm1lCg=="'
Re: JSON parser as a single Perl Regex
by Anonymous Monk on Oct 01, 2012 at 10:30 UTC

    How do you use this script?

    I tried feeding it various .json files and it always prints 'no match'; and frequently consumes all my 8 gig of memory first.

      Worked for me. Give me an example of a one-line file that broke.

      -- Randal L. Schwartz, Perl hacker

      The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

Re: JSON parser as a single Perl Regex
by merlyn (Sage) on Oct 10, 2012 at 03:41 UTC
    I think, looking back on this a few weeks later, is the most amazing thing that...
    if the match succeeds, the thing it is matched against is replaced with the actual data structure. Yes, a match turns into a substitute!

    -- Randal L. Schwartz, Perl hacker

    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: CUFP [id://995856]
Approved by Old_Gray_Bear
Front-paged by davido
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (10)
As of 2013-06-19 11:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How many continents have you visited?









    Results (654 votes), past polls