Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re^3: Epoch based parser

by kcott (Archbishop)
on Aug 14, 2013 at 06:48 UTC ( [id://1049388]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Epoch based parser
in thread Epoch based parser

"When I try to decode_json the entire file, it doesnt allow, as I believe decode_json looks for the braces to see if it finished. How can I parse multiline json from a file?"

With the sample input you posted, you can just read records as being delimited with "}\n" by setting the input record separator: $/ (see perlvar: Variables related to filehandles). You can then remove the embedded "\n" and "\n+" strings with s/\n[+]?//gm (see perlre if you're unfamiliar with that). Here's a modification of my original code that does this. [Note: you haven't supplied data that matches your original 12:15 or subsequent any:20 — I've made an additional change in order to get some output.]

#!/usr/bin/env perl -l use strict; use warnings; use JSON; use Time::Piece; my $wanted_minute = 40; { local $/ = "}\n"; while (<DATA>) { s/\n[+]?//gm; my $data = decode_json $_; for (@{$data->{aaData}}) { print "@$_" if is_wanted_time(@$_[0,1]); } } } sub is_wanted_time { for (@_) { my $t = gmtime $_; return 1 if $t->min == $wanted_minute; } return 0; } __DATA__ {"DisplayRecords":"12","Records":"12","sColumns":"startTime,endTime, remoteNode,srcIP,srcPort,destIP,destPort,egress,ingress","aaData":[["1 +375976271" ,"1375976430","LAN","D0:05:FE","172.20.30.2",1093,"172.20.28.2",1330," +1034,348"]]} {"DisplayRecords":"12","Records":"12","sColumns":"startTime,endTime, remoteNode,srcIP,srcPort,destIP,destPort,egress,ingress","aaData":[["1 +375976271" ,"1375976430","LAN","D0:05:FE","172.20.30.2",1093,"172.20.28.2",1330," +1034,348"]]}

Output:

$ pm_epoch_from_json_2.pl 1375976271 1375976430 LAN D0:05:FE 172.20.30.2 1093 172.20.28.2 1330 1 +034,348 1375976271 1375976430 LAN D0:05:FE 172.20.30.2 1093 172.20.28.2 1330 1 +034,348
"The file input would be something like: ..."

I doubt it!

That looks like you've just pasted it from the web page including the leading '+'s indicating text wrapping at 70 characters. Assuming that's right, you only need "s/\n//gm" for the regex.

-- Ken

Replies are listed 'Best First'.
Re^4: Epoch based parser
by spikeinc (Acolyte) on Aug 14, 2013 at 07:37 UTC
    Hi Ken,

    Thanks for a quick response.

    Note: you haven't supplied data that matches your original 12:15 or subsequent any:20 — I've made an additional change in order to get some output.

    indeed I have two scripts, one that does time based as you had given me, and the other one that takes in file as an input and then parses json. This script (in the above post) takes input (as json output from a webpage logged to the file) and parses it.

    I want to attach the file but I am not sure how I can do that. I tried it as above but it gives me the foll. error:

    use strict; use warnings; use Time::Local; use JSON; use File::Read; my $jsonc = read_file ('H:\Work\perl\latest\Scripts\logs\json.txt'); { local $/ = "}\n"; while ($jsonc) { s/\n[+]?//gm; my $data = decode_json $_; for (@{$data->{aaData}}) { #my print/parse function } } }
    What does having while(<DATA>) do? Does it read till end of the _DATA_ ? if so is it correct that in my above case I do a while($jsonc)? I understood the substitution part, thanks a lot. However, in my case when I parse my file I get the below error:
    c:\perl>perl json.pl Use of uninitialized value $_ in substitution (s///) at get.pl line 14 +. malformed JSON string, neither array, object, number, string or atom, +at character offset 0 (before "(end of string )") at get.pl line 15.
    Content of json.txt looks pretty much same as what you have mentioned above in the _DATA_ section.And I can clearly see that the end of first json block has a "]]}\n". So its supposed to work as you mentioned.

      When I first read your latest post, the code looked like this:

      my $jsonc = read_file ('H:\Work\perl\latest\Scripts\logs\json.txt'); local $/ = "}"; while ($jsonc) { s/\n[+]?//gm; my $data = decode_json $jsonc; for (@{$data->{aaData}}) { #my print function }

      I wrote some more example code, dug up a few more references to help you out and started to respond. Upon doing so, I find you've changed that code to this:

      my $jsonc = read_file ('H:\Work\perl\latest\Scripts\logs\json.txt'); { local $/ = "}\n"; while ($jsonc) { s/\n[+]?//gm; my $data = decode_json $_; for (@{$data->{aaData}}) { #my print/parse function } } }

      If you make changes then clearly indicate what you've changed! See "How do I change/delete my post?".

      Here's the new example code I wrote:

      #!/usr/bin/env perl -l use strict; use warnings; use JSON; use Time::Piece; my $json_file = 'json.txt'; my $wanted_minute = 40; open my $json_fh, '<', $json_file or die "Can't read '$json_file': $!" +; { local $/ = "}\n"; while (<$json_fh>) { s/\n[+]?//gm; my $data = decode_json $_; for (@{$data->{aaData}}) { print "@$_" if is_wanted_time(@$_[0,1]); } } } close $json_fh; sub is_wanted_time { for (@_) { my $t = gmtime $_; return 1 if $t->min == $wanted_minute; } return 0; }

      With this input:

      $ cat json.txt {"DisplayRecords":"12","Records":"12","sColumns":"startTime,endTime, remoteNode,srcIP,srcPort,destIP,destPort,egress,ingress","aaData":[["1 +375976271" ,"1375976430","LAN","D0:05:FE","172.20.30.2",1093,"172.20.28.2",1330," +1034,348"]]} {"DisplayRecords":"12","Records":"12","sColumns":"startTime,endTime, remoteNode,srcIP,srcPort,destIP,destPort,egress,ingress","aaData":[["1 +375976271" ,"1375976430","LAN","D0:05:FE","172.20.30.2",1093,"172.20.28.2",1330," +1034,348"]]}

      Here's the output (it's the same as from the last example, i.e. pm_epoch_from_json_2.pl):

      $ pm_epoch_from_json_3.pl 1375976271 1375976430 LAN D0:05:FE 172.20.30.2 1093 172.20.28.2 1330 1 +034,348 1375976271 1375976430 LAN D0:05:FE 172.20.30.2 1093 172.20.28.2 1330 1 +034,348

      Here's some more references:

      -- Ken

        G'Day Ken. I am extremely sorry for doing the edits without notifying. I was trying out couple of things and wanted to post the latest.

        However, your logic really helped and I thank you for that. I learnt a few things and also implemented some more regexp related filtering and it works great!!!

        I appreciate your help and thankful to you in helping me learn perl so far :)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1049388]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2024-04-20 01:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found