http://www.perlmonks.org?node_id=1049388


in reply to Re^2: Epoch based parser
in thread Epoch based parser

"When I try to decode_json the entire file, it doesnt allow, as I believe decode_json looks for the braces to see if it finished. How can I parse multiline json from a file?"

With the sample input you posted, you can just read records as being delimited with "}\n" by setting the input record separator: $/ (see perlvar: Variables related to filehandles). You can then remove the embedded "\n" and "\n+" strings with s/\n[+]?//gm (see perlre if you're unfamiliar with that). Here's a modification of my original code that does this. [Note: you haven't supplied data that matches your original 12:15 or subsequent any:20 — I've made an additional change in order to get some output.]

#!/usr/bin/env perl -l use strict; use warnings; use JSON; use Time::Piece; my $wanted_minute = 40; { local $/ = "}\n"; while (<DATA>) { s/\n[+]?//gm; my $data = decode_json $_; for (@{$data->{aaData}}) { print "@$_" if is_wanted_time(@$_[0,1]); } } } sub is_wanted_time { for (@_) { my $t = gmtime $_; return 1 if $t->min == $wanted_minute; } return 0; } __DATA__ {"DisplayRecords":"12","Records":"12","sColumns":"startTime,endTime, remoteNode,srcIP,srcPort,destIP,destPort,egress,ingress","aaData":[["1 +375976271" ,"1375976430","LAN","D0:05:FE","172.20.30.2",1093,"172.20.28.2",1330," +1034,348"]]} {"DisplayRecords":"12","Records":"12","sColumns":"startTime,endTime, remoteNode,srcIP,srcPort,destIP,destPort,egress,ingress","aaData":[["1 +375976271" ,"1375976430","LAN","D0:05:FE","172.20.30.2",1093,"172.20.28.2",1330," +1034,348"]]}

Output:

$ pm_epoch_from_json_2.pl 1375976271 1375976430 LAN D0:05:FE 172.20.30.2 1093 172.20.28.2 1330 1 +034,348 1375976271 1375976430 LAN D0:05:FE 172.20.30.2 1093 172.20.28.2 1330 1 +034,348
"The file input would be something like: ..."

I doubt it!

That looks like you've just pasted it from the web page including the leading '+'s indicating text wrapping at 70 characters. Assuming that's right, you only need "s/\n//gm" for the regex.

-- Ken