Re^3: Epoch based parser

"When I try to decode_json the entire file, it doesnt allow, as I believe decode_json looks for the braces to see if it finished. How can I parse multiline json from a file?"

With the sample input you posted, you can just read records as being delimited with "}\n" by setting the input record separator: $/ (see perlvar: Variables related to filehandles). You can then remove the embedded "\n" and "\n+" strings with s/\n[+]?//gm (see perlre if you're unfamiliar with that). Here's a modification of my original code that does this. [Note: you haven't supplied data that matches your original 12:15 or subsequent any:20 — I've made an additional change in order to get some output.]

#!/usr/bin/env perl -l

use strict;
use warnings;

use JSON;
use Time::Piece;

my $wanted_minute = 40;

{
    local $/ = "}\n";

    while (<DATA>) {
        s/\n[+]?//gm;

        my $data = decode_json $_;

        for (@{$data->{aaData}}) {
            print "@$_" if is_wanted_time(@$_[0,1]);
        }
    }
}

sub is_wanted_time {
    for (@_) {
        my $t = gmtime $_;
        return 1 if $t->min == $wanted_minute;
    }

    return 0;
}

__DATA__
{"DisplayRecords":"12","Records":"12","sColumns":"startTime,endTime,
remoteNode,srcIP,srcPort,destIP,destPort,egress,ingress","aaData":[["1
+375976271"
,"1375976430","LAN","D0:05:FE","172.20.30.2",1093,"172.20.28.2",1330,"
+1034,348"]]}
{"DisplayRecords":"12","Records":"12","sColumns":"startTime,endTime,
remoteNode,srcIP,srcPort,destIP,destPort,egress,ingress","aaData":[["1
+375976271"
,"1375976430","LAN","D0:05:FE","172.20.30.2",1093,"172.20.28.2",1330,"
+1034,348"]]}
[download]

Output:

$ pm_epoch_from_json_2.pl
1375976271 1375976430 LAN D0:05:FE 172.20.30.2 1093 172.20.28.2 1330 1
+034,348
1375976271 1375976430 LAN D0:05:FE 172.20.30.2 1093 172.20.28.2 1330 1
+034,348
[download]

"The file input would be something like: ..."

I doubt it!

That looks like you've just pasted it from the web page including the leading '+'s indicating text wrapping at 70 characters. Assuming that's right, you only need "s/\n//gm" for the regex.

-- Ken

Comment on Re^3: Epoch based parser Select or Download Code

Replies are listed 'Best First'.
Re^4: Epoch based parser by spikeinc (Acolyte) on Aug 14, 2013 at 07:37 UTC
Hi Ken, Thanks for a quick response. Note: you haven't supplied data that matches your original 12:15 or subsequent any:20 — I've made an additional change in order to get some output. indeed I have two scripts, one that does time based as you had given me, and the other one that takes in file as an input and then parses json. This script (in the above post) takes input (as json output from a webpage logged to the file) and parses it. I want to attach the file but I am not sure how I can do that. I tried it as above but it gives me the foll. error: `use strict; use warnings; use Time::Local; use JSON; use File::Read; my $jsonc = read_file ('H:\Work\perl\latest\Scripts\logs\json.txt'); { local $/ = "}\n"; while ($jsonc) { s/\n[+]?//gm; my $data = decode_json $_; for (@{$data->{aaData}}) { #my print/parse function } } }` [download] What does having while(<DATA>) do? Does it read till end of the _DATA_ ? if so is it correct that in my above case I do a while($jsonc)? I understood the substitution part, thanks a lot. However, in my case when I parse my file I get the below error: `c:\perl>perl json.pl Use of uninitialized value $_ in substitution (s///) at get.pl line 14 +. malformed JSON string, neither array, object, number, string or atom, +at character offset 0 (before "(end of string )") at get.pl line 15.` [download] Content of json.txt looks pretty much same as what you have mentioned above in the _DATA_ section.And I can clearly see that the end of first json block has a "]]}\n". So its supposed to work as you mentioned.	[reply] [d/l] [select]
Re^5: Epoch based parser by kcott (Archbishop) on Aug 14, 2013 at 09:26 UTC
When I first read your latest post, the code looked like this: `my $jsonc = read_file ('H:\Work\perl\latest\Scripts\logs\json.txt'); local $/ = "}"; while ($jsonc) { s/\n[+]?//gm; my $data = decode_json $jsonc; for (@{$data->{aaData}}) { #my print function }` [download] I wrote some more example code, dug up a few more references to help you out and started to respond. Upon doing so, I find you've changed that code to this: `my $jsonc = read_file ('H:\Work\perl\latest\Scripts\logs\json.txt'); { local $/ = "}\n"; while ($jsonc) { s/\n[+]?//gm; my $data = decode_json $_; for (@{$data->{aaData}}) { #my print/parse function } } }` [download] If you make changes then clearly indicate what you've changed! See "How do I change/delete my post?". Here's the new example code I wrote: `#!/usr/bin/env perl -l use strict; use warnings; use JSON; use Time::Piece; my $json_file = 'json.txt'; my $wanted_minute = 40; open my $json_fh, '<', $json_file or die "Can't read '$json_file': $!" +; { local $/ = "}\n"; while (<$json_fh>) { s/\n[+]?//gm; my $data = decode_json $_; for (@{$data->{aaData}}) { print "@$_" if is_wanted_time(@$_[0,1]); } } } close $json_fh; sub is_wanted_time { for (@_) { my $t = gmtime $_; return 1 if $t->min == $wanted_minute; } return 0; }` [download] With this input: `$ cat json.txt {"DisplayRecords":"12","Records":"12","sColumns":"startTime,endTime, remoteNode,srcIP,srcPort,destIP,destPort,egress,ingress","aaData":[["1 +375976271" ,"1375976430","LAN","D0:05:FE","172.20.30.2",1093,"172.20.28.2",1330," +1034,348"]]} {"DisplayRecords":"12","Records":"12","sColumns":"startTime,endTime, remoteNode,srcIP,srcPort,destIP,destPort,egress,ingress","aaData":[["1 +375976271" ,"1375976430","LAN","D0:05:FE","172.20.30.2",1093,"172.20.28.2",1330," +1034,348"]]}` [download] Here's the output (it's the same as from the last example, i.e. `pm_epoch_from_json_2.pl`): `$ pm_epoch_from_json_3.pl 1375976271 1375976430 LAN D0:05:FE 172.20.30.2 1093 172.20.28.2 1330 1 +034,348 1375976271 1375976430 LAN D0:05:FE 172.20.30.2 1093 172.20.28.2 1330 1 +034,348` [download] Here's some more references: For `__DATA__` (which you asked about): perldata: Scalar value constructors (Special Literals section). For `<`filehandle`>` (which you don't seem to know about): perlop: I/O Operators. I've used the builtin open function here. I don't know why you've used the CPAN module File::Read — it seems unnecessary here. And, just for completeness, here's the close documentation. -- Ken	[reply] [d/l] [select]
Re^6: Epoch based parser by spikeinc (Acolyte) on Aug 15, 2013 at 05:41 UTC
G'Day Ken. I am extremely sorry for doing the edits without notifying. I was trying out couple of things and wanted to post the latest. However, your logic really helped and I thank you for that. I learnt a few things and also implemented some more regexp related filtering and it works great!!! I appreciate your help and thankful to you in helping me learn perl so far :)	[reply]


Problems? Is your data what you think it is?
	PerlMonks