Help Extract these lines

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have this code. What I want is to have this hash data structure:

%rec = { "telephone": "xx-ada-qwebasd",
         "car": "fasda-asd-123123-fkja123a",
         "ball" "97f921-a312-fas2",
};
[download]

This is what I tried so far... and it does not work... Help?

#!/usr/bin/perl

use strict;

my %rec;
while(<DATA>) {
    s/\"//g;
    if ( /id:/ ... /id:/)  {
        my ($a) = /id: (\S+),/;
        my ($b) = /name: (\S+),/;
        print "$a - $b\n";
        $rec{$b} = $a;
    }
}
__DATA__
"id": "xx-ada-qwebasd",
"name": "telphone",
"id": "fasda-asd-123123-fkja123a",
"name": "car",
"id": "97f921-a312-fas2",
"name": "ball",
[download]

Comment on Help Extract these lines Select or Download Code

Replies are listed 'Best First'.
Re: Help Extract these lines by hdb (Monsignor) on Jun 17, 2013 at 19:33 UTC
For any of the proposed codes to work, the "id" and "name" entries have to come on alternating lines. Under this assumption, we do not need to bother reading line by line: `use strict; use warnings; use Data::Dumper; { local $/; $_=<DATA>; } my %rec = reverse /"([^:]*?)",/g; print Dumper \%rec; __DATA__ "id": "xx-ada-qwebasd", "name": "telphone", "id": "fasda-asd-123123-fkja123a", "name": "car", "id": "97f921-a312-fas2", "name": "ball",` [download]	[reply] [d/l]
Re: Help Extract these lines by Preceptor (Deacon) on Jun 17, 2013 at 18:25 UTC
Your loop: `while ( <DATA> ) ...` [download] Takes one line at a time. `print "Line is:", $_, " END\n";` [download] Will give you a hint as to what's going on: `Line is:id: xx-ada-qwebasd, END Line is:name: telphone, END Line is:id: fasda-asd-123123-fkja123a, END Line is:name: car, END Line is:id: 97f921-a312-fas2, END Line is:name: ball, END` [download] What you need to do is grab two lines at a time - there's various ways of doing that. Off the top of my head, reading a second line (if fixed format), or use of grep. I'll have a thought and see if I can come up with something elegant	[reply] [d/l] [select]
Re^2: Help Extract these lines by Anonymous Monk on Jun 17, 2013 at 18:31 UTC
I thought of using a range operator to grab the two lines. But I'm confused as to why it is not grabbing the two lines.	[reply]
Re^3: Help Extract these lines by Preceptor (Deacon) on Jun 17, 2013 at 18:38 UTC
Because the 'while' loop goes off first, populating $_ with one line of DATA. Your range operator then applies to that, which is why it doesn't work. `#!/usr/bin/perl use strict; use warnings; my %rec; while ( my $line = <DATA>) { $line .= <DATA>; $line =~ s/\"//g; $line =~s/,//g; my ( $id, $name ) = ( $line =~ m/id: (\S+)\nname: (\S+)/mg ); print "$id = $name\n"; $rec{$id} = $name; } __DATA__ "id": "xx-ada-qwebasd", "name": "telphone", "id": "fasda-asd-123123-fkja123a", "name": "car", "id": "97f921-a312-fas2", "name": "ball",` [download] That I think does the trick. (Basically, grabs two lines in a go, but isn't ideal if your data structure is more complicated). I suspect there's something more clever you could do to parse a file and grab out pattern matching, but I think most of those would involve reading the file in a scalar context and reading the whole lot (which may be fine, but can go wrong with large files).	[reply] [d/l]
Re: Help Extract these lines by 2teez (Vicar) on Jun 17, 2013 at 19:04 UTC
Something like this: `use warnings; use strict; use Data::Dumper; my $value; my %data; while (<DATA>) { chomp; $value = $1 if /.id.+\s"(.+?)"/; if (/.name.+\s"(.+?)"/) { $data{$1} = $value; } } $Data::Dumper::Pair = ":"; # specify hash key/value separator print Dumper \%data; __DATA__ "id": "xx-ada-qwebasd", "name": "telphone", "id": "fasda-asd-123123-fkja123a", "name": "car", "id": "97f921-a312-fas2", "name": "ball",` [download] Output: `$VAR1 = { 'ball':'97f921-a312-fas2', 'car':'fasda-asd-123123-fkja123a', 'telphone':'xx-ada-qwebasd' };` [download] If you tell me, I'll forget. If you show me, I'll remember. if you involve me, I'll understand. --- Author unknown to me	[reply] [d/l] [select]
Re: Help Extract these lines by Laurent_R (Canon) on Jun 17, 2013 at 18:52 UTC
Maybe something like this (quick proposal, untested): `my %my_hash; while <DATA> { chomp; my $id = (split /:/, $_)[1] if /id/; if (/name/) { my $name =~ /: "(\d+)"/; $my_hash{$name} = $id; } }` [download]	[reply] [d/l]
Re^2: Help Extract these lines by ww (Archbishop) on Jun 17, 2013 at 21:32 UTC
"untested." Yup, Re: Help Extract these lines doesn't even compile. Syntax error at Ln 2 ( which, in turn, produces a spurious error re closing curly braces ). If you didn't program your executable by toggling in binary, it wasn't really programming!	[reply]
Re^3: Help Extract these lines by Laurent_R (Canon) on Jun 17, 2013 at 22:25 UTC
Alright, I just typed an approach for a solution directly on the web page, not having access to a Perl compiler when I did it (and I said that I could not test). What I typed does not compile, but it takes less than 5 seconds to find out why and to correct it. One really easy to find compile error (adding parens after the while keyword) and one other small stupid mistake just as easy to find and to correct, and it basically works: `use Data::Dumper; my %my_hash; my $id; while (<DATA>) { chomp; s/\r//g; $id = (split /:/, $_)[1] if /id/; if (/name/) { my $name = $1 if /\: "(\w+)"/; $my_hash{$name} = $id; } } print Dumper \%my_hash; __DATA__ "id": "xx-ada-qwebasd", "name": "telphone", "id": "fasda-asd-123123-fkja123a", "name": "car", "id": "97f921-a312-fas2", "name": "ball",` [download] which prints: `$ perl 2_lines.pl $VAR1 = { 'ball' => ' "97f921-a312-fas2",', 'car' => ' "fasda-asd-123123-fkja123a",', 'telphone' => ' "xx-ada-qwebasd",' };` [download] I admit the data still needs a bit more cleaning, but that's really boilerplate code, and I think that this untested code basically gave a very good track towards the solution.	[reply] [d/l] [select]
Re: Help Extract these lines by ww (Archbishop) on Jun 17, 2013 at 21:37 UTC
Perhaps worth nothing: most of the proposed solutions above do NOT extend easily or gracefully if the sample data is just one element of a multi-record set. To me (YMMV), that raises the question: "Is the data stored in a reasonable fashion?" (If so, it's easy enough to extend by initially storing the data to arrays, and then using an AOA or HOA.) If you didn't program your executable by toggling in binary, it wasn't really programming!	[reply]

Back to Seekers of Perl Wisdom