Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much

Help Extract these lines

by Anonymous Monk
on Jun 17, 2013 at 17:59 UTC ( #1039430=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have this code. What I want is to have this hash data structure:
%rec = { "telephone": "xx-ada-qwebasd", "car": "fasda-asd-123123-fkja123a", "ball" "97f921-a312-fas2", };
This is what I tried so far... and it does not work... Help?
#!/usr/bin/perl use strict; my %rec; while(<DATA>) { s/\"//g; if ( /id:/ ... /id:/) { my ($a) = /id: (\S+),/; my ($b) = /name: (\S+),/; print "$a - $b\n"; $rec{$b} = $a; } } __DATA__ "id": "xx-ada-qwebasd", "name": "telphone", "id": "fasda-asd-123123-fkja123a", "name": "car", "id": "97f921-a312-fas2", "name": "ball",

Comment on Help Extract these lines
Select or Download Code
Re: Help Extract these lines
by Preceptor (Chaplain) on Jun 17, 2013 at 18:25 UTC

    Your loop:

    while ( <DATA> ) ...

    Takes one line at a time.

    print "Line is:", $_, " END\n";

    Will give you a hint as to what's going on:

    Line is:id: xx-ada-qwebasd, END Line is:name: telphone, END Line is:id: fasda-asd-123123-fkja123a, END Line is:name: car, END Line is:id: 97f921-a312-fas2, END Line is:name: ball, END

    What you need to do is grab two lines at a time - there's various ways of doing that. Off the top of my head, reading a second line (if fixed format), or use of grep. I'll have a thought and see if I can come up with something elegant

      I thought of using a range operator to grab the two lines. But I'm confused as to why it is not grabbing the two lines.

        Because the 'while' loop goes off first, populating $_ with one line of DATA. Your range operator then applies to that, which is why it doesn't work.

        #!/usr/bin/perl use strict; use warnings; my %rec; while ( my $line = <DATA>) { $line .= <DATA>; $line =~ s/\"//g; $line =~s/,//g; my ( $id, $name ) = ( $line =~ m/id: (\S+)\nname: (\S+)/mg ); print "$id = $name\n"; $rec{$id} = $name; } __DATA__ "id": "xx-ada-qwebasd", "name": "telphone", "id": "fasda-asd-123123-fkja123a", "name": "car", "id": "97f921-a312-fas2", "name": "ball",

        That I think does the trick. (Basically, grabs two lines in a go, but isn't ideal if your data structure is more complicated). I suspect there's something more clever you could do to parse a file and grab out pattern matching, but I think most of those would involve reading the file in a scalar context and reading the whole lot (which may be fine, but can go wrong with large files).

Re: Help Extract these lines
by Laurent_R (Parson) on Jun 17, 2013 at 18:52 UTC

    Maybe something like this (quick proposal, untested):

    my %my_hash; while <DATA> { chomp; my $id = (split /:/, $_)[1] if /id/; if (/name/) { my $name =~ /: "(\d+)"/; $my_hash{$name} = $id; } }


      Yup, Re: Help Extract these lines doesn't even compile.
      Syntax error at Ln 2 ( which, in turn, produces a spurious error re closing curly braces ).

      If you didn't program your executable by toggling in binary, it wasn't really programming!

        Alright, I just typed an approach for a solution directly on the web page, not having access to a Perl compiler when I did it (and I said that I could not test). What I typed does not compile, but it takes less than 5 seconds to find out why and to correct it. One really easy to find compile error (adding parens after the while keyword) and one other small stupid mistake just as easy to find and to correct, and it basically works:

        use Data::Dumper; my %my_hash; my $id; while (<DATA>) { chomp; s/\r//g; $id = (split /:/, $_)[1] if /id/; if (/name/) { my $name = $1 if /\: "(\w+)"/; $my_hash{$name} = $id; } } print Dumper \%my_hash; __DATA__ "id": "xx-ada-qwebasd", "name": "telphone", "id": "fasda-asd-123123-fkja123a", "name": "car", "id": "97f921-a312-fas2", "name": "ball",
        which prints:
        $ perl $VAR1 = { 'ball' => ' "97f921-a312-fas2",', 'car' => ' "fasda-asd-123123-fkja123a",', 'telphone' => ' "xx-ada-qwebasd",' };
        I admit the data still needs a bit more cleaning, but that's really boilerplate code, and I think that this untested code basically gave a very good track towards the solution.
Re: Help Extract these lines
by 2teez (Priest) on Jun 17, 2013 at 19:04 UTC

    Something like this:

    use warnings; use strict; use Data::Dumper; my $value; my %data; while (<DATA>) { chomp; $value = $1 if /.id.+\s"(.+?)"/; if (/.name.+\s"(.+?)"/) { $data{$1} = $value; } } $Data::Dumper::Pair = ":"; # specify hash key/value separator print Dumper \%data; __DATA__ "id": "xx-ada-qwebasd", "name": "telphone", "id": "fasda-asd-123123-fkja123a", "name": "car", "id": "97f921-a312-fas2", "name": "ball",
    $VAR1 = { 'ball':'97f921-a312-fas2', 'car':'fasda-asd-123123-fkja123a', 'telphone':'xx-ada-qwebasd' };

    If you tell me, I'll forget.
    If you show me, I'll remember.
    if you involve me, I'll understand.
    --- Author unknown to me
Re: Help Extract these lines
by hdb (Prior) on Jun 17, 2013 at 19:33 UTC

    For any of the proposed codes to work, the "id" and "name" entries have to come on alternating lines. Under this assumption, we do not need to bother reading line by line:

    use strict; use warnings; use Data::Dumper; { local $/; $_=<DATA>; } my %rec = reverse /"([^:]*?)",/g; print Dumper \%rec; __DATA__ "id": "xx-ada-qwebasd", "name": "telphone", "id": "fasda-asd-123123-fkja123a", "name": "car", "id": "97f921-a312-fas2", "name": "ball",
Re: Help Extract these lines
by ww (Bishop) on Jun 17, 2013 at 21:37 UTC
    Perhaps worth nothing: most of the proposed solutions above do NOT extend easily or gracefully if the sample data is just one element of a multi-record set.

    To me (YMMV), that raises the question: "Is the data stored in a reasonable fashion?"
    (If so, it's easy enough to extend by initially storing the data to arrays, and then using an AOA or HOA.)

    If you didn't program your executable by toggling in binary, it wasn't really programming!

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1039430]
Approved by ww
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (9)
As of 2014-09-18 22:32 GMT
Find Nodes?
    Voting Booth?

    How do you remember the number of days in each month?

    Results (125 votes), past polls