http://www.perlmonks.org?node_id=1039430

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have this code. What I want is to have this hash data structure:
%rec = { "telephone": "xx-ada-qwebasd", "car": "fasda-asd-123123-fkja123a", "ball" "97f921-a312-fas2", };
This is what I tried so far... and it does not work... Help?
#!/usr/bin/perl use strict; my %rec; while(<DATA>) { s/\"//g; if ( /id:/ ... /id:/) { my ($a) = /id: (\S+),/; my ($b) = /name: (\S+),/; print "$a - $b\n"; $rec{$b} = $a; } } __DATA__ "id": "xx-ada-qwebasd", "name": "telphone", "id": "fasda-asd-123123-fkja123a", "name": "car", "id": "97f921-a312-fas2", "name": "ball",

Replies are listed 'Best First'.
Re: Help Extract these lines
by hdb (Monsignor) on Jun 17, 2013 at 19:33 UTC

    For any of the proposed codes to work, the "id" and "name" entries have to come on alternating lines. Under this assumption, we do not need to bother reading line by line:

    use strict; use warnings; use Data::Dumper; { local $/; $_=<DATA>; } my %rec = reverse /"([^:]*?)",/g; print Dumper \%rec; __DATA__ "id": "xx-ada-qwebasd", "name": "telphone", "id": "fasda-asd-123123-fkja123a", "name": "car", "id": "97f921-a312-fas2", "name": "ball",
Re: Help Extract these lines
by Preceptor (Deacon) on Jun 17, 2013 at 18:25 UTC

    Your loop:

    while ( <DATA> ) ...

    Takes one line at a time.

    print "Line is:", $_, " END\n";

    Will give you a hint as to what's going on:

    Line is:id: xx-ada-qwebasd, END Line is:name: telphone, END Line is:id: fasda-asd-123123-fkja123a, END Line is:name: car, END Line is:id: 97f921-a312-fas2, END Line is:name: ball, END

    What you need to do is grab two lines at a time - there's various ways of doing that. Off the top of my head, reading a second line (if fixed format), or use of grep. I'll have a thought and see if I can come up with something elegant

      I thought of using a range operator to grab the two lines. But I'm confused as to why it is not grabbing the two lines.

        Because the 'while' loop goes off first, populating $_ with one line of DATA. Your range operator then applies to that, which is why it doesn't work.

        #!/usr/bin/perl use strict; use warnings; my %rec; while ( my $line = <DATA>) { $line .= <DATA>; $line =~ s/\"//g; $line =~s/,//g; my ( $id, $name ) = ( $line =~ m/id: (\S+)\nname: (\S+)/mg ); print "$id = $name\n"; $rec{$id} = $name; } __DATA__ "id": "xx-ada-qwebasd", "name": "telphone", "id": "fasda-asd-123123-fkja123a", "name": "car", "id": "97f921-a312-fas2", "name": "ball",

        That I think does the trick. (Basically, grabs two lines in a go, but isn't ideal if your data structure is more complicated). I suspect there's something more clever you could do to parse a file and grab out pattern matching, but I think most of those would involve reading the file in a scalar context and reading the whole lot (which may be fine, but can go wrong with large files).

Re: Help Extract these lines
by 2teez (Vicar) on Jun 17, 2013 at 19:04 UTC

    Something like this:

    use warnings; use strict; use Data::Dumper; my $value; my %data; while (<DATA>) { chomp; $value = $1 if /.id.+\s"(.+?)"/; if (/.name.+\s"(.+?)"/) { $data{$1} = $value; } } $Data::Dumper::Pair = ":"; # specify hash key/value separator print Dumper \%data; __DATA__ "id": "xx-ada-qwebasd", "name": "telphone", "id": "fasda-asd-123123-fkja123a", "name": "car", "id": "97f921-a312-fas2", "name": "ball",
    Output:
    $VAR1 = { 'ball':'97f921-a312-fas2', 'car':'fasda-asd-123123-fkja123a', 'telphone':'xx-ada-qwebasd' };

    If you tell me, I'll forget.
    If you show me, I'll remember.
    if you involve me, I'll understand.
    --- Author unknown to me
Re: Help Extract these lines
by Laurent_R (Canon) on Jun 17, 2013 at 18:52 UTC

    Maybe something like this (quick proposal, untested):

    my %my_hash; while <DATA> { chomp; my $id = (split /:/, $_)[1] if /id/; if (/name/) { my $name =~ /: "(\d+)"/; $my_hash{$name} = $id; } }

      "untested."

      Yup, Re: Help Extract these lines doesn't even compile.
      Syntax error at Ln 2 ( which, in turn, produces a spurious error re closing curly braces ).


      If you didn't program your executable by toggling in binary, it wasn't really programming!

        Alright, I just typed an approach for a solution directly on the web page, not having access to a Perl compiler when I did it (and I said that I could not test). What I typed does not compile, but it takes less than 5 seconds to find out why and to correct it. One really easy to find compile error (adding parens after the while keyword) and one other small stupid mistake just as easy to find and to correct, and it basically works:

        use Data::Dumper; my %my_hash; my $id; while (<DATA>) { chomp; s/\r//g; $id = (split /:/, $_)[1] if /id/; if (/name/) { my $name = $1 if /\: "(\w+)"/; $my_hash{$name} = $id; } } print Dumper \%my_hash; __DATA__ "id": "xx-ada-qwebasd", "name": "telphone", "id": "fasda-asd-123123-fkja123a", "name": "car", "id": "97f921-a312-fas2", "name": "ball",
        which prints:
        $ perl 2_lines.pl $VAR1 = { 'ball' => ' "97f921-a312-fas2",', 'car' => ' "fasda-asd-123123-fkja123a",', 'telphone' => ' "xx-ada-qwebasd",' };
        I admit the data still needs a bit more cleaning, but that's really boilerplate code, and I think that this untested code basically gave a very good track towards the solution.
Re: Help Extract these lines
by ww (Archbishop) on Jun 17, 2013 at 21:37 UTC
    Perhaps worth nothing: most of the proposed solutions above do NOT extend easily or gracefully if the sample data is just one element of a multi-record set.

    To me (YMMV), that raises the question: "Is the data stored in a reasonable fashion?"
    (If so, it's easy enough to extend by initially storing the data to arrays, and then using an AOA or HOA.)


    If you didn't program your executable by toggling in binary, it wasn't really programming!