Beefy Boxes and Bandwidth Generously Provided by pair Networks vroom
Just another Perl shrine
 
PerlMonks  

Move data into AoH

by vitoco (Pilgrim)
on Sep 24, 2013 at 19:36 UTC ( #1055528=perlquestion: print w/ replies, xml ) Need Help??
vitoco has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks!

I have a text file with a data structure, like this:

[{field1:value1,field2:value2,field3:value3} ,{field1:value4,field2:value5,field3:value6} ,{field1:value7,field2:value8,field3:value9} ]

There are no newlines (I put them to show records), and both field names and values may or may not be enclosed by quotes. Special chars in text values are escaped with backslash.

Is there a simple way to read this into a perl structure? I mean, an AoH...

EDIT: To clarify, as values can contain colons, commas and escaped quotes, I cannot write a simple regexpr. I guess I must parse it somehow, and I don't want to reinvent the wheel.

Comment on Move data into AoH
Download Code
Re: Move data into AoH
by ww (Bishop) on Sep 24, 2013 at 19:41 UTC
    Why an AoH? Why not a simple hash?

    But to answer your question, yes, "there (is) a simple way...."

    The how-to will be found at Data Type: Hash

    Study there, and if you get stuck, come back with the code you've written, the error and warning messages, and any narrative explication that you wish.

      Why an AoH? Why not a simple hash?

      Because it is an array... I edited the OP ;-)

      BTW, I edited only the numbered sequence of values from the example... the [] and {} remained the same.

      Also, some of the fields may or may not be present in a record.

Re: Move data into AoH
by VinsWorldcom (Priest) on Sep 24, 2013 at 21:05 UTC

    How are you getting / generating the text file?

    If the name / value pairs are separated by '=>' and the values are quoted, this can be done with something as simple as eval:

    #!/usr/bin/perl use strict; use warnings; use Data::Dumper; my @array = @{eval <DATA>}; print Dumper \@array; __DATA__ [{field1=>'value1',field2=>'value2',field3=>'value3'},{field1=>'value4 +',field2=>'value5',field3=>'value6'},{field1=>'value7',field2=>'value +8',field3=>'value9'}]

    And the output:

    VinsWorldcomC:\Users\VinsWorldcom\tmp> test.pl $VAR1 = [ { 'field1' => 'value1', 'field2' => 'value2', 'field3' => 'value3' }, { 'field1' => 'value4', 'field2' => 'value5', 'field3' => 'value6' }, { 'field1' => 'value7', 'field2' => 'value8', 'field3' => 'value9' } ];
Re: Move data into AoH
by LanX (Abbot) on Sep 24, 2013 at 21:07 UTC
    If its JSON just use a JSON parser like JSON.pm

    If not, please specify the differences to JSON and which results you expect.

    Cheers Rolf

    ( addicted to the Perl Programming Language)

Re: Move data into AoH
by hdb (Parson) on Sep 25, 2013 at 06:24 UTC

    JSON is a good starting point, but it requires field names and values in quotes. See the following code which does not work due to lack of quotes:

    use strict; use warnings; use JSON; use Data::Dumper; my $json = <<EOT; [{field1:value1,field2:value2,field3:value3} ,{field1:value4,field2:value5,field3:value6} ,{field1:value7,field2:value8,field3:value9} ] EOT my $data = decode_json( $json ); print Dumper $data;

    If you can't write a simple regex to fix the quote issue, I suggest to look into Text::CSV. With the help of that module you might be able to parse your text file. It really depends on how quotes, colons and commas interact in your file.

Re: Move data into AoH
by vitoco (Pilgrim) on Sep 25, 2013 at 14:02 UTC

    I didn't realize that it was JSON. Never tried it before, only XML. I almost started to code a parser from scratch!!! I'm happy to ask here first.

    I tried JSON module successfully. Quotes and escapes were proccessed OK. Only one thing got my attention: booleans become a bessed var for the first record, and the remaining records with same value referenced it. I can live with that :-)

    Thank you!

Re: Move data into AoH
by ww (Bishop) on Sep 25, 2013 at 17:00 UTC

    Your latest message, quoted below, leaves me even more baffled about your problem case.

    Harking back to my original question, '(w)hy an AoH? Why not simple hash?', I suggest the following approach lends itself to further transformation of the records into a simple hash (using the "value" elements as the keys) of the fields:

    #!/usr/bin/perl use 5.016; use warnings; use Data::Dumper; =head vitoco's msg: vitoco says Re Re^2: Move data into AoH The data file happens to be a + JSON structure, an array of records, very like an AoH. I know how to + manage them in perl, but as text values may have the special chars i +nside, there was no simple regexpr to parse that. =cut # 1055528 my @arr; my $recsep = q(,); while ( <DATA> ) { @arr = split /$recsep/, $_; } my $i=1; for my $record(@arr) { say "\t $i: $record \n"; ++$i; } say "\n\n"; say Dumper @arr; =head vitoco: There are no newlines (I put them to show records), ww: So here's data with extra newlines removed but with (from OP) some + regex-special characters... namely, the square brackets... which vit +oco says prevent writing a "simple regexpr" -- OK, this uses split a +nd a regular expression (with<b>out</b> any post-5.8 bells and whistl +es), but the outcome isn't changed. =cut __DATA__ [{field1:value1,field2:value2,field3:value3},{field1:value4,field2:val +ue5,field3:value6},{field1:value7,field2:value8,field3:value9}]
    When executed:
    C:\>perl D:\_Perl_\PMonks\1055528.pl 1: [{field1:value1 2: field2:value2 3: field3:value3} 4: {field1:value4 5: field2:value5 6: field3:value6} 7: {field1:value7 8: field2:value8 9: field3:value9}] $VAR1 = '[{field1:value1'; $VAR2 = 'field2:value2'; $VAR3 = 'field3:value3}'; $VAR4 = '{field1:value4'; $VAR5 = 'field2:value5'; $VAR6 = 'field3:value6}'; $VAR7 = '{field1:value7'; $VAR8 = 'field2:value8'; $VAR9 = 'field3:value9}]';

    Am I hopelessly off target with respect to your intent? If so, can you explain more clearly for the guy (\me) currently playing 'village idiot'? (Remember, every village needs one.)

      Given the following table from a database:

      +------+-----------+--------------------+ | id | title | comments | +------+-----------+--------------------+ | 2911 | Rush Hour | Fun :-) | +------+-----------+--------------------+ | 3217 | Titanic | Drama, too long | +------+-----------+--------------------+ | 6518 | Bambi | | +------+-----------+--------------------+ | 7388 | Star Wars | "I'm your father!" | +------+-----------+--------------------+

      In a JSON structure, this would be something like the following (without newlines to show the records):

      [{"id":2911,"title":"Rush Hour","comments":"Fun :-)"} ,{"id":3217,"title":"Titanic","comments":"Drama, too long"} ,{"id":6518,"title":"Bambi"} ,{"id":7388,"title":"Star Wars","comments":"\"I'm your father!\""} ]

      My intention was to read this "special" file and translate it into a tab delimited CSV-like file, filling missing fields with default values.

      The problem parsing this very long line is that it's not possible to split by comma without looking at the context, because it appears in data, just like colons, quotes (escaped with a backslash), and structure indicators ([ ] and { }, not the regex special chars, BTW).

      JSON module can parse that file without problems, returning the AoH structure just like it was intended:

      use JSON; use Data::Dumper; my $json = <DATA>; my $data = decode_json($json); print Dumper $data; __DATA__ [{"id":2911,"title":"Rush Hour","comments":"Fun :-)"},{"id":3217,"titl +e":"Titanic","comments":"Drama, too long"},{"id":6518,"title":"Bambi" +},{"id":7388,"title":"Star Wars","comments":"\"I'm your father!\""}]
      $VAR1 = [ { 'title' => 'Rush Hour', 'id' => 2911, 'comments' => 'Fun :-)' }, { 'title' => 'Titanic', 'id' => 3217, 'comments' => 'Drama, too long' }, { 'title' => 'Bambi', 'id' => 6518 }, { 'title' => 'Star Wars', 'id' => 7388, 'comments' => '"I\'m your father!"' } ];

      Of course, if I had to manage this structure in memory, I'd use HoH (with id as the main key) or HoA (to save space if all other fields are present in every record) instead of AoH... Or maybe a single hash (as you proposed first) with a key composed of the id value of a record and the field name for every other values... But this should be another discussion.

      Was I clear this time? I'm sorry, I'm translating from Spanish on the fly... ;-)

        Was I clear this time?

        Yes! Thank you for making the effort, and for posting your code which will likely be a blesssing for future Seekers....

        I'm translating from Spanish on the fly... ;-)

        ... and in this case, doing it so well that I think I'm entirely clear about your intent and suspect you know a good deal more about JSON, likely know more about Perl and certainly have had an ups-and-downs intro to the Monastery.

        So go forth; sin no more, but confess (with code and data) here if you run into further problems.


        If you didn't program your executable by toggling in binary, it wasn't really programming!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1055528]
Approved by chacham
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (8)
As of 2014-04-21 13:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (495 votes), past polls