Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

Re: Move data into AoH

by ww (Archbishop)
on Sep 25, 2013 at 17:00 UTC ( #1055715=note: print w/replies, xml ) Need Help??

in reply to Move data into AoH

Your latest message, quoted below, leaves me even more baffled about your problem case.

Harking back to my original question, '(w)hy an AoH? Why not simple hash?', I suggest the following approach lends itself to further transformation of the records into a simple hash (using the "value" elements as the keys) of the fields:

#!/usr/bin/perl use 5.016; use warnings; use Data::Dumper; =head vitoco's msg: vitoco says Re Re^2: Move data into AoH The data file happens to be a + JSON structure, an array of records, very like an AoH. I know how to + manage them in perl, but as text values may have the special chars i +nside, there was no simple regexpr to parse that. =cut # 1055528 my @arr; my $recsep = q(,); while ( <DATA> ) { @arr = split /$recsep/, $_; } my $i=1; for my $record(@arr) { say "\t $i: $record \n"; ++$i; } say "\n\n"; say Dumper @arr; =head vitoco: There are no newlines (I put them to show records), ww: So here's data with extra newlines removed but with (from OP) some + regex-special characters... namely, the square brackets... which vit +oco says prevent writing a "simple regexpr" -- OK, this uses split a +nd a regular expression (with<b>out</b> any post-5.8 bells and whistl +es), but the outcome isn't changed. =cut __DATA__ [{field1:value1,field2:value2,field3:value3},{field1:value4,field2:val +ue5,field3:value6},{field1:value7,field2:value8,field3:value9}]
When executed:
C:\>perl D:\_Perl_\PMonks\ 1: [{field1:value1 2: field2:value2 3: field3:value3} 4: {field1:value4 5: field2:value5 6: field3:value6} 7: {field1:value7 8: field2:value8 9: field3:value9}] $VAR1 = '[{field1:value1'; $VAR2 = 'field2:value2'; $VAR3 = 'field3:value3}'; $VAR4 = '{field1:value4'; $VAR5 = 'field2:value5'; $VAR6 = 'field3:value6}'; $VAR7 = '{field1:value7'; $VAR8 = 'field2:value8'; $VAR9 = 'field3:value9}]';

Am I hopelessly off target with respect to your intent? If so, can you explain more clearly for the guy (\me) currently playing 'village idiot'? (Remember, every village needs one.)

Replies are listed 'Best First'.
Re^2: Move data into AoH
by vitoco (Friar) on Sep 26, 2013 at 13:39 UTC

    Given the following table from a database:

    +------+-----------+--------------------+ | id | title | comments | +------+-----------+--------------------+ | 2911 | Rush Hour | Fun :-) | +------+-----------+--------------------+ | 3217 | Titanic | Drama, too long | +------+-----------+--------------------+ | 6518 | Bambi | | +------+-----------+--------------------+ | 7388 | Star Wars | "I'm your father!" | +------+-----------+--------------------+

    In a JSON structure, this would be something like the following (without newlines to show the records):

    [{"id":2911,"title":"Rush Hour","comments":"Fun :-)"} ,{"id":3217,"title":"Titanic","comments":"Drama, too long"} ,{"id":6518,"title":"Bambi"} ,{"id":7388,"title":"Star Wars","comments":"\"I'm your father!\""} ]

    My intention was to read this "special" file and translate it into a tab delimited CSV-like file, filling missing fields with default values.

    The problem parsing this very long line is that it's not possible to split by comma without looking at the context, because it appears in data, just like colons, quotes (escaped with a backslash), and structure indicators ([ ] and { }, not the regex special chars, BTW).

    JSON module can parse that file without problems, returning the AoH structure just like it was intended:

    use JSON; use Data::Dumper; my $json = <DATA>; my $data = decode_json($json); print Dumper $data; __DATA__ [{"id":2911,"title":"Rush Hour","comments":"Fun :-)"},{"id":3217,"titl +e":"Titanic","comments":"Drama, too long"},{"id":6518,"title":"Bambi" +},{"id":7388,"title":"Star Wars","comments":"\"I'm your father!\""}]
    $VAR1 = [ { 'title' => 'Rush Hour', 'id' => 2911, 'comments' => 'Fun :-)' }, { 'title' => 'Titanic', 'id' => 3217, 'comments' => 'Drama, too long' }, { 'title' => 'Bambi', 'id' => 6518 }, { 'title' => 'Star Wars', 'id' => 7388, 'comments' => '"I\'m your father!"' } ];

    Of course, if I had to manage this structure in memory, I'd use HoH (with id as the main key) or HoA (to save space if all other fields are present in every record) instead of AoH... Or maybe a single hash (as you proposed first) with a key composed of the id value of a record and the field name for every other values... But this should be another discussion.

    Was I clear this time? I'm sorry, I'm translating from Spanish on the fly... ;-)

      Was I clear this time?

      Yes! Thank you for making the effort, and for posting your code which will likely be a blesssing for future Seekers....

      I'm translating from Spanish on the fly... ;-)

      ... and in this case, doing it so well that I think I'm entirely clear about your intent and suspect you know a good deal more about JSON, likely know more about Perl and certainly have had an ups-and-downs intro to the Monastery.

      So go forth; sin no more, but confess (with code and data) here if you run into further problems.

      If you didn't program your executable by toggling in binary, it wasn't really programming!

        Well, I'm used to mangle tons of data from many different sources and formats, but I didn't know about JSON. I only figured out about the data format and potential problems. That's why I started this thread! As some monks pointed out, JSON module was what I needed this time... at least for a reasonable number of records! :-)

        BTW: in Re^2: Move data into AoH I said: "if I had to manage this structure in memory". I meant: "if I had to manage this dataset in memory".

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1055715]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (7)
As of 2018-05-25 17:37 GMT
Find Nodes?
    Voting Booth?