Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

array of hashes, categorized by array index

by tevus_oriley (Novice)
on Jun 26, 2013 at 21:10 UTC ( #1040844=perlquestion: print w/ replies, xml ) Need Help??
tevus_oriley has asked for the wisdom of the Perl Monks concerning the following question:

I have a log file I am parsing that will eventually become a report. I split each line of the log file into its individual fields, and ran select fields into a hash.
%report = (); while (my $line = <$log_file>) { my @fields = (split / /, $line); $report{$_}++ foreach @fields[2,4,5,9,7,1]; }

That works fine... and then I realized I need each field categorized for the reporting part later. I think I need an array of hashes; but I can't figure out how to name each hash after the index number of the @fields array as I'm creating it. The existing hash above could be changed to the main array @report. Then each of the selected index numbers would become the name of a different hash within @report.

The following horrible code creates hashes for each line of input instead of a hash for each selected field.

my @report; while (my $line = <$log_file>) { my @fields = (split / /, $line); foreach (@fields[2,4,5,9,7,1]) { my %hash; foreach (@fields[2,4,5,9,7,1]) { $hash{$_}++; } push @report, \%hash; } }
sample log lines:
483 OS dx-32 1 charles list4 aardvark.com ty-off lx-on C 01 483 DS dx-14 1 james list3 23.456.12.7 ty-on lx-on B 01 769 XO dx-32 5 sully nolist widgets.com ty-on lx-on V 07
so for example, when the array of hashes is created, one of the hashes in the array would be
@report = ( { #### 2 or fields2, something like that dx-32 => 2, dx-14 => 1, }, { #### fields3, and so on
any thoughts?

Comment on array of hashes, categorized by array index
Select or Download Code
Re: array of hashes, categorized by array index
by LanX (Canon) on Jun 26, 2013 at 21:29 UTC
    my @report; while (my $line = <$log_file>) { my @fields = (split / /, $line); $report[$_]{$fields[$_]}++ for 2,4,5,9,7,1; }

    should do! (untested)

    But I would rather use names for each column in a HoH (since you are skipping indices), so

       $report{ $column[$_] }{ $fields[$_] }++  for 2,4,5,9,7,1;

    might be better! (for a hash my %report of course)

    Cheers Rolf

    ( addicted to the Perl Programming Language)

    UPDATE

    using column-numbers as names/keys in a HoH

    use strict; use warnings; use Data::Dump qw/dd/; my %report; while (my $line = <DATA>) { next if $line =~ /^\s*$/; # skip empty lines my @fields = (split / /, $line); $report{$_}{$fields[$_]}++ for 2,4,5,9,7,1; } dd \%report; __DATA__ 483 OS dx-32 1 charles list4 aardvark.com ty-off lx-on C 01 483 DS dx-14 1 james list3 23.456.12.7 ty-on lx-on B 01 769 XO dx-32 5 sully nolist widgets.com ty-on lx-on V 07

    thats what you want?

    { 1 => { DS => 1, OS => 1, XO => 1 }, 2 => { "dx-14" => 1, "dx-32" => 2 }, 4 => { charles => 1, james => 1, sully => 1 }, 5 => { list3 => 1, list4 => 1, nolist => 1 }, 7 => { "ty-off" => 1, "ty-on" => 2 }, 9 => { B => 1, C => 1, V => 1 }, }
      thats exactly what I want, the HoH is a better idea. Thanks for that, LanX.
Re: array of hashes, categorized by array index
by BrowserUk (Pope) on Jun 26, 2013 at 21:32 UTC

    Why "name" the fields. They are indexed numerically, so use the indexes to index your report array.

    The following is a dump of @report. It shows that (from the 3 line sample supplied) field[0] contains 2x'483' and 1x'769'; field[1] contained 1 each of 3 values:'DS', 'OS', 'XO'; and so on.

    C:\test\primes>..\junk83 [ { 483 => 2, 769 => 1 }, { DS => 1, OS => 1, XO => 1 }, { "dx-14" => 1, "dx-32" => 2 }, { 1 => 2, 5 => 1 }, { charles => 1, james => 1, sully => 1 }, { list3 => 1, list4 => 1, nolist => 1 }, { "23.456.12.7" => 1, "aardvark.com" => 1, "widgets.com" => 1 }, { "ty-off" => 1, "ty-on" => 2 }, { "lx-on" => 3 }, { B => 1, C => 1, V => 1 }, { "01" => 2, "07" => 1 }, ]

    Code to produce the above output:

    #! perl -slw use strict; use Data::Dump qw[ pp ]; my @report; while( <DATA> ) { my @f = split ' ', $_; for my $field ( 0 .. $#f ) { $report[ $field ]{ $f[ $field ] }++; } } pp \@report; __DATA__ 483 OS dx-32 1 charles list4 aardvark.com ty-off lx-on C 01 483 DS dx-14 1 james list3 23.456.12.7 ty-on lx-on B 01 769 XO dx-32 5 sully nolist widgets.com ty-on lx-on V 07

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: array of hashes, categorized by array index
by Eily (Hermit) on Jun 26, 2013 at 21:50 UTC

    What's wrong with your "horrible" code is that you don't use the data in the outer loop. If written like that:

    OUTER: foreach my $outerField (@fields[2,4,5,9,7,1]) { my %hash; INNER: foreach my $innerField(@fields[2,4,5,9,7,1]) { $hash{$innerField}++; } push @report, \%hash; }
    you can see that $outerField is declared but never used. You're actually just doing six times the same thing in a row.

    If I were you, I'd get the result of the split into a hash instead of an array, because selecting them by number doesn't help reading.

    use strict; use warnings; use Data::Dumper; my %report; my $nameCount=0; my @names = map { $_.++$nameCount } ('Key', ) x 10; # Here the names will be Key1 to Key10, # But you should actually give explicit names with something like # my @names = qw/Princess Leia Obiwan Kenobi Anakin Skywalker Darth Va +der Han Solo/; while (my $line = <DATA>) { my %fields; @fields{@names} = (split / /, $line); # Here we put the result into + a slice of %fields foreach my $name (@names) { my $value = $fields{$name}; $report{$name}{$value}++; # Auto vivification here, $report{$nam +e} magically becomes a hashref } } print Dumper \%report; __DATA__ 483 OS dx-32 1 charles list4 aardvark.com ty-off lx-on C 01 483 DS dx-14 1 james list3 23.456.12.7 ty-on lx-on B 01 769 XO dx-32 5 sully nolist widgets.com ty-on lx-on V 07
    $VAR1 = {
              'Key10' => {
                           'C' => 1,
                           'B' => 1,
                           'V' => 1
                         },
              'Key2' => {
                          'XO' => 1,
                          'OS' => 1,
                          'DS' => 1
                        },
              'Key1' => {
                          '483' => 2,
                          '769' => 1
                        },
              'Key5' => {
                          'james' => 1,
                          'sully' => 1,
                          'charles' => 1
                        },
              'Key8' => {
                          'ty-on' => 2,
                          'ty-off' => 1
                        },
              'Key4' => {
                          '1' => 2,
                          '5' => 1
                        },
              'Key6' => {
                          'nolist' => 1,
                          'list3' => 1,
                          'list4' => 1
                        },
              'Key3' => {
                          'dx-32' => 2,
                          'dx-14' => 1
                        },
              'Key7' => {
                          'aardvark.com' => 1,
                          'widgets.com' => 1,
                          '23.456.12.7' => 1
                        },
              'Key9' => {
                          'lx-on' => 3
                        }
            };
    
    So if you want to access the information on the site name, and have named the seventh key "siteName", you just have to go for $report{siteName} instead of $report[7]

    Appart from the use of a hash instead of an array, it's pretty much the same as LanX and BrotherUk's ones.

Re: array of hashes, categorized by array index
by bdalzell (Sexton) on Jun 26, 2013 at 22:52 UTC

    In your log line sample there are two lines with the same first number. however if the first number is the log line and you use a sample where all the first numbers are different this shows you how to create a hash of arrays where the key is the first number in the line.

    #!/usr/bin/perl use strict; my @sample = ("483 OS dx-32 1 charles list4 aardvark.com ty-off lx-on C 01", "495 DS dx-14 1 james list3 23.456.12.7 ty-on lx-on B 01", "769 XO dx-32 5 sully nolist widgets.com ty-on lx-on V 07"); my %hoa; foreach my $line (@sample){ my @array = split(/ /, $line); my $key = $array[0]; $hoa{$key} = \@array; #this is a reference } my @keys = keys %hoa; foreach my $item(@keys){ print "log line $item\n"; my @array = @{$hoa{$item}}; #this is dereferenced foreach my $part(@array){ print "$part - ";} print "\n"; }
    This will not work if you eliminate the my designation for @array.
Re: array of hashes, categorized by array index
by sundialsvc4 (Monsignor) on Jun 26, 2013 at 23:58 UTC

    Usually, when faced with requirements like these, I grab the various “columns” into hash buckets, just so that I can easily refer to them by name.   (I also use use constant to define those names in the code.)

    Then, when it comes time to actually generate an array out of the thing, I simply iterate through a qw// list of those strings, so that it is the order of the entries in this list (effortlessly changed to suit the present whim of the marketing department ...) that defines the column-order.   You can rearrange them to your heart’s content and nothing else happens throughout the code.

    By contrast, if you try to use actual array-indices ... well, the day will come when the third change of the day happens and “you miss just one of them ...”

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1040844]
Approved by BrowserUk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (11)
As of 2014-09-18 17:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (120 votes), past polls