Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Re^2: making presence/absence table from a hash of arrays

by reubs85 (Acolyte)
on Sep 06, 2011 at 09:27 UTC ( #924339=note: print w/ replies, xml ) Need Help??

in reply to Re: making presence/absence table from a hash of arrays
in thread making presence/absence table from a hash of arrays

Thanks for the responses everyone,

Incidentally, it is not 'homework'; I am a PhD genome biologist still getting to grips with the finer details of Perl - I am going to use the code as part of a script that will allow me to count genes which are shared across different species. The example I wrote was merely for ease of readership.

So whilst its true that I should learn more about the grep and map functions (although asking questions is part of learning, I think), I'm not skipping my way through some homework assignment.

Thanks again!

Comment on Re^2: making presence/absence table from a hash of arrays
Re^3: making presence/absence table from a hash of arrays
by Marshall (Prior) on Sep 06, 2011 at 16:19 UTC
    It is certainly possible to get a lot done without fancy map statements. There is certainly something to be said for doing something straightforward with foreach loops. Don't worry about being compact/terse - do something that is easy for you to understand - worry about more complex constructs when you are writing a lot more Perl.

    See code below. Perl is great at translating one thing into another thing - the hash table. So I just make a hash table table to translate the column name into an array index. This also perhaps could have been just statically declared, but I wanted to make this flexible. For each row in the table, I just zero out an array and use the name2Index translator to turn on the appropriate elements and then print that row.

    #!/usr/bin/perl -w use strict; use Data::Dump qw(pp); my $header_row= 'one two three four five'; my %table = ( row_1 => [qw(one five two)], row_2 => [qw(four two)], row_3 => [qw(three one five four)], ); my %name2Index; my $col=0; foreach my $col_head (split ' ',$header_row) { $name2Index{$col_head} = $col++; } print "name2Index table = ",pp(\%name2Index),"\n\n"; foreach my $row (sort keys %table) { my @bitmap = (0) x keys %name2Index; foreach my $col_name (@{$table{$row}}) { $bitmap[$name2Index{$col_name}] = 1; } print "$row = @bitmap\n"; } __END__ name2Index table = { five => 4, four => 3, one => 0, three => 2, two = +> 1 } row_1 = 1 1 0 0 1 row_2 = 0 1 0 1 0 row_3 = 1 0 1 1 1
      my $header_row= 'one two three four five'; ... my %name2Index; ... my $col=0; foreach my $col_head (split ' ',$header_row) { $name2Index{$col_head} = $col++; }

      Or more simply as:

      my @header_row = qw/ one two three four five /; ... my %name2Index; ... @name2Index{ @header_row } = 0 .. $#header_row;

        The innermost foreach loop can also be replaced with array/hash slices:

        #!/ichigo/perl use v5.12; use warnings; use strict; my @array = qw/ one two three four five /; my %hash = ( row_1 => [ qw/ one five two / ], row_2 => [ qw/ four two / ], row_3 => [ qw/ three one five four / ], ); say "@array"; # Get index mapping. my %index; @index{@array} = 0..$#array; my @zero = (0) x @array; my @one = (1) x @array; for my $key (sort keys %hash) { # Zero-out a bit array. my @bits = @zero; # Flip bits to one using index. @bits[@index{@{$hash{$key}}}] = @one; say "$key = @bits"; }
        "Or more simply as:" I would say that is debatable.

        I think that sometimes we get a bit carried away with the "zoom" of Perl and don't emphasize the basics for beginners, i.e. we would do well to consider the audience when suggesting code.

        The OP is a biologist, not a SW person. The purpose of my post was to show code that only used the most basic parts of beginning Perl - something simple - both from the program logic and the syntax. Also, this code may actually run faster than some some more terse versions!

        For the OP, what jwkrahn is demonstrating here is called a "hash slice". This essentially combines multiple hash assignment statements, like: $hash{a}=0; $hash{b}=1; together as one statement, could be: @hash{'a','b'}=(0,1); This is great and cool stuff, but a plain old foreach() loop is just fine. Shorter code does not necessarily run faster - in fact, sometimes it runs slower, but it is sometimes easier to write for those "in the know". Perl is loaded with idioms. Extensive use of them is not necessary to write good solid, clear, high performing code.

        The syntax for a hash slice looks similar to that of an array as a hash value. @{$hash{value}}, but it is not. The HoA (Hash of Array), @{$hash{value}} is the "take home", "use it often", "get used to seeing it" message here. A hash slice is less often encountered.

        Anyway, my point here is that a hash slice is probably not "easier" for a beginner to understand because of the syntax.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://924339]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (11)
As of 2014-10-02 12:53 GMT
Find Nodes?
    Voting Booth?

    What is your favourite meta-syntactic variable name?

    Results (56 votes), past polls