Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re: Table manipulation, array or hash?

by ack (Deacon)
on Mar 23, 2010 at 17:36 UTC ( [id://830348]=note: print w/replies, xml ) Need Help??


in reply to Table manipulation, array or hash?

I took a somewhat similar approach to the other reponders and used a Hash of Hashes (HoH) and combined the two data values for each entry into a string which could latter be split() so as to avoid the third level of deep data structure.

My approach (with just the principle code snippet) is shown below:

my %myHash = (); my %tempHash = (); foreach (@lines){ my($key1,$key2,$val1,$val2,$rest) = split(/\s+/,$_,5); my $combinedValue = sprintf("%2s,%2s",$val1,$val2); $key1 =~ /SNP(\d+)/; my $indx = $1; if(exists $myHash{$key2}){ %tempHash = %{$myHash{$key2}}; $tempHash{$indx} = $combinedValue; $myHash{$key2} = {%tempHash}; } else { $tempHash{$indx} = $combinedValue; $myHash{$key2} = {%tempHash}; } } foreach my $key (sort keys %myHash){ my %tempHash2 = %{$myHash{$key}}; my $line2output = "$key "; foreach my $sortedKey (sort keys %tempHash2){ $line2output .= sprintf(" %2s %2s",split(',',$tempHash2{$sortedKey})); } print "$line2output\n"; }

I have also put the OP's example data input into an array, @lines, to simplify my testing. Assuming that the lines are being read in from a file, one would do a foreach (<INPUT>){} rather than my foreach (@lines){} structure.

I hope this helps and shows yet another approach that works. I didn't spend a lot of time optimizing or simplifying. I figure that is a worthwhile exercise for the reader and the OP.

ack Albuquerque, NM

Replies are listed 'Best First'.
Re^2: Table manipulation, array or hash?
by GrandFather (Saint) on Mar 23, 2010 at 20:14 UTC
    one would do a foreach (<INPUT>){}

    No one wouldn't. One might do while (<$inFile>) {...} however. Perl for loops like to work with lists of things and will generally create a list (except in a few special cases) which in the code you suggested would slurp the entire file into memory - something that should generally be avoided.

    That aside, I find your sample code very 'busy' with repeated code and needless (and poorly named) variables. Contrast it with the following:

    my %dataHash; foreach (@lines) { my ($key1, $key2, $val1, $val2, $rest) = split(/\s+/, $_, 5); my $combinedValue = "$val1,$val2"; $key1 =~ /SNP(\d+)/; $dataHash{$key2}{$1} = $combinedValue; } foreach my $key (sort keys %dataHash) { my %tempHash2 = %{$dataHash{$key}}; print $key; printf(" %2s %2s", split(',', $tempHash2{$_})) for sort keys %temp +Hash2; print "\n"; }

    In a teaching context it is desirable to present the cleanest code you can and to demonstrate best practises. Worthwhile exercises for the reader generally entail extending the code in various ways - not in trying to compensate for the sample's deficiencies.


    True laziness is hard work

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://830348]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (5)
As of 2024-04-23 19:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found