Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re: Obtaining combinations of hash keys and values

by choroba (Cardinal)
on Apr 28, 2016 at 16:39 UTC ( [id://1161790]=note: print w/replies, xml ) Need Help??


in reply to Obtaining combinations of hash keys and values

If the number of fragments you want to combine isn't dynamic (i.e. it's always 2), just write the nested loops:
#!/usr/bin/perl use warnings; use strict; my %fragments = ( 1 => { F => 'TTAAGTAGCATCGATTTATAGCATCGACTAGTAA', R => 'TTACTAGTCGATGCTATAAATCGATGCTACTTAA', }, 2 => { F => 'TTAGCTACGATCAGCTACGATCGAGCGACTACGTAGCAA +', R => 'TTGCTACGTAGTCGCTCGATCGTAGCTGATCGTAGCTAA +', }, ); my %combinations; for my $fragment (keys %fragments) { for my $other_fragment (keys %fragments) { next if $fragment eq $other_fragment; for my $cognate (qw( F R )) { for my $other_cognate (qw( F R )) { $combinations{"$fragment$cognate$other_fragment$other_ +cognate"} = $fragments{$fragment}{$cognate} . $fragments{$other_fragment}{$other_cognate}; } } } } use Data::Dumper; print Dumper(\%combinations);

This generates 8 entries, not four, as it creates 2F1F etc., too. You can modify the "next" condition to avoid it, i.e.

next if $fragment ge $other_fragment;

If you need to combine N fragments, see Algorithm::Loops.

($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,

Replies are listed 'Best First'.
Re^2: Obtaining combinations of hash keys and values
by Anonymous Monk on Apr 29, 2016 at 09:01 UTC
    Thanks! How scalable would this prove to be if thousands of input fragments are used?
      For thousands of fragments, your result set could get large. Since you are obtaining 4 results for each combination, the total number of results would come to 4 * the total number of combinations.

      • 4 * C(1000,2) == 1,998,000
      • 4 * C(5000,2) == 12,497,500
      • 4 * C(10,000,2) == 199,980,000
      • 4 * C(20,000,2) == 799,960,000

      I guess you need to know how you want to use/analyze the results. Also if you would want to print the results to a file or store in an array or hash.

      Update: When I ran a test here against a sample fasta file, I generated 124 fragments and stored the combinations in a hash. I had a total memory use of 12,848,152 bytes for the 30,504 combinations (about 421 bytes per combination). So I'd guess that if you had 1000 or more fragments, you would probably exceed your memory.

      When I used an array instead of a hash, the memory used was slightly less, 10,888,016 bytes. (roughly 357 bytes per combination).

Re^2: Obtaining combinations of hash keys and values
by Anonymous Monk on Apr 29, 2016 at 09:13 UTC
    I'm guessing I will need to print to an output file instead of storing combinations in a hash for large input amounts otherwise the RAM requirement is off the charts!
      You can't print before you've read the whole file, otherwise you don't know the counts. If this is really a problem (test it and see), you can store just line numbers in the hash instead of the full blocks, and then read the file for the second time and print the numbers (but only print the first block for each corresponding key!).

      ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1161790]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (3)
As of 2024-04-19 01:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found