I'm not sure if I understood correctly. Does this do the trick? I made the slight modification of putting the count of "AG"s (for example) in $count{AG}, not $count_AG.
use strict;
use warnings;
use Algorithm::Loops qw( MapCar );
my @site1 = qw(
AATKKM
123456
);
my @site2 = qw(
GGGGGG
!@#$%^
);
my $site1;
my $site2;
my @pairs;
my %counts;
foreach $site1 (@site1) {
my @site1_parts = split(//, $site1);
foreach $site2 (@site2) {
my @site2_parts = split(//, $site2);
MapCar { $counts{$_[0].$_[1]}++ }
\@site1_parts, \@site2_parts;
}
}
print($_, ': ', $counts{$_}, $/) foreach (sort keys %counts);
__END__
output
======
1!: 1
1G: 1
2@: 1
2G: 1
3#: 1
3G: 1
4$: 1
4G: 1
5%: 1
5G: 1
6G: 1
6^: 1
A!: 1
A@: 1
AG: 2
K$: 1
K%: 1
KG: 2
MG: 1
M^: 1
T#: 1
TG: 1
| [reply] [Watch: Dir/Any] [d/l] [select] |
#! perl -slw
use strict;
use List::Util qw[ min ];
use Data::Dumper;
my @site1 = qw[ abcde cdefg efghi ];
my @site2 = qw[ zyxwv xwvut vutrs ];
my %counts;
for my $site1 ( @site1 ){
for my $site2 ( @site2 ) {
$counts{ substr( $site1, $_, 1 ) . substr( $site2, $_, 1 ) }++
+
for 0 .. min( length( $site1 )-1, length( $site2 )-1 );
}
}
print Dumper \%counts;
__END__
P:\test>400340.pl
$VAR1 = {
'bu' => 1,
'cv' => 2,
'fu' => 2,
'av' => 1,
'hw' => 1,
'ax' => 1,
'du' => 2,
'gv' => 2,
'es' => 1,
'is' => 1,
'ct' => 1,
'iv' => 1,
'gx' => 1,
'fr' => 1,
'cx' => 2,
'dy' => 1,
'az' => 1,
'ev' => 3,
'ez' => 1,
'et' => 2,
'it' => 1,
'dw' => 2,
'hu' => 1,
'dr' => 1,
'ex' => 2,
'fy' => 1,
'cz' => 1,
'by' => 1,
'fw' => 2,
'bw' => 1,
'gs' => 1,
'hr' => 1,
'gt' => 2
};
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
"Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
| [reply] [Watch: Dir/Any] [d/l] |
| [reply] [Watch: Dir/Any] |
| [reply] [Watch: Dir/Any] [d/l] |
I read the description twice and did not understand, what you want to do.
| [reply] [Watch: Dir/Any] |
Your description is ambiguous. With the second string all the same char we have to infer either you want
- char1 str1 against char1 str2, char2 str1 against char2 str2 or you want all the permutaions ie
- char1 str1 against char(1..n) str2, char2 str1 against char(1..n) str2....
I am assuming you want all the permutations. There are two general approaches. Brute force and the more clever approach. Brute force won't scale well as it is O(n^2). The clever approach is 0(n) plus an O(n^2) whisker that represents the number of possible tokens. As you can see both approaches yield the same results. Assuming those are the results you wanted ;-)
use Data::Dumper;
my $site1 = 'AATKKM' x 20;
my $site2 = 'GGGGGA' x 20;
# brute force
my (%hash, $loops);
for my $base1( split //, $site1 ) {
for my $base2( split //, $site2 ) {
$hash{"${base1}_$base2"}++;
$loops++;
}
}
# precount incidence of tokens
my (%s1, %s2, %hash_e, $loops_e);
do{ $s1{$_}++; $loops_e++ } for split //, $site1;
do{ $s2{$_}++; $loops_e++ } for split //, $site2;
# still loop within loop but the are now far fewer loops to do
# as we only do one per token pair and calculate the total pairs
# from our precount data
for my $base1( keys %s1 ) {
for my $base2( keys %s2 ) {
$hash_e{"${base1}_$base2"} = $s1{$base1}*$s2{$base2};
$loops_e++;
}
}
print "Brute force loops $loops\nEfficeient Loops $loops_e\n\n";
print Data::Dumper->Dump([\%hash, \%s1, \%s2, \%hash_e], [qw( hash s1
+s2 hash_e)] );
__DATA__
Brute force loops 14400
Efficeient Loops 248
$hash = {
'T_G' => '2000',
'A_A' => '800',
'T_A' => '400',
'M_G' => '2000',
'M_A' => '400',
'K_G' => '4000',
'K_A' => '800',
'A_G' => '4000'
};
$s1 = {
'A' => '40',
'K' => '40',
'T' => '20',
'M' => '20'
};
$s2 = {
'G' => '100',
'A' => '20'
};
$hash_e = {
'T_G' => '2000',
'A_A' => '800',
'T_A' => '400',
'M_G' => '2000',
'M_A' => '400',
'K_G' => '4000',
'K_A' => '800',
'A_G' => '4000'
};
| [reply] [Watch: Dir/Any] [d/l] |
(From memory!) Your algorithm is O(N^2) but mine is O(N)
True enough, but which N?
I don't think your code comes close to answering the question. Your treating all the strings in each array as a single concatenated string. I'm not, and I don't believe that's what the OP wants.
Using your algorithm on my test data produces this:
Which is solving a totally different problem to the one I belive the OP wants to solve.
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
"Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
| [reply] [Watch: Dir/Any] [d/l] |