Re^2: Using grep in a scalar context

in reply to Re: Using grep in a scalar context
in thread Using grep in a scalar context

I have no bioinformatic background, but I'd like to offer a couple of comments on your code, specifically the version that counts overlapping letter pairs (would 'digrams' be an appropriate term for these?).

my %acids; for(my $i = 0; $i < length($string)-1; $i++){ my $amino = substr($string, $i, 2); if(exists $acids{$amino}){ $acids{$amino}++; }else{ $acids{$amino} = 1; } #print "$amino\n"; }
[download]

Because it is not necessary to check for the existence of a hash key before incrementing its value (due to autovivification), the body of this for-loop can be reduced to a single statement:
++$acids{ substr $string, $i, 2 }
This will almost certainly yield a speed benefit.

Alternatively, in 5.10+ versions of Perl, the entire for-loop can be replaced by a single regex (tested):
$string =~ m{ (?= (..) (?{ ++$pairs2{$^N} }) (*FAIL)) }xms;
This may or may not increase speed; you will have to Benchmark this for yourself. The alternate regex
m{ (?= .. (?{ ++$pairs2{${^MATCH}} }) (*FAIL)) }xmsp
also works (note the additional /p regex modifier) and may be slightly faster because no capturing group is used. Again, Benchmark-ing will tell the tale.

>perl -wMstrict -le
"use Test::More tests => 2;
 use Data::Dump;
 ;;
 my $string = 'ABCCCDEAB';
 ;;
 my %pairs1;
 $pairs1{$_}++ for $string =~ /(?=(..))/g;
 ;;
 local our %pairs2;
 $string =~ m{ (?= .. (?{ ++$pairs2{${^MATCH}} }) (*FAIL)) }xmsp;
 ;;
 my %pairs3;
 for (my $i = 0;  $i < length($string) - 1;  ++$i) {
   ++$pairs3{ substr $string, $i, 2 }
   }
 ;;
 dd \%pairs1, \%pairs2, \%pairs3;
 is_deeply \%pairs1, \%pairs2, '1 & 2, same results';
 is_deeply \%pairs1, \%pairs3, '1 & 3, same results';
"
1..2
(
  { AB => 2, BC => 1, CC => 2, CD => 1, DE => 1, EA => 1 },
  { AB => 2, BC => 1, CC => 2, CD => 1, DE => 1, EA => 1 },
  { AB => 2, BC => 1, CC => 2, CD => 1, DE => 1, EA => 1 },
)

ok 1 - 1 & 2, same results
ok 2 - 1 & 3, same results
[download]

Comment on Re^2: Using grep in a scalar context Select or Download Code

In Section Seekers of Perl Wisdom