Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??

I have no bioinformatic background, but I'd like to offer a couple of comments on your code, specifically the version that counts overlapping letter pairs (would 'digrams' be an appropriate term for these?).

my %acids; for(my $i = 0; $i < length($string)-1; $i++){ my $amino = substr($string, $i, 2); if(exists $acids{$amino}){ $acids{$amino}++; }else{ $acids{$amino} = 1; } #print "$amino\n"; }

Because it is not necessary to check for the existence of a hash key before incrementing its value (due to autovivification), the body of this for-loop can be reduced to a single statement:
    ++$acids{ substr $string, $i, 2 }
This will almost certainly yield a speed benefit.

Alternatively, in 5.10+ versions of Perl, the entire for-loop can be replaced by a single regex (tested):
    $string =~ m{ (?= (..) (?{ ++$pairs2{$^N} }) (*FAIL)) }xms;
This may or may not increase speed; you will have to Benchmark this for yourself. The alternate regex
    m{ (?= .. (?{ ++$pairs2{${^MATCH}} }) (*FAIL)) }xmsp
also works (note the additional  /p regex modifier) and may be slightly faster because no capturing group is used. Again, Benchmark-ing will tell the tale.

>perl -wMstrict -le "use Test::More tests => 2; use Data::Dump; ;; my $string = 'ABCCCDEAB'; ;; my %pairs1; $pairs1{$_}++ for $string =~ /(?=(..))/g; ;; local our %pairs2; $string =~ m{ (?= .. (?{ ++$pairs2{${^MATCH}} }) (*FAIL)) }xmsp; ;; my %pairs3; for (my $i = 0; $i < length($string) - 1; ++$i) { ++$pairs3{ substr $string, $i, 2 } } ;; dd \%pairs1, \%pairs2, \%pairs3; is_deeply \%pairs1, \%pairs2, '1 & 2, same results'; is_deeply \%pairs1, \%pairs3, '1 & 3, same results'; " 1..2 ( { AB => 2, BC => 1, CC => 2, CD => 1, DE => 1, EA => 1 }, { AB => 2, BC => 1, CC => 2, CD => 1, DE => 1, EA => 1 }, { AB => 2, BC => 1, CC => 2, CD => 1, DE => 1, EA => 1 }, ) ok 1 - 1 & 2, same results ok 2 - 1 & 3, same results

In reply to Re^2: Using grep in a scalar context by AnomalousMonk
in thread Using grep in a scalar context by newbie1991

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others chanting in the Monastery: (7)
    As of 2015-11-26 01:31 GMT
    Find Nodes?
      Voting Booth?

      What would be the most significant thing to happen if a rope (or wire) tied the Earth and the Moon together?

      Results (695 votes), past polls