Re: Capture Lookahead

Like this?

#!/usr/bin/perl
use strict;
use warnings;

my $str = do {local $/; <DATA>};
$str =~ s/\s+//g;
my $len = length $str;

while (--$len > 780) {
    printf "%3d : %s\n", $_, substr( $str, $_, $len )
        for 1 .. ( length( $str ) - $len );
}
[download]

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?

"Science is about questioning the status quo. Questioning authority".

The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.

Comment on Re: Capture Lookahead Download Code

Replies are listed 'Best First'.
Re^2: Capture Lookahead by Cristoforo (Curate) on Jul 26, 2005 at 02:57 UTC
Yes and thanks! I made a slight adjustment. The true reason for generating these substrings is to see if there are any palindromes and I need to check all possible substrings of the fasta string (here 800 chars). This code is not checking for palindromes (but is trivial to add that). I just needed to get the munging part correct. (Just to run the program as is created an 86 MB file, which I won't be doing - just printing out the palindromes instead). But, it does take some time and with a larger fasta string, may take a while to just test every substring. Thanks everyone. I'll be working at this for a while now. The reason I am doing this is because I saw that Mathmatica does a palindrome check in pretty terse terms (someone had a link to Mathmatica here in the Monks a few days ago). Chris `#!/usr/bin/perl use strict; use warnings; my $str = do {local $/; <DATA>}; $str =~ s/\s+//g; my $len = length $str; do { printf "%3d : %s\n", $_, substr( $str, $_, $len ) for 0 .. ( length( $str ) - $len ); } while (--$len > 3);` [download]	[reply] [d/l]
Re^3: Capture Lookahead by fishbot_v2 (Chaplain) on Jul 26, 2005 at 11:46 UTC
If you want to find palindromes, why not do it with a regex directly? `#!/usr/bin/perl use strict; use warnings; my $str = do {local $/; <DATA>}; $str =~ s/\s+//g; while ( $str =~ m/( (..+) .? (??{ reverse $2 }) )/xgc ) { print pos( $str ) . ": $1\n"; pos( $str ) = $-[0] + 1; # slide pos back to the left }` [download] prints: `14: AGGGA 21: TACAT 25: GTTG 55: GAAAAAAAG ...etc...` [download] I'm not sure how it would compare with the two-stage approach for speed, though. It is much faster if you minimize the qualifier `..+?`, but then you end up with the shortest palindrome at each position, rather than the longest. I make the assumption that you don't care about palindromes shorter than 4 characters. If you bump that upwards, things get faster. Update: I tested, and it looks like the `substr` approach is considerably faster, particularly if you do it in a single pass.	[reply] [d/l] [select]

In Section Seekers of Perl Wisdom