Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re: More efficient way for this pattern match?

by GrandFather (Saint)
on Apr 14, 2015 at 01:40 UTC ( [id://1123347]=note: print w/replies, xml ) Need Help??


in reply to More efficient way for this pattern match?

No, the split function is not a good way to do this trick. A regular expression is a better bet:

#!/usr/bin/perl use strict; use warnings; use File::Copy; my $str1='MCCAALAPPMAATVGPESIWLWIGTIGMTLGTLYFVGRGRGVRDRKMQEFYIITIFITTI +AAAMYFAMATGFGVT-------------EVMVG----DE---ALTIYWARYADWLFTTPLLLLDLSLLA +GANRN----TIATLIG-LDVFMIG---T---GAIAALSST-PGTRIAWWAIST--GALL--ALLYVLVG +TLSENARNRAPEVA--SLFGRLRNLVIALWFLYPVVWILGT---EGTFGILP--LYWETAAFMVLDLSA +KVGFGVILLQSRSVLERVATPTAAPT'; my $str2='--OOOOOOOOOOOOOOOOMMMMMMMMMMMMMMMMMMMMMIIIIIIIIIIMMMMMMMMMMM +MMMMMMMMMMOOOOO-------------OOOOO----OO---OOOOMMMMMMMMMMMMMMMMMMMMMII +IIIII----MMMMMMM-MMMMMMM---M---MMMMMMOOO-OOOOMMMMMMMM--MMMM--MMMMMMMM +MMIIIIIIIIIIII--IIIIMMMMMMMMMMMMMMMMMMMMO---OOO-OOOO--OOOMMMMMMMMMMMM +MMMMMMMMMIIIIIIIIIIIII----'; while ($str2 =~ /(-+)/g) { my ($start, $end) = ($-[0], $+[0]); my $matchLen = $end - $start; next if substr($str1, $start, $matchLen) =~ /^-+$/; my $chIdx = $end == length($str2) ? $start - 1 : $end; substr ($str2, $start, $matchLen, substr($str2, $chIdx, 1) x $matc +hLen); } print $str2;

Prints:

OOOOOOOOOOOOOOOOOOMMMMMMMMMMMMMMMMMMMMMIIIIIIIIIIMMMMMMMMMMMMMMMMMMMMM +OOOOO-------------OOOOO----OO---OOOOMMMMMMMMMMMMMMMMMMMMMIIIIIII----M +MMMMMM-MMMMMMM---M---MMMMMMOOO-OOOOMMMMMMMM--MMMM--MMMMMMMMMMIIIIIIII +IIII--IIIIMMMMMMMMMMMMMMMMMMMMO---OOOOOOOO--OOOMMMMMMMMMMMMMMMMMMMMMI +IIIIIIIIIIIIIIII
Perl is the programming world's equivalent of English

Replies are listed 'Best First'.
Re^2: More efficient way for this pattern match?
by ikegami (Patriarch) on Apr 14, 2015 at 03:00 UTC
    Modifying $str2 is resetting pos($str2), so you're doing a lot of unneeded work. Adding pos($str2) = $end; at the end of the loop addresses this issue.
Re^2: More efficient way for this pattern match?
by hdb (Monsignor) on Apr 14, 2015 at 12:08 UTC

    Your script seems to make some additional assumptions that I cannot find in the original question. For example, for

    my $str1='--M--CCA'; my $str2='-----OOO';

    your script prints OOOOOOOO while I would have thought it should be --O--OOO?

    UPDATE: In order to avoid any confusion, the strings above are NOT part of the original question but examples I constructed assuming they could occur. The purpose of this was to highlight a situation where the proposed script would deliver something that violates the original requirements. Not sure the example is really relevant.

      Good catch. It's a bug. The line:

      next if substr($str1, $start, $matchLen) =~ /^-+$/;

      should be more like (untested):

      next if ! matchLen || substr($str1, $start, $matchLen) !~ /[^-]/;
      Perl is the programming world's equivalent of English
Re^2: More efficient way for this pattern match?
by Anonymous Monk on Apr 14, 2015 at 07:14 UTC
    Thanks so much!
Re^2: More efficient way for this pattern match?
by Anonymous Monk on Apr 14, 2015 at 06:46 UTC

    No, the split function is not a good way to do this trick. A regular expression is a better bet:

    FWIW, split function , same as match operator, both take a regular expression ;)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1123347]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (7)
As of 2024-04-19 09:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found