How can I do this action WITHOUT split?

by Anonymous Monk
on Feb 22, 2014 at 16:45 UTC
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello fellow Monks!
I am trying to optimize a code that I have, which is supposed to do the following:
If you have 2 strings, like the following:
$str_no_dash=' IIIIIIIIIIIIIIIIIIIIIIIIIIIMMMMMMMOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOMMMMMM +MMMIIIMMMMMMMMMMMOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOMMMMMMMMMI +IIMMMMMMMMMOOOOMMMMMMMMMIIMMMMMMMMMOOOOOOOOOOOOOOOOOMMMMMMMMMIIIIMMMM +MMMMMOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOMMMMMMMIIIIIMMM +MMMMMMMMOOOOOOOOOOOOOOOOOOOMMMMMMMMMMMIIIIIMMMMMMMMMMMOOOOOOOOOOOO'; $str_with_dash='VNRV-L-K--R-------PL------A-------N---------PV----N--- +--D----V--G---V--T-----L-----G------T-------------------------------- +--------------------------------------------------------------------- +--------------------------------------------------------------------- +---RG----E--R----RG--E-F-----D-L--G----WN----A----N-------D---------- +----------A-------AR-----F---RM----T----G-----A-----A--E-N----S-N-SF- +--------------------------------------------------------------------- +--------------------------------------------------------------------- +--------------------------------------------------------------------- +-----------------------------------------------R--------------------- +--------------------------------------------------------------------- +--------------------------------------------------------------------- +------------------------------------DQF--Q-----L-N-R-Q---A----IA--P-- +S----------------AQ--F---------K-------L----D---R----------D--------T +--V--L--------N--V-------E--FDY-LHD---------------------------------- +--------------------------------------------------------------------- +------------------------------------------------------R--R---T--S---D +--Q--GI-------------------------------------P--AYR------------------- +------------------GRPV---------------DVP-IN----TY-Y-GSAD------------- +----------------------------------------------------------------GVNSS +YNDV-S----A-K--SA-----------------T-V--------S----L-------------D---- +-H-----R---------------F----N-----------D-S-L-S-F------------------HG +--A------------------------------------------------------------------ +-------------------------------------------I--R--AY------------------ +--------DF--SL------ER----------------------------------------------- +-----K-N---Y-V-----T----Y----E--P-I---K-TA--------------------------- +--------------------------------------------------------------------- +-----------------------------------------------AHP--------------VV--T +--L--D-Q--S--T------------------------------------------------------- +--------------------------------------------------------------------- +--------------------------------------------------------------------- +--------------------------------------------------------------------- +RQRTDH-----G--I-D---G--L--F--EL----------------------------------T--- +-QK---TS-----------LFGM---------------------------------------------- +--------------------------RH--E----------L---L--------Y------G--L---- +---E--LS--Q----Q--Q-K------------------------------------------------ +--------------------------------------------------------------------- +--------------------------------------------------------------------- +--------------------------------------------------------------------- +---------------------F--DTIY--------S---------------V-S-------------- +---------------------------------------------------------------K---VA +-TYDLFNP----------------------------------QPVVLPG--VP----------TGT--- +--------------------------------------------------------------------- +--------------------------------------------------------------------- +----------RA----K-----------------------TNASTVVG-LA-----G--V-Y----A-- +------Q-------D--L--IS-----------------------L---------T---E--H------ +-------W---K-----V---L--A---G----------------------------L----------- +--R---F-------------------------------------DY---------------------L- +-------------------------------NQ-------------I---------------------- +RHDYTSSNV------------------------------------------------------------ +--------------------------------------------------------------------- +--------------------------------------------------------------------- +--------------------------------------------------------------------- +--------------------------NLDRTDHAW----SP-------R---V---------------- +-------------------G--LI------Y-------------------------------E--P--- +----L---DW----------------------------------L--TL--Y----G--SFSQ------ +-----S-F------S-------------------------------------P-L--A----------- +--D----T----L-------I-S--SG------------A----------------------------- +---------------------------------------------------------------';

to create a new string, which will have the same letters as $str_no_dash, but with - in the same positions as $str_with_dash.
The thing is that I can do it with split but I think there could be a faster solution using for loop...Can you show me how?
@final=(); @split_str_no_dash = split(//, $str_no_dash); @split_str_with_dash = split(//, $str_with_dash); $count_pos = -1; $count_pred = -1; @str_final=(); foreach $b(@split_str_with_dash) { if ($b=~/\./ or $b=~/-/) { $count_pos++; $str_final[$count_pos]=$b; } elsif ($b!~/\./ && $b!~/-/) { $count_pos++; $count_pred++; $str_final[$count_pos]=$split_str_no_dash[$count_pred]; } } $final_string=join("", @str_final); }

Replies are listed 'Best First'.
Re: How can I do this action WITHOUT split?
by dave_the_m (Prior) on Feb 22, 2014 at 17:14 UTC
    my $chars = $str_no_dash; my $final_string = $str_with_dash; $final_string =~ s{[^-]}{substr($chars,0,1,'')}ge;


      Beat me to it!

      my $r = reverse $str_no_dash; $str_with_dash =~ s/[^-]/chop $r/ge;

      Chapeau! I do forget always that you can put a function in the right part of the substitute command.


Re: How can I do this action WITHOUT split?
by McA (Priest) on Feb 22, 2014 at 17:10 UTC


    in C this solution would also work and would be MUCH faster:

    my $source_pos = 0; my $end_string = $str_with_dash; for (my $i = 0; $i < length $str_with_dash; $i++) { next if substr($str_with_dash, $i, 1) eq '-'; substr($end_string, $i, 1) = substr($str_no_dash, $source_pos, 1); $source_pos++; }

    Best regards

Re: How can I do this action WITHOUT split?
by ww (Archbishop) on Feb 22, 2014 at 18:15 UTC

    Do you recognize that you'll lose specificity (detail) by doing what you specify?

    Since $str_no_dash has a charset of just three alphabetic characters, while $str_with_dash has at least 5 times as many distinct alphas (I didn't use a regex to count for me and I ran out of fingers... ) that the processing you specify is a one-way system. That means there'll be no way to pass on all the detail in $str_with_dash nor to recover all that detail from the result of the processing.

    Come, let us reason together: Spirit of the Monastery
      No, this was not the question ww...
      The sequence with the 3 letters is the output of a prediction algorithm, which has 3 labels only (in this case I, M and O).
      The sequence with the hyphens is the same sequence but refers to the amino-acid sequence instead (so yes, there are 20 letters in that case).
      If you remove the hyphens, you will see that both sequences are of the same length, so what I wanted is to replace each letter of the amino-acid sequence with the respective predicted label of it.
      Thanks to everyone for answering, now my code runs extremely faster!
Re: How can I do this action WITHOUT split?
by jwkrahn (Monsignor) on Feb 23, 2014 at 10:36 UTC
    ( my $final_string = $str_with_dash ) =~ s| ( [^.-] ) | $str_no_dash = +~ / ( . ) /xsg && $1 |xeg;

