http://www.perlmonks.org?node_id=1005373

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hey there,
I have 2 strings, with one being a substring of the other, like:
$big_string="MAAAAATLRGAMVGPRGAGLPGARARGLLCGARPGQLPLRTPQAVSLSSKSGLSRGR +KVILSALGMLAAGGAGLAVALHSAVSASDLELHPPSYPWSHRGLLSSLDHTSIRRGFQVYKQVCSSCHS +MDYVAYRHLVGVCYTEDEAKALAEEVEVQDGPNEDGEMFMRPGKLSDYFPKPYPNPEAARAANNGALPP +DLSYIVRARHGGEDYVFSLLTGYCEPPTGVSLREGLYFNPYFPGQAIGMAPPIYNEVLEFDDGTPATMS +QVAKDVCTFLRWAAEPEHDHRKRMGLKMLLMMGLLLPLVYAMKRHKWSVLKSRKLAYRPPK"; $small_string="SDLELHPPSYPWSHRGLLSSLDHTSIRRGFQVYKQVCSSCHSMDYVAYRHLVGVC +YTEDEAKALAEEVEVQDGPNEDGEMFMRPGKLSDYFPKPYPNPEAARAANNGALPPDLSYIVRARHGGE +DYVFSLLTGYCEPPTGVSLREGLYFNPYFPGQAIGMAPPIYNEVLEFDDGTPATMSQVAKDVCTFLRWA +AEPEHDHRKRMGLKMLLMMGLLLPLVYAMKRHKWSVLKSRKLAYRPPK";
For the small string I have a label string:
$label_for_small="OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO +OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO +OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO +OOOOOOOOOOOOOOOMMMMMMMMMMMMMMMMMMIIIIIIIIIIIIIIIIII";

What I would like to do is to first identify the "small" substring into the "big" one, and then add, respectively, 'I's and 'O's (or the opposite) in the start and end positions. I'm thinking to use index, would this be efficient solution? And how would I know how many 'O's and 'I's must I add in order for the "small" substring to grow into the length of the "big" one?
Thanks!

Replies are listed 'Best First'.
Re: indexing in strings
by roboticus (Chancellor) on Nov 24, 2012 at 14:02 UTC

    Yes, index looks like a fine way to find the small string inside the big one. I don't know the rule for adding 'O' and 'I' to your string, but index tells you how many bytes from the front the small string is located, so given that value and your rule for adding 'I' and 'O', it should be straightforward. Don't forget that a simple way to generate a repeated string of characters is the x operator:

    my $text = 'The quick red fox jumped over the lazy brown dog'; my $search = 'quick'; my $location = index($text, $search); if ($location < 0) { print "$search not found!\n"; } else { print "$search found $location bytes from the front" . ("!"x$locat +ion), "\n"; }

    (Note: totally untested)

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      Thanks so much, I just started looikg at it :)
Re: indexing in strings
by BrowserUk (Patriarch) on Nov 24, 2012 at 14:38 UTC
    I have 2 strings, with one being a substring of the other, like: $big_string=... & $small_string=... For the small string I have a label string: $label_for_small=... What I would like to do is to first identify the "small" substring into the "big" one, and then add, respectively, 'I's and 'O's (or the opposite) in the start and end positions.

    Add Is & Os ... to what? The big string? The small string? The label string? A post-it on the back of your keyboard?

    I'm thinking to use index, would this be efficient solution?

    Yes. index locates strings within strings efficiently.

    But your description of what you then want to do is about as clear as mud.

    And how would I know how many 'O's and 'I's must I add in order for the "small" substring to grow into the length of the "big" one?

    So you want to grow the small string to be the same length as the big one? If so, what is the label for?

    Your label seems to contain the Is & Os you mention, but as the label is shirt than the big string, and you have mamy of each, they do not represent the start and end positions of the matched substring? And what are the Ms for?

    Try asking your question again, this time imagine you are explaining it to someone with no knowledge of computers.

    You might also like to give an example of (much shorter!) inputs and the required output(s).

    A picture is worth a thousand words.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    RIP Neil Armstrong

Re: indexing in strings
by AnomalousMonk (Archbishop) on Nov 24, 2012 at 15:59 UTC

    As others have remarked, I find the problem statement confusing. (In particular, these tired, old optics would have appreciated less eye-bezoggling example strings!) However, a couple of code examples that might help to define the problem:

    >perl -wMstrict -le "my $big = 'xyzzyABCDfoobar'; my $small = 'ABCD'; ;; (my $padded = $big) =~ s{ \A (.*?) (\Q$small\E) (.*) } { 'o' x length($1) . $2 . 'i' x length($3) }xmse or die 'small substring not found'; print qq{'$big'}; print qq{'$padded'}; ;; $padded = ''; my $is = index $big, $small; die 'small substring not found' if $is < 0; my $op = $is; my $ip = length($big) - length($small) - $op; $padded = 'o' x $op . $small . 'i' x $ip; print qq{'$big'}; print qq{'$padded'}; " 'xyzzyABCDfoobar' 'oooooABCDiiiiii' 'xyzzyABCDfoobar' 'oooooABCDiiiiii'

    Update: Added die on substitution failure in substitution example.

Re: indexing in strings
by LanX (Saint) on Nov 24, 2012 at 15:31 UTC
    The first step for a good program is a clear task description!

    > ...and then add, respectively, 'I's and 'O's (or the opposite) in the start and end positions.

    ??? Please clarify!

    Cheers Rolf