Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re: Bracketing Substring(s) in the String

by sgifford (Prior)
on Aug 25, 2005 at 16:48 UTC ( [id://486627]=note: print w/replies, xml ) Need Help??


in reply to Bracketing Substring(s) in the String

The problem is that once you've added the brackets, index doesn't match anymore. Regular expressions can deal with this more easily. This code allows optional brackets between letters to find the substrings, then it "de-nests" the brackets to get the results you give above.
sub put_bracket { my ($str,$ar) = @_; foreach my $subs ( @$ar ) { # Construct a regexp with [\[\]] between all the letters my $newsub = join('[\[\]]?',split(//,$subs)); $str =~ s/($newsub)/[$1]/g; } # Now de-nest the brackets in the string my $depth = 0; my $newstr = ''; foreach my $c (split(//,$str)) { if ($c eq "\[") { $newstr .= $c if ($depth++ == 0); } elsif ($c eq "\]") { $newstr .= $c if (--$depth == 0); } else { $newstr .= $c; } } print "$newstr\n"; return ; }

Replies are listed 'Best First'.
Re^2: Bracketing Substring(s) in the String
by monkfan (Curate) on Oct 28, 2005 at 07:16 UTC
    Dear sgifford,
    First of all I want to apologize for having have to come back to you to ask this question after some time.

    Since your solution above is so important to me, I need to turn to you for this.
    I truly don't know how to go about it. I hope you won't mind.

    Your code above provide 99% correct solutions, except the following case.
    Example 1:
    #Given: my $s5 ='CTGGGTATGGGT'; my @a5 = qw(GTATG TGGGT);
    Your code above returns
    C[TGGGTATGGGT]
    Instead of this the correct one:
    CTGG[GTATGGGT]
    The explanation is as follows TGGGT occur twice in $s5.
    $s5 = "CTGGGTATGGGT"; TGGGT GTATG -- |--- Only this two satisfy. TGGGT -- Since it follows order and delim of the given array.
    Now why the latter is the correct answer. It is because in the array @a5 = qw(GTATG TGGGT), the string "TGGGT" comes after "GTATG", thus the bracketed region should also follow the order of the given array and the span delimited also by the array. By that I mean, the bracketed regions -- be it disjointed or overlapped -- should always start with first element of the array and end with the last element of the array.

    Let me give another examples, hope it clarifies.
    I would also need to state that the size length of the string in the array is always fixed. In our examples they are always of length 5.

    Is there a way I can modify your code above so that it can handle such case? Hope to hear from you again. I'll try not to bother you again after this.

    Update 2 : I think I've got the solution. Thanks so much sgifford, sorry for the trouble.

    Regards,
    Edward

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://486627]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (4)
As of 2024-03-28 14:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found