Re: strip out anything inbetween brackets

The naive approach would be

$string =~ s/$.*$//;

Which would do the trick in this particular case, but would convert "this is a (blah) and this is not a (blah)" in "this is a ", which is why you should use a non-eager quantifier:

$string =~ s/$.*?$//;

This does the trick...

Don't forget, however, to use the /g switch (for global substitutions). Also, your example has the result as being "this is a" (notice there's no space after the a...)

If that's what you want, you just need to include \s* on both ends of your regular expression...

OTOH, that would turn "this is a (blah) bleh" into "this is ableh", which is probably not what you want... O:-)

Comment on Re: strip out anything inbetween brackets Select or Download Code

Replies are listed 'Best First'.
Re^2: strip out anything inbetween brackets by reasonablekeith (Deacon) on Apr 05, 2005 at 14:51 UTC
Well I was going to post to say you should really be checking using a negated character class, rather than having all that backtracking going on. I was pretty sure it'd be faster, and it's what I would normally do when coding regexes like this. I did a quick benchmark first, and it turns out I was wrong, the negated character class get relatively more and more inefficient the longer the data it has to scoop up is. Twice as much as proved here. `use strict; use Benchmark qw(:all) ; my $count = 50000; my $replacement_string = "this is a (" . "a"x1000 . ") test"; cmpthese($count, { 'negated' => sub { my $text = $replacement_string; $text =~ s\|$[^)]$\|\|sg; }, 'backtrack' => sub { my $text = $replacement_string; $text =~ s\|$.?$\|\|sg; }, }); OUTPUT Rate negated backtrack negated 8562/s -- -67% backtrack 26316/s 207% --` [download] I still think there's something to be said for the character class, as it is more explicit (after all, we are trying to match anything other than the closing bracket.), but it it certainly slower. This surprised me, so I thought I'd post it, incase it surprised anyone else.	[reply] [d/l]
Re^2: strip out anything inbetween brackets by jhourcle (Prior) on Apr 05, 2005 at 14:48 UTC
The second one will work provided that you don't have nested parens: `This is a ((very important) blah)` If there's a possibility of that sort of thing happening, you'll probably want to look at Pustular Postulant's recommendation, and not use a regex. (I don't know that exact module, so if it'll handle it, or if you need to look for something else) I've typically run into this problem with SGML, so used a parser specifically for HTML or XML... I don't know if there's something that does nested braces and the like.	[reply] [d/l]
Re^2: strip out anything inbetween brackets by Anonymous Monk on Apr 05, 2005 at 14:19 UTC
perfect thankyou for your help	[reply]
Re^3: strip out anything inbetween brackets by cog (Parson) on Apr 05, 2005 at 14:24 UTC
You are welcome, Anonymous Monk. You should also consider creating a user in this site :-)	[reply]


laziness, impatience, and hubris
	PerlMonks