http://www.perlmonks.org?node_id=544944

fraktalisman has asked for the wisdom of the Perl Monks concerning the following question:

I want to modify a string of HTML code, using a simple regex, so that all links (<a ...) with a href that starts with "http", would be given a target="_blank" attribute unless they already have a target.

The regex I've come up with so far produces the desired output but it is awfully slow ... and it looks far from being elegant. It looks for links, extracts link targets and remaining attributes, and checks that the remaining parts don't already contain the word target:

$test=~s/<a ([^t|>]*?[^a|>]*?[^r|>]*?[^g|>]*?[^e|>]*?[^t|>]*?)href=(" +??http:[^"]*?" ??)([^t|>]*?[^a|>]*?[^r|>]*?[^g|>]*?[^e|>]*?[^t|>]*?)> +/<a $1 href=$2$3 target="_blank">/gosi;

I want to improve my regex but I wonder how. What I didn't find anywhere in documentation and tutorials is how to match anything (like .*?) unless it contains a certain expression (target). Maybe I'd better use HTML::Parser in this case, but I still want to know how it's possible to optimize the expression.