|Problems? Is your data what you think it is?|
Re: Problem with <> and regexby golux (Pilgrim)
|on Mar 11, 2014 at 15:15 UTC||Need Help??|
Hi luxlunae, The "<*" part of your regex means "match any number of less-than (<), including zero". So the whole thing will get rid of any number of "<" immediately followed by a single ">".
Closer (though still not correct) is:
which means "get rid of "<" and ">" and anything between. The reason it's still not correct is because it will delete multiple <...> ... <...> from the line, including the text within it. (try it and see). That is to say, it matches (and deletes) this entire line:
A real solution would be:
where the "[^>]+" part means "1 or more of any character except greater-than ">". That regex should therefore get rid of all occurrences of <...> in the line, without removing non-tag text in between.
Edit: it's worth pointing out another solution would be to use the "non-greedy" quantifier "?" in "still not correct" example I gave above:
which would have the effect of matching the shortest possible "<...>" each time, and thus avoid getting multiple pairs.
Edit 2: fixed misspelling of "$word" to "$words".