http://www.perlmonks.org?node_id=110464


in reply to how-to strip empty HTML tags like <b> </b>

The normal answer would be use HTML::TokeParser if you want a reliable solution to parse HTML. For this *particular* task a well constructed regex should suffice. This will strip all the <b> </b> tags that are empty. I believe it covers all bases.

$text = join '', <DATA>; print $text; $text =~ s#<\s*b\s*>(?:[\s\n]|&nbsp;)*<\s*/\s*b\s*>##ig; print $text; __DATA__ <p>test <p>test<b></b> <p>test<b >&nbsp; </b> <p>test<b> </b> <p>test<b ></b > <p>test<B></B> <p>test<B> </B> <p>test<B ></B> <p>test<B > </B > <p>test<B> &nbsp; &nbsp; </B> <p><b>I am not empty!</b>

Replies are listed 'Best First'.
Re: Answer: how-to strip empty HTML tags like b /b
by Hofmator (Curate) on Sep 06, 2001 at 16:33 UTC

    I'd not replace the non-breaking spaces, maybe you could remove the bold tags, but removing the nbsp can mess up the layout of a table.

    Just my euro 0.02

    -- Hofmator