how-to strip empty HTML tags like <b> </b>


We don't bite newbies here... much
	PerlMonks

how-to strip empty HTML tags like <b> </b>

by russmann (Initiate)

on Sep 06, 2001 at 04:03 UTC ( [id://110456]=perlquestion: print w/replies, xml )

Need Help??

russmann has asked for the wisdom of the Perl Monks concerning the following question: ⭐ (regular expressions)

how-to strip empty HTML tags like <b> </b>

Originally posted as a Categorized Question.

Comment on how-to strip empty HTML tags like <b> </b>

Replies are listed 'Best First'.
Re: how-to strip empty HTML tags like b /b⭐ by tachyon (Chancellor) on Sep 06, 2001 at 05:40 UTC
A more complete option is not Perl. Use HTML Tidy which is by w3c's Dave Raggett. I just tested it and it will strip empty tags a well as a doing many other useful things. Get your free copy here http://www.w3.org/People/Raggett/tidy/ or get a Win32 GUI version here http://perso.wanadoo.fr/ablavier/TidyGUI/	[reply]
Re: how-to strip empty HTML tags like b /b by tachyon (Chancellor) on Sep 06, 2001 at 04:56 UTC
The normal answer would be use HTML::TokeParser if you want a reliable solution to parse HTML. For this particular task a well constructed regex should suffice. This will strip all the <b> </b> tags that are empty. I believe it covers all bases. `$text = join '', <DATA>; print $text; $text =~ s#<\sb\s>(?:[\s\n]\| )<\s/\sb\s>##ig; print $text; __DATA__ <p>test <p>test<b></b> <p>test<b >  </b> <p>test<b> </b> <p>test<b ></b > <p>test<B></B> <p>test<B> </B> <p>test<B ></B> <p>test<B > </B > <p>test<B>     </B> <p><b>I am not empty!</b>` [download]	[reply] [d/l]
Re: Answer: how-to strip empty HTML tags like b /b by Hofmator (Curate) on Sep 06, 2001 at 16:33 UTC
I'd not replace the non-breaking spaces, maybe you could remove the bold tags, but removing the nbsp can mess up the layout of a table. Just my euro 0.02 -- Hofmator	[reply]
Re: how-to strip empty HTML tags like b /b by Hofmator (Curate) on Sep 06, 2001 at 17:20 UTC
I'd also go with tachyon's suggestion of HTML Tidy, but if you are trying to do this quick and dirty somewhere in the middle of a script, I'd use this regex `$text =~ s#<\s([^>])\s>[\s\n]<\s/\s\1\s*>##ig;` It should remove any empty tags which don't contain any attributes (not just bold tags), so it works on `__DATA__ <i> </ I> < B ></b> < em> < / eM >` [download]	[reply] [d/l] [select]
Re: how-to strip empty HTML tags like <b> </b> by Anonymous Monk on Apr 17, 2003 at 19:10 UTC
You might find this interesting: We made a javascript function that trims leading/trailing spaces from field values, but found it did not include the nonbreaking space character (nbsp;) when it appeared. Before fixing it, our function looked like this: ------------------------------------- String.prototype.Trim = function() { return this.replace(/(^\s)\|(\s$)/g, ""); } ------------------------------------- After discovering that values that included nbsp; were not getting trimmed, I found the following to be true: nbsp; = chr(160) = xA0 So have now modified the Trim function to read: ------------------------------------- String.prototype.Trim = function() { return this.replace(/(^{\s\xA0#})\|({\s\xA0#}$)/g, ""); } ------------------------------------- (Please note that I used CURLY BRACKETS in the example above, but if you use this, use SQUARE BRACKETS -- they just wouldn't display on this page when I typed them in...) And it works!! FYI, to clean field values, our javascript code calls it this way: document.forms[0].myField.value.Trim(); Hope this helps someone, Susan Henesy Originally posted as a Categorized Answer.	[reply]
A reply falls below the community's threshold of quality. You may see it by logging in.

Back to Seekers of Perl Wisdom

Log In^?

Domain Nodelet^?

www.com | www.net | www.org

Node Status^?

node history
Node Type: perlquestion [id://110456]
help

Chatterbox^?

How do I use this? • Last hour • Other CB clients

Other Users^?

Others learning in the Monastery: (6)

As of 2024-04-19 03:08 GMT

Sections^?

Information^?

Find Nodes^?

Leftovers^?

Today I Learned

Voting Booth^?

No recent polls found