http://www.perlmonks.org?node_id=1041009


in reply to Question for regex experts

I am not sure my proposal will not create havoc elsewhere in your html (ie how robust it is) but have a look:

use strict; use warnings; my $html = ">lt;amp; something else <gt;amp;something else ; +"; $html =~ s|&(([^ ;]+;)+)| join '', map { "&$_;" } split /;/, $1 |ge; print "$html\n";

UPDATE: Probably \w is more robust than [^ ;].