Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re: Where did HTML::Sanitizer go?

by redgreen (Priest)
on Nov 17, 2009 at 01:53 UTC ( #807592=note: print w/ replies, xml ) Need Help??


in reply to Where did HTML::Sanitizer go?

You also might want to consider http://tidy.sourceforge.net/ for cleaning up your HTML.

While not a perl solution, it does bring some sanity to HTML tag soup.


Comment on Re: Where did HTML::Sanitizer go?
Re^2: Where did HTML::Sanitizer go?
by wazoox (Prior) on Nov 17, 2009 at 10:41 UTC
    Well it's not the same usage. I use HTML Tidy all the time for static HTML code, but HTML::Sanitizer looked like a really great solution to "purify" input.
Re^2: Where did HTML::Sanitizer go?
by grantm (Parson) on Nov 17, 2009 at 21:09 UTC
    Another option for tidying the HTML before sanitising it is XML::LibXML. It has a parse_html method that gracefully copes with mismatched tag nesting, broken quoting and other common offences. You can then use the toStringHTML method to produce nice clean HTML.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://807592]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (8)
As of 2015-07-06 11:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (72 votes), past polls