Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: How to parse HTML5?

by Corion (Patriarch)
on Mar 08, 2016 at 11:59 UTC ( [id://1157067]=note: print w/replies, xml ) Need Help??


in reply to How to parse HTML5?

Maybe <section> tags are not supposed to be nested and that's why HTML::Tidy is complaining about them?

Why do you want to parse HTML? Personally, I like HTML::TreeBuilder, which gives me a tree I can later query for content. If you want to clean up HTML, maybe you can use the ->as_HTML method of the resulting HTML::Element to pretty-print it.

Replies are listed 'Best First'.
Re^2: How to parse HTML5?
by NRan (Novice) on Mar 08, 2016 at 12:13 UTC

    could you give me any short example with "HTML::TreeBuilder"

    But i also need error log with "line number" and "column number"

    Then after end user go that line number and then correct it

    Is it possible?

    Thanks
    Nikhil Ranjan

      What kind of errors do you want? If you're after finding malformed HTML, HTML::Tidy is better, because HTML::TreeBuilder will automatically correct much of the HTML.

        No I don't want auto correction

        I want i do it manual

        suppose <p> is missing, then it gives me only error log not correct it

        Thanks
        Nikhil Ranjan

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1157067]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (3)
As of 2024-04-24 03:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found