Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

"combined" HTML *and* CSS parser?

by telcontar (Beadle)
on Apr 12, 2005 at 19:05 UTC ( #447136=perlquestion: print w/ replies, xml ) Need Help??
telcontar has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,


let me see if I can explain this in a satisfactory way.

I have been using HTML::Tree to parse HTML documents, and I could use some CSS parser to parse CSS files (or statements). But how do I put them in context- use them together? This will give me two parse trees, but what I'd rather have is a single tree where I can access CSS properties at the level of HTML elements.

For instance, if I have a simple CSS rule, and I get a parse tree out of it, how am I to know which HTML elements it will apply to? This depends on the property (since some properties have no rendering effect on some types of elements), on inheritance, etc.

In short, is there any way to combine some HTML and CSS browsers so that, in the HTML parse tree, one can tell which elements effectively have which CSS properties? (Except for taking apart some browser source code)

Does this make any sense at all?


John

Comment on "combined" HTML *and* CSS parser?
Re: "combined" HTML *and* CSS parser?
by brian_d_foy (Abbot) on Apr 12, 2005 at 22:44 UTC

    Sure it's possible. You have to keep some state to know which things apply to which tags (DIV, SPAN, and so on defining a scope), then looking in your CSS data structure to get the right elements. You just have to apply things like the CSS spec says you should.

    I'm not sure HTML::Tree will be low-level enough for you to do that, and you can always right your own parser as a sub-class of HTML::Parser if something else doesn't do what you want.

    --
    brian d foy <brian@stonehenge.com>
      Hi,

      yes, of course it's possible ;-) (and there's More Than One Way To Do It ...) But implementing all of CSS Level 2 (or a large subset of it) to combine it with a HTML parser is, well, a lot of work. I thought maybe someone had already done that, or there was some module to do it. Not that I don't want to give it a try; rather, I'd not waste time doing something someone else has already done.

      I'm also rather new at Perl, so it would probably take rather long. Since I have a specific set of tasks that I would need to check, maybe I can use HTML::Tree and the CSS distribution after all. Although I must say that I find the documentation of the CSS distribution lacking, and wonder why there are no methods to navigate the object tree- probably will just have to access the properties directly and have a look at the code.

      Thanks for the reply!


      John
Re: "combined" HTML *and* CSS parser?
by wfsp (Abbot) on Apr 13, 2005 at 07:10 UTC
    I had an overweight style sheet.

    I wanted to know if there were styles in the style sheet not used by the HTML or styles in the HTML that were not in the style sheet. Also the least used styles - could they be 'refactored' (either out of existance or into separate sheets)?.

    I came up with this. There are many caveats, see in particular Your Mother's comments. The style sheet did lose some weight though!

    Hope it's helpful, John

    Update: Fixed link

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://447136]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (8)
As of 2014-08-23 18:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (176 votes), past polls