Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Who has used HTML::Parser??

by Anonymous Monk
on Jul 06, 2000 at 03:01 UTC ( [id://21248]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I just got the HTML::Parser 3.1 module and i am trying to figure out how to use it. I have no prior experience with modules, and the biggest problem i am having with this one is that i cant figure out what the elements of a parser object are. The documentation shows how to create a new parser data structure and perform some operations on it, but i can't find what the different parts of this structure are. More over, i can't seem to find where the main parser function is. I need to figure out what the function parser returns. if anyone has any experience with this module, please help. Thanks, Shaheeb.

Replies are listed 'Best First'.
Re: Who has used HTML::Parser??
by nardo (Friar) on Jul 06, 2000 at 04:56 UTC
    The "main parser function" is either parse() or parse_file() depending on whether you have the html in memory or on disk. The parser has three functions: start, end, and text, which will be called when a new tag is encountered, ended, and text is found. You need to supply these functions yourself. Version 2 of HTML::Parser requires you to subclass HTML::Parser:
    #!/usr/bin/perl use strict; { package SampleParser; use base qw(HTML::Parser); sub start { my ($self, $tagname, $attr, $attrseq, $origtext) = @_; my $at; print "Tag: $tagname\n"; foreach $at (@{$attrseq}) { print "Attribute: $at = $attr->{$at}\n"; } } sub text { my ($self, $origtext) = @_; print "Text: $origtext\n"; } } my $html = '<html><head><title>this is the title</title><body bgcolor= +"white">Hello</body></html>'; my $sp = new SampleParser; $sp->parse($html)
    But version 3 looks like it allows you to specify which functions to use for start end and text in the constructor (see the documentation for an example of this).
Re: Who has used HTML::Parser??
by ZZamboni (Curate) on Jul 06, 2000 at 04:08 UTC
Re: Who has used HTML::Parser??
by beppu (Hermit) on Jul 06, 2000 at 09:50 UTC

    I've used HTML::Parser twice so far, and it's not hard to use at all. On perlmonks.org, I've put up a script called delirium which uses HTML::Parser in a simplistic way. I've also written another script called lchtml which is a text filter that turns HTML tags and attributes to lower case. lchtml gives HTML::Parser a little bit more of a work out, and you'll probably find it more informative.

    They're both available for browsing at
    http://opensource.lineo.com/cgi-bin/cvsweb/scripts/little/
    delirium and lchtml

    coder equ "beppu" ; asm and perl 4ever

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://21248]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (6)
As of 2024-04-23 10:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found