Beefy Boxes and Bandwidth Generously Provided by pair Networks Bob
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

HTML Search Engine/Parser

by Anonymous Monk
on Jul 14, 2001 at 03:57 UTC ( [id://96671]=perlquestion: print w/replies, xml ) Need Help??

This is an archived low-energy page for bots and other anonmyous visitors. Please sign up if you are a human and want to interact.

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Before I duplicate someone else's work, does anyone know of a good html search engine/parser that is available in perl (or even C/C++)? What I'm looking for is a really well designed a powerful parser that can, among other things, keep a hierarchial list of what tags its current searching through. For example, if I have:

<BODY>
<TABLE>
<TR><TD><IMG>
</TD></TR></TABLE></BODY>

And I want to find all IMG tags, it'll know whether those tags are nested, like above, in a body,table,tr,td.

Any ideas?

Replies are listed 'Best First'.
Re: HTML Search Engine/Parser
by LD2 (Curate) on Jul 14, 2001 at 04:25 UTC
Re: HTML Search Engine/Parser
by MZSanford (Curate) on Jul 14, 2001 at 08:25 UTC
    I am in agreemment, LD2 was definitly correct with HTML::Parser ... i have used it for complex HTML parsing and found it to be stable.
    OH, a sarcasm detector, that’s really useful
Re: HTML Search Engine/Parser
by agent00013 (Pilgrim) on Jul 14, 2001 at 15:52 UTC
    HTML::LinkExtor is good for parsing, also. You can use it for links as well as other tags if you do it right.

    If you look at the grabLinks function in my URL Checking Spider you can see an example of how I used it to grab all the links and images from a web site. (the script spiders through a series of pages, so with some modification and additional functions, you might be able to set it up as a search engine.) I hope that gives you a start, good luck.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://96671]
Approved by root
help
Sections?
Information?
Find Nodes?
Leftovers?
    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.