Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: Scraping HTML: orthodoxy and reality

by PodMaster (Abbot)
on Jul 08, 2003 at 08:08 UTC ( #272219=note: print w/ replies, xml ) Need Help??


in reply to Scraping HTML: orthodoxy and reality

After seeing HP200LX:: on cpan, I suggest you stick it in HP::4600::Status(Scrape)? (or something like HP::Printer::4600 thatwhatever somewhat corresponds to the HP naming convention ;) and suggest to the author of HP200LX:: to rename his HP::200:: yada yada.

As for your notes on html scraping reality, checkout YAPE::HTML, it's regex based.

MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
** The third rule of perl club is a statement of fact: pod is sexy.


Comment on Re: Scraping HTML: orthodoxy and reality

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://272219]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (13)
As of 2014-10-20 13:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (76 votes), past polls