Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re: Parsing webpages

by CountZero (Bishop)
on Jan 28, 2013 at 07:29 UTC ( #1015636=note: print w/ replies, xml ) Need Help??


in reply to Parsing webpages

"Trade Me" has a published API and it will be much easier to use this API rather than scrape the site.

Actually, using the API is the only authorised way to automate access to the website:

4.1.c You may not use a robot, spider, scraper or other unauthorised automated means to access the Website or information featured on it for any purpose.

CountZero

A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

My blog: Imperial Deltronics


Comment on Re: Parsing webpages
Re^2: Parsing webpages
by tel2 (Scribe) on Jan 28, 2013 at 21:52 UTC
    Thanks CountZero,

    Good points.  Didn't realise that.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1015636]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (14)
As of 2014-10-21 20:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (110 votes), past polls