Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

Re: Parsing webpages

by CountZero (Bishop)
on Jan 28, 2013 at 07:29 UTC ( #1015636=note: print w/ replies, xml ) Need Help??

in reply to Parsing webpages

"Trade Me" has a published API and it will be much easier to use this API rather than scrape the site.

Actually, using the API is the only authorised way to automate access to the website:

4.1.c You may not use a robot, spider, scraper or other unauthorised automated means to access the Website or information featured on it for any purpose.


A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

My blog: Imperial Deltronics

Comment on Re: Parsing webpages
Re^2: Parsing webpages
by tel2 (Monk) on Jan 28, 2013 at 21:52 UTC
    Thanks CountZero,

    Good points.  Didn't realise that.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1015636]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (6)
As of 2015-03-28 11:49 GMT
Find Nodes?
    Voting Booth?

    When putting a smiley right before a closing parenthesis, do you:

    Results (625 votes), past polls