Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

Re: Parsing webpages

by CountZero (Bishop)
on Jan 28, 2013 at 07:29 UTC ( #1015636=note: print w/replies, xml ) Need Help??

in reply to Parsing webpages

"Trade Me" has a published API and it will be much easier to use this API rather than scrape the site.

Actually, using the API is the only authorised way to automate access to the website:

4.1.c You may not use a robot, spider, scraper or other unauthorised automated means to access the Website or information featured on it for any purpose.


A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

My blog: Imperial Deltronics

Replies are listed 'Best First'.
Re^2: Parsing webpages
by tel2 (Pilgrim) on Jan 28, 2013 at 21:52 UTC
    Thanks CountZero,

    Good points.  Didn't realise that.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1015636]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (5)
As of 2018-06-21 03:50 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (117 votes). Check out past polls.