Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

Re: HTTP Scripting

by ajt (Prior)
on Nov 29, 2002 at 12:30 UTC ( #216514=note: print w/replies, xml ) Need Help??

in reply to HTTP Scripting


Perl is good at this kind of thing. Perl has modules to connect to web servers (LWP), work with the cookies and passwords, and parse HTML (HTML).

Perl has several HTML/XML parsers, some are general purpose parsers, and some are dedicated, e.g. link extractors, header parsers.

You could argue that your choice is so wide that it becomes daunting!

I would suggest the following books: Perl and LWP which is all about connecting to, collecting from, and parsing of web data. I would also suggest Data Munging with Perl, it's a little older and more generic (it's for more than just web automation), but it's a fine book and has good examples of web data mining. Web Client Programming with Perl is old and out of print, but it's freely available as an OpenBook from O'Reilly, and quite useful.

I would also check out merlyn's columns as I think there are some good examples in there with good descriptions. There may also be something in's article archive.


Replies are listed 'Best First'.
Re: Re: HTTP Scripting
by marinersk (Priest) on Nov 29, 2002 at 15:24 UTC
    Thanks, ajt, excellent book references, and things which will gladly join my growing library.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://216514]
[GrandFather]: Us: "What scaling do we need to apply to the numbers from the SDK for the wibble?". Them: "Oh, the numbers from the SDK for the wibble are already correct, they don't need scaling"
[GrandFather]: In our code: wibble range 1 scale by 1, range 2 scale by 2, range 3 scale by 4, range 4 scale by 8

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (4)
As of 2017-08-24 07:20 GMT
Find Nodes?
    Voting Booth?
    Who is your favorite scientist and why?

    Results (365 votes). Check out past polls.