Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Auto Login to Websites

by Anonymous Monk
on May 19, 2005 at 07:46 UTC ( [id://458542]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Is it possible to create a script that will automatically login to a website, then download the html and images, then email it. (Or even do a print screen or capture the information as a jpeg) The reason is that our business keeps going over its broadband cap limit. I want to be able to automate the process of loging into the site, and have this information sent to me on a periodic bases. Any suggestions or alternatives MOST welcome. THANKS

Replies are listed 'Best First'.
Re: Auto Login to Websites
by tphyahoo (Vicar) on May 19, 2005 at 08:02 UTC
    Sure. Have a supersearch around perlmonks for www::mechanize or lwp::useragent, along with login. You need to know perl pretty well to get this to work though. If you are feeling trusting, post the website, username, password, and someone here may do it for fun or the XP.

    If you're feeling less trusting, you could post a project to rentacoder, elance, scriptlance, etc you should be able to find someone to do this for you at a low cost without too much effort. Or check out dragonchild's node for perlmonks that are available to contract.

    You could also do this without perl using iopus's internet macros. I would go with perl though.

    You could also try this with wget. The login is the tricky part, since wget's support for html post requests is not as good as perl's, but this might help, or you might be able to find a better howto by hitting your favorite search engines.

    The above applies for loggin in / downloading. For emailing, well, without perl there might be some utility you can call from the batch file, otherwise yes, you could do it with pero.

    Good luck.

Re: Auto Login to Websites
by Thilosophy (Curate) on May 19, 2005 at 08:11 UTC
    You can login to a website and download things with [WWW::Mechanize]. There are a couple of modules on CPAN to send emails, just search for SMTP or Mail.

    I am not sure how to do a screen capture, or if that is a good idea (bandwidth-wise and usability-wise). You can download all the images linked from the page, but if possible, try to stick to the text content. This will make sending the email easier, too, as you do not have to worry about attachments.

    I am also not sure how this is going to save bandwidth, though. Your script will access the site just as your browser would (consuming just as much bandwidth) and sending an image by email is not slimmer than downloading it via HTTP, too.

    Are you sure that accessing this web site is the cause of going over your broadband limit? How often every day do you do this? And would the script do it less often?

Re: Auto Login to Websites
by inman (Curate) on May 19, 2005 at 08:18 UTC
    The answer depends on whether you want to work with the whole of a website or just part of it. It also depends on your network setup.

    The easiest solution would be to buy more data-transfer from your ISP or choose and ISP that didn't impose such limits.

    A fairly easy solution would be to access the Internet through a proxy server which was configured to cache the files that had previously been accessed. This works very well if a number of the employees in your company access the same pages.

    You could also mirror the whole or part of the site and maintain a local copy. The initial download will take up a lot of your data transfer limit but (if the software is any good) the update process should only validate the cached files and not download unchanged ones. Check out wget for a start.

    Downloading a discrete set of pages should be easy enough with a script. You will have to handle the login and then work through the target pages, following links, downloading target pages and changing the links so that they refer to local files. Look at the section on mirroring in the LWP documentation.

Re: Auto Login to Websites
by Anonymous Monk on May 22, 2005 at 07:32 UTC
    Thanks for your comments. I am currently looking into WWW::Mechanize. I work for a printing business running Windows Servers. Our ISP only mails our bandwidth usage once it hits 80%. Writing a script to run once a week or so will give us more of an idea on usage (rather than manually logging on etc). I want to send this info to my boss. I think I have all the information thanks to the comments. I just need to play around and find away to grab a given image from a page, then attach it and email it. Thanks again....

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://458542]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (4)
As of 2024-04-25 15:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found