Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

How do I access a password protected site and access data?

by jaydon (Novice)
on Jun 28, 2005 at 15:34 UTC ( #470674=perlquestion: print w/ replies, xml ) Need Help??
jaydon has asked for the wisdom of the Perl Monks concerning the following question:

I need to access a site where the login page uses an https URL and then get data off an http URL.
I'm approaching this problem by first doing a GET HTTP::Request of the login page and then a POST HTTP::Request to the login page and providing the username and password.

I then plan to do another GET to the data url.


My problem is in the POST.

When I look at the returned page, it is the same as what gets returned for the initial GET, where the user is getting prompted for the login info.


I'm new to perl and would really appreciate any advice.

Here is my code:

use LWP::UserAgent; use HTTP::Cookies; use LWP::Simple; use HTTP::Request; use strict; my $ua = LWP::UserAgent->new; $ua->agent("Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"); my $cj = HTTP::Cookies->new(file => "lwp.cookie", autosave => 1); $ua->cookie_jar($cj); $cj->save; my $req1 = HTTP::Request->new(GET => 'https://www.samplesite.com/myacc +ount/default.asp?sp=login'); my $res1 = $ua->request($req1); if ($res1->is_success) { print "1st step complete\n"; } else { print "Error: " . $res1->status_line . "\n"; } my $req2 = HTTP::Request->new(POST => 'https://www.samplesite.com/myac +count/default.asp?sp= login', [username => "samp2", password => "data362"],); if ($res2->is_success) { print $res2->content; print "Step 2 ocmplete\n"; } else { print "Error: " . $res2->status_line . "\n"; }

Comment on How do I access a password protected site and access data?
Download Code
Re: How do I access a password protected site and access data?
by marto (Chancellor) on Jun 28, 2005 at 15:41 UTC
    Hi,

    Perhaps you should consider using WWW::Mechanize to do this.
    I think you may find it easier to implement what you are trying to do.

    Hope this helps.

    Martin

      Thank you for your response. Since I'm new to perl, I will need to study up on that module. However, does that mean I cannot use LWP::userAgent and HTTP::Request to do this? What are the limitations (so I understand why I cannot use them)?

      Thank you.

        Hi,

        Looking back at this question a day later raises a query in my mind. Does the page you are trying to access have some kind of login form where you are required to enter a valid username and password, or is a pop up window displayed asking for valid credentials?

        Either way, in the past I have implemented screen scraping / data processing of sites using WWW::Mechanize. The previous link shows some examples of how easy it is to process forms (login or otherwise), follow links and return page content for processing.
        I have done the same thing using other methods, but in my experience using WWW::Mechanize is easier to implement.

        It is also worth while reading merlyn's column Web scraping with WWW::Mechanize (Apr 03) which I found very informative.

        Hope this helps

        Martin
Re: How do I access a password protected site and access data?
by shiza (Hermit) on Jun 28, 2005 at 16:44 UTC
    This will do the trick (LWP::UserAgent and HTTP::Request):
    # define user agent my $ua = LWP::UserAgent->new(); $ua->agent("USER/AGENT/IDENTIFICATION"); # make request my $request = HTTP::Request->new(GET => $URI); # authenticate $request->authorization_basic($user, $pass); # except response my $response = $ua->request($request);
      Obviously I'm doing something wrong... as that didn't work. Do I need to create an HTTP::Headers object first?
        Try dropping these use statements:
        use LWP::Simple; use HTTP::Request; use HTTP::Cookies;
        and add:
        use LWP;
        so your script will look something like this:
        #!/usr/bin/perl use strict; use LWP; use LWP::UserAgent; my $URI = 'https://www.domain.com/protected_realm'; my $user = 'foo'; my $pass = 'bar'; # define user agent my $ua = LWP::UserAgent->new(); $ua->agent("USER/AGENT/IDENTIFICATION"); # make request my $request = HTTP::Request->new(GET => $URI); # authenticate $request->authorization_basic($user, $pass); # except response my $response = $ua->request($request); # get content of response my $content = $response->content(); # do whatever you need to do with the content here print $content; exit;
        It would be wise to add error checking in there as well. :)

        Also, check out the docs for LWP here: LWP

        And the LWP::UserAgent docs for your cookie related needs: LWP::UserAgent

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://470674]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (8)
As of 2014-09-17 20:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (99 votes), past polls