Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re: LWP UserAgent to script Mediawiki reads

by PikMaster (Initiate)
on Oct 30, 2012 at 14:25 UTC ( #1001514=note: print w/ replies, xml ) Need Help??


in reply to LWP UserAgent to script Mediawiki reads

Hi.

You need to do GET request first, to get the value of input field wpLoginToken.

my $req; my $response; $req = "$wikiurl?title=Special:UserLogin&returnto=Main+Page"; print "req = GET $req\n"; $response = $ua->request( GET $req, ); # look for for cookie-set request from server, ie. mediawiki_isg_sess +ion=954d4797c18ca9054f14c2675af3255e; path=/; HttpOnly # foreach (keys %{$response->{'_headers'}}) { if ($_ =~ /^set-cookie$/i) { # server attempting to set a cookie my $c = $response->{'_headers'}->{$_}; print "Server header set-cookie = $c\n"; } } my @lines = split /\n/, $response->content(); foreach (@lines) { if (/name="wpLoginToken" value="([0-9a-z]+)"/) { my $token = $1; $params{'wpLoginToken'} = $token; print "found token=$token, line=$_\n"; } } $cookie_jar->extract_cookies( $response );

Then, you need to pass this in your next request when you are actually logging in, when submitting the username and password

$req = $wikiurl."?title=Special:UserLogin&action=submitlogin&type=lo +gin&returnto=Main+Page"; print "req = POST $req\n"; $params{'wpLoginAttempt'} = "Log in"; $response = $ua->request( POST $req, Content_Type => 'application/x-www-form-urlencoded' , Content => [ %params ] ); $loggedIn = 0; foreach (keys %{$response->{'_headers'}}) { # print "ServerHeader: ".$_."\n"; if ($_ =~ /^set-cookie$/i) { # server attempting to set a cookie my $a = $response->{'_headers'}->{$_}; if ($a =~ /^ARRAY(.+)$/) { foreach (@{$a}) { print "Server header set-cookie: ======== $_\n"; if (/UserID=\d+\;/i) { $loggedIn = 1; # Success! last; } } } } } print "Login result = $loggedIn\n";

Now, you will need the get and keep passing the editToken

my $LastEditToken = undef; my $body = $response->content(); if ($body =~ /value="([0-9a-z\+\\]+)"\s+name="editToken"/) { $LastEditToken = $1; print "found token=$LastEditToken\n"; }

Whenever you want to do anything further with the wiki, keep pasing the editToken, eg. for importing XML dumps:

sub importXML { if (not defined $LastEditToken) { my $response = $ua->request( GET "$wikiurl?title=Special:Import", ); if ($response->content() =~ /value="([0-9a-z\+\\]+)"\s+name="editT +oken"/) { $LastEditToken = $1; print "==found token=$LastEditToken\n"; } } my $url = "$wikiurl?title=Special:Import&action=submit"; print "Sending request to $url,\n using token $LastEditToken\n"; my $response = $ua->request( POST "$url", Content_Type => 'multipart/form-data', Content => [ 'action' => 'submit', 'source' => 'upload', 'editToken' => $LastEditToken, 'MAX_FILE_SIZE' => $MaxXmlSize, 'xmlimport' => [$filepath], ] ); if ($response->is_success) { if ($response->content() =~ /value="([0-9a-z\+\\]+)"\s+name="editT +oken"/) { $LastEditToken = $1; print "==found token=$LastEditToken\n"; } return 1; } else { return 0; print $response->content(); } }


Comment on Re: LWP UserAgent to script Mediawiki reads
Select or Download Code
Re^2: LWP UserAgent to script Mediawiki reads
by PikMaster (Initiate) on Oct 30, 2012 at 14:31 UTC

    The login part may be a little bit unclear, so here is the summary of form parameters passed (used in $ua->request):

    my %params = (); $params{'wpName'} = 'your_user_name'; $params{'wpPassword'} = 'your_password'; $params{'wpLoginAttempt'} = "Log in"; $params{'wpLoginToken'} = $token; #Login token obtained previously

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1001514]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (12)
As of 2014-09-20 13:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (159 votes), past polls