OK, here are more pieces to the puzzle. The issue is not yet resolved so I would still appreciate input on this.
I did more searches through MediaWiki docs and ultimately ended up on the MediaWiki IRC channel (http://www.mediawiki.org/wiki/MediaWiki_on_IRC). With the help from those guys, I found the following:
- The query shown in the test code in the parent node (my $titles = $mw->api(...)) works. It was verified against two test sites:
# $mw->{config}->{api_url} = 'http://test.wikipedia.org/w/api.php';
# $mw->{config}->{api_url} = 'https://secure.wikimedia.org/wikipedia/t
+est/w/api.php';
Therefore, I don't think the issue is with either the module or the API call.
-
As mentioned in the parent node, the query URL works fine when accessed from a browser (specifically, Firefox 3.0.5).
-
I compared the headers from the browser (captured via Live HTTP Headers) with those in the $mw object (via Data::Dumper). Other than more detail provided in the output by Data::Dumper related to the ssl certificate/etc they looked equivalent. The only difference that stuck out to my eyes was that the perl code used a POST method while the browser used GET.
-
After examining the output of Dumper( $mw ) I noticed that while the HTTP::Response object contained only the 403 error shown in the parent node (and the stack trace contained no new information), the content of the returned page was not null and may be significant:
'_content' =>
'<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://w
+ww.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"
+>
<head>
<title>Access forbidden!</title>
<link rev="made" href="mailto:you@example.com" />
<style type="text/css"><!--/*--><![CDATA[/*><!--*/
body { color: #000000; background-color: #FFFFFF; }
a:link { color: #0000CC; }
p, address {margin-left: 3em;}
span {font-size: smaller;}
/*]]>*/--></style>
</head>
<body>
<h1>Access forbidden!</h1>
<p>
You don\'t have permission to access the requested obj
+ect.
It is either read-protected or not readable by the ser
+ver.
</p>
<p>
If you think this is a server error, please contact
the <a href="mailto:you@example.com">webmaster</a>.
</p>
<h2>Error 403</h2>
<address>
<a href="/">cabig-kc.nci.nih.gov</a><br />
<span>Sat Mar 28 02:18:28 2009<br />
Apache</span>
</address>
</body>
</html>
My conclusion is that despite setting the agent to 'Mozilla/5.0' the program is still not acting enough like a browser. My naive assessment is that the server is rejecting the request because it looks too much like a bot, but the functionality is available because the same request from a browser works.
So my question becomes: How do I make the program look more like a browser? Did I miss something in the headers? I can post more information if requested, but I don't know what to look for.
My dear monks, what am I missing?
Thanks