I have compared WWW::Curl::Easy use with LWP and found that for our use the former was a lot faster on our local network for HTTP GET requests. No big surprise there as Curl is all C. However on POST requests WWW::Curl::Easy is much slower than LWP in my test case. Our setup is a little too complicated to explain here but it is nginx as a reverse proxy going to starman processes but the main thing is that the test code goes to the same server. The test code is:
use strict;
use warnings;
use Benchmark qw(:all);
use WWW::Curl::Easy;
use HTTP::Cookies;
use HTTP::Request::Common qw(POST);
use LWP::UserAgent;
my $cj = HTTP::Cookies->new(file => 'cookieslwp', autosave => 1);
my $ua = LWP::UserAgent->new;
$ua->cookie_jar($cj);
$ua->default_header('Content-Type' => 'text/plain;charset=UTF-8');
my $curl_hdr = '';
my $curl_hdr_fh = undef;
open $curl_hdr_fh, ">", \$curl_hdr;
my $curlp = WWW::Curl::Easy->new();
$curlp->setopt(CURLOPT_NOPROGRESS, 1); # shut off the built-in progres
+s meter
$curlp->setopt(CURLOPT_HEADER, 0); # don't include hdr in body
$curlp->setopt(CURLOPT_ENCODING, 'gzip');
$curlp->setopt(CURLOPT_PROXY, ''); # no proxy
open $curl_hdr_fh, "+>", \$curl_hdr;
$curlp->setopt(CURLOPT_WRITEHEADER, $curl_hdr_fh);
$curlp->setopt(CURLOPT_COOKIEFILE, 'cookiescurl');
$curlp->setopt(CURLOPT_COOKIEJAR, 'cookiescurl');
# either of the following speeds curl up massively
#$curlp->setopt(CURLOPT_TCP_NODELAY, 1);
#$curlp->setopt(CURLOPT_FORBID_REUSE, 1);
logincurl();
loginlwp();
timethese(1000,
{ curl => sub {postcurl2()},
lwp => sub {postlwp2()}});
sub loginlwp {
my $url = 'http://xxx.yyy.zzz:82/v1';
my %parameters = (method => 'login',
username => 'xxx',
password => 'yyy');
my $request = POST $url, \%parameters;
$request->content_type('application/x-www-form-urlencoded;charset=
+utf-8');
my $response = $ua->request($request);
if (!$response->is_success) {
die "Failed to get url - $response->code, $response->status_li
+ne";
}
print $response->content, "\n";
}
sub logincurl {
seek $curl_hdr_fh, 0, 0;
truncate $curl_hdr_fh, 0;
my $url = 'http://xxx.yyy.zzz:82/v1';
$curlp->setopt(CURLOPT_URL, $url);
my $form = WWW::Curl::Form->new;
$form->formadd('method', 'login');
$form->formadd('username', 'xxx');
$form->formadd('password', 'yyy');
$curlp->setopt(CURLOPT_HTTPPOST, $form);
my @hdrs = ('Expect:');
$curlp->setopt(CURLOPT_HTTPHEADER, \@hdrs);
my $response_body;
$curlp->setopt(CURLOPT_WRITEDATA,\$response_body);
my $retcode = $curlp->perform;
my $response_code = $curlp->getinfo(CURLINFO_HTTP_CODE);
if ($retcode != 0 || ($response_code != 200 && $response_code != 3
+04)) {
die "Failed to get url - $response_code, ", $curlp->errbuf;
}
print $response_body, "\n\n";
}
sub postlwp2 {
my $url = 'http://xxx.yyy.zzz:82/v1/data/xxx.dat';
my %parameters = (method => 'xxx');
my $request = POST $url, \%parameters;
$request->content_type('application/x-www-form-urlencoded;charset=
+utf-8');
my $response = $ua->request($request);
if (!$response->is_success) {
die "Failed to get url - $response->code, $response->status_li
+ne";
}
# print $response->content, "\n";
}
sub postcurl2 {
seek $curl_hdr_fh, 0, 0;
truncate $curl_hdr_fh, 0;
my $url = 'http://xxx.yyy.zzz:82/v1/data/xxx.dat';
$curlp->setopt(CURLOPT_URL, $url);
my $form = WWW::Curl::Form->new;
$form->formadd('method', 'account');
$curlp->setopt(CURLOPT_HTTPPOST, $form);
my @hdrs = ('Expect:');
$curlp->setopt(CURLOPT_HTTPHEADER, \@hdrs);
my $response_body;
$curlp->setopt(CURLOPT_WRITEDATA,\$response_body);
my $retcode = $curlp->perform;
my $response_code = $curlp->getinfo(CURLINFO_HTTP_CODE);
if ($retcode != 0 || ($response_code != 200 && $response_code != 3
+04)) {
die "Failed to get url - $response_code, ", $curlp->errbuf;
}
#print $response_body, "\n";
}
As you can see the POSTed data is small and the returned data is around 128 bytes + HTTP headers. All POSTs return exactly the same data. The results for the code as it stands above are:
Benchmark: timing 1000 iterations of curl, lwp...
curl: 48 wallclock secs ( 0.10 usr + 0.08 sys = 0.18 CPU) @ 55
+55.56/s (n
=1000)
(warning: too few iterations for a reliable count)
lwp: 16 wallclock secs ( 4.68 usr + 0.69 sys = 5.37 CPU) @ 18
+6.22/s (n=
1000)
It seems as though Curl uses hardly any cpu compared with LWP and yet it takes more real time to run 1000 iterations. This surprised me as it suggested something was holding up Curl; almost as if it was waiting. I looked for differences between Curl and LWP and first noticed that LWP closes the connection by default but Curl keeps it open by default. Then I discovered setting CURLOPT_TCP_NODELAY with Curl speeded it up massively and lastly I discovered that stopping Curl reusing the socket was faster than LWP but not quite as fast as setting CURLOPT_TCP_NODELAY. Results with CURLOPT_TCP_NODELAY or CURLOPT_FORBID_REUSE set are:
Benchmark: timing 1000 iterations of curl, lwp...
curl: 12 wallclock secs ( 0.31 usr + 0.13 sys = 0.44 CPU) @ 22
+72.73/s (n
=1000)
lwp: 16 wallclock secs ( 4.66 usr + 0.55 sys = 5.21 CPU) @ 19
+1.94/s (n=
1000)
Now it is has been a while since I did any serious C programming with sockets but the results with CURLOPT_TCP_NODELAY or CURLOPT_FORBID_REUSE raised alarm bells. In the case of the nagle algorithm I thought the first write was guaranteed to be sent but then further writes to the socket may be held up if an ack is not received. The results with CURLOPT_TCP_NODELAY suggests to me the form data is not being written at the same time as the POST headers and held up. Also if CURLOPT_FORBID_REUSE is used it is possible Curl does a shutdown(write) on the socket which expedites the data. Really, I'd like to reuse the connection to the HTTP server but I'm also loathe to set CURLOPT_TCP_NODELAY. Any ideas?
UPDATE1 I have confirmed with strace that libcurl is sending the POST header in one write to the socket and the form data in a second write to the socket.
UPDATE RESOLUTION It appears formadd adds form data as multipart and because the way libcurl is written that results in 2 writes to the socket, one for the POST header fields and one for the POSTed data. In my case the posted data is very small which causes nagle to come into effect. I changed the code to remove formadd calls and instead use CURLOPT_POST, CURLOPT_POSTFIELDS and this results in only a single write to the socket and no nagle. Thanks to libcurl mailing list for helping with this.