Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

WWW::Curl::Easy slower than LWP on POSTs (RESOLVED)

by mje (Curate)
on Sep 13, 2011 at 19:32 UTC ( [id://925760]=perlquestion: print w/replies, xml ) Need Help??

mje has asked for the wisdom of the Perl Monks concerning the following question:

I have compared WWW::Curl::Easy use with LWP and found that for our use the former was a lot faster on our local network for HTTP GET requests. No big surprise there as Curl is all C. However on POST requests WWW::Curl::Easy is much slower than LWP in my test case. Our setup is a little too complicated to explain here but it is nginx as a reverse proxy going to starman processes but the main thing is that the test code goes to the same server. The test code is:

use strict; use warnings; use Benchmark qw(:all); use WWW::Curl::Easy; use HTTP::Cookies; use HTTP::Request::Common qw(POST); use LWP::UserAgent; my $cj = HTTP::Cookies->new(file => 'cookieslwp', autosave => 1); my $ua = LWP::UserAgent->new; $ua->cookie_jar($cj); $ua->default_header('Content-Type' => 'text/plain;charset=UTF-8'); my $curl_hdr = ''; my $curl_hdr_fh = undef; open $curl_hdr_fh, ">", \$curl_hdr; my $curlp = WWW::Curl::Easy->new(); $curlp->setopt(CURLOPT_NOPROGRESS, 1); # shut off the built-in progres +s meter $curlp->setopt(CURLOPT_HEADER, 0); # don't include hdr in body $curlp->setopt(CURLOPT_ENCODING, 'gzip'); $curlp->setopt(CURLOPT_PROXY, ''); # no proxy open $curl_hdr_fh, "+>", \$curl_hdr; $curlp->setopt(CURLOPT_WRITEHEADER, $curl_hdr_fh); $curlp->setopt(CURLOPT_COOKIEFILE, 'cookiescurl'); $curlp->setopt(CURLOPT_COOKIEJAR, 'cookiescurl'); # either of the following speeds curl up massively #$curlp->setopt(CURLOPT_TCP_NODELAY, 1); #$curlp->setopt(CURLOPT_FORBID_REUSE, 1); logincurl(); loginlwp(); timethese(1000, { curl => sub {postcurl2()}, lwp => sub {postlwp2()}}); sub loginlwp { my $url = 'http://xxx.yyy.zzz:82/v1'; my %parameters = (method => 'login', username => 'xxx', password => 'yyy'); my $request = POST $url, \%parameters; $request->content_type('application/x-www-form-urlencoded;charset= +utf-8'); my $response = $ua->request($request); if (!$response->is_success) { die "Failed to get url - $response->code, $response->status_li +ne"; } print $response->content, "\n"; } sub logincurl { seek $curl_hdr_fh, 0, 0; truncate $curl_hdr_fh, 0; my $url = 'http://xxx.yyy.zzz:82/v1'; $curlp->setopt(CURLOPT_URL, $url); my $form = WWW::Curl::Form->new; $form->formadd('method', 'login'); $form->formadd('username', 'xxx'); $form->formadd('password', 'yyy'); $curlp->setopt(CURLOPT_HTTPPOST, $form); my @hdrs = ('Expect:'); $curlp->setopt(CURLOPT_HTTPHEADER, \@hdrs); my $response_body; $curlp->setopt(CURLOPT_WRITEDATA,\$response_body); my $retcode = $curlp->perform; my $response_code = $curlp->getinfo(CURLINFO_HTTP_CODE); if ($retcode != 0 || ($response_code != 200 && $response_code != 3 +04)) { die "Failed to get url - $response_code, ", $curlp->errbuf; } print $response_body, "\n\n"; } sub postlwp2 { my $url = 'http://xxx.yyy.zzz:82/v1/data/xxx.dat'; my %parameters = (method => 'xxx'); my $request = POST $url, \%parameters; $request->content_type('application/x-www-form-urlencoded;charset= +utf-8'); my $response = $ua->request($request); if (!$response->is_success) { die "Failed to get url - $response->code, $response->status_li +ne"; } # print $response->content, "\n"; } sub postcurl2 { seek $curl_hdr_fh, 0, 0; truncate $curl_hdr_fh, 0; my $url = 'http://xxx.yyy.zzz:82/v1/data/xxx.dat'; $curlp->setopt(CURLOPT_URL, $url); my $form = WWW::Curl::Form->new; $form->formadd('method', 'account'); $curlp->setopt(CURLOPT_HTTPPOST, $form); my @hdrs = ('Expect:'); $curlp->setopt(CURLOPT_HTTPHEADER, \@hdrs); my $response_body; $curlp->setopt(CURLOPT_WRITEDATA,\$response_body); my $retcode = $curlp->perform; my $response_code = $curlp->getinfo(CURLINFO_HTTP_CODE); if ($retcode != 0 || ($response_code != 200 && $response_code != 3 +04)) { die "Failed to get url - $response_code, ", $curlp->errbuf; } #print $response_body, "\n"; }

As you can see the POSTed data is small and the returned data is around 128 bytes + HTTP headers. All POSTs return exactly the same data. The results for the code as it stands above are:

Benchmark: timing 1000 iterations of curl, lwp... curl: 48 wallclock secs ( 0.10 usr + 0.08 sys = 0.18 CPU) @ 55 +55.56/s (n =1000) (warning: too few iterations for a reliable count) lwp: 16 wallclock secs ( 4.68 usr + 0.69 sys = 5.37 CPU) @ 18 +6.22/s (n= 1000)

It seems as though Curl uses hardly any cpu compared with LWP and yet it takes more real time to run 1000 iterations. This surprised me as it suggested something was holding up Curl; almost as if it was waiting. I looked for differences between Curl and LWP and first noticed that LWP closes the connection by default but Curl keeps it open by default. Then I discovered setting CURLOPT_TCP_NODELAY with Curl speeded it up massively and lastly I discovered that stopping Curl reusing the socket was faster than LWP but not quite as fast as setting CURLOPT_TCP_NODELAY. Results with CURLOPT_TCP_NODELAY or CURLOPT_FORBID_REUSE set are:

Benchmark: timing 1000 iterations of curl, lwp... curl: 12 wallclock secs ( 0.31 usr + 0.13 sys = 0.44 CPU) @ 22 +72.73/s (n =1000) lwp: 16 wallclock secs ( 4.66 usr + 0.55 sys = 5.21 CPU) @ 19 +1.94/s (n= 1000)

Now it is has been a while since I did any serious C programming with sockets but the results with CURLOPT_TCP_NODELAY or CURLOPT_FORBID_REUSE raised alarm bells. In the case of the nagle algorithm I thought the first write was guaranteed to be sent but then further writes to the socket may be held up if an ack is not received. The results with CURLOPT_TCP_NODELAY suggests to me the form data is not being written at the same time as the POST headers and held up. Also if CURLOPT_FORBID_REUSE is used it is possible Curl does a shutdown(write) on the socket which expedites the data. Really, I'd like to reuse the connection to the HTTP server but I'm also loathe to set CURLOPT_TCP_NODELAY. Any ideas?

UPDATE1 I have confirmed with strace that libcurl is sending the POST header in one write to the socket and the form data in a second write to the socket.

UPDATE RESOLUTION It appears formadd adds form data as multipart and because the way libcurl is written that results in 2 writes to the socket, one for the POST header fields and one for the POSTed data. In my case the posted data is very small which causes nagle to come into effect. I changed the code to remove formadd calls and instead use CURLOPT_POST, CURLOPT_POSTFIELDS and this results in only a single write to the socket and no nagle. Thanks to libcurl mailing list for helping with this.

Replies are listed 'Best First'.
Re: WWW::Curl::Easy slower than LWP on POSTs
by onelesd (Pilgrim) on Sep 13, 2011 at 22:03 UTC

    It appears you already did all the hard work to diagnose the performance and come to your own conclusion. All I can offer is to say what you most likely already know: performance optimization can sometimes amount to you spinning your wheels to fix a problem that may not need to be fixed.

    If I had gone as far down this path as you I would choose the module/library that does what I want without potentially introducing new bugs to gain a few seconds of performance here and there.

Re: WWW::Curl::Easy slower than LWP on POSTs (RESOLVED)
by deMize (Monk) on Mar 27, 2013 at 04:04 UTC
    Question: Old post, but have you tried comparing WWW::Curl to just using back ticks to capture/call curl?


    Demize

      No I didn't. Are you suggesting you think it is quicker?

        Not suggesting. I have no idea and was hoping you'd find out.


        Demize

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://925760]
Approved by keszler
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (9)
As of 2024-04-19 13:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found