Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Develop a forward HTTP proxy server

by josef (Acolyte)
on Nov 04, 2013 at 16:27 UTC ( #1061146=perlquestion: print w/ replies, xml ) Need Help??
josef has asked for the wisdom of the Perl Monks concerning the following question:

Hello PerlMonks,

I find very interesting to develop a flexible and powerful forward HTTP proxy server in Perl. I imagine that is possible to write a HTTP proxy server to support 400-1000 requests/second.
For example the very used proxy Squid use one non-blocking socket with a big select() poll. For my intention I tend to use only pre-forking or polling (select, epoll, kqueue), due to easy development. Iím not clear which technique is better suited. Which CPAN modules are suitable? Which performance can I expect? Do I need a blocking or a non-blocking socket? How scale?

I have tested some modules: forks, AnyEvent, EV, IO::Socket, IO::Async, IO::Select, IO::Poll, HTTP::Daemon, HTTP:Proxy, Net::Server, LWP::UserAgent, Furl and other, but I need a direction.

Google canít help my much; one running proxy server using pre-forking was published by Randal L. Schwart (http://www.stonehenge.com/merlyn/WebTechniques/col34.html) 15 years ago.

Best regards

Josef

Comment on Develop a forward HTTP proxy server
Re: Develop a forward proxy server
by oiskuu (Pilgrim) on Nov 04, 2013 at 18:19 UTC
    Not that I've used either, but there is a perl-based reverse proxy called Perlbal using Danga::Socket.
    The latter (Danga::Socket) is epoll/kqueue backed and ought to scale well.
Re: Develop a forward HTTP proxy server
by vsespb (Hermit) on Nov 09, 2013 at 21:54 UTC

    I've got 250 req/s with HTTP::Daemon + fork() + LWP::UserAgent + my own code and logic (not small) + sharing data between workers and master process via sockets on _each_ request. On modern desktop hardware. I think I was testing with tiny requests on localhost. And I had around 100 worker processes

    Note that most slower part here is LWP::UserAgent (for doing outgoing requests to upstream proxy).

    So I think 400 is possible if you replace LWP::UserAgent with something like .*Curl.*.

    Not sure about 1000.

      Exactly, LWP::UserAgent for a proxy server is to slow.
      My benchmark with some user agents (LWP::UserAgent, LWP::Simple, HTTP::Lite, WWW::Curl::Easy and Furl):
      Test 1: run on localhost, ideal case for no network latency, no cache URI: http://localhost, GET method Compare COUNT: 10000 Benchmark: timing 10000 iterations of Furl, curl, lite, lwp, lwp_simpl +e... Furl: 5.90996 wallclock secs (1.92 usr + 0.62 sys = 2.55 CPU) @3926/s curl: 6.10753 wallclock secs (1.16 usr + 1.70 sys = 2.87 CPU) @3487/s lite: 5.68429 wallclock secs (2.53 usr + 0.55 sys = 3.09 CPU) @3240/s lwp: 16.2804 wallclock secs (12.20 usr + 0.50 sys = 12.70 CPU) @787/s lwp_simple: 16.7128 wallclock secs(12.92 usr + 0.54 sys = 13.46 CPU) @ + 742/s Rate lwp_simple lwp lite curl Furl lwp_simple 743/s -- -6% -77% -79% -81% lwp 788/s 6% -- -76% -77% -80% lite 3241/s 336% 311% -- -7% -17% curl 3488/s 369% 343% 8% -- -11% Furl 3926/s 429% 398% 21% 13% --
      Test 2: run on internet, real case with network latency, cache active +on remote server URI: http://google.de, GET method Compare COUNT: 1000 Benchmark: timing 1000 iterations of Furl, curl, lite, lwp, lwp_simple +... Furl: 99.9638 wallclock secs (0.75 usr + 0.52 sys = 1.27 CPU) @790/s curl: 42.9186 wallclock secs (0.24 usr + 0.39 sys = 0.63 CPU) @1580/s lite: 14.2738 wallclock secs (0.48 usr + 0.24 sys = 0.72 CPU) @1391/s lwp: 102.792 wallclock secs (3.88 usr + 0.44 sys = 4.31 CPU) @231/s lwp_simple: 432.2 wallclock secs (3.73 usr + 0.48 sys = 4.22 CPU) @237 +/s Rate lwp lwp_simple Furl lite curl lwp 232/s -- -2% -71% -83% -85% lwp_simple 237/s 2% -- -70% -83% -85% Furl 790/s 241% 233% -- -43% -50% lite 1391/s 500% 487% 76% -- -12% curl 1580/s 581% 567% 100% 14% --

      LWP::UserAgent and LWP::Simple use the same ground, hence identically performance.
      Furl, Curl and HTTP::Lite achieve good performance for requests on localhost (test 1).
      On real live (test 2), all agents note a performance decrease at least %50:
      LWL -70%, Furl -80%, HTTP::Lite -58% and Curl -55%
      Furl is the real drama, but still is better as LWP. I thing HTTP::Lite and Curl make a very good figure and must earn more recognition, especially HTTP::Lite.
      Note: The test use only the GET method, perhaps on POST the image is different.

      Josef

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1061146]
Approved by keszler
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (15)
As of 2014-09-30 15:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (376 votes), past polls