Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

parallelism v.python

by perl-diddler (Chaplain)
on Feb 28, 2014 at 18:53 UTC ( [id://1076596]=CUFP: print w/replies, xml ) Need Help??

Someone was asking on a list why their program wasn't getting expected results (had to due with "unit of measurement" inconsistencies). But to test they'd written the program in python -- for 9 threads, they got < 2 cores utilization. I was curious how perl would do. So I tried to follow the python structure as much as possible.

First the programs, python:

#!/usr/bin/python import operator import hashlib from threading import Thread def ticks_all(): with open('/proc/stat') as f: cpu = f.readline().split() return (int(cpu[1]), int(cpu[3])) def ticks_process(): with open('/proc/self/stat') as f: cpu = f.readline().split() return (int(cpu[13]), int(cpu[14])) def do_work(): d = hashlib.md5() d.update('nobody inspects') for i in xrange(0, 10000000): d.update(' the spammish repetition') before_all_user, before_all_sys = ticks_all() before_process_user, before_process_sys = ticks_process() threads = [] for i in xrange(0, 8): t = Thread(target=do_work) threads.append(t) t.start() for t in threads: t.join() after_process_user, after_process_sys = ticks_process() after_all_user, after_all_sys = ticks_all() print 'delta process: user:', after_process_user - before_process_user +, 'sys:', after_process_sys - before_process_sys print 'delta all: user:', after_all_user - before_all_user, 'sys:', af +ter_all_sys - before_all_sys
Then my attempt at a perl approximation (I don't really know python, so if anyone sees anywhere I booboo'd, feel free to politely point it out ;-).
#!/usr/bin/perl use 5.16.0; use threads; sub open_for_read($) { open(my $handle, "<$_[0]") or die "opening $_[0]: $!"; $handle } sub ticks_all { my $f = open_for_read("/proc/stat"); return (split ' ', <$f>)[1,3] } sub ticks_process() { my $f = open_for_read("/proc/self/stat"); return (split ' ', <$f>)[13,14] } sub dowork () { use Digest::MD5; my $d = Digest::MD5->new; $d->add('nobody inspects'); $d->add(' the spammish repetition') for (0 .. 10_000_000)} my ($before_all_user, $before_all_sys) = ticks_all(); my ($before_process_user, $before_process_sys) = ticks_process(); my @threads; for my $i (0 .. 8) { my $t = threads->create(\&dowork); push @threads,$t } $_->join() foreach @threads; my ($after_all_user, $after_all_sys) = ticks_all(); my ($after_process_user, $after_process_sys) = ticks_process(); #(note: changing perl defaults for print) $, = " "; #put spaces between output fields $\ = "\n"; #add LF to end of lines by default print 'delta process: user:', $after_process_user - $before_process_us +er, ' sys:', $after_process_sys - $before_process_sys; print 'delta all: user:', $after_all_user - $before_all_user, ' sys: ', $after_all_sys - $before_all_sys;
The results:
> export TIMEFORMAT="%2Rsec %2Uusr %2Ssys (%P%% cpu)" > time python ticks.py delta process: user: 9263 sys: 2987 delta all: user: 6034 sys: 2178 67.35sec 92.64usr 29.89sys (181.94% cpu) > time perl /tmp/pticks delta process: user: 2917 sys: 3 delta all: user: 2926 sys: 25 3.36sec 29.20usr 0.03sys (870.05% cpu) --- For 9 threads: lang #thrds #coresuse %efficency python 9 1.82 20.2% perl 9 8.70 96.7%
I tried to use as close to same semantics as the python program. Even used python indentation where practical (I did split the prints at the end... something python seems to have problems with...)

Replies are listed 'Best First'.
Re: parallelism v.python
by LanX (Saint) on Mar 01, 2014 at 13:54 UTC
Re: parallelism v.python
by oiskuu (Hermit) on Feb 28, 2014 at 21:54 UTC

    See python docs. I believe this is a problem with their hashlib implementation. (In python2.6 ?)

      Perhaps this is (part of) the cause?

      CPython implementation detail: In CPython, due to the Global Interpreter Lock, only one thread can execute Python code at once

      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        Holy batpoo! Yeah, that might explain a bit...
        BTW, been thinking about this a bit...

        If it was just a matter of parallelism V non-parallelism, that would be 1 issue.

        But the 2nd big issue: Look at the total cpu time used:

        lang #thrds #cores_used %efficency clocktime cputime(Usr+Sy +s) python 9 1.82 20.2% 67.35s 122.53s perl 9 8.70 96.7% 3.36s 29.23
        From the above, stated as percentages or multipliers, perl is 478% more efficient in making use of multicore resources.

        In real time, python takes 64 seconds longer over the base time that was needed, of 3.36s. Python is 1900% SLOWER.

        Someone mentioned python might not be optimal in parallelism due to threading problems.

        So look at the actual amount of CPUtime used for each to do the work. 122.53/29.23. For heavy number crunching, perl (with max precision possible in x86-64 HW, takes less than 1/4th the time, i.e. perl is 4.2x the speed of python.
        ----

        (Notes collected for a response to the "said" note writer...)

Re: parallelism v.python
by BrowserUk (Patriarch) on Feb 28, 2014 at 19:56 UTC

    Whoops! Just noticed this is in CUFP! :)

    Is there a question here?


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://1076596]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (4)
As of 2024-04-18 07:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found