http://www.perlmonks.org?node_id=11133179

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks

I came across the MCE::Map utility and wanted to try out.

Here's what I see...

testuser@linux-mint:~$ cat just_map.pl use strict; use warnings; my @biglist = (1..10000000); my @squared_big_list = map { $_ * $_ } @biglist; testuser@linux-mint:~$ time perl just_map.pl real 0m1.743s user 0m1.496s sys 0m0.248s testuser@linux-mint:~$ cat mce_map.pl use strict; use warnings; use MCE::Map; my @biglist = (1..10000000); my @squared_big_list = mce_map { $_ * $_ } @biglist; testuser@linux-mint:~$ time perl mce_map.pl real 0m5.401s user 0m16.008s sys 0m4.658s testuser@linux-mint:~$

Why is the time more with MCE::Map? I tried it with various versions of perl (5.16, 5.24, 5.32 and 5.34) and I get similar results....Either I'm reading something wrong or my understanding is messed up.

Replies are listed 'Best First'.
Re: I thought MCE::Map is faster than usual map. What am I missing? (updated)
by LanX (Saint) on May 27, 2021 at 22:29 UTC
    Parallelization is always limited by the overhead for communication ...

    from MCE::Map

    The time for mce_map below also includes the time for data exchanges between the manager and worker processes.

    Please try

    my @squared_big_list = mce_map_s { $_ * $_ } 1,10000000;

    Even faster is mce_map_s; useful when input data is a range of numbers. Workers generate sequences mathematically among themselves without any interaction from the manager process.

    (untested)

    update

    to elaborate more on what AnoMonk said here:

    $_ * $_ is so fast that even a slight communication overhead counts. The more complex the calculation, the less significant will the communication become.

    The relation matters.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

Re: I thought MCE::Map is faster than usual map. What am I missing?
by Anonymous Monk on May 28, 2021 at 01:01 UTC

    I thought MCE::Map is faster than usual map. What am I missing? Either I'm reading something wrong or my understanding is messed up.

    Hi

    How come why do you think MCE::Map supposed to be faster?

    This is whatt MCE::Map says

    This module provides a parallel map implementation via Many-Core Engine. MCE incurs a small overhead due to passing of data. A fast code block will run faster natively. However, the overhead will likely diminish as the complexity increases for the code. ...

    The description continues with code examples with timings like you posted , how speed changes...

Re: I thought MCE::Map is faster than usual map. What am I missing?
by perlfan (Vicar) on Jun 02, 2021 at 23:19 UTC
    MCE is cool, but just adds a convenient communication layer to adult child OS perl processes spawned via fork. If you want real OS threads (what most of the non-Perl world considers threads), you'll need to look at something like Inline::C + OpenMP + Alien::OpenMP or PDL::ParallelCPU (pthreads).

    Given what I consider the semantics of map, you may also think about looking at Parallel::ForkManager::Segmented which gives some convenient sugar for chunking up work among child processes (on top of the awesome Parallel::ForkManager - and I think will reuse them without unduly aborting them prematurely (thus saving the overhead of spawning new processes) - and if doesn't maintain an actual pool, you're at lease amortizing the start up time over multiple items processed rather than one new process per item.

    > I tried it with various versions of perl (5.16, 5.24, 5.32 and 5.34) and I get similar results

    This is because you're likely testing on the same hardware and OS, which means same forking overhead. It's not rocket surgery.