Re^2: Could there be a ThreadedMapReduce (instead of DistributedMapReduce)?

by tphyahoo (Vicar)
on Oct 20, 2006 at 11:23 UTC ( #579555=note: print w/replies, xml ) Need Help??

in reply to Re: Could there be a ThreadedMapReduce (instead of DistributedMapReduce)?
in thread Could there be ThreadedMapReduce (and/or ForkedMapReduce) instead of DistributedMapReduce?

I originally started with P::FM as well, because it did look simple. But this uses forks, not threads, and so couldn't be applied to my problem.

(Specifically, concatenating a string in a distributed way, where order wasn't important: using parallel processing to concatenate a string, where order of concatenation doesn't matter.)

It didn't work, in a way that violated my expectations, because parallelization, like I said, is painful. In this case I would have needed to use threads.

Perhaps there could be a ForkedMapReduce as well as a ThreadedMapReduce. The important thing to me is a reliable abstraction that Does What I Mean.

Re^3: Could there be a ThreadedMapReduce (instead of DistributedMapReduce)?
by Anonymous Monk on Oct 20, 2006 at 15:50 UTC
    The important thing to me is a reliable abstraction that Does What I Mean.

    What's wrong with this?

      It has locks.

      BrowserUK missed that on the first go, and had to add them later. So, it's not a clean abstraction. If ikegami hand't stepped in, we might have had a subtle, hard to find bug. (There's several layers of dialogue between these two about that, and these are smart people, so that's what I'm talking about parallelism being very hard to wrap your mind around.)

      Here, it seems like not such a big deal. But the coded that I wanted to parallelize was also an artificually simple example.

      The point is, you couldn't just cut and paste code that was executing serially, and change it in one place, and have it executing parallelly. Close maybe, but not quite. But what if my example hadn't been so simple? What if there had been two loops I wanted to processes parallely; what if they interacted somehow? Pain.

      That said, I am giving this code a careful study to see if I can apply it to the threadedreduce stub code I am working on at Re^2: Could there be a ThreadedMapReduce (instead of DistributedMapReduce)?

        Is this too complicated for you?

        #! perl -slw use strict; use threads; use threads::shared::Scalar; my $shared = threads::shared::Scalar->new; for ( 1.. 10 ) { async { sleep 1 + rand 3; $shared->value .= ":$_"; }; } $_->join for threads->list; print $shared->value; __END__ C:\test>579015 :1:2:3:5:6:9:4:7:8:10 C:\test>579015 :1:2:6:9:3:5:7:10:4:8 C:\test>579015 :4:5:6:7:8:9:3:1:2:10

        Update: A slightly simpler version:

        #! perl -slw use strict; use threads::shared::Scalar; our $T ||= 10; our $N ||= 1000; my $shared = threads::shared::Scalar->new; for( 1 .. $T ) { async{ for( 1 .. $N ) { $shared->value++; } }; } waitall; print $shared->value; __END__ C:\test>sharedScalar.plt -T=1000 -N=1000 1000000 C:\test>sharedScalar.plt -T=500 -N=1000 500000 C:\test>sharedScalar.plt -T=500 -N=12345 6172500 C:\test>sharedScalar.plt -T=123 -N=10 1230

