Beefy Boxes and Bandwidth Generously Provided by pair Networks Bob
P is for Practical
 
PerlMonks

The State of Parallel Computing in perl 2007?

by jettero (Prior)
 | Log in | Create a new user | The Monastery Gates | Super Search | 
 | Seekers of Perl Wisdom | Meditations | PerlMonks Discussion | 
 | Obfuscation | Reviews | Cool Uses For Perl | Perl News | Q&A | Tutorials | 
 | Poetry | Recent Threads | Newest Nodes | Donate | What's New | 

on Jan 21, 2007 at 15:20 UTC ( #595771=perlquestion: print w/ replies, xml ) Need Help??
jettero has asked for the wisdom of the Perl Monks concerning the following question:

What are people using for parallel packages these days?

I've been daydreaming for some time about a minature OS that runs in the systray in win32 and as a daemon in linux... You know, that runs perl and has some kind of shared filesystem and/or shared memory.

It occurred to me recently that it might already exist. I looked at Parallel::Pvm a little and it seems to go the right direction perhaps. jcwren talks about parallelism a little here: Parallel Processing, Processes, and Threads. I found one article here that mentioned Parallel::Pvm and some other system that seemed abandoned/forgotten (but it wasn't perl based). I can't find that article now...

I found some offsite things (eg, parawiki and Parallel_computing), but I was looking for perl specific things — all the nodes here that talk about it seem to be several years old — or at least, I don't know how to look for the newer ones.

At one point, I had hoped POE would help — and for all I know, it does; but it seemed woefully single threaded to me.

Again, what are people using for parallel packages these days? I have a sneaking suspicion it hasn't changed all that much since the older posts I've found. That, or everyone just uses threads and/or message passing by hand maybe?

UPDATES and CLARIFICATIONS:

  1. I intentionally didn't say what I meant by parallel because I'm interested in any links people have. Personally, I'm mostly interested multi-computer scenarios, but multi-processor scenarios would be interesting to read about also.

-Paul

Comment on The State of Parallel Computing in perl 2007?
Re: The State of Parallel Computing in perl 2007?
by zentara (Chancellor) on Jan 21, 2007 at 17:04 UTC
    Are you talking about multiple processors working on a single problem, or just processes running in parallel, sharing data somehow? It's a big topic, and you need to narrow down what Parallel means.

    I'm not really a human, but I play one on earth. Cogito ergo sum a bum
[reply]
Re: The State of Parallel Computing in perl 2007?
by diotalevi (Canon) on Jan 21, 2007 at 18:28 UTC
[reply]

      If your interest in Mozart/Oz is due to it's distributed and parallel computing aspects, you might also find Erlang interesting if you haven't already encountered it.

      I find the Erlang cui repl preferable to the Oz emacs-based interface, but if you like emacs that will be less of a consideration.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
[reply]

        Tying the language to an editor has almost completely ruled it out for me. What an odd thing to do. It does still sound pretty interesting though. I'm installing erlang presently. I've heard people mention it before.

        I feel like I'm taking away a "perl doesn't really have much of this yet" feeling from the two posts above this though. Is that the case?

        -Paul

[reply]

        No, my interest in Mozart comes from its integrated constraint solvers. The distributed computation stuff is just gravy.

        ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

[reply]

        An attractive talk was just posted to Lambda the Ultimate about Erlang and concurrency: LCA2007: Concurrency and Erlang.

        ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

[reply]
Re: The State of Parallel Computing in perl 2007?
by diotalevi (Canon) on Jan 21, 2007 at 23:33 UTC
[reply]
Re: The State of Parallel Computing in perl 2007?
by toma (Vicar) on Jan 22, 2007 at 07:42 UTC
    I don't know if you would consider it parallel programming or not, but I use memcached to get more than one computer into the act. There are several perl modules that use it.

    Unlike many things in parallel programming, memcached is easy.

    POE::Wheel::Run will allow you to use multiple processes, which should provide parallelism on a multiprocessor machine. To use it in windows, I used Cygwin since POE::Wheel::Run required the more full-featured fork/exec.

    Another easy way to do parallel computing is to use a web server. One program can make requests to multiple servers, or a single server with multiple CPUs, and get them all working on parts of a problem. You can use POE to control the flow of making multiple web requests and synching back up when they return.

    It should work perfectly the first time! - toma
[reply]

      I saw in the POE docs that there were references to needing a serializer. I didn't get far enough with it to see that you can fork. Does it support load balancing and things? I've got the POE::WHeel::Run docs up presently, and I'm seeing that it forks a child process, but I thought it was for things like spawning 'cat' or 'ls' or whatever...

      That probably isn't what I have in mind, but I do still wonder if POE has some kind of built in shared memory multi-processor and/or multi-computer features. It seems like it should.

      -Paul

[reply]
        I use POE::Wheel::Run to spawn four Perl programs.
        1. A web server built from HTTP::Daemon. This provides a browser-based GUI. This web server also spawns programs that can get content from other web servers.
        2. A live link to a large CAD program.
        3. A live link to a large circuit simulator.
        4. A terminal that provides user messages and a command line, for development and for cases where the GUI doesn't have deep enough functionality.
        I hadn't thought about this program as parallel processing until I saw your question. I have only recently begun running the application on multi-cpu machines.

        The program uses message passing through several mechanisms:

        • STDIN, STDOUT, and STDERR of child processes.
        • Dropping files. The CAD package uses this for input.
        • Web calls. I recently switched from LWP to curl because of deployment difficulties. I had trouble getting my installer to automate the configuration of LWP.
        • Environment variables are used to send parameters into child programs. I had trouble with platform differences in the handling of command-line arguments. This is possibly due to differences in quoting and escaping.
        It should work perfectly the first time! - toma
[reply]
      You use memcached as a database? Not a good idea. Use a database for that. Memcached doesn't consider it a problem to drop your data silently if it runs out of free RAM. That's typical for a cache.
[reply]
        No, I don't think I mentioned using memcached as a database. Memcached allows me to use multiple machines to cache data in a transparent manner. It allows me to use more, cheaper machines rather than one big expensive machine. I have heard people describe this tactic as "build out, not up."

        A common use case is to cache the results of an SQL query. I use the query as the cache key and the value is the result set from the query. I check the cache to see if the dataset is there. If it is, I get it from the cache. If it isn't, I run the SQL and put the results in the cache. This provides me with a huge speedup.

        If the data in the database gets updated, I flush the cache and start again. This is not a problem for parts of my application, so those are the parts where I use memcached. Instead of storing lot of data in a perl data structure in mod_perl program and counting on the copy-on-write mechanism to save RAM, I use memcached.

        It should work perfectly the first time! - toma
[reply]

        memcached is just about the coolest thing ever btw. It's on the top of my list of things to learn next. UPDATE: oops, wrong parent. Eh.

        -Paul

[reply]
Re: The State of Parallel Computing in perl 2007?
by markatwork (Initiate) on Jan 22, 2007 at 12:07 UTC
    From the realms of "I saw this once and I thought it looked interesting", rather than being anything I've actually used,
    WSRF::Lite at
    http://www.sve.man.ac.uk/Research/AtoZ/ILCT
    looks interesting.

    Googling for 'perl grid' brings back a few links that seem to concern the areas you're looking at.

    Regards Mark
[reply]
Re: The State of Parallel Computing in perl 2007?
by moklevat (Priest) on Jan 22, 2007 at 15:35 UTC
    Hi jettero,

    I occasioanlly work on "trivially" or "embarassingly" parallelizable problems. For me this most often involves computing a single statistic from a dataset over a large combination of parameters. The most efficient solution for me has been to use R with the Rmpi package to interface with MPI. Depending on the scope of the task, I may also use MySQL for distributing the dataset and collecting the results using RMySQL.

    I could see doing the same thing in perl with PDL, Parallel::MPI, and your favorite database.

[reply]
      I clicked through to ::MPI a little, but the low version number and update from 1999 kinda scare me off. I have looked at PDL enough to wish I had columns of numbers to process.

      -Paul

[reply]
        I initially had to choose between PVM and MPI, and I ended up using MPI only because that was the first thing I tried and it happened to work for me. From what I had read at the time, PVM should work just as well as MPI for trivially parallelizable tasks. I would not guess that the MPI module is so trivial that it did not warrant any changes, but it does look like the PVM module has seen more development activity.
[reply]
Re: The State of Parallel Computing in perl 2007?
by erix (Curate) on Feb 18, 2007 at 17:11 UTC

    As a multi-computer scenario, Condor might be interesting for you.

    Condor lets you submit a program/batchfile/shellscript to a queue of many machines (nodes). Every one of these nodes needs to have a condor client installed. The condor client advertises the resources that that particular machine has on offer. This information is then used to match your job requirements to any number of machines. Advertised attributes are things like: CPU-type, OS-type, Amount of memory, free disk space, etc. If enough clients are available, your jobs will run simultaneously.

    Condor can use dedicated machines, or take advantage of idle clients: running only on designated times (at night, for instance) or monitoring machine activity, and kicking in after some idle period.

    Obviously, because clients need to be installed on all machines, it needs some organisation (=politics) to get authorization to run your programs on a sizable group of machines.

[reply]
Re: The State of Parallel Computing in perl 2007?
by casiano (Pilgrim) on May 22, 2008 at 12:35 UTC
    If you have several UNIX platforms with Perl installed and SSH access, then you can use GRID::Machine to have Perl interpreters running in those nodes and make them collaborate. The best thing being that you don't have to ask administrators to install any additional software.

    I have written a tutorial (GRID::Machine::perlparintro) that through a simple example introduces how to use Perl via GRID::Machine to exploit the computing power of idle workstations.

    Hope it Helps

    Casiano

[reply]

Back to Seekers of Perl Wisdom


Login:
Password
remember me
What's my password?
Create A New User

Node Status
node history
Node Type: perlquestion [id://595771]
Approved by Joost
Front-paged by Tanktalus
help
Community Ads
Chatterbox
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users
Others browsing the Monastery: (8)
GrandFather
wfsp
atcroft
herveus
Eyck
gnosti
raisputin
im2
As of 2009-11-21 09:23 GMT
Sections
The Monastery Gates
Seekers of Perl Wisdom
Meditations
PerlMonks Discussion
Categorized Q&A
Tutorials
Obfuscated Code
Perl Poetry
Cool Uses for Perl
Perl News
Information
PerlMonks FAQ
Guide to the Monastery
What's New at PerlMonks
Voting/Experience System
Tutorials
Reviews
Library
Perl FAQs
Other Info Sources
Find Nodes
Nodes You Wrote
Super Search
List Nodes By Users
Newest Nodes
Recently Active Threads
Selected Best Nodes
Best Nodes
Worst Nodes
Saints in our Book
Leftovers
The St. Larry Wall Shrine
Offering Plate
Awards
Craft
Snippets Section
Code Catacombs
Quests
Editor Requests
Buy PerlMonks Gear
PerlMonks Merchandise
Planet Perl
Perlsphere
Use Perl
Perl.com
Perl 5 Wiki
Perl Jobs
Perl Mongers
Perl Directory
Perl documentation
CPAN
Random Node
Voting Booth

Future historians will find that the material characteristic of the current era is...

Aluminium
Plastic
Oil
Water
Carbon dioxide
Copper
Iron
Silicon
Salt
Uranium
Hydrogen
Other

Results (729 votes), past polls