Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: Run a script in parallel mode

by sundialsvc4 (Abbot)
on May 26, 2015 at 18:01 UTC ( [id://1127866]=note: print w/replies, xml ) Need Help??


in reply to Run a script in parallel mode

I would also caution you to first, test your ruling assumption:   that 20 processes running in parallel actually will complete the total job faster.   I am not so sure.

In fact, I doubt it don’t believe it.

You see, you talk about “a big file.”   That means:   I/O.   Therefore, a procedure which is likely to be, “fundamentally, I/O bound.”   The completion-time of the procedure probably won’t be bound by the speed of the CPU, nor the availability of cores.   Instead, it will be bound by how fast the I/O subsystem can move data into and out of the computer’s memory.   (As a simple test, run the time command on the existing Java program, and compare the wall-time to the CPU-time.   I’ll wager that the CPU-time is much smaller.   This means that the process spends most of its time waiting for an I/O operation to complete.

The simplest test would be to do this:   open up four or five shell-command windows, make four or five identical copies of your test file, start the same program in all five windows, and start your stopwatch.   If you discover that all five instances, running in parallel on the same data, complete in about the same amount of time that they would if run one-by-one, then it might be profitable to pursue (and to implement) your theory.   (As a further test, split the file into five pieces, by whatever means, and run the test again.   All five of them, running in parallel, should complete in one-fifth the time or less.)

If you don’t clearly see such results ... and I predict that you will not ... then, “save your effort.”   The odds are not in your favor that your efforts will have been profitably spent, IMHO, and if this be the case, find out sooner rather than later.

Replies are listed 'Best First'.
Re^2: Run a script in parallel mode
by marioroy (Prior) on Jun 02, 2015 at 23:42 UTC

    MCE applies "graceful" IO while reading input. Only a single worker reads at any given time. This allows for sequential IO which is typically faster than random IO, especially for mechanical drives.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1127866]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (3)
As of 2024-04-24 01:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found