go ahead... be a heretic PerlMonks

### Re^3: randomising file order returned by File::Find

by BrowserUk (Pope)
 on Mar 01, 2011 at 22:23 UTC ( #890848=note: print w/replies, xml ) Need Help??

The downside of that mechanism is control. If, as the OP says later, the need to suspend or terminate the processing early arises, then you're stuck with starting the whole process over from scratch. Same thing if the number of workers varies up or down.

With server/clients approach, pause and restart the clients, or knock out half the clients--or double them--and the processing continues without duplication and automatically redistributes to accommodate the changes.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
• Comment on Re^3: randomising file order returned by File::Find

Replies are listed 'Best First'.
Re^4: randomising file order returned by File::Find
by jeffa (Bishop) on Mar 01, 2011 at 22:27 UTC

"... then you're stuck with starting the whole process over from scratch."

True ... but Hadoop scales linearly, meaning what used to take multiple hours or days to run now only takes a few hours, maybe even a few minutes. Such termination becomes trivial. I do not know how familiar you are with Hadoop/cloud computing.

jeffa

L-LL-L--L-LL-L--L-LL-L--
-R--R-RR-R--R-RR-R--R-RR
B--B--B--B--B--B--B--B--
H---H---H---H---H---H---

True ... but Hadoop scales linearly, meaning what used to take multiple hours or days to run now only takes a few hours, maybe even a few minutes.

So does the server/clients scheme. The difference is in the level of control.

Such termination becomes trivial.

For some types of processing. For other types, the cost of throwing away the results of a job when it is 99% complete and starting over can be very high.

I do not know how familiar you are with Hadoop/cloud computing.

Not so much. But it isn't so different with stuff I was doing 15 years ago on a server farm.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

"So does the server/clients scheme. The difference is in the level of control."

Right, with Hadoop, all that hard work is done for you. Why roll another wheel?

"Not so much. But it isn't so different with stuff I was doing 15 years ago on a server farm."

Except that now the hardware can (realistically) support the volumes of data being processed. You should check it out.

jeffa

L-LL-L--L-LL-L--L-LL-L--
-R--R-RR-R--R-RR-R--R-RR
B--B--B--B--B--B--B--B--
H---H---H---H---H---H---


Create A New User
Node Status?
node history
Node Type: note [id://890848]
help
Chatterbox?
 [thezip]: i don't get the command line "back" until I close VIM. No what I want to happen... [thezip]: I currently don't have access to CYGWIN, else I'd just do a tail -f on the logfile. [Corion]: thezip: If you want to open vim and can live with opening a second console window, use start "The results" vim.exe c:\path\to\logfile .log [thezip]: Ooops... I lied. I guess Cygwin is back. I'll just do a tail -f instead. Better. Sorry for the noise.

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (12)
As of 2017-03-27 18:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
Should Pluto Get Its Planethood Back?

Results (321 votes). Check out past polls.