How about we have a bet on whether clusters are going away?
I didn't say or imply that "clusters were going away".
Only that the bar at which clusters need to be resorted to will be raised. That those currently having to use the smaller (4- to 32-way) clusters to get their work done will soon no longer need to deal with the latency, restricted bandwidth and topological problems involved with clusters, nor the complexity and expense of cluster-based software solutions, because they'll be able to use simpler, cheaper, cluster in a box solutions.
Google, arguably the biggest users of clusters, pretty certainly commercially, also use commodity hardware. Will google still be using clusters in 2010? Of course. But what say you that:
- Instead of their clusters averaging 2000 commodity PCs, they use 256 commodity multi-cpu machines?
- And that by making that transition, the "locality optimisation" that employ in their clusters to conserve bandwidth gets the huge boost that most data that is read and written by each of their cluster workers is not just on the local harddisk, but 'local' in memory?
- And the current chunk size of 64MB used by their GFS and their chunk servers becomes (say) 256MB or bigger.
- And their throughput grows accordingly because of the reduction in the frequency that data needs to be transported between machines.
Will google move to using threads? Consider the possibilities.
For each given MapReduce task, they currently deploy M map tasks and R reduce tasks (where R is usually some multiple of M), that each live on different machine within a (~2000) machine cluster. The intermediate outputs from the M map tasks, are written/replicated to the local disks of two or three chunk servers within the same cluster. Each of reduce tasks then reads this intermediate results from one or other of those chunk servers, processes it and writes/replicates it's results two or three other chunk servers.
Now, imagine if each group of 1 map task + N reduce tasks all ran within the same machine? Instead of each piece of intermediate data making 6 network transports, those reads and writes can benefit from the localisation optimisation that Google already use. That reduces bandwidth consumption immediately. And by quite a large factor.
Now further imagine that instead of 1 Map task and N Reduce tasks per cluster reading and writing to the local hard disk. You instead deploy 1 Map thread and N Reduce threads per cluster. Now, there is no need for the intermediate data to leave ram.
You've gone from 6 cross network transfers for each piece of intermediate data, to 1 read and 1 write from local memory. How would that affect performance?
And another big argument against multi-threading is that it is hard to do. We have enough trouble finding people who can program semi-competently.
I really did lose you right at the top of the OP didn't I? Had you read on, you would have realised that about 70% of my post was spent stating the difficulties (in rather more detail), that currently prevent threaded code being written and deployed. It then went on to suggest that there is a solution, but since you're dead set against threading, I won't bore you further by repeating it here.
A final note. Computing did not begin or end with the PC.
I'm well aware of that. I've lived and worked through it. My first programs were written to run on a Dec10 running Tops. My college code ran mostly on a PDP11/45. My first database project was on clustered (twinned) pdp11/60s. The first commercial project I independently architected ran on a BBC micro using 6502 machine code. My first interpreted language was REXX running under CMS over VM 370/XA on a an IBM mainframe. Fully half my experience is writing and architecting software that run on machine other than PCs, from embedded systems on microcomputers; to database work on minis; to Big Stuff on Big Iron.
From e-commerce (when it was still called EDP); through scientific work using images to visualise huge quantities of data; through database work deploying and retrieving literally millions of paper (OMR) university examination entrance & examination papers trans-nationally across the breadth of 6 entire West African countries (3 jumbo jets full of paper in either direction) processing and collating the information into another Jumbo jet full of paper reports in 3 weeks. And much more.
You (and merlyn) rail on about your respective depths of experience, but from my perspective, based upon the experience that you have outlined here, you both have less years than me; and far narrower band commercial experience. So please, stop trying to 'put me in my place' with your knowledge and depth of experience.
But just for grins, even the latest supercomputers are PCs. At least in name :)
|Replies are listed 'Best First'.|
Re^7: Parrot, threads & fears for the future.
by tilly (Archbishop) on Oct 25, 2006 at 00:46 UTC