One of the things the people at Google noticed is, that a lot of their problems amount to process a lot of input data to compute some derived data from that input data. For example, they process the ca. 4 billion web pages and compute an index from them. These input data are very diverse: document records, log files, on-disk data structures, etc. and require lots of CPU time. They have the infrastructure to deal with it, but they wanted a framework for automatic en efficient distribution and parallelization of the jobs across their clusters. Even better if it provided fault-tolerance and scheduled I/O. Status monitoring would also be nice.

So over the last year, they devised MapReduce- inspired by the map and reduce primitives present in the functional language Lisp. Most of their operations involved applying a map operation to each logical "record" in their input in order to compute a set of intermediate key/value pairs, and then applying a reduce operation to all the values that shared the same key, in order to combine the derived data appropriately.

More information can be read here, or be seen in this video (some 23 minutes into the presentation).

I don't know in which programming language MapReduce was written, but could you write the basic functions in Perl?

In reply to Google's MapReduce by BioGeek

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":