Perl: the Markov chain saw  
PerlMonks 
Two handy tools for nested arrays: mapat and transposeby tmoertel (Chaplain) 
on Dec 07, 2004 at 01:25 UTC ( #412800=perlmeditation: print w/replies, xml )  Need Help?? 
Here is a fun and flexible way to solve many problems that involve
nested array data, such as the problem posed by knirirr in
"Mean and standard deviation amongst multiple matrices." In this meditation, we will create two small helper
functions that let us transform and restructure array data. Together,
they make it easy to solve problems like the one posed in the
referenced post.
The problemThe problem, in short, is that we have an array of matrices and we want to apply a function "matrixwise" across the array. For example, let us consider the twomatrix case: Our goal is to compute the following matrix from $mats: for some given function f. (In the case of the knirirr's problem, as I understand it, f will compute the mean and standard deviation of its arguments.) We can think about the array of matrices $mats as itself a 3dimensional matrix. Its dimensions, most significant first, are (m,r,c), where m gives the position of the submatrix we want, r gives the row, and c gives the column:
For example, the value at coordinate (0,0,0) is 1; and the value at (1,0,0) is 5. A common solutionWith this representation in mind, we can arrive at the following, typical approach to a solution: To see that it works, let us create a few functions to help us examine the results: Now we can see that the nestedforloop solution does indeed compute the desired result:
One problem with the nestedfor solution, however, is that it is hardcoded for the specifics of this particular, 3dimensional case. Another problem is that the code has a high density of array operations, providing many opportunities to mix up index variables or make similar coding mistakes. Nesteddata helpersIn daytoday coding, we frequently process arrays, and arrays of arrays, and AoAoAs, and similar structures. It might therefore be worthwhile to invest some time in helper functions that let us manipulate nested data more directly. One such helper is the everhandy mapat. It is similar to map except that it allows us to map at a particular depth of nesting in arbitrarilynested arrays: To see how mapat works, let us target the various depths of nesting in our original 3dimensional input matrix $mats:
Note that mapat preserves the structure of its input down to the targeted depth. This is an important property that lets us process nested values "in structure," without having to tear down and build up arrays manually, like we did with the nestedforloop solution. While it is probably clear that mapat is a general and useful function for working with nested arrays, the function doesn't actually solve our original problem. Examining the output, we can see that mapat(2, ...) provides the correct output structure, but not the correct values. The problem is that the elements passed to f are not grouped matrixwise. We are passing in rows extracted from our 3dimensional array, and we ought to be passing in matrixwise, rowcolumn stacks. That is, for each submatrix M and row R, we are passing to f all of the values whose coordinates are given by (m=M,r=R,c=*), where * matches any value. We ought to be passing in (m=*,r=R,c=C) for each valid rowcolumn pair (R,C). To provide the correct values to mapat, we can restructure $mats so that m becomes its leastsignificant dimension. In other words, we can convert $mats's dimensions from (m,r,c) to (r,c,m). Then mapat(2, ...) can be used to apply f to exactly the desired elements: (r=R,c=C,m=*). We can use matrix transposition to perform the restructuring. Transposition swaps the order of the two mostsignificant dimensions of a matrix. It is commonly used in spreadsheets to convert rows into columns and vice versa. Here is how it works for the 2dimensional case: I usually implement transpose in terms of zip, which is a handy helper for functional programming. There are many other implementations lying around if you don't like this one:
With transpose in hand, we can restructure our original input matrix in two steps. First, we transpose at depth 0 to reorder the first two dimensions: Then, we use mapat to transpose the result of the previous step at depth 1, in effect transposing the result's submatrices. This swaps the second and third dimensions: Note that the output values are grouped matrixwise. This is exactly the grouping that we want. The final step is to use mapat(2, ...) to apply f to the second "level" of our restructured matrix: And that is our desired output. Putting all of the above together results in our final function, matrixwise_map: It does exactly what we want, in one fell swoop:
Back to our original problemTo return to the knirirr's original problem, here is how we might compute the matrixwise mean of the input matrices: The matrixwise standard deviations can be computed similarly: Create a stdev function and matrixwise_map it over $mats. That's it! What have we gained?In solving a particular nestedarray problem, we created two small functions mapat and transpose that we can use to solve many kinds of related problems. The mapat function lets us transform nested array data without disturbing its structure, and transpose lets us restructure dimensional data by rearranging dimensions. As an example of what these functions can do, we created matrixwise_map. It lets us apply functions matrixwise over an array of matrices. Together or alone, all three of these functions are handy. If you find yourself writing a lot of code to process array data, consider whether you could benefit from functions like these or from problemsolving approaches similar to those we explored in this meditation. Thanks!Thanks for taking the time to read this meditation. If you can see any way to improve my writing or explanations, please let me know. Cheers, Tom Moertel : Blog / Talks / CPAN / LectroTest / PXSL / Coffee / Movie Rating Decoder
Back to
Meditations

