Re: Things I have learnt

At the risk of providing an inferior answer amid higher authorities, I'll try to explain the Schwartzian Transform.

First, one has to understand why it is useful. Assume you want to sort a list by, for example, ignoring the first character in each element. In other words, you want to sort 'Babc' before "Abcd". You could write such a sort like this:

my @sorted = sort { substr( $a, 1 ) cmp substr( $b, 1 ) } @unsorted;
[download]

And that works fine. But remember that the comparison routine of a sort function gets called many times. ...possibly several times per element being sorted. It's somewhat inefficient calling substr twice for each comparison, over and over and over again as the list gets sorted.

There is a better way. What if, instead of calling substr over and over again, multiple times per element being sorted, you called it once per element and stored the results? If you can create, store, and access this cached list fast enough, it will be more efficient than doing complex calculations over and over again on the same element throughout the course of the sort.

The Schwartzian Transform does this. It creates a cache of whatever complex computation you had in mind, and saves it alongside the original data, so that it's easy to keep track of the real data and its cached pre-computed sort criteria at the same time.

The most common form of the Schwartzian Transform takes a flat list, and turns it into a list of references to two-element arrays. Here's how our substr( $string, 1 ) might work in a ST.

@original       |                @transform
                |               [0]          [1]
_________       |         _________________________
abc             |               abc          bc
asdf            |               asdf         sdf
jkas            |               jkas         kas
qwert           |               qwert        wert
uiop            |               uiop         iop
[download]

So in the above example:

@original = qw/abc asdf jkas qwert uiop/;
[download]

...and...

@transformed = ( [ 'abc'  ,  'bc'   ],
                 [ 'asdf' ,  'sdf'  ],
                 [ 'jkas' ,  'kas'  ],
                 [ 'qwert',  'wert' ],
                 [ 'uiop' ,  'iop'  ]  );
[download]

Now lets look at the complete transform, as commonly seen:

my @sorted = map { $_->[0] }
                 sort { $a->[1] cmp $b->[1] }
                     map { [ $_, substr( $_, 1 ) ] }
                         @unsorted;
[download]

Look at this from the bottom up:

Start with a list called @unsorted
Feed @unsorted into map. The bottom map takes each element of @unsorted and turns it into an anonymous array with two elements. The first is the original string, and the second is the cached complex calculation.
Feed the list of array-refs to sort. Inside sort, dereference each array-ref to compare the 2nd elements (the cached versions).
Finally, feed this list of anon-array-refs (now sorted into the final order) into the top map. This final step strips away the cached information, leaving only the original strings (the array-refs get thrown away too).
The sorted original strings get assigned to @sorted

In actuality, this particular example may not be all that more efficient, since substr is pretty fast, and anonymous array creation and dereferencing takes time. But for longer lists, and particularly for more complex sort computations, the ST shines.

I hope this helps. :)

Dave

Comment on Re: Things I have learnt Select or Download Code


Do you know where your variables are?
	PerlMonks