Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

Comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

At the risk of providing an inferior answer amid higher authorities, I'll try to explain the Schwartzian Transform.

First, one has to understand why it is useful. Assume you want to sort a list by, for example, ignoring the first character in each element. In other words, you want to sort 'Babc' before "Abcd". You could write such a sort like this:

my @sorted = sort { substr( $a, 1 ) cmp substr( $b, 1 ) } @unsorted;

And that works fine. But remember that the comparison routine of a sort function gets called many times. ...possibly several times per element being sorted. It's somewhat inefficient calling substr twice for each comparison, over and over and over again as the list gets sorted.

There is a better way. What if, instead of calling substr over and over again, multiple times per element being sorted, you called it once per element and stored the results? If you can create, store, and access this cached list fast enough, it will be more efficient than doing complex calculations over and over again on the same element throughout the course of the sort.

The Schwartzian Transform does this. It creates a cache of whatever complex computation you had in mind, and saves it alongside the original data, so that it's easy to keep track of the real data and its cached pre-computed sort criteria at the same time.

The most common form of the Schwartzian Transform takes a flat list, and turns it into a list of references to two-element arrays. Here's how our substr( $string, 1 ) might work in a ST.

@original | @transform | [0] [1] _________ | _________________________ abc | abc bc asdf | asdf sdf jkas | jkas kas qwert | qwert wert uiop | uiop iop

So in the above example:

@original = qw/abc asdf jkas qwert uiop/;


@transformed = ( [ 'abc' , 'bc' ], [ 'asdf' , 'sdf' ], [ 'jkas' , 'kas' ], [ 'qwert', 'wert' ], [ 'uiop' , 'iop' ] );

Now lets look at the complete transform, as commonly seen:

my @sorted = map { $_->[0] } sort { $a->[1] cmp $b->[1] } map { [ $_, substr( $_, 1 ) ] } @unsorted;

Look at this from the bottom up:

  1. Start with a list called @unsorted
  2. Feed @unsorted into map. The bottom map takes each element of @unsorted and turns it into an anonymous array with two elements. The first is the original string, and the second is the cached complex calculation.
  3. Feed the list of array-refs to sort. Inside sort, dereference each array-ref to compare the 2nd elements (the cached versions).
  4. Finally, feed this list of anon-array-refs (now sorted into the final order) into the top map. This final step strips away the cached information, leaving only the original strings (the array-refs get thrown away too).
  5. The sorted original strings get assigned to @sorted

In actuality, this particular example may not be all that more efficient, since substr is pretty fast, and anonymous array creation and dereferencing takes time. But for longer lists, and particularly for more complex sort computations, the ST shines.

I hope this helps. :)


In reply to Re: Things I have learnt by davido
in thread Things I have learnt by neilh

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and a log crumbles through the grate...

    How do I use this? | Other CB clients
    Other Users?
    Others surveying the Monastery: (5)
    As of 2018-05-20 14:37 GMT
    Find Nodes?
      Voting Booth?