Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

I started to write this as a reply to Random array sorting, a question asked by kidd. I realized I was addressing a larger issue and that it didn't really belong there even though the example I use is related. I believe this is my first meditation.

A couple years ago, when I was at a different company than I am now, one of my co-workers asked pretty much the same question kidd did. He needed to randomize lines in a file, wanted to do it with perl, and was looking for the best (most efficient) way. My reply was the same as merlyn's to kidd, "it's a FAQ." The lead toolsmith, however, suggested another answer so simple, elegant, and efficient that I was simply blown away. His not only benchmarked much better but would have won if we were playing golf too.

To put things in perspective, our toolsmith was no slouch. If you work on a unix platform, the chances are that you have run code written by him. He has spent time maintaining some significant GNU projects and has contributed substantially to many others.

The solution that he proposed for randomizing an array is quite beautiful. You may want to use it. Here's a word of caution: DON'T. It is, unfortunately, incorrect.

@random = sort { .5 <=> rand(1) } @array; # DO NOT USE THIS!

Even when I first saw this, something about it felt wrong. It was too simple. I pointed out that the behavior of qsort(), the C function which acts as the basis for Perl's sort, is undefined when the comparison function doesn't produce an ordering. That, however, didn't seem to be much of an argument as A) it appeared to work and B) we wanted the ordering to be undefined. As long as it didn't add or remove elements, it should work, right? I gave up. He was right. It was beautiful. I would never be the developer he was. As far as I know, his solution went into production code and may well remain there today.

Sometime after I left that company, the question came up again. I looked a little closer at that most elegant of solutions. I never really tested it for correctness. Here's what I found:

$ perl -le '@n=(1..3);for(1..100000){@m=sort{.5<=>rand}@n;$h{"@m"}++}f +or(sort{$h{$b}<=>$h{$a}}keys%h){print"$_: $h{$_}"}' 1 2 3: 24990 2 1 3: 24892 2 3 1: 12572 1 3 2: 12541 3 1 2: 12514 3 2 1: 12491
Notice how, of the six possible orderings, two of them came up twice as many times as any of the four others. Not very random, is it? The problem lies in the way a quicksort works. Determining the details is left as an exercise.

This story has several morals. Among them are "no one is infallible," "it isn't a solution until it has been thoroughly tested," and "elegance is worthless without correctness." Another lesson I learned was that, since Perl hides so much power under the surface, sometimes a simple solution isn't simple at all.

I'm wondering if any other monks have stories of great solutions that turned out to be subtly wrong. I offer this thread up as a place to collect them so that they may serve as reminders to ourselves and warnings to others.

Edit: Added readmore tags.

-sauoq
"My two cents aren't worth a dime.";

In reply to When the Best Solution Isn't by sauoq

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others examining the Monastery: (5)
    As of 2020-01-18 20:25 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?
      Notices?