Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re^5: Better mousetrap (getting top N values from list X)

by sleepingsquirrel (Hermit)
on Feb 03, 2005 at 23:39 UTC ( #427867=note: print w/ replies, xml ) Need Help??


in reply to Re^4: Better mousetrap (getting top N values from list X)
in thread Better mousetrap (getting top N values from list X)

And you'll notice, it still has to process the whole list on the first pass, so lazy lists don't really come into play.
Yes, lazy lists are what make the whole scheme elegant. The sort returns a lazy list of values, which is nothing more than a thunk, a promise to do some work in the future (i.e. there is no computation done, and there are no elements sitting around in memory). When you end up needing the value in @sorted[0] that thunk is evaluated (you get the minimal value in O(n) time). And when you need @sorted[1] you get that in another O(log(n)) time. Etc. You might want to check out the lazy language Haskell for all kinds of funky stuff that can be done lazily (and all perl monks know the value of Laziness!)


-- All code is 100% tested and functional unless otherwise noted.


Comment on Re^5: Better mousetrap (getting top N values from list X)
Select or Download Code
Re^6: Better mousetrap (getting top N values from list X)
by BrowserUk (Pope) on Feb 04, 2005 at 00:04 UTC

    Okay. I see what you are saying now. The slice across the return from sort generates a lazy list.

    So the code:

    my @data = getdata( 100000 ); my @topN = ( sort( @data ) )[ 1 .. 10 ]; undef( @data ); for my $top ( @topN ) { ## << COrrect per 427878 ## do something with $top }

    Passes the whole list of data to sort, (which takes a copy of it in case we modify or discard it) and returns an iterator which gets bound to @top, so that each time we access the next element of @top in the for loop, the iterator is called via the magic of tieing, and the copy of the 100,000 elements that we passed to sort gets scanned to produce the next of our 5 return values.

    Lazy huh?!


    Examine what is said, not who speaks.
    Silence betokens consent.
    Love the truth but pardon error.
      Essentially. Although I think you probably meant...
      for my $top ( @topN ) { ## do something with $top }


      -- All code is 100% tested and functional unless otherwise noted.

        Corrected. Thanks.

        You do see the contradiction though?

        For an algorithm that produces a subset m of a set N, duplicating N so that you can produce the subset lazily always consumes more storage--not to mention the overhead of tieing the iterator, retaining it's state etc.

        <soapbox>

        That's one of my bug bears with FP. They claim it produces proveably correct code by avoiding side effects, or putting things into variables and so forth, but what they gloss over is that you still retain state, you just put it all on the either the program stack or continuation stack or some other internal stack where the programmer cannot see it, or test it or verify it. Or control it.

        Until they find a way to prove that the FP interpreter or compiler (usually written in C!) cannot corrupt it's own stack, every other claim of proveability is built on sand.

        </soapbox>


        Examine what is said, not who speaks.
        Silence betokens consent.
        Love the truth but pardon error.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://427867]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (4)
As of 2014-08-22 02:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (145 votes), past polls