http://www.perlmonks.org?node_id=212228


in reply to The longest word ....

Just a word on all the solutions that use sort as the means of getting the longest word:

Don't

This is incredibly wasteful. A byproduct of calling sort is that you also solve the problem of determining the ordering of all the elements amongst themselves. All that information in unnecessary to solve the problem at hand, and what is more, the performance certainly isn't O(n).

You can find the longest word in a single pass of a list (and BrowserUk, if you can only do this in two passes, or with a hash, well, I'm glad you don't work for me :). You can't sort an arbitrary list in a single pass, (or if you can, I'd got a few people I'd like you to meet). This means that the sort-based solutions are not going to scale as well as a single-pass approach.

<update>This is the code I was thinking of (simplifying the problem of where the words come from for the sake of the argument):

my $max = 0; my @longest; for $word( @words ) { my $length = length $word; if( $max < $length ) { $max = $length; @longest = (); push @longest, $word; } elsif( $max == $length ) { push @longest, $word; } }

At the end of this single pass through the words, you will have a counter holding the length of the longest word(s), and an array that contains the word(s). No hashes needed.

</update>

And this is exactly the kind of program that people are going to start throwing dictionaries at. What is more, sort more or less insists that your set fits in memory, whereas the my approach needs only about as much space as the largest word.


print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u'

Replies are listed 'Best First'.
Re: Re: The longest word ....
by BrowserUk (Pope) on Nov 12, 2002 at 10:01 UTC

    Caveats noted, the problem wasn't "the longest word" but the longest words, which is either

    • Two passes.

      1 to get the length, one extract the words of that length.

    • Requires building a hash (or similar) to record the lengths and the words.

    The former would work on a list greater than memory, the latter is likely to fail before the sort.

    Anyone who "throws a wordlist" at an algorithm without understanding its limitations, gets what they deserve.

    In the context of the requested solution: A shorter way of finding the longest words in a supplied string (already memory bound) using sort was just a quick option. I didn't read the question as requiring a failsafe nor scalable solution. Did you?

    After all

    print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u'

    is probably not the most efficient way of printing "just another bofh" either.


    Okay you lot, get your wings on the left, halos on the right. It's one size fits all, and "No!", you can't have a different color.
    Pick up your cloud down the end and "Yes" if you get allocated a grey one they are a bit damp under foot, but someone has to get them.
    Get used to the wings fast cos its an 8 hour day...unless the Govenor calls for a cyclone or hurricane, in which case 16 hour shifts are mandatory.
    Just be grateful that you arrived just as the tornado season finished. Them buggers are real work.

      Another point of view without using sort:

      sub lword{ push(@{$a[length]},$_)for(pop)=~/\b\w+/g;@{$a[-1]} }

      Or clean the array every iteration too:
      sub lword2{ for((pop)=~/\b\w+/g){push(@{$a[length]},$_);@b[0..-2]=''}@{$a[-1]} }

      Update:(Changed "while" with "for" after testing)

      $anarion=\$anarion;

      s==q^QBY_^=,$_^=$[x7,print

Re: Re: The longest word ....
by AltBlue (Chaplain) on Nov 12, 2002 at 14:58 UTC
    Just a word on all the solutions that use sort as the means of getting the longest word
    even more abstract: don't use sort on a data structure if you need just a single value. isn't it obvious ?! :))
     
    heh, the rest is history, you just didn't want to understand what i meant... I thought it was pretty obvious as long as I clearly spoken in my entire message of wordS ... with the exception of the title :P~
    thx again to BrowserUk for clearing things up for you with his reply. :)

    --
    AltBlue.

Re: Re: The longest word ....
by FamousLongAgo (Friar) on Nov 12, 2002 at 17:37 UTC
    Of course sort is wasteful of processing cycles, but what are we trying to conserve? These are web pages, it's 2002, we have RAM and MHz to spare. The poster is just satisfying his curiosity about finding long words ( and doing it in minimum keystrokes ), it's not meant to be an enterprise application.

    I agree that klugey code is bad, and we should all be aware of scaling / performance issues, but sometimes I think we can be needlessly Puritan about it.
Re: Re: The longest word ....
by sauoq (Abbot) on Nov 12, 2002 at 20:03 UTC

    He asked for the shortest solution, grinder, not the best.

    In my first post in this thread I cautioned that it could be done in one pass.

    -sauoq
    "My two cents aren't worth a dime.";