Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling

Re: speed up one-line "sort|uniq -c" perl code

by perrin (Chancellor)
on Apr 10, 2003 at 02:32 UTC ( [id://249519]=note: print w/replies, xml ) Need Help??

in reply to speed up one-line "sort|uniq -c" perl code

I know this isn't really the answer you were looking for, but I think you should consider getting a better sort program. I don't think there are any limitations on the GNU one, and it is typically faster than doing the same thing in Perl.
  • Comment on Re: speed up one-line "sort|uniq -c" perl code

Replies are listed 'Best First'.
Re^2: speed up one-line "sort|uniq -c" perl code (speed)
by tye (Sage) on Apr 10, 2003 at 16:47 UTC

    sort needs both time and space to perform the sort no matter how cleverly implemented. I find it hard to imagine a system that is so poorly configured that it can't handle sorting a paultry 500kB file. But I don't think that really matters in this particular case.

    There is a reason that "sort -u" came to be. It is much slower to sort all 57000 instances of several IPs and then throw all but one of each away. So I think "sort | uniq -c" would be much slower than using Perl.

    Unfortunately, it doesn't appear that even GNU sort has bothered to implement a -u option that counts the duplicates.

                    - tye
      Thanks for making me realize a typo.
      The file that I am parsing is 500MB, not 500kB....
      That's why sort freaks out.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://249519]
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (4)
As of 2024-06-17 19:58 GMT
Find Nodes?
    Voting Booth?

    No recent polls found

    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.