Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re: perl functionality like unix's "sort -u" and "uniq -c"

by ambs (Pilgrim)
on Apr 08, 2005 at 15:02 UTC ( #446115=note: print w/replies, xml ) Need Help??


in reply to perl functionality like unix's "sort -u" and "uniq -c"

Steve, can I ask if you are doing that because we really need to do it in Perl, or just to learn? I mean, those two unix programs are very efficient, so, why not just use them?

Also, you can open shell pipes as if they were files. So, you can easilly do:

open PIPE, "sort file | uniq -c |"; while(<PIPE>) { }

Ok, I know I didn't help you how to write those scripts in Perl, but I hope I made you think if you really need to rewrite those tools in Perl.

Alberto Simões

Replies are listed 'Best First'.
Re^2: perl functionality like unix's "sort -u" and "uniq -c"
by jfroebe (Parson) on Apr 08, 2005 at 15:12 UTC
    I generally agree with you but remember that if you are trying to make the perl script as portable as possible, you will want to avoid using unix commands if you can. Just minor differences between the GNU versions of the tools and unix (solaris,etc) versions can break the script because of slightly different behaviors. Also, cgi scripts where you will want to avoid calling the util progs for both security and performance reasons.

    but that's just a 10 sec observation.

    Jason L. Froebe

    Team Sybase member

    No one has seen what you have seen, and until that happens, we're all going to think that you're nuts. - Jack O'Neil, Stargate SG-1

      I'll give you a 4-year observation. Having done something very similar to what the OP is asking, and not only for sort and uniq, but for many things that could be done in shell (I know, because I used to maintain the shell script that did it), I'll tell you that portability can be problematic. It's difficult enough ensuring portability for perl ;-), nevermind a whole host of subcommands.

      I make extensive, extensive use out of trivial-looking modules such as File::Copy, and especially File::Spec, and grepping through files is commonplace.

      However, that's just the 1-year view. The following 3-years of the view is that once you've started down the road of conversion to pure-perl, you'll find better, faster, and easier ways of doing things. You'll find out, perhaps, that you don't really need the sort. In shell, you need the sort to make uniq work the way you want. In perl, you don't - just use a hash. Bang! Speed improvement. In shell, you need to use temporary files if you want to feed the same input into multiple filters (sort it for one output, sort and uniq for another output, then diff, just to see what is duplicated). In perl, you can keep it in memory (if it's small enough). Bang! More speed improvement.

      Reducing, and outright removing, your dependancy on the shell is just the first step to cleaner, faster, and easier to maintain scripting.

      Or, at least, that's my experience with perl.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://446115]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (4)
As of 2020-04-10 07:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    The most amusing oxymoron is:
















    Results (49 votes). Check out past polls.

    Notices?