Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

state preserving uniq

by Chady (Priest)
on Mar 01, 2003 at 08:50 UTC ( #239674=snippet: print w/ replies, xml ) Need Help??

Description:

removes duplicate lines like UNIX uniq without having to sort the lines first, and preserves the line positions keeping the first match in the list.

#!/usr/bin/perl

while (<>) {
    next if (defined $list{$_});
    $list{lc($_)} = [ $i++, $_ ];
}

print map { $_->[1] }
      sort { $a->[0] <=> $b->[0] } values %list;

Comment on state preserving uniq
Download Code
Re: state preserving uniq
by xmath (Hermit) on Mar 01, 2003 at 10:54 UTC
    Don't mean to be rude, but this can be done a lot simpler:
    perl -ne '$s{lc $_}++ or print'
    (It's actually a standard example to which I added lc, and unlike your version it prints out the line right at its first occurrance, rather than first slurping in the whole file)
      yea, but my code use a lot more memory than yours. ;)

      typical example on how NOT to code in perl. (I should get more sleep)

      nice one-liner
      He who asks will be a fool for five minutes, but he who doesn't ask will remain a fool for life.

      Chady | http://chady.net/
      And that can be sped up even more...
      perl -ne 'print if 1 == ++$s{lc $_}'
      (The difference is that you don't have to create temporary variables.)
        Although I can't test it right now, I sincerely doubt that your version is faster, especially if you compare the optrees:

        While your version does save copying the old integer value to a temporary (you realize the temporary sv is allocated at compile-time, right?), it takes two extra ops, which I'm pretty sure costs more cpu time.

        And in any case the difference is too minimal (especially compared to lowercasing the string and doing a hash lookup) to justify adding to the code complexity

      Or even:
      $s{+lc}++||print
      :-)

      Makeshifts last the longest.

        Yea ofcourse; I was aiming for simplicity though, I wasn't golfing. :-)

        (it yields the same op-tree btw)

Re: state preserving uniq
by steves (Curate) on Mar 01, 2003 at 12:13 UTC

    Why the lc? uniq as I know it is not case insensitive. You might want to note that behavior to any potential users.

Re: state preserving uniq
by Intrepid (Deacon) on Jul 30, 2003 at 19:02 UTC

    This discussion also took place over here where a version is presented that takes its input on STDIN (like a typical *nix filter) as well as in the arguments in @ARGV.

Back to Snippets Section

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: snippet [id://239674]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2014-09-03 02:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite cookbook is:










    Results (35 votes), past polls