Beefy Boxes and Bandwidth Generously Provided by pair Networks Cowboy Neal with Hat
Problems? Is your data what you think it is?
 
PerlMonks  

state preserving uniq

by Chady (Priest)
on Mar 01, 2003 at 08:50 UTC ( #239674=snippet: print w/ replies, xml ) Need Help??

Description:

removes duplicate lines like UNIX uniq without having to sort the lines first, and preserves the line positions keeping the first match in the list.

#!/usr/bin/perl

while (<>) {
    next if (defined $list{$_});
    $list{lc($_)} = [ $i++, $_ ];
}

print map { $_->[1] }
      sort { $a->[0] <=> $b->[0] } values %list;

Comment on state preserving uniq
Download Code
Re: state preserving uniq
by xmath (Hermit) on Mar 01, 2003 at 10:54 UTC
    Don't mean to be rude, but this can be done a lot simpler:
    perl -ne '$s{lc $_}++ or print'
    (It's actually a standard example to which I added lc, and unlike your version it prints out the line right at its first occurrance, rather than first slurping in the whole file)
      yea, but my code use a lot more memory than yours. ;)

      typical example on how NOT to code in perl. (I should get more sleep)

      nice one-liner
      He who asks will be a fool for five minutes, but he who doesn't ask will remain a fool for life.

      Chady | http://chady.net/
      And that can be sped up even more...
      perl -ne 'print if 1 == ++$s{lc $_}'
      (The difference is that you don't have to create temporary variables.)
        Although I can't test it right now, I sincerely doubt that your version is faster, especially if you compare the optrees:

        While your version does save copying the old integer value to a temporary (you realize the temporary sv is allocated at compile-time, right?), it takes two extra ops, which I'm pretty sure costs more cpu time.

        And in any case the difference is too minimal (especially compared to lowercasing the string and doing a hash lookup) to justify adding to the code complexity

      Or even:
      $s{+lc}++||print
      :-)

      Makeshifts last the longest.

        Yea ofcourse; I was aiming for simplicity though, I wasn't golfing. :-)

        (it yields the same op-tree btw)

Re: state preserving uniq
by steves (Curate) on Mar 01, 2003 at 12:13 UTC

    Why the lc? uniq as I know it is not case insensitive. You might want to note that behavior to any potential users.

Re: state preserving uniq
by Intrepid (Deacon) on Jul 30, 2003 at 19:02 UTC

    This discussion also took place over here where a version is presented that takes its input on STDIN (like a typical *nix filter) as well as in the arguments in @ARGV.

Back to Snippets Section

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: snippet [id://239674]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (7)
As of 2014-04-17 05:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (439 votes), past polls