Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

state preserving uniq

by Chady (Priest)
on Mar 01, 2003 at 08:50 UTC ( #239674=snippet: print w/replies, xml ) Need Help??

removes duplicate lines like UNIX uniq without having to sort the lines first, and preserves the line positions keeping the first match in the list.


while (<>) {
    next if (defined $list{$_});
    $list{lc($_)} = [ $i++, $_ ];

print map { $_->[1] }
      sort { $a->[0] <=> $b->[0] } values %list;
Replies are listed 'Best First'.
Re: state preserving uniq
by xmath (Hermit) on Mar 01, 2003 at 10:54 UTC
    Don't mean to be rude, but this can be done a lot simpler:
    perl -ne '$s{lc $_}++ or print'
    (It's actually a standard example to which I added lc, and unlike your version it prints out the line right at its first occurrance, rather than first slurping in the whole file)
      Or even:

      Makeshifts last the longest.

        Yea ofcourse; I was aiming for simplicity though, I wasn't golfing. :-)

        (it yields the same op-tree btw)

      yea, but my code use a lot more memory than yours. ;)

      typical example on how NOT to code in perl. (I should get more sleep)

      nice one-liner
      He who asks will be a fool for five minutes, but he who doesn't ask will remain a fool for life.

      Chady |
      And that can be sped up even more...
      perl -ne 'print if 1 == ++$s{lc $_}'
      (The difference is that you don't have to create temporary variables.)
        Although I can't test it right now, I sincerely doubt that your version is faster, especially if you compare the optrees:

        While your version does save copying the old integer value to a temporary (you realize the temporary sv is allocated at compile-time, right?), it takes two extra ops, which I'm pretty sure costs more cpu time.

        And in any case the difference is too minimal (especially compared to lowercasing the string and doing a hash lookup) to justify adding to the code complexity

Re: state preserving uniq
by steves (Curate) on Mar 01, 2003 at 12:13 UTC

    Why the lc? uniq as I know it is not case insensitive. You might want to note that behavior to any potential users.

Re: state preserving uniq
by Intrepid (Deacon) on Jul 30, 2003 at 19:02 UTC

    This discussion also took place over here where a version is presented that takes its input on STDIN (like a typical *nix filter) as well as in the arguments in @ARGV.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: snippet [id://239674]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (5)
As of 2021-09-23 18:51 GMT
Find Nodes?
    Voting Booth?

    No recent polls found