Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot

state preserving uniq

by Chady (Priest)
on Mar 01, 2003 at 08:50 UTC ( #239674=snippet: print w/replies, xml ) Need Help??

removes duplicate lines like UNIX uniq without having to sort the lines first, and preserves the line positions keeping the first match in the list.


while (<>) {
    next if (defined $list{$_});
    $list{lc($_)} = [ $i++, $_ ];

print map { $_->[1] }
      sort { $a->[0] <=> $b->[0] } values %list;

Replies are listed 'Best First'.
Re: state preserving uniq
by xmath (Hermit) on Mar 01, 2003 at 10:54 UTC
    Don't mean to be rude, but this can be done a lot simpler:
    perl -ne '$s{lc $_}++ or print'
    (It's actually a standard example to which I added lc, and unlike your version it prints out the line right at its first occurrance, rather than first slurping in the whole file)
      Or even:

      Makeshifts last the longest.

        Yea ofcourse; I was aiming for simplicity though, I wasn't golfing. :-)

        (it yields the same op-tree btw)

      yea, but my code use a lot more memory than yours. ;)

      typical example on how NOT to code in perl. (I should get more sleep)

      nice one-liner
      He who asks will be a fool for five minutes, but he who doesn't ask will remain a fool for life.

      Chady |
      And that can be sped up even more...
      perl -ne 'print if 1 == ++$s{lc $_}'
      (The difference is that you don't have to create temporary variables.)
        Although I can't test it right now, I sincerely doubt that your version is faster, especially if you compare the optrees:

        While your version does save copying the old integer value to a temporary (you realize the temporary sv is allocated at compile-time, right?), it takes two extra ops, which I'm pretty sure costs more cpu time.

        And in any case the difference is too minimal (especially compared to lowercasing the string and doing a hash lookup) to justify adding to the code complexity

Re: state preserving uniq
by steves (Curate) on Mar 01, 2003 at 12:13 UTC

    Why the lc? uniq as I know it is not case insensitive. You might want to note that behavior to any potential users.

Re: state preserving uniq
by Intrepid (Deacon) on Jul 30, 2003 at 19:02 UTC

    This discussion also took place over here where a version is presented that takes its input on STDIN (like a typical *nix filter) as well as in the arguments in @ARGV.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: snippet [id://239674]
[MidLifeXis]: Leveraging prove to do data validation on an ETL process. What am I smoking??? :-)

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (12)
As of 2016-12-07 19:55 GMT
Find Nodes?
    Voting Booth?
    On a regular basis, I'm most likely to spy upon:

    Results (131 votes). Check out past polls.