Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re^3: Deleting duplicate lines from file

by blazar (Canon)
on Feb 17, 2006 at 09:59 UTC ( [id://530899]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Deleting duplicate lines from file
in thread Deleting duplicate lines from file

Yes, it is likely to be of some use. That's precisely what you need. In English it parses like thus:

  • assign to @clean the return value of the last statement in the do block;
  • in the do block apply grep to @list. Do you know what grep is for? It will take a block (or an expression, but this is the "block form") and evaluate it for all the elements of the list it is passed to. The elements of the list are aliased to $_. If the block returns a true value for a particular value of $_ then that $_ is included in the return value of grep, else it is discarded;
  • In this case the block consists of a single statement, precisely !$dupe{$_}++. Now, $dupe{$_} is the value of the hash %dupe for the key $_. Thus $dupe{$_}++ is a counter for the occurrencies of $_: it will be 0 (false) on the first one and a number greater than zero (true) on the successive ones, and its negation !$dupe{$_}++ will be true for the first occurrence of $_ and false for the other ones.

Anything else?

So this seems to be exactly what you need. Except that your description suggests you don't really want to operate on lists and you don't need to assign to arrays. Thus all in all you could do something along the lines of:

my %saw; while (<$in>) { print if !$saw{$_}++; }

which can be compacted/golfed as in my other reply.

UPDATE: speaking toungue-in-cheek - but previous experience tells me won't listen anyway: evidence is that your Perl knowledge is quite limited. So far so fine, nobody can impose to you to be an expert or to become one in a minute. But it is just as evident that you're routinely using perl to get some job done. In this case it should be recommendable to get acquainted with Perl's basic syntax, semantics, idioms. That is: asking here for help is fine and all the rest, but I guarantee to you that spending some time to read some introductory book or tutorial will enable you to solve problems like this and to understand code like the above, since I assure you that there's nothing particularly advanced or complex involved.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://530899]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (3)
As of 2024-04-23 23:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found