Re: output unique lines only


Perl: the Markov chain saw
	PerlMonks

Re: output unique lines only

by EdwardG (Vicar)

on Dec 06, 2005 at 16:44 UTC ( [id://514541]=note: print w/replies, xml )

Need Help??

in reply to output unique lines only

Here's one approach -

Use STDIN and STDOUT for input and output
Use a regex to extract the first 'column'. You could also use split, but since you care only about the first column it may be overkill.
Use a hash to gather unique filenames

Put it all together and you will have something like this:

# uniqfiles.pl
use strict; # helps prevent silly mistakes
use warnings; # helpful when writing code

while (<>) {  # Reads from STDIN
   if (/^(\w+)\t/) { # If the line starts with one or more 'word' char
+acters followed by a tab...
      my $filename = $1; # ...assume we've got a filename captured
      $uniq_fnames{$filename} = 1; # ...and add it to our hash.
   }
}
print $_,"\n" for keys %uniq_fnames;  # prints to STDOUT, can be piped
+ to a file
[download]

Then you could use this as follows

perl uniqfiles.pl < my_non_unique_list_of_files > my_unique_list_of_fi
+les
[download]

Comment on Re: output unique lines only Select or Download Code

In Section Seekers of Perl Wisdom

Domain Nodelet^?

www.com | www.net | www.org

Node Status^?

node history
Node Type: note [id://514541]
help

Chatterbox^?

How do I use this? • Last hour • Other CB clients

Other Users^?

Others having an uproarious good time at the Monastery: (3)

As of 2024-04-26 08:19 GMT

Sections^?

Information^?

Find Nodes^?

Leftovers^?

Today I Learned

Voting Booth^?

No recent polls found