Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: output unique lines only

by EdwardG (Vicar)
on Dec 06, 2005 at 16:44 UTC ( #514541=note: print w/ replies, xml ) Need Help??


in reply to output unique lines only

Here's one approach -

  • Use STDIN and STDOUT for input and output
  • Use a regex to extract the first 'column'. You could also use split, but since you care only about the first column it may be overkill.
  • Use a hash to gather unique filenames

Put it all together and you will have something like this:

# uniqfiles.pl use strict; # helps prevent silly mistakes use warnings; # helpful when writing code while (<>) { # Reads from STDIN if (/^(\w+)\t/) { # If the line starts with one or more 'word' char +acters followed by a tab... my $filename = $1; # ...assume we've got a filename captured $uniq_fnames{$filename} = 1; # ...and add it to our hash. } } print $_,"\n" for keys %uniq_fnames; # prints to STDOUT, can be piped + to a file

Then you could use this as follows

perl uniqfiles.pl < my_non_unique_list_of_files > my_unique_list_of_fi +les

 


Comment on Re: output unique lines only
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://514541]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (7)
As of 2015-07-29 05:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (260 votes), past polls