Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re: move the line if particular column is duplicate or more than 1 entries

by poj (Priest)
on Mar 17, 2013 at 18:08 UTC ( #1023929=note: print w/ replies, xml ) Need Help??


in reply to move the line if particular column is duplicate or more than 1 entries

I see that output 2 has the duplicate record where the column 6 value 213 is less than 345 except that value is a later line not a previous line. To compare the value of previous and later lines you either have to scan the file twice or read all the lines into a structure and then create the output files.

This example scans twice

# hash to hold highest values my %col6=(); while (my $line = <$data>) { chomp $line; my @fields = split "," , $line, -1; my $key = $fields[1].$fields[2]; # store max values if ( $fields[5] > $col6{$key} ){ $col6{$key} = $fields[5]; } } # reset to start seek $data,0,0; # read file 2nd time while (my $line = <$data>) { chomp $line; my @fields = split "," , $line, -1; my $key = $fields[1].$fields[2]; # reject lowest duplicate if ( $fields[5] < $col6{$key} ){ # extra text added for debugging print OUTFILE_1 $line." - duplicate $key $col6{$key}\n"; } else { print OUTFILE $line."\n"; } }

Update : This simple example assumes column 6 values are never negative.

poj


Comment on Re: move the line if particular column is duplicate or more than 1 entries
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1023929]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (11)
As of 2015-07-08 00:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (93 votes), past polls