Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: extract values from file if greater than value

by Marshall (Canon)
on Jun 15, 2016 at 17:57 UTC ( [id://1165765]=note: print w/replies, xml ) Need Help??


in reply to extract values from file if greater than value

I don't understand your requirements statement. You have file1 with 3 fields and file2 with 10K+ fields. The output winds up with either I guess 10K fields or 4 fields. Can you explain the requirements again?
  • Comment on Re: extract values from file if greater than value

Replies are listed 'Best First'.
Re^2: extract values from file if greater than value
by mulder4786 (Novice) on Jun 15, 2016 at 18:28 UTC
    The output can be anywhere from 3 to 10k fields, depending on how many of the columns for that row meet the requirements of being greater than or equal to the third column of file 1
      Ok here is something to consider that produces your output to the best of my understanding at the moment:
      #!/usr/bin/perl use warnings; use strict; # this uses a "trick" to open an in memory file # like a file on the disk for testing purposes my $file1 =<<END; 1 19002930 0.74 1 19002931 -0.12 END my $file2 =<<END; 1 19002930 0.84 0.12 0.94 1 19002931 0 -.20 .12 END open (my $fh1, '<', \$file1) or die "$!"; open (my $fh2, '<', \$file2) or die "$!"; my $lineFile1; my $lineFile2; # compare line by line of both files # stop if either file "runs out of lines" # assumes that say: line 232 of file 1 goes with line 232 of file 2 while (defined ($lineFile1=<$fh1>) and defined ($lineFile2=<$fh2>)) { my ($line1Col1, $line1Col2, $line1Col3) = split ' ', $lineFile1; my ($line2Col1, $line2Col2, @file2rest) = split ' ', $lineFile2; if ($line1Col1 == $line2Col1 and $line1Col2 == $line2Col2) { print "$line1Col1 $line1Col2 "; print join" ", grep{ $_>=$line1Col3 }@file2rest; print "\n"; } else { # col1 and col2 didn't match, so we do nothing # you can delete this else clause entirely } } __END__ prints: 1 19002930 0.84 0.94 1 19002931 0 .12
      Could be written shorter, but I think that this what you want and it will run very quickly (compared with R).

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1165765]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (8)
As of 2024-04-25 11:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found