http://www.perlmonks.org?node_id=1075298


in reply to Re^3: How to add column into array from delimited tab file
in thread How to add column into array from delimited tab file

I'm sorry, I do not understand this part...abit lost here:
for my $key ( keys %hash ) { if ( $key =~ /^dataS\d\dR\d$/ ) { print $key, @{ $hash{$key} }, "\n"; }
I want to be able manipulate the columns individually (thus attempting array) so I am not sure how hash can help in this case. I try running it but there's error which the script keep running without stopping. Instead, I attempted the matching of the colume names by using loop but I run into errors somewhere. I have my own code where I did a while loop to find the matching headers:
my $first = 1; for(my $i = 0; $i < $originalfilecount; $i++) { #read in the current file open CURINFILE, "<$files[$i]" or die "Error couldn't open file $fi +les[$i]\n"; print "$files[$i]\n"; if($first) { #if this is first file, find column locations my $firstline = <CURINFILE>; #read headerline chomp $firstline; my @columns = split (/\t/, $firstline); my $columncount = 0; # print "$firstline\n"; #check if print headers correctly ####### Column Headers for ID, TIME ######### while ($columncount <= $#columns && !($columns[$columncount] = +~ /ID/)) { $columncount ++; } $ID = $columncount; while ($columncount <= $#columns && !($columns[$columncount] = +~ /Time/)) { $columncount ++; } $masstimes = $columncount; while ($columncount <= $#columns && !($columns[$columnco +unt] =~ /Links/)) { $columncount ++; } $Links = $columncount; #check if column position is correct (so far it is correct) print "ID is at column: $ID\n"; #output = 0 print "Time is at column: $masstimes\n"; #output = 1 print "Links is at column: $Links\n"; #output = 33 #DataR Columns (got ERROR here where I can't run script at all if I ad +d this) # while($columncount <= $#columns && !(($columns[$columncount] = +~ /_data/))) { $columncount++; } $columns[$columncount] =~ /_dataS(\d+)R/; my $currentReplicateID = $1; my $currentReplicateCount = 1; $ctrlStartCol = $columncount++; while($columncount <= $#columns) { $columns[$columncount] =~ /_dataS(\d+)R/; my $newReplicateID = $1; if($newReplicateID ne $currentReplicateID) { push(@replicateCount, $currentReplicateCount); $currentReplicateID = $newReplicateID; $currentReplicateCount = 1; } else { $currentReplicateCount++; } $columncount++; } #add the last replicate in push(@replicateCount, $currentReplicateCount); ###### End of Data Column Headers #### ####### Read remainder of the file ############## while (<CURINFILE>) { #add metabolite ID, MZ, RT to an array chomp $_; my @templine = split (/\t/, $_); push(@tempratio, $templine[$metabolite]); push(@tempratio, $templine[$masstimes]); push(@tempratio, $templine[$rt]); #ERROR #add intensities from the samples my $columnIndex = $ctrlStartCol; for(my $k = 0; $k <= $i; $k++) { $columnIndex += $replicateCount[$k]; } for(my $j = 0; $j < $replicateCount[$i+1]; $j++) { push(@tempratio, $templine[$columnIndex+$j]); } } } # end of main if loop close CURINFILE; } #end of main for loop ############## Start of output ################## print "\nWriting output..."; #create a new Directory and open output files and print out those valu +es from the hash that meet the filtering criteria #filtering criteria: defaults set at pvalue < 0.05, 0.5 <ratio > 1.5. +User specified mkdir "$pathname" or die "Error couldn't create new Directory"; open my $OUT1, ">$pathname/Metabolite ID.txt" or die "error couldn't o +pen output file"; open my $OUT2, ">$pathname/masstimes.txt" or die "error couldn't open +output file"; open my $OUT3, ">$pathname/retentiontimes.txt" or die "error couldn't +open output file"; open my $OUT4, ">$pathname/intensitydata.txt" or die "error couldn't o +pen output file"; print $OUT1 "$tempratio[0]"; print $OUT2 "$tempratio[1]"; print $OUT3 "$tempratio[2]"; print $OUT4 "$tempratio[3]"; close $OUT1; close $OUT2; close $OUT3;
PS: The output only print 1st value for all OUT1 to OUT4 instead all the values in the whole columns...why?? my DataR names are something like this across the columns:
8899_Neg_Rep01_dataS01R01 8889_Neg_Rep02_dataS01R02 8889_Neg_Rep03_dataS01R03 7499_Neg_Rep01_dataS02R01 7499_Neg_Rep02_dataS02R02 7499_Neg_Rep03_dataS02R03 7709_Neg_Rep01_dataS05R01 7709_Neg_Rep02_dataS05R02 7709_Neg_Rep03_dataS05R03 (and so on...)
That's why I attempt a while loop to try match but I think somewhere is wrong? Not sure if it is the right way to do this.