http://www.perlmonks.org?node_id=1007127
ZWcarp has asked for the wisdom of the Perl Monks concerning the following question:

#Update switched s[0] and s2's order in push

Hello monks, I am wondering if there is a way to do what I have posted below. There are two files, each have gene, sample, and position. I want to first count the unique sample/gene pairs in file2 for a given gene in file 1. Then I want to check the presence of a specific position/gene pair in file 2 for a given position/gene pair in file 1. For the pos/gene the samples do not matter, so in my hash I was hoping for some sort of wildcard character so that it just tells me if this exists or not. Can I use only one hash to do this, or do I need to use two separate hashes... I was hoping there might be some wildcard character to do this if (exists($Pos_overlap{$s[5]}{*}{$s[2]})) { thanks for your help!

use strict; use Data::Dumper; my %Gene_overlap; my %Pos_overlap; open (MYFILE,$ARGV[0]); my @file2 =<MYFILE>; close MYFILE; open (MYFILE,$ARGV[1]); my @file1 =<MYFILE>; close MYFILE; foreach(@file2){ chomp; my @s = split (/\t/, $_); #Splitting the Validations file for + gene name and amino acid push(@{$Pos_overlap{$s[5]}{$s[0]}{$s[2]}},$s[2]); # + pushes all sample/postion/gene/ combos into a hash } foreach(@file1){ chomp; my @s = split (/\t/, $_); # Splitting the file to get the s +ample/ position / gene if (exists($Pos_overlap{$s[5]})) { # Check to see if this +gene is also found in the file2 print $_ ."\t" . (keys %{$Pos_overlap{$s[5]}}); # +Prints how many times the exact combination of Gene and a unique samp +le is seen (but samples identity across files does not matter, just h +ow many unique ones there are ### This is the part I can't get to work###### if (exists($Pos_overlap{$s[5]}{$s[2]})) { # Check +s if the exact variant/position combination is present in both files + # print "\t" . (keys %{$Pos_overlap{$s[5]}{$s[2 +]}}) . "\n"; #prints how many times variant seen or would also be acc +eptable to just print a "1", saying that it does exist across both fi +les } else {print "\t0\n";} # prints 0 if no gene/positi +on found } else {print $_ . "\t0\t0\n";} #if no gene overlap

File1 structure

P15    1    17085713    C    S     MST1P9

file2 structure

005 1 17085712 C S MST1P9 006 1 17085712 C S MST1P9 006 1 17085713 C S MST1P9 007 1 17085712 C S MST1P9 006 1 17085713 C S MST1P9 006 1 17085713 C S MST1P9