Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

subtracting values from 2 hashes

by lecb (Acolyte)
on Jun 21, 2014 at 18:59 UTC ( #1090786=perlquestion: print w/ replies, xml ) Need Help??
lecb has asked for the wisdom of the Perl Monks concerning the following question:

I seem to be struggling with something basic here... I have 2 input files:

CTACTCTCTCGTTTCCTAGGCTC -1.06 CTACTCTCTCGTTTCCTAGGCTC -1.06 CTACTCTCTCGTTTCCTAGGCTC -1.06 CATGGTCTCATCTTCCTAGGGAG -2.32 CATGGTCTCATCTTCCTAGGGAG -2.32 TAAGGCAGCCCACCCGCAGGCTG -15.60 AATAAGAGTAAGGACTTACTCTT -30.64 TTTCCCTTTCCCCTGCCAGATCT -11.24 TCTATCCTTTGTTTTACAGGAAC -3.05 ACTGTGTATAAATACTTACATCC -16.93 CGGTCCAGGCGTCGGCTACCTGG -22.77 CGGTCCAGGCGTCGGCTACCTGG -22.77 CAGGTACGTATTTTTCCAGGAAG -7.75 CCTGGGAAGAATGTCCTACCTGA -22.07 TTTCTCTTTCTTCAAACAGATGA -13.04 CCCCTTTCAAGTGACTCACAAGA -22.38 AGTGTCCTAGACGAAACACGTGA -17.22 CCACAATCTGATCACATACCTGA -19.09 GTGAGTGTCAGAGCCCTGTGGGC -31.44 GGTGACCTTTAAGGGCAAAATGT -17.26 GGTGACCTTTAAGGGCAAAATGT -17.26 GGTGACCTTTAAGGGCAAAATGT -17.26 TCGCCAAGGTCAGTGGCACAACT -31.06

and this is the second one

CTACTCTCTCATTTCCTAGGCTC -1.82 CTACTCTCTCATTTCCTAGGCTC -1.82 CTACTCTCTCATTTCCTAGGCTC -1.82 CATGGTCTCGTCTTCCTAGGGAG -1.06 CATGGTCTCGTCTTCCTAGGGAG -1.06 TAAGGCAGCTCACCCGCAGGCTG -13.51 AATAAGAGTAAGGTCTTACTCTT -28.26 TTTCCCTTTCCCTTGCCAGATCT -11.36 TCTATCCTTTGCTTTACAGGAAC -3.27 ACTGTGTATAAATGCTTACATCC -17.69 CGGTCCAGGCGGCGGCTACCTGG -25.61 CGGTCCAGGCGGCGGCTACCTGG -25.61 CAGGTACGTGTTTTTCCAGGAAG -6.62 CCTGGGAAGAATGTTCTACCTGA -22.07 TTTCTCTTTCTCCAAACAGATGA -13.05 CCCCTTTCATGTGACTCACAAGA -16.03 AGTGTCCTAGACAAAACACGTGA -16.88 CCACAATCTGAGCACATACCTGA -26.65 GTGAGTGTCGGAGCCCTGTGGGC -26.06 GGTGACCTTTAAAGGCAAAATGT -17.87 GGTGACCTTTAAAGGCAAAATGT -17.87 GGTGACCTTTAAAGGCAAAATGT -17.87 TCGCCAAGGTCAGTAGCACAACT -35.97

The sequences are the same in both files, just the number varies. I want to be able to output:

[whatever sequence is] [difference between 2 values].

It's obvious to me to use a hash, but when I try and do the calculation in my script it falls down. I tried making some simpler dummy data and the @calc array worked, so I'm not sure where I'm going wrong. My code is here:

#!/usr/bin/perl -w use strict; my $file1 = $ARGV[0]; my $file2 = $ARGV[1]; open (FILE1, $file1) or die "Uh oh.. unable to find file $file1"; ##Op +ens input file open (FILE2, $file2) or die "Unable to find $file2"; my @maxent_unchanged = <FILE1>; #loads inputfile1 data into array close FILE1; my @maxent_with_variant = <FILE2>; ## loads ref genome close FILE2; my @NM; my @max_score_unchanged; my %max_unchanged; foreach my $line(@maxent_unchanged) { if ($line =~ m/[a-z]/i) { push (@NM, $line); } else { push (@max_score_unchanged, $line); } } my $i = 0; foreach my $lines(@maxent_unchanged) { $max_unchanged{$NM[$i]} = $max_score_unchanged[$i]; $i++; } my @NM_ID; my @max_score_changed; my %max_changed; foreach my $line(@maxent_with_variant) { if ($line =~ m/[a-z]/i) { push (@NM_ID, $line); } else { push (@max_score_changed, $line); } } my $i = 0; foreach my $lines(@maxent_with_variant) { $max_changed{$NM_ID[$i]} = $max_score_changed[$i]; $i++; } print %max_unchanged; print "\n"; print "\n"; print "\n"; print %max_changed; my @calc; foreach my $key (keys(%max_changed)) { my $value1 = $max_unchanged{$key}; my $value2 = $max_changed{$key}; my $calc = $value1 - $value2; push (@calc, $calc); } use Data::Dumper; print Dumper @calc;

my other script which works is here:

#!/usr/bin/perl -w use strict; my %hash; my %hash2; %hash = ('John', '-455.45', 'Jack', '-300.00', 'Tom', '-766.75'); %hash2 = ('Jack', '-200.00', 'John', '-44.25', 'Tom', '-999.23'); use Data::Dumper; print Dumper %hash2; print "\n"; use Data::Dumper; print Dumper %hash; my @calc; foreach my $key (keys(%hash2)) { my $value1 = $hash{$key}; my $value2 = $hash2{$key}; my $calc = $value1 - $value2; push (@calc, $calc); } print "The difference is:", "\n"; use Data::Dumper; print Dumper @calc;

When I Data::Dump the arrays, I can see that they hold the correct information, so I don't understand why I can't get what I want... Go easy on me.. I'm still a beginner...

Comment on subtracting values from 2 hashes
Select or Download Code
Re: subtracting values from 2 hashes
by neilwatson (Curate) on Jun 21, 2014 at 19:18 UTC

    Too much info there. Perhaps narrow down your next post :). Welcome to Perl.

    I would read each file into separate hashes, so that $file{seq} = value. Then you compare the values from each.

    foreach my $key ( keys %file1 ) { $diff = $file1{$key} - $file2{$key}; }

    Your open statements should use variables, not bare words (depreciated), and use an operator. Example:

    open( my $fh1, "<", $file1 ) or die "Cannot open [$file1], [$!]";

    Use [] for visual separation. $! contains any error message that Perl may have for you.

    Neil Watson
    watson-wilson.ca

      Thank you Neil - I will make sure not to put so much in next time (I was afraid if I didn't, and I had made an error somewhere further up, it would have been missed). Didn't mean to overcrowd!

      Please excuse my ignorance, but I have been trying to load both files into their own hashes, the two hashes are:

      %max_unchanged %max_changed

      Maybe there is a much simpler way.. but if I use this:

      my @calc; $diff; foreach my $key (keys %max_changed) { my $diff = $max_unchanged{$key} - $max_changed{$key}; push (@calc, $diff); } print @calc, "\n";

      I still get the same error.. :-s...

      Re your second point, I wasn't too sure what you meant by

      "Your open statements should use variables, not bare words (depreciated), and use an operator."

      When you say "bare words" are you referring to an FILEHANDLE being called FILE?
      Many thanks, E

        Correct, instead of using file handles as bare words (FILE) we now use normal scalars.

        Neil Watson
        watson-wilson.ca

Re: subtracting values from 2 hashes
by poj (Priest) on Jun 21, 2014 at 20:03 UTC
    The sequences are the same in both files
    Are you sure ?. This is the first 3 records of each file, 11th character is different
    file1 = CTACTCTCTCGTTTCCTAGGCTC file2 = CTACTCTCTCATTTCCTAGGCTC ^ file1 = CTACTCTCTCGTTTCCTAGGCTC file2 = CTACTCTCTCATTTCCTAGGCTC ^ file1 = CTACTCTCTCGTTTCCTAGGCTC file2 = CTACTCTCTCATTTCCTAGGCTC ^
    I don't think any are the same.
    poj

      I am a numpty!!! Haha, no problem, I can easily fix it now! Thanks! x

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1090786]
Approved by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (4)
As of 2014-10-01 23:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    What is your favourite meta-syntactic variable name?














    Results (41 votes), past polls