Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

Re^2: fasta hash

by morio56 (Initiate)
on Aug 26, 2011 at 13:51 UTC ( #922613=note: print w/replies, xml ) Need Help??

in reply to Re: fasta hash
in thread fasta hash

Thanks. But then what will be the value to the id key since te ids file only contain ids and nothing else?

Replies are listed 'Best First'.
Re^3: fasta hash
by ForgotPasswordAgain (Deacon) on Aug 26, 2011 at 22:37 UTC

    As moritz said, 1 (or ++) is a common choice of value, but it's not necessarily the best one. It doesn't usually matter these days for 100k elements, but it's better (thanks, Liz! ;) for memory size to do something like this:

    my ($undef); while (whatever...) { .... $hash{$key} = $undef; }

    This way each element points to the same $undef value. Otherwise, each element would point to a different copy of the value 1. That's a kind of "poor man's aliasing". For bonus points, you might look at Array::RefElem or Data::Alias.

Re^3: fasta hash
by moritz (Cardinal) on Aug 26, 2011 at 13:56 UTC

      I have changed the code, but now my problem seems to be that I can only access the last line of the output outside the loops. I wonder if there's a way to store the variables inside the loop to be accessible outside. The code looks like this now.

      if(@ARGV < 3){ die "Not enough arguments\n"; } $sequence=""; $fastaID; open(FILE1,"$ARGV[0]") or die "No fasta file provided in command line: + $!\n"; while ($line=<FILE1>){ chomp($line); if ($line=~/^\s*$/){ next; }elsif ($line=~/^.*$/){ $fastaID=$line; $fastahash{$fastaID}=1; } } open(FILE2,"$ARGV[2]") or die "No fasta file provided in command line: + $!\n"; while($line2=<FILE2>){ chomp($line2); if ($line2=~/^>/){ @data=split(" ",$line2); $fasta=$data[1]; $sequence=""; }else{ $sequence.=$line2; } } if (exists $fastahash{$fasta}){ print "$fastaID\t $sequence\n"; } exit;

      And the output, which is just the last key value in the fastahash is

      2056360013 Musacgagchagshgashcgahcgacacsasasasacsacsasasacacaasc +assacsaascascascascac

        Sadly you ignored half of my advice; it wasn't given without reason. Please reconsider your position.

        You print can only be executed once because it's not inside a loop (should be inside the while loop). Proper indention would have made that obvious, and use strict would have told you that $fasta isn't in scope in the if (exists $fastahash{$fasta}) (if you would have declared it properly inside the while loop).

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://922613]
and a moth chases the moon...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (7)
As of 2017-04-24 20:11 GMT
Find Nodes?
    Voting Booth?
    I'm a fool:

    Results (445 votes). Check out past polls.