Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re^2: fasta hash

by morio56 (Initiate)
on Aug 26, 2011 at 13:51 UTC ( #922613=note: print w/ replies, xml ) Need Help??


in reply to Re: fasta hash
in thread fasta hash

Thanks. But then what will be the value to the id key since te ids file only contain ids and nothing else?


Comment on Re^2: fasta hash
Re^3: fasta hash
by moritz (Cardinal) on Aug 26, 2011 at 13:56 UTC

      I have changed the code, but now my problem seems to be that I can only access the last line of the output outside the loops. I wonder if there's a way to store the variables inside the loop to be accessible outside. The code looks like this now.

      if(@ARGV < 3){ die "Not enough arguments\n"; } $sequence=""; $fastaID; open(FILE1,"$ARGV[0]") or die "No fasta file provided in command line: + $!\n"; while ($line=<FILE1>){ chomp($line); if ($line=~/^\s*$/){ next; }elsif ($line=~/^.*$/){ $fastaID=$line; $fastahash{$fastaID}=1; } } open(FILE2,"$ARGV[2]") or die "No fasta file provided in command line: + $!\n"; while($line2=<FILE2>){ chomp($line2); if ($line2=~/^>/){ @data=split(" ",$line2); $fasta=$data[1]; $sequence=""; }else{ $sequence.=$line2; } } if (exists $fastahash{$fasta}){ print "$fastaID\t $sequence\n"; } exit;

      And the output, which is just the last key value in the fastahash is

      2056360013 Musacgagchagshgashcgahcgacacsasasasacsacsasasacacaasc +assacsaascascascascac

        Sadly you ignored half of my advice; it wasn't given without reason. Please reconsider your position.

        You print can only be executed once because it's not inside a loop (should be inside the while loop). Proper indention would have made that obvious, and use strict would have told you that $fasta isn't in scope in the if (exists $fastahash{$fasta}) (if you would have declared it properly inside the while loop).

Re^3: fasta hash
by ForgotPasswordAgain (Deacon) on Aug 26, 2011 at 22:37 UTC

    As moritz said, 1 (or ++) is a common choice of value, but it's not necessarily the best one. It doesn't usually matter these days for 100k elements, but it's better (thanks, Liz! ;) for memory size to do something like this:

    my ($undef); while (whatever...) { .... $hash{$key} = $undef; }

    This way each element points to the same $undef value. Otherwise, each element would point to a different copy of the value 1. That's a kind of "poor man's aliasing". For bonus points, you might look at Array::RefElem or Data::Alias.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://922613]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (6)
As of 2014-12-21 05:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (103 votes), past polls