Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re^4: fasta hash

by morio56 (Initiate)
on Aug 26, 2011 at 15:12 UTC ( [id://922670]=note: print w/replies, xml ) Need Help??


in reply to Re^3: fasta hash
in thread fasta hash

I have changed the code, but now my problem seems to be that I can only access the last line of the output outside the loops. I wonder if there's a way to store the variables inside the loop to be accessible outside. The code looks like this now.

if(@ARGV < 3){ die "Not enough arguments\n"; } $sequence=""; $fastaID; open(FILE1,"$ARGV[0]") or die "No fasta file provided in command line: + $!\n"; while ($line=<FILE1>){ chomp($line); if ($line=~/^\s*$/){ next; }elsif ($line=~/^.*$/){ $fastaID=$line; $fastahash{$fastaID}=1; } } open(FILE2,"$ARGV[2]") or die "No fasta file provided in command line: + $!\n"; while($line2=<FILE2>){ chomp($line2); if ($line2=~/^>/){ @data=split(" ",$line2); $fasta=$data[1]; $sequence=""; }else{ $sequence.=$line2; } } if (exists $fastahash{$fasta}){ print "$fastaID\t $sequence\n"; } exit;

And the output, which is just the last key value in the fastahash is

2056360013 Musacgagchagshgashcgahcgacacsasasasacsacsasasacacaasc +assacsaascascascascac

Replies are listed 'Best First'.
Re^5: fasta hash
by moritz (Cardinal) on Aug 26, 2011 at 16:38 UTC

    Sadly you ignored half of my advice; it wasn't given without reason. Please reconsider your position.

    You print can only be executed once because it's not inside a loop (should be inside the while loop). Proper indention would have made that obvious, and use strict would have told you that $fasta isn't in scope in the if (exists $fastahash{$fasta}) (if you would have declared it properly inside the while loop).

      Thanks for your help, but non of this things work even if i follow your instructions to the letter. Putting the if loop within the while loop only prints the ids and no sequences. Putting it in the big while loop prints several lines repeatedly

        Instead of processing the second file line by line, it is possible to use the '>' character as an end-of-line input separator. This simplifies processing of the file.
        #!/usr/bin/perl -w use strict; open (IDS, '<', "fastaIds") or #with your data 2056360012 2056360013 die "cannot open fastaIds $!\n"; #-------------# my %ids; while (<IDS>) #process first file with only ID's { chomp; $ids{$_}=1; } #-------------# $/='>'; #input record separator is now '>' while (<DATA>) { chomp; # now works on '>' not \n next if /^\s*$/; # first record will be blank my ($id) = /^\s*(\d+)/; print ">$_" if $ids{$id}; #print all lines for this id } #-------------# =prints > 2056360012 1047627436237 yyyacgagchagshgashcgahcgac acsasasasacsacsasasacaca ascassacsaascascascascac > 2056360013 1047627436238 xxxxcgagchagshgashcgahcgac acsasasasacsacsasasacaca ascassacsaascascascascac =cut __DATA__ > 2056360012 1047627436237 yyyacgagchagshgashcgahcgac acsasasasacsacsasasacaca ascassacsaascascascascac > 2056360013 1047627436238 xxxxcgagchagshgashcgahcgac acsasasasacsacsasasacaca ascassacsaascascascascac

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://922670]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (3)
As of 2024-04-26 01:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found