Beefy Boxes and Bandwidth Generously Provided by pair Networks Ovid
Perl Monk, Perl Meditation
 
PerlMonks  

Re: reading files in directory

by Riales (Hermit)
on Mar 19, 2012 at 23:51 UTC ( #960498=note: print w/ replies, xml ) Need Help??


in reply to reading files in directory

Which part are you have trouble with, exactly? Opening the fasta files for reading? Parsing the files? Using a hash?

What does a 'fasta' file look like? You say you want to store a sequence in a hash. What would you use as a key?


Comment on Re: reading files in directory
Re^2: reading files in directory
by anonym (Acolyte) on Mar 20, 2012 at 00:01 UTC

    The file format is

    >chr1 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNATGCNNNNNNNNNNNNNNNNGCAT... +..... >chr2 ATNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN +NNNN. .... EOF

    I have splitted this multifasta into 25 pieces which is present in my current working directory.However, I am confused as to how can I read all these fasta files and store the hash key as header line and value as sequence. Thanks

      Well, the first thing you'll need to do is tell Perl what the relevant files are. The simplest way to do that is to pass it in as a command-line argument. Look for @ARGV in perldoc perlvar for that.

      If the files are all in the same directory and there are no other files in the directory (and you don't want to type out 25 separate filenames), you could use a loop on STDIN to receive piped into into your script:

      my @filenames; push @filenames, $_ while (<STDIN>); chomp@filenames; use Data::Dumper; print Dumper(@filenames);

      That way, you can get all your filenames into your script by simply doing:

      ls | my_script.pl

      As for actually opening each file for reading, you've already done that. What's confusing about reading the files? Have you checked perldoc -f open?

      As for hashes, they're pretty simple:

      my %hash; # Declare your hash. $hash{$key} = $sequence; # Assign $sequence to the hash with key $key. my $sequence_im_looking_for = $hash{$key}; # Get the sequence by looki +ng it up with its key $key.

      It looks like you'll be alternating lines; line1 is the key for the sequence in line2, line3 is the key for the sequence in line4, etc.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://960498]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (18)
As of 2014-04-18 20:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (472 votes), past polls