http://www.perlmonks.org?node_id=958408


in reply to Creating 2 arrays from a larger array

Not sure of your data but this might help:
my @seqList; my @sequences; my $dnaString; my $count = 0; my @DNA = <DATA>; foreach my $line (@DNA) { if ($line=~/^>(\S+)/){ push (@seqList, $1); push (@sequences, $dnaString) if $count++; $dnaString = ''; } else { chomp $line; $dnaString .= $line; } } push (@sequences, $dnaString); # need to push last one __DATA__ >123 blah abcdef ghijkl >456 de dah mnopqr >789 nothing wanted here stuvwxyz # OUTPUT @seqList = ( '123', '456', '789' ); @sequences = ( 'abcdefghijkl', 'mnopqr', 'stuvwxyz' );

Replies are listed 'Best First'.
Re^2: Creating 2 arrays from a larger array
by imtakinbioinformatic (Initiate) on Mar 08, 2012 at 03:18 UTC
    Thanks tangent! I'm confused how  push (@sequences, $dnaString) ever operates. If the string matches >, it goes to the seqList array, so how is the @sequences array being created?
      You are in a loop, so $dnaString is being added to everytime the line doesn't match your start pattern. $dnaString is always one step behind @seqList so when you come to the next start line, it pushes the previous $dnaString onto @sequences, then clears it for the next iteration. Best thing to do is try it: set $count = 1 and see what happens.

      Update: after re-reading this I'm a bit confused myself. Better to put a print statement within the loop:
      my $dnaString = ''; ... if ($line=~/^>(\S+)/){ print qq|Count: $count, Match: $1, String: $dnaString\n|; push (@seqList, $1);
        uggg still not getting it. No errors but the elements I printed were all the same. I'll keep working, thanks for explaining!
        Yeah here's the file http://www.med.nyu.edu/rcr/rcr/course/smutans.fas