Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

to read particular lines

by tarakaramji (Initiate)
on May 30, 2012 at 18:06 UTC ( #973354=perlquestion: print w/ replies, xml ) Need Help??
tarakaramji has asked for the wisdom of the Perl Monks concerning the following question:

target: AT3G19880A, length: 1168 miRNA : miR156a length: 20 mfe: -19.3 kcal/mol p-value: 0.607541 position 165

i have a file which is basically has the repetion similar to given above (size of the file is 20GB). i have written a perl script which actually runned in a smaller file which matches the pattern and retrive the information

open(FH,$file) || die "cant open file\n"; #@lines=<FH>; #close FH; while ($line=<FH>) { chomp $line; if ($line =~ /^target:\s+(\S+)/) { push (@target,$1);}#print "$1\n"; elsif ($line =~ /^miRNA :\s+(\S+)/) {push (@mirna,$1)#print "$1\n"; } elsif ($line =~ /^p-value:\s+(\S+)/) {push (@score,$1)#print "$1\n"; } elsif ($line =~ /^position\s+(\S+)/) {push (@start,$1)#print "$1\n"; } } $length=@mirna; for ($i = 0; $i <$length; $i++) { print "$mirna[$i]\t$target[$i]\t$start[$i]\t$score[$i]\n";}

this is the code i have written for which the output is

miR156a AT3G19833 151 0.607541 miR156a AT3G19883 11 0.607541 miR156a AT3G19883 12 0.607541 miR156a AT3G19773 15 0.607541 miR156a AT3G19833 161 0.607541 miR156a AT3G19780 163 0.607541

the code i have written is running for smaller file but not for bigger ones..can anybody suggest a better in order run the program in effecient way...

Comment on to read particular lines
Select or Download Code
Re: to read particular lines
by thezip (Vicar) on May 30, 2012 at 20:30 UTC

    Let me format this for you, with some recoding as well.

    Update:

    This probably won't work for you since I wasn't sure what your input data looked like. My assumption was that it was multi-space delimited, but as I re-read your spec, it might have the actual labels interspersed in there.

    Perhaps you could include some sample data to clear this up?


    use strict; use warnings; use autodie; my $length; my $file = 'the filename...'; open(my $fh, '<', $file) || die "cant open file\n"; while (<$fh>) { chomp; my($target, $mirna, $score, $start) = split(/\s+/); print join("\t", $mirna, $target, $start, $score), "\n"; $length = $.; } close $fh;

    This is the code i have written for which the output is:

    miR156a AT3G19833 151 0.607541 miR156a AT3G19883 11 0.607541 miR156a AT3G19883 12 0.607541 miR156a AT3G19773 15 0.607541 miR156a AT3G19833 161 0.607541 miR156a AT3G19780 163 0.607541

    I haven't tested this, and this might not work exactly as you need it to, but I think it avoids some of the problems you might have had in your version.


    What can be asserted without proof can be dismissed without proof. - Christopher Hitchens, 1949-2011

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://973354]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (10)
As of 2014-12-27 19:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (177 votes), past polls