Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re^4: Comparing and getting information from two large files and appending it in a new file

by perlkhan77 (Acolyte)
on Apr 01, 2012 at 01:32 UTC ( #962819=note: print w/ replies, xml ) Need Help??


in reply to Re^3: Comparing and getting information from two large files and appending it in a new file
in thread Comparing and getting information from two large files and appending it in a new file

Hi Graff one more thing about the result file being produced the number of lines having Gm10 assembly in Methylation.gtf are 26637 and thus the final result should have the same number with 0 value for those genes which have no CG CHG or CHH count while right now it only prints 8626 lines including the header. Sorry to trouble you about that but if you can let me know what changes should I make in the code to make it possible

Thanks again


Comment on Re^4: Comparing and getting information from two large files and appending it in a new file
Re^5: Comparing and getting information from two large files and appending it in a new file
by graff (Chancellor) on Apr 01, 2012 at 02:19 UTC
    To get one line of output for every line in your first input file, there's few changes.
    ... my %methrange; my %methhash; # this line had been further down in my prev.version, j +ust move it up ... if ( /^\s*Gm10/ ) { my ( $bgn, $end, $methcount ) = (split)[3,4,10]; $methrange{$bgn}{$end}{$_} = undef; $methhash{$_}(methcount} = $methcount; # moved up from below } ... for my $end ( keys %{$methrange{$bgn}} ) { if ( $position <= $end ) { for my $match ( keys %{$methrange{$bgn}{$end}} ) { $methhash{$match}{$class}++; # methcount was +moved from here } } } ...
    As for the benchmark, you said that your OP version "took forever". Was "forever" more than an hour and a half? (Did my version yield any improvement at all?) Do you have specific constraints about how much time can be taken up by a single run? If not, I'd say focus more on making sure the output is correct, rather than how long it takes to produce the output.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://962819]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (15)
As of 2014-08-27 20:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (252 votes), past polls