Keep It Simple, Stupid | |
PerlMonks |
Re^4: Searching array against hashby BrowserUk (Patriarch) |
on Aug 22, 2013 at 03:18 UTC ( [id://1050447]=note: print w/replies, xml ) | Need Help?? |
It will help if you are looking to retrieve a subsequence from the human genome, the FASTA file of which is about 5 Gb; I guess things have moved on. The version I have is just under 3GB and came in 25 files chr(1-22, M, X, Y). That said, if his 3 posted sequences are representative of his 900,000; that means his file is a tad under 900MB. Which if he can process that in "a few seconds"; means he could process your 5GB file in 5+bit * "a few seconds". But, and here is the point. It will take Bio::DB::Fasta at least that same 5+bit*"a few seconds" to construct an index; before he can start processing anything. So for a one-off process, there is a net loss. Now the real crux. Given all the additional layers and overheads; how many times does he have to redo the process in order to obtain a net gain? (If ever.) Then add to that the (potential) problems with installation; and the learning curve of finding your way around the documentation for 897 modules to find the one that you want; and then learning how to use it to do what you want; and suddenly the reason why so many bioinformaticians are looking for Lite alternatives to the Bio::Behemoth and simple procedures in order to get their work done; rather than becoming technical debt slaves to the byzantine Bio::Empire. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
In Section
Seekers of Perl Wisdom
|
|