Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re^4: Increasing the efficiency of a viral clonal expansion model

by ZWcarp (Beadle)
on Jul 06, 2011 at 19:34 UTC ( #913064=note: print w/ replies, xml ) Need Help??


in reply to Re^3: Increasing the efficiency of a viral clonal expansion model
in thread Increasing the efficiency of a viral clonal expansion model

Yeah I suppose you're right. I just type it by habit. Good point!


Comment on Re^4: Increasing the efficiency of a viral clonal expansion model
Re^5: Increasing the efficiency of a viral clonal expansion model
by BrowserUk (Pope) on Jul 06, 2011 at 19:42 UTC

    How much memory does the code you posted in the root post use when run with that seed sequence and 120 iterations? And how long does it take to run?


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      on a qrsh ~18G it takes about 20 minutes. When I can't get a node with that kind of memory its a few hours. I've been dividing up the iterations as separate qsubs which has been helping. Simplifying it to a right fischer model also greatly helps with speed..( constant population size) but this assumption has drawbacks in modeling true variation seen in pandemic outbreaks where you obviously don't have a constant population size.
        on a qrsh ~18G it takes about 20 minutes. When I can't get a node with that kind of memory its a few hours.

        Figures. When you fail to get a node with sufficient memory you are moving into swapping, and that will always kill performance big time. You need to avoid that at all costs.

        When I run your code here, the fan out varies widely depending upon the random patterns:

        c:\test>912999 I:1 L:1 I:2 L:14 I:3 L:34 I:4 L:182 I:5 L:977 I:6 L:6578 I:7 L:58659 Terminating on signal SIGINT(2) c:\test>912999 I:1 L:1 I:2 L:13 I:3 L:142 I:4 L:784 I:5 L:6166 I:6 L:49483 I:7 L:299369 Terminating on signal SIGINT(2) c:\test>912999 I:1 L:1 I:2 L:1 I:3 L:2 I:4 L:13 I:5 L:95 I:6 L:138 I:7 L:624 I:8 L:2923 Terminating on signal SIGINT(2) c:\test>912999 I:1 L:1 I:2 L:2 I:3 L:5 I:4 L:29 I:5 L:53 I:6 L:294 I:7 L:1935 Terminating on signal SIGINT(2) c:\test>912999 I:1 L:1 I:2 L:2 I:3 L:14 I:4 L:57 I:5 L:157 I:6 L:1467 I:7 L:3200 I:8 L:23871 I:9 L:81714 Terminating on signal SIGINT(2)

        I'm going to assume that qrsh and qsub are GRID apis?

        Whilst there is much that can be done to improve the performance of your posted code, given these statistics, it seems likely that the main constraint for your program is memory usage. When your program moves into swapping, any titivations done to save a few microseconds here and there will just get drowned in the noise of disk(memory) thrashing.

        My suggestion would be to modify your script to monitor the size of the %allgen hash and when it reaches a size that is likely to push the minimum size node on your GRID into swapping, split the generations of that hash into (say) four files and qsub four nodes to read those files and pick up the algorithm from that point.

        So, (say) you run 20 iterations and generate 1 million mutations. You split those 1 million into 4 files and start four nodes to pick up from that point with 1/4 million candidates. When each of those nodes approaches 1 million mutations, you repeat the split. And so on.

        You'll need to judge the split points in the light of your knowledge of the systems available to you. On my (currently only 2GB) system, I've never managed to run your code past 10 iterations before the process moved into swapping.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://913064]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (5)
As of 2014-10-26 02:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (149 votes), past polls