http://www.perlmonks.org?node_id=926774

uG has asked for the wisdom of the Perl Monks concerning the following question:

I'm working on a script to analyze American football play by play data. The first step:
# Use this for calculating running stats my $all_current_plays = $game->plays->search({ GameIdentifier => $play +->GameIdentifier, id => { '<=', $play->id } })->search(undef, { cache + => 1 } ); my $all_current_home_plays = $all_current_plays->search({ b_home_posse +ssion => 1 })->search(undef, { cache => 1 }); my $all_current_away_plays = $all_current_plays->search({ b_home_posse +ssion => 0 })->search(undef, { cache => 1 });
This is pretty obvious, it grabs the resultset for all the plays (home, away, both). So far, so good.
if(my $b = $all_current_home_plays->search({ b_pass => 1 })) { my $count = $b->count; if($count >0) { #CALCULATIONS NEEDING A PASS PLAY TO HAVE OCCURRED my $completes = $count - $b->search({ b_incomplete => +1 })->count; my $sum = $b->get_column('yards')->sum; my $ints = $b->search({ b_intercepted => 1 })->count; if(my $c = $b->search({ down => 1 })) { my $acount = $c->count; if($acount > 0) { $stats{HTO_1st_YPP} = $b->search({ down => 1 } +)->get_column('yards')->sum / $acount; my $totals; while(my $d = $c->next) { $totals += $d->_get_pass_score; } $stats{HTO_1st_Pass_Score_Avg} = $totals / $ac +ount; } }
The above is an example of where my bottle necks start happening. For this section, we are analyzing all the plays (that are passing on 1st down) up to the current play. The while loop is the real killer though, as on play 200 it has to loop through the (up to) 200 previous plays and call _get_pass_score (which is just a $row->get_column('yards')->sum / $row->count Now I have multiple sections like the one above (rushing, passing, the various downs, special teams, etc). Overall a game runs about 13,000 queries, but its taking 300 seconds. I attribute this to dbix::class object creation (my perl process stays at 100% CPU while mysql sits at 5% or lower). I really don't want to generate a summary table or keep track of all this stuff in globals (I did this before, and it ran lightning fast but the code was too hard to maintain), so i'm wondering what kind of hardware (if any) will drastically improve my times (this is all on an atom processor and 4 gigs of ram, so CPU power is lacking). 300 seconds per