I'm working on a script to analyze American football play by play data. The first step:
# Use this for calculating running stats
my $all_current_plays = $game->plays->search({ GameIdentifier => $play
+->GameIdentifier, id => { '<=', $play->id } })->search(undef, { cache
+ => 1 } );
my $all_current_home_plays = $all_current_plays->search({ b_home_posse
+ssion => 1 })->search(undef, { cache => 1 });
my $all_current_away_plays = $all_current_plays->search({ b_home_posse
+ssion => 0 })->search(undef, { cache => 1 });
This is pretty obvious, it grabs the resultset for all the plays (home, away, both). So far, so good.
if(my $b = $all_current_home_plays->search({ b_pass => 1 })) {
my $count = $b->count;
if($count >0) {
#CALCULATIONS NEEDING A PASS PLAY TO HAVE OCCURRED
my $completes = $count - $b->search({ b_incomplete =>
+1 })->count;
my $sum = $b->get_column('yards')->sum;
my $ints = $b->search({ b_intercepted => 1 })->count;
if(my $c = $b->search({ down => 1 })) {
my $acount = $c->count;
if($acount > 0) {
$stats{HTO_1st_YPP} = $b->search({ down => 1 }
+)->get_column('yards')->sum / $acount;
my $totals;
while(my $d = $c->next) {
$totals += $d->_get_pass_score;
}
$stats{HTO_1st_Pass_Score_Avg} = $totals / $ac
+ount;
}
}
The above is an example of where my bottle necks start happening. For this section, we are analyzing all the plays (that are passing on 1st down) up to the current play. The while loop is the real killer though, as on play 200 it has to loop through the (up to) 200 previous plays and call _get_pass_score (which is just a $row->get_column('yards')->sum / $row->count
Now I have multiple sections like the one above (rushing, passing, the various downs, special teams, etc). Overall a game runs about 13,000 queries, but its taking 300 seconds. I attribute this to dbix::class object creation (my perl process stays at 100% CPU while mysql sits at 5% or lower). I really don't want to generate a summary table or keep track of all this stuff in globals (I did this before, and it ran lightning fast but the code was too hard to maintain), so i'm wondering what kind of hardware (if any) will drastically improve my times (this is all on an atom processor and 4 gigs of ram, so CPU power is lacking). 300 seconds per
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.