http://www.perlmonks.org?node_id=332046


in reply to Data averages by time of day

This sort of problems can be handled with a DBMS engine.

Something like this?

#!/usr/bin/perl -w use strict; use DBI; my @data = ( ['2004-02-01 07:01', 82], ['2004-02-01 11:38', 172], ['2004-02-01 22:48', 154], ['2004-02-02 05:38', 107], ['2004-02-02 13:20', 117], ['2004-02-02 23:48', 188]); my $dbh = DBI->connect("dbi:SQLite:averages","","") or die $DBI::errstr; unless ( -s "averages") { $dbh->do(qq{ CREATE TABLE averages ( d datetime not null, val INTEGER)}) or die $DBI::errstr; for (@data) { $dbh->do( qq{INSERT INTO averages VALUES (?, ?)}, undef, @$_) or die $DBI::errstr } } my $query = qq{ SELECT substr(d,1,10) as day, CASE WHEN substr(d,12,2) BETWEEN 4 AND 12 THEN "1 morning" WHEN substr(d,12,2) BETWEEN 13 AND 18 THEN "2 afternoon" WHEN substr(d,12,2) BETWEEN 19 AND 24 THEN "3 evening" ELSE "4 night" END AS tm, AVG(val) FROM averages GROUP BY day, tm ORDER BY day, tm }; my $averages = $dbh->selectall_arrayref($query) or die $DBI::errstr; printf "%-10s %-15s %5.2f\n", @$_ for @$averages; __END__ output: 2004-02-01 1 morning 127.00 2004-02-01 3 evening 154.00 2004-02-02 1 morning 107.00 2004-02-02 2 afternoon 117.00 2004-02-02 3 evening 188.00

You can add/remove/modify the intervals by acting on the CASE statement inside the query.

If you want an average by time only, without the days, then use this query:

my $query = qq{ SELECT CASE WHEN substr(d,12,2) BETWEEN 4 AND 12 THEN "1 morning" WHEN substr(d,12,2) BETWEEN 13 AND 18 THEN "2 afternoon" WHEN substr(d,12,2) BETWEEN 19 AND 24 THEN "3 evening" ELSE "4 night" END AS tm, AVG(val) FROM averages GROUP BY tm ORDER BY tm }; my $averages = $dbh->selectall_arrayref($query) or die $DBI::errstr; printf "%-15s %5.2f\n", @$_ for @$averages; __END__ output: 1 morning 120.33 2 afternoon 117.00 3 evening 171.00
 _  _ _  _  
(_|| | |(_|><
 _|   

Replies are listed 'Best First'.
Re: Re: Data averages by time of day
by blue_cowdawg (Monsignor) on Feb 26, 2004 at 17:19 UTC

        Something like this?

    Something along those lines was one of the ideas I was playing around with but with finer blocks of time.

    select hg from readings where time > "04:00" and time < "4:30"; . . . and so on
    and then averaging the time slots. Where I have no values in a given time slot I could to linear interpolation. Abigail makes a very valid point that I am trying to wrap my head around to figure out how to implement it is the theory (fact?) that there may be a sinusoidal relationship to consider as well.

    I beginning to think that I have a good basis for a posting to CUFP when I get this working.


    Peter L. Berghold -- Unix Professional
    Peter at Berghold dot Net
       Dog trainer, dog agility exhibitor, brewer of fine Belgian style ales. Happiness is a warm, tired, contented dog curled up at your side and a good Belgian ale in your chalice.