Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??

By no means an answer, but just food for thought. One thing you might consider is to just keep the timestamps as they are not worry so much about normalizing them up front.

I was coming up with a way of modeling stats for authenticated smtp spam and was trying to get a breakdown of hourly counts per relay ip and per customer email address. It ended up being much simpler to just store the raw values (epoch timestamp, ip, email) in a mysql table and then put some of the heavy lifting on the sql side.

The date handling you can use in mysql is quite good. I used statements like below to give me a very simple-to-process array of data on which to report.

So, to give myself the last the day's worth of data broken down by hour, I would do something like this:

SELECT timestamp, CONVERT_TZ(FROM_UNIXTIME(timestamp - timestamp % 3600 ), '+00:00', + '-5:00') AS Hourstamp, CONVERT_TZ(FROM_UNIXTIME(timestamp), '+00:00', '-5:00') AS 'Datest +ampEDT', FROM_UNIXTIME(timestamp) AS 'Datestamp', relay, email FROM relay_report WHERE FROM_UNIXTIME(timestamp) >= DATE_SUB(NOW(), INTERVAL 23 HOUR)

The imporant bit is the modulus math. This would give me the rows with Hourstamp being the time at the top of the current row's hour, the raw datestamp in Eastern time (yes, I eventually got around to installing the time zones in mysql and fixed this...), the raw datestamp (all our infrastructure uses GMT), and then the relay ip and email address. I could then easily group things by that hour stamp and have a nice breakdown of hourly rates. I had other things I wanted to do with the data while I was processing it which is why I didn't just do a GROUP BY right in the sql statement.

My primary goal was to come up with a way of detecting compromised customer accounts that were being used to relay spam via authenticated smtp, but I also was giving other departments a way to look at these stats. Doing it this way proved ridiculously easy to just run Excel with the odbc connector to pull data directly, because it was then trivial to create a quick pivot table and a few graphs and charts that were good for a few oohs and aahs on the conference room projector. (Plus date and time stamp handling in Excel is so effing difficult I half expected the Excel elites in there to hold me down with a blanket and pummel me with thick manuals because I'd handled the timestamps perfectly before the data ever got to Excel.)

Given your raw sample data, I'd probably do something similar. Start stuffing your epoch timestamp, key, and value into db rows as-is. Then select the info you want using a similar method to what I demonstrated. Play with the modulus math if you want to group by smaller increments, 60 seconds or whatever.

--
Andy


In reply to Re: How do I synchronize data to common timestamp by naChoZ
in thread How do I synchronize data to common timestamp by lewnewby

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others scrutinizing the Monastery: (8)
    As of 2014-10-02 07:57 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      What is your favourite meta-syntactic variable name?














      Results (50 votes), past polls