Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options

Comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

By no means an answer, but just food for thought. One thing you might consider is to just keep the timestamps as they are not worry so much about normalizing them up front.

I was coming up with a way of modeling stats for authenticated smtp spam and was trying to get a breakdown of hourly counts per relay ip and per customer email address. It ended up being much simpler to just store the raw values (epoch timestamp, ip, email) in a mysql table and then put some of the heavy lifting on the sql side.

The date handling you can use in mysql is quite good. I used statements like below to give me a very simple-to-process array of data on which to report.

So, to give myself the last the day's worth of data broken down by hour, I would do something like this:

SELECT timestamp, CONVERT_TZ(FROM_UNIXTIME(timestamp - timestamp % 3600 ), '+00:00', + '-5:00') AS Hourstamp, CONVERT_TZ(FROM_UNIXTIME(timestamp), '+00:00', '-5:00') AS 'Datest +ampEDT', FROM_UNIXTIME(timestamp) AS 'Datestamp', relay, email FROM relay_report WHERE FROM_UNIXTIME(timestamp) >= DATE_SUB(NOW(), INTERVAL 23 HOUR)

The imporant bit is the modulus math. This would give me the rows with Hourstamp being the time at the top of the current row's hour, the raw datestamp in Eastern time (yes, I eventually got around to installing the time zones in mysql and fixed this...), the raw datestamp (all our infrastructure uses GMT), and then the relay ip and email address. I could then easily group things by that hour stamp and have a nice breakdown of hourly rates. I had other things I wanted to do with the data while I was processing it which is why I didn't just do a GROUP BY right in the sql statement.

My primary goal was to come up with a way of detecting compromised customer accounts that were being used to relay spam via authenticated smtp, but I also was giving other departments a way to look at these stats. Doing it this way proved ridiculously easy to just run Excel with the odbc connector to pull data directly, because it was then trivial to create a quick pivot table and a few graphs and charts that were good for a few oohs and aahs on the conference room projector. (Plus date and time stamp handling in Excel is so effing difficult I half expected the Excel elites in there to hold me down with a blanket and pummel me with thick manuals because I'd handled the timestamps perfectly before the data ever got to Excel.)

Given your raw sample data, I'd probably do something similar. Start stuffing your epoch timestamp, key, and value into db rows as-is. Then select the info you want using a similar method to what I demonstrated. Play with the modulus math if you want to group by smaller increments, 60 seconds or whatever.


In reply to Re: How do I synchronize data to common timestamp by naChoZ
in thread How do I synchronize data to common timestamp by lewnewby

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and all is quiet...

    How do I use this? | Other CB clients
    Other Users?
    Others rifling through the Monastery: (5)
    As of 2018-03-21 02:05 GMT
    Find Nodes?
      Voting Booth?
      When I think of a mole I think of:

      Results (263 votes). Check out past polls.