Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

Perlify some awk'ing for SPAM graphing

by hacker (Priest)
on Jun 24, 2003 at 13:21 UTC ( #268489=perlquestion: print w/replies, xml ) Need Help??
hacker has asked for the wisdom of the Perl Monks concerning the following question:

I'm working on some SPAM graphing with mrtg and gnuplot, and have run into a bit of a conversion I'd like to try to migrate to perl. Basically I have the following:
grep ^"From " SPAM-mbox | \ awk '{$1 = $2 = $6 = ""; \ gsub(/-0[45]00/, ""); \ gsub(" ", "-"); \ print $0}' | \ sort | \ sed s/-$//g | \ uniq -c > spamstatfile

This returns the following:

1 --Fri-Jun-13--2003 1 --Fri-Jun-20--2003 2 --Mon-Jun-9--2003 10 --Tue-Jun-24--2003 3 --Wed-Jun-11--2003 1 --Wed-Jun-18--2003

What I'm trying to do, is swap the first and second columns, but I can't figure out how to easily do that (either with awk's backtracking, or with this converted into a perl one-liner). Once I can do this (without manually swapping the columns), I'll be shoving it into gnuplot with:

set xdata time set timefmt "%b-%d-%Y" plot "spamstatfile" using 1:2

Any ideas on how I can swap those columns either with perl or awk? Or perhaps a better way to do this entirely in perl?

jmacnamara's suggestion got me 99.99% of the way there. Here's a slightly-modified version which now works:

perl -lane '$h{"$F[3]-$F[4]-$F[6]"}++ if /^From /; END { print $_, "\t", $h{$_} for keys %h }' SPAM > spamdates

And then I just run gnuplot -persist spam.gnuplot across my control file. Done.

Now to fancy it up a bit and add more heuristics.

Replies are listed 'Best First'.
Re: Perlify some awk'ing for SPAM graphing
by jmcnamara (Monsignor) on Jun 24, 2003 at 13:49 UTC

    For my mbox file the following one-liner gives the same output as your program, except that the columns are reversed (as requested) and the data is unsorted:
    perl -lane '$h{"--$F[2]-$F[3]-$F[4]--$F[6]"}++ if /^From /; / END{print $_, "\t", $h{$_} for keys %h}' SPAM-mbox

    If you need sorted data you could use the system sort or add sort before keys above:


Re: Perlify some awk'ing for SPAM graphing
by Abigail-II (Bishop) on Jun 24, 2003 at 14:18 UTC
    I don't recall gnuplot requiring the columns to be used in order. Did you try plotting with using 2:1? That would be my first line of attack.


Re: Perlify some awk'ing for SPAM graphing
by bobn (Chaplain) on Jun 24, 2003 at 13:28 UTC
    perl -ane 'chomp @F; print "$F[1] $F[0]\n"' spamstatfile

    update: oops, you wanted edit in place, not to stdout:

    perl  -i.bak -ane  'chomp @F; print "$F[1] $F[0]\n"'  spamstatfile

    --Bob Niederman,

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://268489]
Approved by gellyfish
Front-paged by broquaint
and God said, "Let Newton be!"...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (7)
As of 2018-04-23 12:05 GMT
Find Nodes?
    Voting Booth?