Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?

Perlify some awk'ing for SPAM graphing

by hacker (Priest)
on Jun 24, 2003 at 13:21 UTC ( #268489=perlquestion: print w/replies, xml ) Need Help??
hacker has asked for the wisdom of the Perl Monks concerning the following question:

I'm working on some SPAM graphing with mrtg and gnuplot, and have run into a bit of a conversion I'd like to try to migrate to perl. Basically I have the following:
grep ^"From " SPAM-mbox | \ awk '{$1 = $2 = $6 = ""; \ gsub(/-0[45]00/, ""); \ gsub(" ", "-"); \ print $0}' | \ sort | \ sed s/-$//g | \ uniq -c > spamstatfile

This returns the following:

1 --Fri-Jun-13--2003 1 --Fri-Jun-20--2003 2 --Mon-Jun-9--2003 10 --Tue-Jun-24--2003 3 --Wed-Jun-11--2003 1 --Wed-Jun-18--2003

What I'm trying to do, is swap the first and second columns, but I can't figure out how to easily do that (either with awk's backtracking, or with this converted into a perl one-liner). Once I can do this (without manually swapping the columns), I'll be shoving it into gnuplot with:

set xdata time set timefmt "%b-%d-%Y" plot "spamstatfile" using 1:2

Any ideas on how I can swap those columns either with perl or awk? Or perhaps a better way to do this entirely in perl?

jmacnamara's suggestion got me 99.99% of the way there. Here's a slightly-modified version which now works:

perl -lane '$h{"$F[3]-$F[4]-$F[6]"}++ if /^From /; END { print $_, "\t", $h{$_} for keys %h }' SPAM > spamdates

And then I just run gnuplot -persist spam.gnuplot across my control file. Done.

Now to fancy it up a bit and add more heuristics.

Replies are listed 'Best First'.
Re: Perlify some awk'ing for SPAM graphing
by jmcnamara (Monsignor) on Jun 24, 2003 at 13:49 UTC

    For my mbox file the following one-liner gives the same output as your program, except that the columns are reversed (as requested) and the data is unsorted:
    perl -lane '$h{"--$F[2]-$F[3]-$F[4]--$F[6]"}++ if /^From /; / END{print $_, "\t", $h{$_} for keys %h}' SPAM-mbox

    If you need sorted data you could use the system sort or add sort before keys above:


Re: Perlify some awk'ing for SPAM graphing
by Abigail-II (Bishop) on Jun 24, 2003 at 14:18 UTC
    I don't recall gnuplot requiring the columns to be used in order. Did you try plotting with using 2:1? That would be my first line of attack.


Re: Perlify some awk'ing for SPAM graphing
by bobn (Chaplain) on Jun 24, 2003 at 13:28 UTC
    perl -ane 'chomp @F; print "$F[1] $F[0]\n"' spamstatfile

    update: oops, you wanted edit in place, not to stdout:

    perl  -i.bak -ane  'chomp @F; print "$F[1] $F[0]\n"'  spamstatfile

    --Bob Niederman,

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://268489]
Approved by gellyfish
Front-paged by broquaint
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (2)
As of 2017-09-25 03:32 GMT
Find Nodes?
    Voting Booth?
    During the recent solar eclipse, I:

    Results (276 votes). Check out past polls.