Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

grouping numbers

by ag4ve (Monk)
on Jul 11, 2013 at 11:49 UTC ( #1043700=perlquestion: print w/ replies, xml ) Need Help??
ag4ve has asked for the wisdom of the Perl Monks concerning the following question:

I'm cross posting from beginners mailing list

Basically I want to see most active events in logs.

Technically, I don't even want a static average - if a group is 5,6,7, I'd like to see it the same as 8,10,12,14,16 but probably have the later rank higher based on $distance/$numbers. As it is, I can't even get this to work, so...

use strict; use warnings; use Data::Dumper; my $arr = [ 10, 7, 5, 10, 50, 70, 75, 72, 79, 80 ]; my $avg; foreach my $i ($#$arr) { $avg = ($i + ($avg ? $avg : $i)) / 2; } print "avg [$avg]\n"; @$arr = sort {$a <=> $b} @$arr; my $store; foreach my $i (0 .. $#$arr) { $store->[$i]{num} = $arr->[$i]; $store->[$i]{thresh} = $avg; foreach my $store_i (0 .. $#$store) { if (abs($arr->[$i] - $store->[$store_i]{num}) <= $store->[$i]{ +thresh}) { push @{$store->[$i]{group}}, $arr->[$i]; } } } print Dumper($store);

Comment on grouping numbers
Download Code
Re: grouping numbers
by 5mi11er (Deacon) on Jul 11, 2013 at 12:18 UTC
    I think you need to explain this better, I don't understand what you're trying to accomplish.

    -Scott

      I want to see if someone sends a ton of packets that I DROP in a short time vs someone that, over time, ends up sending the same number (or more) DROPped packets (or DENY, etc). Technically, I'm looking at a time stamp and converting it to epoch, but the general theory holds up to what I presented.

Re: grouping numbers
by mtmcc (Hermit) on Jul 11, 2013 at 12:27 UTC

    Agreed... More info please!

    Michael
Re: grouping numbers
by ww (Bishop) on Jul 11, 2013 at 12:44 UTC

    Serious Error: "can't even get this to work" is not recognized as a valid error description.

    Seriously, you'll need to tell us more about what you expected or want; why the output deviates from your heart's desire; and -- tho none appear when I test or execute your code -- error messages or warnings received from code you haven't shown.

    If I've misconstrued your question or the logic needed to answer it, I offer my apologies to all those electrons which were inconvenienced by the creation of this post.

      Seriously? The code runs fine for me.

      The expected output of the code or what I'd like. Since I'm not sure if I'm on the right track with the code, what I'd like are three sets with 5,7,10,10, 50, and 70,72,75,79,80.

      I'm not sure how else to explain this - between the code I'm stuck on, the use case, and abstract....?

        What is significant about those groups of numbers?

        Perl can't mind read.

        Michael

        Not sure if this is any use but is does give the expected output. It loops through the sorted list working out the effect of adding each to 'current group average'. If change is greater than the value 'diff' it starts a new group

        #!perl use strict; my @data = (10,7,5,10,50,70,75,72,79,80); my @arr = sort {$a<=>$b} @data; my $diff = 2; # group closeness my @avg; #[count,sum] my @grp; my $g=0; for my $i (0..$#arr){ my $val = $arr[$i]; #print "value $val\n"; if ($i == 0){ push @{$grp[$g]},$val; $avg[$g] = [1,$val]; } else { #work out new average with this element my ($n,$sum) = @{$avg[$g]}; #print "count $n sum $sum\n"; my $avg = $sum/$n; my $new_avg = ($sum+$val)/($n+1); if (abs($new_avg - $avg) < $diff){ # join group push @{$grp[$g]},$val; $avg[$g] = [$n+1,$sum+$val]; } else { # start new group ++$g; push @{$grp[$g]},$val; $avg[$g] = [1,$val]; } } } for (@grp) { print join ',',@$_,"\n"; }
        POJ
Re: grouping numbers
by QM (Vicar) on Jul 11, 2013 at 14:31 UTC
    This is suspicious:
    foreach my $i ($#$arr) { $avg = ($i + ($avg ? $avg : $i)) / 2; }

    This only runs once, with $i taking the value of the last index of $arr. You probably meant @$arr.

    The next problem is that this isn't an average in the normal sense, it's a weighted average where the last element gets much more weight than the first, more like:

    $avg = (...(($arr->[0]/2 + $arr->[1])/2 + $arr->[2])/2 + ... $arr->[$# +$arr])/2;

    So perhaps you meant a decaying average, but it's not clear from the context. Here's something equivalent, with output:

    my @x = (10, 7, 5, 10, 50, 70, 75, 72, 79, 80); sub avg { my @arr = @_; my $avg; foreach my $i (@arr) { print "$i: $avg\n"; $avg = ($i + ($avg ? $avg : $i))/2; } return $avg; } print avg(@x),"\n"; 10: # undef 7: 10 5: 8.5 10: 6.75 50: 8.375 70: 29.1875 75: 49.59375 72: 62.296875 79: 67.1484375 80: 73.07421875 76.537109375 print avg(reverse @x),"\n"; 80: # undef 79: 80 72: 79.5 75: 75.75 70: 75.375 50: 72.6875 10: 61.34375 5: 35.671875 7: 20.3359375 10: 13.66796875 11.833984375

    Making a change to produce the uniformly weighted average:

    my @x = (10, 7, 5, 10, 50, 70, 75, 72, 79, 80); sub avg2 { my @arr = @_; my $sum; foreach my $i (@arr) { print "$i: $sum\n"; $sum += $i; } return $sum/@arr; } print avg(@x),"\n"; 10: # undef 7: 10 5: 17 10: 22 50: 32 70: 82 75: 152 72: 227 79: 299 80: 378 45.8 print avg(reverse @x),"\n"; 80: # undef 79: 80 72: 159 75: 231 70: 306 50: 376 10: 426 5: 436 7: 441 10: 448 45.8

    where the forward and reverse lists produce the same average.

    A lot of the syntax would be less cumbersome if you started with my @arr = (...).

    -QM
    --
    Quantum Mechanics: The dreams stuff is made of

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1043700]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (5)
As of 2014-09-23 06:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (210 votes), past polls