Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re^4: grouping numbers

by ag4ve (Monk)
on Jul 11, 2013 at 13:47 UTC ( #1043729=note: print w/ replies, xml ) Need Help??


in reply to Re^3: grouping numbers
in thread grouping numbers

Oh, they're 'close' together.


Comment on Re^4: grouping numbers
Re^5: grouping numbers
by mtmcc (Hermit) on Jul 11, 2013 at 14:04 UTC
    Compared to -5000 and +5000, they're all 'close' together.

    I think you first need to think about how you would like to define 'close together' and 'far apart' in practical terms, and when you've worked that out, write some code.

    I'm happy to be corrected if I'm missing something...

    Good luck!

      That's why I had $avg - I don't really like taking the average distance to do this (I'd much prefer to have a score system where I get everything and then filter out after) but as I can't even figure this out, I figure this is a good starting point. I could work with it if I got this working at least.

        What you're trying to do is cluster analysis - naturally grouping data together in clusters (for some value of "naturally").

        Most approaches I'm aware of require you to know the number of clusters ahead of time (which sort of defeats the purpose).

        However, if you can come up with some heuristic, such as "any element of a cluster must be within 10% of the center point of the cluster's range", you might be able to quickly compute the results, and live with them. (Of course there are pathological cases where adding a new element changes the center point, causing other elements to be cast out.)

        -QM
        --
        Quantum Mechanics: The dreams stuff is made of

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1043729]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (13)
As of 2014-07-30 09:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (230 votes), past polls