Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked

Re^2: Sets of subsets

by jpearl (Scribe)
on Jul 22, 2009 at 20:44 UTC ( #782433=note: print w/replies, xml ) Need Help??

in reply to Re: Sets of subsets
in thread Sets of subsets

I probably didn't phrase it very well. Really I'm just interested in the partitions of the main set of things. So if there were three "things", there would only be 4 sets of subsets:

{t1}{t2}{t3} {t1,t2}{t3} {t1}{t2,t3} {t1,t3}{t2}

Since ordering doesn't matter. From these partitions I can calculate the distance or similarity between the groups from the attribute table (which has already been calculated). So for the above example I might have a table like:

t1 t2 t3 t1 0 1 2 t2 1 0 3 t3 2 3 0

Representing the difference between each thing. When I have the subsets from above I can calculate what the average difference is between the subsets using the difference values as the metric.

Important to note as well that the number of "things" themselves are relatively few. The attributes previously calculated are in the hundreds, but the things themselves number a few tens (max 24). So, I'm not too worried about the numbers exploding on me... yet.

Replies are listed 'Best First'.
Re^3: Sets of subsets
by JavaFan (Canon) on Jul 22, 2009 at 21:16 UTC
    Just look at the group t1 is in. t1 can be a group in itself. Or with each of t2, t3, ..., tk. Or any combination of them. Which means that if you have N "things", there are 2N-1-1 ways to be t1 in a group with other "t's" (-1 comes from that you seem to exclude the subset of all things together). And that's not even counting the different ways you can split up the group of things that aren't in the subset t1 belongs to.
      Definitely a good point. However, I'm sort of used to analyses sometimes taking upwards of days, so "blowing up" I guess might mean something else to me. You are correct though, very computationally complex. This was really just a random thought I had, I was wondering what sort of information I could extract out of this data. I imagine I'll end up going with a k-means clustering or something of that nature.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://782433]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (10)
As of 2017-01-18 22:49 GMT
Find Nodes?
    Voting Booth?
    Do you watch meteor showers?

    Results (166 votes). Check out past polls.