One other thing to notice is that a particular distribution can be identified by a binary string of length equal to one less than the number of categories, and a number of 1-bits equal to one less than the number of columns. Each 0-bit denotes that the next category is in the same column as the prior category, while a 1-bit denotes that the next category begins the next column over. Since the first category is forced into the first column, and the last category is forced into the last column, we get two freebies there.
So the total number of distributions of N categories into M columns is equal to the number of combinations of N-1 things taken M-1 at a time.
Dunno if this helps, but it should keep you from brute forcing more than you need. {grin}
In fact, for your particular dataset (6 categories, 4 columns), you shouldn't need to brute force more than (5 items taken 3 at a time which is) 10 tries.
Wow, that's less than I thought! But it desk checks properly. All you need is a good generating algorithm,
and you can brute force this!