Consider an arbitrary number of lists of arbitrary size:

qw( A A A A ) qw( B B ) qw( C C C ) qw( D )

As a "bag" this could be expressed as `{ A => 4, B => 2, C => 3, D => 1 }`.

The total number of elements from all lists (or all bag elements) is ten (but the general problem could have any result-set size, based on the inputs). Zip (or distribute) all bags into a single list where each bag is as uniformly distributed as possible. Many problems will have more than one solution, but any single "as optimal as possible" solution is adequate. The above bags could distribute like this:

qw( A C B A D C A B C A ) # I think...

One solution I considered was to convert each list into an iterator that spits out its contents at frequencies separated by undef, then zip the iterators. Each iterator would need to know its priority (its relative list size), and use that priority to pad the starting point with 'undef'. Something like this:

# Sublists generated by an iterator. my @p1 = ( 'A' , undef, undef, 'A', undef, undef, 'A', undef, und +ef, 'A' ); my @p2 = ( undef, 'C', undef, undef, undef, 'C', undef, undef, und +ef, 'C' ); my @p3 = ( undef, undef, 'B', undef, undef, undef, undef, 'B', und +ef, undef ); my @p4 = ( undef, undef, undef, 'D', undef, undef, undef, undef, und +ef, undef ); # Now zip them up. my @solution = grep { defined } zip @p1, @p2, @p3, @p4; print "@solution\n"; # Produces A C B A D C A B A C

This isn't exactly the same as the sample output I sought, but I think it's an equally optimal output, so it should be fine. But one problem is that we would need to sort the bags by size so that the largest always picks first, and so on. Why am I hooked on the idea of an iterator generating each sub-list as a frequency interspersed with undef? ...because it seems in my mind to be a more general approach, allowing for the possibility of infinite streams, possibly useful for task scheduling by priority.

So what is my question? Several:

- What class of problem is this? I thought maybe bin packing, but the bins are size one, and bin packing doesn't concern itself with uniformity of frequency.
- What algorithm would produce one of several optimal solutions in the best computational complexity and space?
- Is there a generalized solution that minimizes error if the closed list is converted to an infinite list? (Ie, instead of producing a result set of ten elements, produce a stream of uniform distribution based on the frequencies)
- What would a Perlish solution actually look like?

There isn't anything mission critical; I just recently was reading a discussion and started thinking... we're satisfying curiosity here. Last night I spent some time reading over the docs for List::Gen, which may actually be useful as it facilitates lazy generation, but I couldn't quite get my head around a List::Gen approach.

Dave

Back to
Seekers of Perl Wisdom