Re: "Intelligent" array joining (topological sort)
by tye (Sage) on Feb 05, 2004 at 22:56 UTC
|
#!/usr/bin/perl -w
use strict;
sub tsort
{
my( %pred, %succ );
for my $av ( @_ ) {
for my $i ( 1 .. $#$av ) {
$pred{$av->[$i]}{$av->[$i-1]} ||= 1;
$succ{$av->[$i-1]}{$av->[$i]} ||= 1;
}
$succ{$av->[-1]} ||= {};
}
my @output;
while( %succ ) {
my( $best, $count );
for my $item ( keys(%succ) ) {
my $preds= keys %{$pred{$item}};
if( ! defined($count) || $preds < $count ) {
$best= $item;
$count= $preds;
last if 0 == $count;
}
}
warn "Data contains a cycle, breaking at $best.\n"
if 0 < $count;
push @output, $best;
for my $succ ( keys %{$succ{$best}} ) {
delete $pred{$succ}{$best};
}
delete $succ{$best};
}
return wantarray ? @output : \@output;
}
my @array1= ( 1, 3, 4, 6 );
my @array2= ( 1, 2, 4, 6 );
my @array3= ( 1, 2, 3, 5 );
my @output= tsort( \( @array1, @array2, @array3 ) );
print "@output\n";
@output= tsort( [1,2,3,4], [1,3,2,4] );
print "@output\n";
output is
1 2 3 4 5 6
Data contains a cycle, breaking at 2.
1 2 3 4
| [reply] [Watch: Dir/Any] [d/l] [select] |
Re: "Intelligent" array joining
by dragonchild (Archbishop) on Feb 06, 2004 at 03:40 UTC
|
This sounds like a directed graph. I'd recommend looking at Graph::Directed. The following worked for me:
use Graph::Directed;
my @array1 = qw(dog cat rat mouse);
my @array2 = qw(dog rat mouse bird);
my @array3 = qw(cat rat fish mouse);
my $graph = Graph::Directed->new;
$graph->add_path(@array1);
$graph->add_path(@array2);
$graph->add_path(@array3);
my @toposort = $graph->toposort;
print "@toposort\n";
------
dog cat rat fish mouse bird
------
We are the carpenters and bricklayers of the Information Age.
Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.
| [reply] [Watch: Dir/Any] [d/l] |
|
As a follow-up... I've noticed that any duplicates in a given array will mess up the topological sort. So, I just added a quick duplicate search:
use Graph::Directed;
my @array1 = qw(dog rat rat mouse);
my @array2 = qw(dog rat mouse bird);
my @array3 = qw(cat rat fish mouse);
my @elements;
push(@elements, \@array1, \@array2, \@array3);
my $graph = Graph::Directed->new;
for (0..$#elements) {
# Check for duplicate GUIDs in this set
my %seen = ();
my @dup = ();
foreach my $item (@{$elements[$_]}) {
if ($seen{$item}++) {
push(@dup, $item);
}
}
unless ($#dup == -1) {
print "Duplicate elements: @dup";
exit;
}
# If all's well, add to the path
$graph->add_path(@{$elements[$_]});
}
my @elements_ordered = $graph->toposort;
print "@elements_ordered\n";
| [reply] [Watch: Dir/Any] [d/l] |
|
Hey, wow! That's precisely what I needed. Pretty straightforward, too. I just didn't know what to look for.
Just goes to show... with Perl, if you've got a problem, somebody's already figured it out and written a module.
On to reading more about directed graphs and topological sorting...
Thanks!
| [reply] [Watch: Dir/Any] |
Re: "Intelligent" array joining
by jeffa (Bishop) on Feb 05, 2004 at 21:25 UTC
|
Unless i am missing something (which i usually am), why not just sort the results?
@union = sort grep { not $seen{$_}++ } (@union, @array3);
UPDATE: i think i see what you want now ... let's try using CHARS instead of INTS. In order
to have "intelligent" sorting, you have to provide the "intelligence" ... in this case, let's
give weights to each of the items we are dealing with:
my @array1 = ([b=>1], [d=>3], [z=>4], [e=>5]);
my @array2 = ([a=>2], [b=>1], [z=>4]);
my @array3 = ([d=>3], [e=>5]);
The rest of the code is mostly the same (i did not bother to see if this could be refactored
for efficiency), but since our arrays hold more arrays, we need to code appropriately:
my %seen;
my @union = grep { not $seen{$_->[0]}++ } (@array1, @array2);
undef %seen;
@union =
map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
grep { not $seen{$_->[0]}++ } (@union, @array3);
use Data::Dumper;
print Dumper \@union;
Hope this helps. :)
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
Well, you see... I'm not trying to sort based on numerical order. I'm trying to sort based on the original order of the arrays. Let's try this again:
my @array1 = qw(dog cat rat mouse);
my @array2 = qw(dog rat mouse bird);
my @array3 = qw(cat rat fish mouse);
The output would then be:
dog cat rat mouse bird fish
Here, fish should come before mouse, and before bird.
So, I'm not sorting numerically, and I'm not sorting alphabetically... I'm trying to preserve the order of the original arrays...
To make this (perhaps) a little more clear, here's my specific problem. I'm taking data sets, with questions, and each question has a unique identifier, a GUID. For the most part, for each data set, the GUIDs will be in the same order, but every now and then, they change a question and assign a new GUID. Or, they insert a new question into the middle.
I need to combine all the data sets, and attempt to preserve the natural order of the questions.
Does that make a bit more sense? | [reply] [Watch: Dir/Any] [d/l] [select] |
Re: "Intelligent" array joining
by valentin (Abbot) on Feb 05, 2004 at 21:50 UTC
|
Let Tie::IxHash keep your order.
I did not understand which order is prefered for your example, but put them in your order into the hash. That's what you get back.
my @array1 = (1, 3, 4, 6);
my @array2 = (1, 2, 4, 6);
my @array3 = (1, 2, 3, 5);
tie my %seen, 'Tie::IxHash';
for (0..$#array1) {
my @v = sort ( $array1[$_], $array2[$_], $array3[$_] );
@seen{ @v } = 1;
}
print join ' ', keys %seen;
| [reply] [Watch: Dir/Any] [d/l] |
|
Hmmm... this appears to be a step in the right direction, though I don't need that "sort" in there. You see, numerical order is of no importance... only the order in which it appears in the array. It's like this:
my @array1 = (1, 3, 4, 6);
my @array2 = (1, 2, 4, 6);
my @array3 = (1, 2, 3, 5);
So, the program looks as the first array, and notes its order. Then, it looks at the second array, and notices that "2" hasn't been seen before. It notes that it goes after "1" and before "4". At this point, the logical order could be either
(1, 2, 3, 4, 6) or (1, 3, 2, 4, 6)
It doesn't matter which it chooses. Then, it examines the third array and realizes that "2" comes before "3". Plus, the "5" is new, and comes after "3". So, now the order could be:
(1, 2, 3, 4, 5, 6) or (1, 2, 3, 4, 6, 5)
Does that make sense? | [reply] [Watch: Dir/Any] [d/l] |
Re: "Intelligent" array joining
by runrig (Abbot) on Feb 05, 2004 at 23:40 UTC
|
This is not very thought out or tested, probably not very efficient on large lists, but I think its interesting, and it seems to work: #!/usr/bin/perl
use strict;
use warnings;
my @array1 = qw(dog cat rat mouse);
my @array2 = qw(dog rat mouse bird);
my @array3 = qw(cat rat fish mouse);
#my @array1= ( 1, 3, 4, 6 );
#my @array2= ( 1, 2, 4, 6 );
#my @array3= ( 1, 2, 3, 5 );
init_cmp($_) for \@array1, \@array2, \@array3;
my %animals;
undef @animals{@array1, @array2, @array3};
print join(",", keys %animals), "\n";
my @sorted = sort { my_cmp($a, $b) } keys %animals;
print join(",", @sorted), "\n";
{
my %lt;
sub init_cmp {
my ($this, @rest) = @{$_[0]};
while ( @rest ) {
$lt{$this}{$rest[0]} = 1;
$this = shift @rest;
}
}
sub is_lt {
my ($first, $next) = @_;
return unless exists $lt{$first};
return 1 if $lt{$first}{$next};
for my $mid ( keys %{$lt{$first}} ) {
return 1 if is_lt($mid, $next);
}
return;
}
sub my_cmp {
my ($first, $next) = @_;
return is_lt($first, $next) ? -1 : 1;
}
}
| [reply] [Watch: Dir/Any] [d/l] |
Re: "Intelligent" array joining
by QM (Parson) on Feb 06, 2004 at 01:47 UTC
|
I think the OP wants something like this:
#!/your/perl/here
use strict;
use warnings;
my @array1 = qw(dog cat rat mouse);
my @array2 = qw(dog rat mouse bird);
my @array3 = qw(cat rat fish mouse);
my %seen;
my @collection;
foreach my $item ( @array1, @array2, @array3 )
{
unless ( exists( $seen{$item} ) )
{
push @collection, $item;
$seen{$item}++;
}
}
print "@collection\n";
__END__
Which gives:
dog cat rat mouse bird fish
I'll leave it to OMAR to come up with an obfuscated compact form.
-QM
--
Quantum Mechanics: The dreams stuff is made of
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
Doesn't work. Try the following test:
my @array1 = qw(dog cat rat mouse);
my @array2 = qw(dog rat mouse bird);
my @array3 = qw(cat rat fish mouse);
my %seen;
my @collection;
foreach my $item ( @array1, @array2, @array3 )
{
unless ( exists( $seen{$item} ) )
{
push @collection, $item;
$seen{$item}++;
}
}
print "@collection\n";
@collection = ();
%seen = ();
foreach my $item ( @array2, @array1, @array3 )
{
unless ( exists( $seen{$item} ) )
{
push @collection, $item;
$seen{$item}++;
}
}
print "@collection\n";
__END__
Prints:
dog cat rat mouse bird fish
dog rat mouse bird cat fish
Theoretically, it should always give the same order, no matter what order the initial arrays are in.
------
We are the carpenters and bricklayers of the Information Age.
Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
Theoretically, it should always give the same order, no matter what order the initial arrays are in.
Ah, I missed that in the specification. I suppose if the OP could have stated the problem better, s/he might have known where to find it in the first place, no? ;)
-QM
--
Quantum Mechanics: The dreams stuff is made of
| [reply] [Watch: Dir/Any] |
Re: "Intelligent" array joining
by Anonymous Monk on Feb 05, 2004 at 21:34 UTC
|
topological sort, aka tsort | [reply] [Watch: Dir/Any] |
Re: "Intelligent" array joining
by thospel (Hermit) on Feb 07, 2004 at 13:09 UTC
|
Several good answers all based on basically doing a topological sort have already been given.
Can this problem also be solved without essentially
doing a topological sort ?
When you consider as input a sequence of arrays where each
consists of two elements, that simply says
the first element must come before the second, and you
actually have the standard way to ask for a topological sort.
So no, in general you can't. Any solution must also be able to do a topological sort. | [reply] [Watch: Dir/Any] |
Re: "Intelligent" array joining
by Not_a_Number (Prior) on Feb 06, 2004 at 19:02 UTC
|
Bit late, I'm afraid, but this seems to do what you want:
use strict;
use warnings;
my @array1 = qw ( ant bee ant cow ant dog );
my @array2 = qw ( ant cat bee cow rat );
my @array3 = qw ( ant 1 3 5 7 9 1 2 3 4 5 6 7 8 9 X );
my @arrays;
my $length = 0;
for ( \@array1, \@array2, \@array3 ) {
$length = @$_ if @$_ > $length;
push @arrays, $_;
}
my %seen;
foreach my $i ( 0 .. $length - 1 ) {
for ( @arrays ) {
print "$_->[$i] " if $_->[$i] and not $seen{$_->[$i]}++;
}
}
Output: ant bee cat 1 3 cow 5 rat 7 dog 9 2 4 6 8 X
Maybe there's a way of avoiding two loops, though?
dave
Update: Single loop: my @arrays = \( @array1, @array2, @array3 );
my %seen;
my $i = 0;
{
my $count = 0;
for ( @arrays ) {
++$count and next unless $_->[$i];
print "$_->[$i] " unless $seen{$_->[$i]}++;
}
$i++;
redo unless $count >= @arrays;
}
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
Doesn't work. Your method favors/regurgitates the order of the first array, but according to the second array, 'cat' should come between 'ant' and 'bee', not after.
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
Re: "Intelligent" array joining
by jdporter (Chancellor) on Aug 12, 2004 at 04:36 UTC
|
Well, I don't know if this is a topical sort -- I kind of doubt it -- but it seems to me to give satisfactory results, and handles "loops" quite gracefully. It's pretty efficient, but doesn't necessarily give the optimal result. Pass a list of arrayrefs.
sub concensus_sort
{
my( %pre, %suf );
for my $ar ( @_ )
{
my @a = @$ar;
my @pre;
while ( @a )
{
my $k = shift @a;
$pre{$k} ||= {};
$suf{$k} ||= {};
$pre{$k}{$_}++ for @pre;
$suf{$k}{$_}++ for @a;
push @pre, $k;
}
}
if ( $main::DEBUG )
{
for ( keys %pre )
{
for my $p ( keys %{ $pre{$_} } )
{
if ( exists $pre{$p}{$_} )
{
print "$_-$p AND $p-$_ !!!\n";
}
}
}
}
my @result;
for my $k (
sort {
keys(%{$pre{$a}}) <=> keys(%{$pre{$b}})
or
keys(%{$suf{$a}}) <=> keys(%{$suf{$b}})
} keys %suf
) {
push @result, $k;
# now remove all trace of it in the data structures:
delete $pre{$_}{$k} for keys %{ $suf{$k} };
delete $suf{$_}{$k} for keys %{ $pre{$k} };
delete $pre{$k};
delete $suf{$k};
}
@result
}
But it still doesn't give a pretty result when the input lists are disjoint.
| [reply] [Watch: Dir/Any] [d/l] |
Re: "Intelligent" array joining
by ngomong (Sexton) on Mar 11, 2004 at 17:00 UTC
|
Following up on this a bit more...
I found that duplicates in each input array would cause problems, so I put in a check for this (above, in a previous post).
What I'm finding now, is that problems arise when the input arrays are totally unique. That is, an interection of all the arrays would yield no elements.
Consider this code:
#!/usr/bin/perl
use strict;
use warnings;
use Graph::Directed;
my @array1= ( 1, 2, 3, 4);
my @array2= ( 5, 6, 7, 8);
my @array3= ( 9, 10, 11, 12);
my $graph = Graph::Directed->new;
$graph->add_path(@array1);
$graph->add_path(@array2);
$graph->add_path(@array3);
my @ordered = $graph->toposort();
print "@ordered";
Note that, for this example, I'm using numerals. However, the desired order is not alphabetical, ASCIIbetical, numerical, or anything similar. What's important is the order that the elements appear in the arrays.
The problem here, is that the vertices in this example are not strongly connected. So, rather than getting:
1 2 3 4 5 6 7 8 9 10 11 12
I get:
5 1 2 9 10 3 4 11 12 6 7 8
Is there a way have the topological sort expect unique arrays, with no connections, and weight them, based on the order they are received?
You see, sometimes the arrays will be connected, sometimes they won't... so I need a system that will account for both cases, without being explicitly told which is which. | [reply] [Watch: Dir/Any] [d/l] |