Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

How to make buckets of like data

by PRyanRay (Novice)
on Feb 05, 2013 at 19:26 UTC ( #1017261=perlquestion: print w/ replies, xml ) Need Help??
PRyanRay has asked for the wisdom of the Perl Monks concerning the following question:

Hello, Monks!

I am attempting to "bucketize" binary lists. The basic idea is if I have a list like this:

my @array = (["A","B"],["C","D"],["A","C"],["E","F"],["F","G"]);

I would like to (using hases or arrays) bucket letters that are associated. In this example, I would like to have an object like this:

my @array1 = ([A,B,C,D],[E,F,G]);

I have not had much luck. I have some code that is close but it is very clumsy and cumbersome. Here is the "meat" of my code that is attempting to do this. The variables $req1 and $req2 are the letters in the above example, and, yes, I know it is not correct, that is why I am here:

#!usr/bin/perl use strict; use warnings; my %buckets; my @array2 = (["A","B"],["C","D"],["A","C"],["E","F"],["F","G"]); my $count = 0; foreach(@array2){ my $req1 = $array2[$count][0]; my $req2 = $array2[$count][1]; my $j = 0; foreach my $key (keys %buckets){ if($key =~ /$req1/i){ $buckets{$key} .= $req2; $j++; last; } elsif($key =~ /$req2/i){ $buckets{$key} .= $req1; $j++; last; } elsif($buckets{$key} =~ /$req1/i){ $buckets{$key} .= $req2; $j++; last; } elsif($buckets{$key} =~ /$req2/i){ $buckets{$key} .= $req1; $j++; last; } } if($j == 0){$buckets{$req1} = $req2;} $count++; } foreach(keys %buckets){print $_,$buckets{$_},"\n";}

Am I missing something simple here? Any help would be appreciated.

The output from this is the following:

ABC

CD

EFG

See the problem is how to do the retroactive updating if a new hash entry was made.

Comment on How to make buckets of like data
Select or Download Code
Re: How to make buckets of like data
by BrowserUk (Pope) on Feb 05, 2013 at 20:07 UTC
    Am I missing something simple here?

    I don't believe you can do it in one pass. This works, but might be simplified?:

    #! perl -slw use strict; use Data::Dump qw[ pp ]; my @array = (['A','B'],['C','D'],['A','C'],['E','F'],['F','G']); my %h; for( @array ) { my( $x, $y ) = @$_; $h{ $x }{ $y } = 1; $h{ $y }{ $x } = 1; } OUTER: for my $x ( reverse sort keys %h ) { for my $y ( reverse sort keys %h ) { next if $x eq $y; if( exists $h{ $y }{ $x } ) { $h{ $y }{ $_ } = 1 for keys %{ $h{ $x } }; delete $h{ $x }; next OUTER; } } } my @buckets = map[ sort keys %{ $h{ $_ } } ], keys %h; pp \@buckets; __END__ C:\test>1017261 [["A", "B", "C", "D"], ["E", "F", "G"]]

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: How to make buckets of like data
by roboticus (Canon) on Feb 05, 2013 at 23:40 UTC

    PRyanRay:

    That was amusing.

    $ cat groupify_a.pl #!/usr/bin/perl use strict; use warnings; my %Grps; # Build the groups while (<DATA>) { my ($l,$r) = split /\s*,?\s+/, $_; last unless defined $r; my $gl = $Grps{$l}; my $gr = $Grps{$r}; if (defined $gl) { if (defined $gr) { if ($gr != $gl) { # Merge two groups my %new_gr = map { $_=>0 } @$gr, @$gl; my $new_gr = [ sort keys %new_gr ]; $Grps{$_} = $new_gr for @$new_gr; } } else { # Add $r existing group push @{$Grps{$r}=$gl},$r; } } elsif (defined $gr) { # Add $l to existing group push @{$Grps{$l}=$gr},$l; } else { # Create new group my $new_gr = [ $l, $r ]; $Grps{$_} = $new_gr for @$new_gr; } } #...and print them for my $K (keys %Grps) { next unless @{$Grps{$K}}>0; print "{ ", join(", ", splice @{$Grps{$K}}), " }\n"; } __DATA__ A, B C, D A, C E, F F, G $ perl groupify_a.pl { E, F, G } { A, B, C, D }

    I did a similar version using hashes, but I think this one is a little better. (Perhaps I could've said it better as "the hash one was a little uglier".)

    Update: I prefer Browser's version because it has better notation. I prefer the concept of mine, but can't figure out how to make the notation as nice as his...

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

Re: How to make buckets of like data
by BrowserUk (Pope) on Feb 06, 2013 at 01:34 UTC

    A more efficient formulation:

    #! perl -slw use strict; use Data::Dump qw[ pp ]; my @array = (['A','B'],['C','D'],['A','C'],['E','F'],['F','G']); my @buckets; for( sort { $a->[0] cmp $b->[0] } @array ) { my( $x, $y ) = @$_; my $added = 0; for my $bkt ( @buckets ) { if( exists $bkt->{ $x } or exists $bkt->{ $y } ) { undef @{ $bkt }{ $x, $y }; ++$added; } } push @buckets, { $x, undef, $y, undef } unless $added; } $_ = [ sort keys %$_ ] for @buckets; pp \@buckets; __END__ C:\test>1017261-2 [["A", "B", "C", "D"], ["E", "F", "G"]]

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: How to make buckets of like data (connected components)
by LanX (Canon) on Feb 06, 2013 at 01:50 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1017261]
Approved by BrowserUk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (14)
As of 2014-12-18 12:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (51 votes), past polls