Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

Two dimensional sets intersection

by menth0l (Monk)
on Apr 10, 2013 at 13:12 UTC ( #1027966=perlquestion: print w/ replies, xml ) Need Help??
menth0l has asked for the wisdom of the Perl Monks concerning the following question:


I have two discontinuous sets of values and some pairs from each sets are "valid" and have assigned values, for example:
# subset 1 of (A..Z) set_1 = (A, B, C, X, Y, Z) # subset 2 of (a..z) set_2 = (q, w, r, t, u) # valid pairs valid = ((A,q) = 1, (A,u) = 6, (A,k) => 3, (C,t) = 10, (Z,u) = 30)
I have a problem with "translating" this to perl. I know I need some kind of map for storing information about valid pairs and their id (like python's map of tuples: (x,y) => z). But having two sets, each consisting of 100 elements requires 100x100=10000 combinations... Is there any quicker method for this?

If I'd have an index like this:
my %INDEX = { A => { q => 1, u => 6, k => 3 }, C => { t => 10 }, Z => { u => 30 }, M => { q => 100 } }
I could narrow search by creating intersection (e.g. using Set::Object):
# narrow by first set (set_1) x (keys %INDEX) = (set_1) x (A, C, M, Z) = (A, C, Z)
then i would be left with an array of hash-refs and do:
set_2 x (q, u, k) + set_2 x (t) + set_2 x (u) = (1, 6, 10, 30)

I don't like this idea though and I don't know if it would be faster than combinations hash checks. Is there a perl module that would help me with this?

Comment on Two dimensional sets intersection
Select or Download Code
Replies are listed 'Best First'.
Re: Two dimensional sets intersection
by daxim (Chaplain) on Apr 10, 2013 at 13:28 UTC
    This is a FAQ.
    use Hash::MultiKey; tie my %valid, 'Hash::MultiKey'; %valid = ( [qw(A q)] => 1, [qw(A u)] => 6, [qw(A k)] => 3, [qw(C t)] => 10, [qw(Z u)] => 30, );
Re: Two dimensional sets intersection
by Not_a_Number (Parson) on Apr 10, 2013 at 20:44 UTC

    No need to worry about narrowing your search or whatever. Hash lookups are fast.

    The code below creates a hash for a 100 x 100 matrix. It also assumes that each matrix pair has a value (worst case, given your 'spec'). To simplify validation, I make this value 1, but changing it to any other number makes no difference to the time required for lookup.

    Subsequently, I make another worst-case assumption, namely that your lists of search terms (what you call Subset 1 and Subset 2) are maximally large.

    On my machine, it finishes before I can blink. Run it and see:

    use strict; use warnings; # Create 100 x 100 datastructure: my @uc = ( 'AA' .. 'DV' ); my @lc = ( 'aa' .. 'dv' ); my %index; for my $k ( @uc ) { for my $v ( @lc ) { $index{$k}{$v} = 1; } } # Parse it: my $score; for my $k ( @uc ) { # or: for my $k ( @subset_1 ) for my $v ( @lc ) { # or: for my $v ( @subset_2 ) $score += $index{$k}{$v} || 0; } } print $score;
Re: Two dimensional sets intersection
by LanX (Chancellor) on Apr 10, 2013 at 14:29 UTC
    first of all I have to say that I have the impression that you are trying to reinvent a wheel from graph theory.

    Your structure looks for me like a weighted bipartite graph, and you are trying to find a subgraph.

    Did you check CPAN for related modules?

    > (like python's map of tuples: (x,y) => z)

    don't know enough about Python, but this can easily be simulated in Perl with old-style multidim hashes and grepping with a split on the keys.

    DB<116> $index{"A","a"}=1 => 1 DB<117> $index{"A","b"}=2 => 2 DB<118> $index{"B","a"}=3 => 3 DB<119> $index{"B","b"}=4 => 4 DB<120> grep { my ($k1,$k2) = split /$;/,$_; $k1 eq "A" } keys %inde +x => ("A\34a", "A\34b") DB<121> %set1=(A=>1) => ("A", 1) DB<122> %set2=(a=>1) => ("a", 2) DB<124> grep { my ($k1,$k2) = split /$;/,$_; $set1{$k1} and $set2{$k +2} } keys %index => "A\34a"

    This brute force search might be elegantš, but for sure not as efficient as looping thru your hierarchical structure.

    Cheers Rolf

    ( addicted to the Perl Programming Language)

    1) especially when delegating the split to a function called by grep

    grep &splitted(\%set1,\%set2) , keys %index


    added missing keys

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1027966]
Approved by LanX
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2016-04-29 08:49 GMT
Find Nodes?
    Voting Booth?
    :nehw tseb si esrever ni gnitirW

    Results (439 votes). Check out past polls.