Perl: the Markov chain saw PerlMonks

### Two dimensional sets intersection

by menth0l (Monk)
 on Apr 10, 2013 at 13:12 UTC Need Help??
menth0l has asked for the wisdom of the Perl Monks concerning the following question:

Hi

I have two discontinuous sets of values and some pairs from each sets are "valid" and have assigned values, for example:
```# subset 1 of (A..Z)
set_1 = (A, B, C, X, Y, Z)
# subset 2 of (a..z)
set_2 = (q, w, r, t, u)
# valid pairs
valid = ((A,q) = 1, (A,u) = 6, (A,k) => 3, (C,t) = 10, (Z,u) = 30)
I have a problem with "translating" this to perl. I know I need some kind of map for storing information about valid pairs and their id (like python's map of tuples: (x,y) => z). But having two sets, each consisting of 100 elements requires 100x100=10000 combinations... Is there any quicker method for this?

If I'd have an index like this:
```my %INDEX = {
A => { q => 1, u => 6, k => 3 },
C => { t => 10 },
Z => { u => 30 },
M => { q => 100 }
}
I could narrow search by creating intersection (e.g. using Set::Object):
```# narrow by first set
(set_1) x (keys %INDEX) = (set_1) x (A, C, M, Z) = (A, C, Z)
then i would be left with an array of hash-refs and do:
```set_2 x (q, u, k) + set_2 x (t) + set_2 x (u) = (1, 6, 10, 30)

I don't like this idea though and I don't know if it would be faster than combinations hash checks. Is there a perl module that would help me with this?

Replies are listed 'Best First'.
Re: Two dimensional sets intersection
by daxim (Chaplain) on Apr 10, 2013 at 13:28 UTC
This is a FAQ.
```use Hash::MultiKey;
tie my %valid, 'Hash::MultiKey';
%valid = (
[qw(A q)] => 1,
[qw(A u)] => 6,
[qw(A k)] => 3,
[qw(C t)] => 10,
[qw(Z u)] => 30,
);
Re: Two dimensional sets intersection
by Not_a_Number (Prior) on Apr 10, 2013 at 20:44 UTC

No need to worry about narrowing your search or whatever. Hash lookups are fast.

The code below creates a hash for a 100 x 100 matrix. It also assumes that each matrix pair has a value (worst case, given your 'spec'). To simplify validation, I make this value 1, but changing it to any other number makes no difference to the time required for lookup.

Subsequently, I make another worst-case assumption, namely that your lists of search terms (what you call Subset 1 and Subset 2) are maximally large.

On my machine, it finishes before I can blink. Run it and see:

```use strict;
use warnings;

# Create 100 x 100 datastructure:

my @uc = ( 'AA' .. 'DV' );
my @lc = ( 'aa' .. 'dv' );

my %index;
for my \$k ( @uc ) {
for my \$v ( @lc ) {
\$index{\$k}{\$v} = 1;
}
}

# Parse it:

my \$score;
for my \$k ( @uc ) {    # or: for my \$k ( @subset_1 )
for my \$v ( @lc ) {  # or: for my \$v ( @subset_2 )
\$score += \$index{\$k}{\$v} || 0;
}
}

print \$score;
Re: Two dimensional sets intersection
by LanX (Bishop) on Apr 10, 2013 at 14:29 UTC
first of all I have to say that I have the impression that you are trying to reinvent a wheel from graph theory.

Your structure looks for me like a weighted bipartite graph, and you are trying to find a subgraph.

Did you check CPAN for related modules?

> (like python's map of tuples: (x,y) => z)

don't know enough about Python, but this can easily be simulated in Perl with old-style multidim hashes and grepping with a split on the keys.

```  DB<116> \$index{"A","a"}=1
=> 1

DB<117> \$index{"A","b"}=2
=> 2

DB<118> \$index{"B","a"}=3
=> 3

DB<119> \$index{"B","b"}=4
=> 4

DB<120> grep { my (\$k1,\$k2) = split /\$;/,\$_; \$k1 eq "A" } keys %inde
+x
=> ("A\34a", "A\34b")

DB<121> %set1=(A=>1)
=> ("A", 1)

DB<122> %set2=(a=>1)
=> ("a", 2)

DB<124> grep { my (\$k1,\$k2) = split /\$;/,\$_; \$set1{\$k1} and \$set2{\$k
+2} } keys %index
=> "A\34a"

This brute force search might be elegantš, but for sure not as efficient as looping thru your hierarchical structure.

Cheers Rolf

( addicted to the Perl Programming Language)

1) especially when delegating the split to a function called by grep

grep &splitted(\%set1,\%set2) , keys %index

##### update

Create A New User
Node Status?
node history
Node Type: perlquestion [id://1027966]
Approved by LanX
help
Chatterbox?
 [Discipulus]: hello monks and nuns! all was well inmy absecnce? [1nickt]: Tutto bene, Discipulus [jedikaiti]: Well, the Monastery is still standing, so the party wasn't that epic [Discipulus]: ;=) I will unavailable this week during eu morning time: I'm following a redhat 7 basci course, finally [1nickt]: bravo [jedikaiti]: wheee

How do I use this? | Other CB clients
Other Users?
As of 2017-12-11 17:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
What programming language do you hate the most?

Results (308 votes). Check out past polls.

Notices?