### Set Operators

 on Oct 24, 2002 at 17:51 UTC Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have frequently found myself in a position where I want to treat an array as a set, and wish there were "set operators" (or wish I knew if they existed) in perl. The main thing I want to do is test to see if a scalar is a member of a set that is inside an array. I.e.

if (\$a isamemberof @set) {...}

Does anyone know a 1 line way of checking for set membership like the above? Thanks for the help, Monks.

Mark

Replies are listed 'Best First'.
Re: Set Operators
by zigdon (Deacon) on Oct 24, 2002 at 17:57 UTC

While arrays are ill suited for this, there are several ways:

```if (grep /^\Q\$a\$/, @set) { ... }

But more likely, you'll be better off using a hash instead of an array. Then, it's trivial (and fast!) to do this:

```if (exists \$set{\$a}) { ... }

-- Dan

Re: Set Operators
by broquaint (Abbot) on Oct 24, 2002 at 17:59 UTC
You could use Set::Array if you specifically wanted to perform set operations on arrays. But just to see whether an element is in an array can be performed with grep() e.g
```my @a = qw( foo bar baz quux );
print "yup" if grep "bar" eq \$_, @a;

__output__

yup
Or if you want something a little more mind-bending check out the any function from Quantum::Superpositions.
HTH

_________
broquaint

Re: Set Operators
by demerphq (Chancellor) on Oct 24, 2002 at 18:13 UTC
Ok, as other replies have said the best way to do this is to not do this. :-) Use a hash in the first place when you think you will need "isamememberof" relationships.

But if you must do this for some reason then as the other replies have said you can use grep. Except that you probably dont want to do that. :-) grep() is slow for this type of job. grep() is intended for finding "all the members of an array that meet a set of criteria", but what you have asked for is "find if there is an element in the array that meets a set of criteria" which is pretty different. A simple (practically) textbook solution is

```my \$found=0;
\$_==\$scalar and (\$found=1),last for @array;
Where \$found will whether the array contains a value (in the case numerically) equivelent with \$scalar. A way to do this as an expression (ie in the if you gave as an example) is here
```if (do {my \$f=0; \$_==\$scalar and (\$f=1),last for @array; \$f}) {
}
Of course you could set it up in a sub or do any number of other things....

Remember using a hash in this situation is much smarter than using either this solution or grep()

--- demerphq
my friends call me, usually because I'm late....

Re: Set Operators
by fglock (Vicar) on Oct 24, 2002 at 18:00 UTC
if ( exists \$set{\$a} ) { ... }

but it only works for hashes.
If you really want "sets" there is a whole Set directory in CPAN.
(I'm guilty of Set::Infinite but there are many other modules)

but it only works for hashes.

As of Perl 5.6.0, the exists function can be used on arrays to test the existence of an element within an array. For example:

```my %hash = ( 'key' => 'value' );

print "exists \\$hash{'key'}\n" if exists \$hash{'key'};

my @array = ( 'value' );

print "exists \\$array[\\$index]\n" if exists \$array[0];

Under Perl 5.005.03 this code results in an error:

```exists operator argument is not a HASH element at test-5.00503.perl li
+ne 7.

Whereas, under Perl 5.6.0 or later, the code executes without error:

```exists \$hash{'key'}
exists \$array[\$index]

Cool huh? Much nicer than using a construct like (\$index <= \$#array) to test the existence of an array element - Especially if someone is performing magic with \$[ !

perl -e 'print+unpack("N",pack("B32","00000000000000000000000111010111")),"\n"'

Does anyone know how the exists function is implemented for arrays? Is it the equivalent of a linear search through the array, or is it more interesting? Essentially, does it have a worst case time complexity better than N? Oops. Didn't read that very carefully. Thought you could do an exists on a value, not an index.
Re: Set Operators
by zakzebrowski (Curate) on Oct 25, 2002 at 11:47 UTC
Take a look at quantum-superpositions on cpan.

----
Zak
Pluralitas non est ponenda sine neccesitate - mysql's philosphy

Just because I like Quantum::Superpositions, I thought I'd provide an example.

```use Quantum::Superpositions;

my @set = (1, 2, 3, 4, 5);

print "3 is a member!\n" if 3 == any(@set);
print "8 is a member!\n" if 8 == any(@set);

Admittedly, this isn't the fastest solution (see the responses above for those), but it maps well to how I like to think about set operations. Sacrificing speed for clarity. :)

-Bird
Re: Set Operators
by thor (Priest) on Oct 24, 2002 at 18:00 UTC
Check out perldoc -f grep. From what I understand of your request, this is probably what you want.

thor

Re: Set Operators
by gjb (Vicar) on Oct 25, 2002 at 14:51 UTC

If you don't care about the order of the elements, Set::Scalar may be the way to go since its methods names have names that make more sense in a mathematical sense.

It should be faster than Set::Array since one needs less bookkeeping if the order is irrelevant.

Hope this helps, -gjb-

Re: Set Operators
by SuperCruncher (Pilgrim) on Oct 25, 2002 at 22:12 UTC
I can't believe no one has mentioned this (admittedly it isn't directly related to the Anonymous Monk's question): it will be possible to manipulate Perl 6 arrays with set theory-style operators. There was a Perl 6 RFC proposed, and Damian Conway's talk at YAPC::Europe this year seemed to confirm that. About one half of the audience loved it, the rest seemed to be a bit concerned at some of the incredibly cool Perl 6 features that were presented.

<offtopic>
Now once "they" (Larry et al.) decide to forget the silly idea of removing foreach and having only for, I'll be completely happy with Perl 6. Apparently code like:

doesn't make sense or looks ugly. Well, I read that as "for each of my people in the address book, do something". It makes sense to me!
</offtopic>

Re: Set Operators
by jsegal (Friar) on Oct 25, 2002 at 17:29 UTC
Following up on some of the other posts, a quick and easy way to turn an array-based set into a hash-based set is:
```%hashset = map {\$_=>1} @arrayset;
If for some reason you are forced to have the set be an array (say you need to keep the order, or you are being passed this array from some module outside of your control), but you will be doing numerous (i.e. more than one) membership tests, it might be worthwhile converting it to a hash-based set, and doing your membership tests on the hash (via exists(\$hashset{\$a})
Best of luck...

--JAS
Funny that this question came up today, as I just had to figure this out on my own. I needed to exclude a list of items from a search, so I immediately made a list of the items:

my @items = qw (one two three etc);

but then I was stumped. Looping through the whole list every time to detect a match seems not to be an optimal method. And writing:

my %items = ( one=>1, two=>1, three=>1 );

doesn't feel very good either.

Map to the rescue!

my %items = map {\$_ => 1} qw (one two three);

I'm still fairly new to map, so I'm kinda happy that I found that one on my own... Take note, fellow acolytes! :)

++jsegal

The map operator might look nice, but it isn't always the most suitable operator to use. It is a very powerful tool, but IMHO it is not the best for this job.

The classic idiom for this type of "set" operation is to use the 'x' (multiplication) operator, in combination with hash slices. e.g.

```my (%hash, @array);

@array = ('one', 'two', 'three');

@hash{@array} = (1) x @array;

if (exists \$hash{'one'}) { .... }

To demonstrate just how efficient this method is, I have written a small test case using Benchmark.pm. I've tried to include all the methods which have been shown in reply so far. I'm not guaranteeing that Perl hasn't optimised away some of the operations (which could skew the results), but I hope that it provides a fairly accurate measure of the different approaches. If anyone can see any glaring errors or ommisions, please reply.

```use Benchmark qw(timethese);

my \$iterations = 300_000;
my @array = ( "best", 1..49, "average", 51..99, "worst" );

for my \$search (qw( best average worst )) {
print "Timing '\$search' case:\n";

timethese(\$iterations, {
'map' => sub {
my %hash = ();
%hash = map { \$_=>1 } @array;
if (\$hash{\$search}) { 1 }
},
'xop' => sub {
my %hash = ();
@hash{@array} = (1) x @array;
if (\$hash{\$search}) { 1 }
},
'for1' => sub {
my %hash = ();
for (@array) { \$hash{\$_} = 1 }
if (\$hash{\$search}) { 1 }
},
'for2' => sub {
for (@array) { if (\$_ eq \$search) { 1; last } }
},
'for3' => sub {
if (do {
my \$f=0;
\$_ eq \$search and (\$f=1),last for @array;
\$f
}) { 1 }
},
'grep' => sub {
if (grep \$search eq \$_, @array) { 1 }
},
});
}

__END__

Timing 'best' case:

Benchmark: timing 300000 iterations of for1, for2, for3, grep, map, xo
+p...
for1: 116 wallclock secs (117.60 usr +  0.00 sys = 117.60 CPU) @
+ 2551.02/s (n=300000)
for2:  1 wallclock secs ( 1.21 usr +  0.00 sys =  1.21 CPU) @ 24
+7933.88/s (n=300000)
for3:  3 wallclock secs ( 2.36 usr +  0.00 sys =  2.36 CPU) @ 12
+7118.64/s (n=300000)
grep: 27 wallclock secs (26.20 usr +  0.00 sys = 26.20 CPU) @ 11
+450.38/s (n=300000)
map: 236 wallclock secs (234.47 usr +  0.00 sys = 234.47 CPU) @
+ 1279.48/s (n=300000)
xop: 93 wallclock secs (92.93 usr +  0.00 sys = 92.93 CPU) @ 32
+28.24/s (n=300000)

Timing 'average' case:

Benchmark: timing 300000 iterations of for1, for2, for3, grep, map, xo
+p...
for1: 111 wallclock secs (110.18 usr +  0.00 sys = 110.18 CPU) @
+ 2722.82/s (n=300000)
for2: 20 wallclock secs (18.35 usr +  0.00 sys = 18.35 CPU) @ 16
+348.77/s (n=300000)
for3: 20 wallclock secs (18.29 usr +  0.00 sys = 18.29 CPU) @ 16
+402.41/s (n=300000)
grep: 26 wallclock secs (26.09 usr +  0.00 sys = 26.09 CPU) @ 11
+498.66/s (n=300000)
map: 220 wallclock secs (219.22 usr +  0.00 sys = 219.22 CPU) @
+ 1368.49/s (n=300000)
xop: 93 wallclock secs (92.83 usr +  0.00 sys = 92.83 CPU) @ 32
+31.71/s (n=300000)

Timing 'worst' case:

Benchmark: timing 300000 iterations of for1, for2, for3, grep, map, xo
+p...
for1: 109 wallclock secs (109.24 usr +  0.00 sys = 109.24 CPU) @
+ 2746.25/s (n=300000)
for2: 35 wallclock secs (35.65 usr +  0.00 sys = 35.65 CPU) @ 84
+15.15/s (n=300000)
for3: 34 wallclock secs (34.06 usr +  0.00 sys = 34.06 CPU) @ 88
+07.99/s (n=300000)
grep: 26 wallclock secs (26.19 usr +  0.00 sys = 26.19 CPU) @ 11
+454.75/s (n=300000)
map: 222 wallclock secs (222.51 usr +  0.00 sys = 222.51 CPU) @
+ 1348.25/s (n=300000)
xop: 93 wallclock secs (93.05 usr +  0.00 sys = 93.05 CPU) @ 32
+24.07/s

Cheers,

-- Dave :-)

Re: Set Operators
by greenback (Initiate) on Oct 27, 2002 at 15:21 UTC
Adapting DaveH's nice solution to use another Quantum module:

```use Quantum::Entanglement;

my (\$foo, \$bar, @foo);

@foo = qw(one two three);
\$foo = entangle((1) x @foo);
sub foo {my \$state = \$_[0]; return \${\$_[1]}[\$state]}
\$bar = p_func('foo', \$foo, \@foo);

if (\$bar eq 'one') { .... }
Re: Set Operators
by greenback (Initiate) on Oct 27, 2002 at 21:53 UTC
Try:

```my (\$found, @set);

@set = qw(one two three);
\$found = map { 1 if \$set[\$_] eq \$element } 0..\$#set;

if (\$found) { ... }

Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://207798]
Approved by fglock
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (2)
As of 2023-06-04 00:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
How often do you go to conferences?

Results (17 votes). Check out past polls.

Notices?