Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^3: Using hashes for set operations...

by jaredor (Deacon)
on May 24, 2011 at 12:55 UTC ( #906491=note: print w/ replies, xml ) Need Help??


in reply to Re^2: Using hashes for set operations...
in thread Using hashes for set operations...

Sir, to me "elegant" is equivalent to "mathematical"!

(Even though I am a failed mathematician, it does sting a little that you felt the need to give me a link to De Morgan's laws.)

True, I did treat this more as a little puzzle than a real world answer, but then what do you expect with only allowing the set operations, "union" and "subtraction"? As for that, you only get the union property as a side effect from hash key creation (which (I think) would not give even a squeak of protest if there were two key/value pairs with identical keys and different values, but by construction we avoid this issue).

Although I've never used this behavior, apparently the delete function does return a list of keys values it's deleted, so delete(@A{keys B}) does return the intersection of A and B ... as well as an undef for every key value that was in B and not in A; so you're back to square 1 with ugly undefs in your answer. (Hence I still haven't used this behavior.) BTW, it just occurred to me that for the symmetric differences %sd1 and %sd2 don't have to be copies of the union, they only need to be the hash equivalents of @array1 and @array2, respectively.

I love talking like a pedant, but in addition to this vice, I have to admit to a hypocrisy: I love using map and grep more, even to the point of using them for their side effects, which means a significant portion of my time in writing up my (few) posts on perlmonks is translating to while and for constructs. So I'm sincere when I say that your original post was elegant enough for me: Handling undefs in perl is just a fact of life, and if you can do it with just one swipe of the knife, then you are about as good as you're going to get without reformulating your approach.

My last gratuitous squeak of protest is that print statements suffice if they work correctly for a code that is an illustration of a solution; in other words, it is not a data structure at that point so much as a formatted solution! (That being said, you've shamed me enough that I will try to start using Data::Dumper in my posts.)

I've just gone through and removed all the smiley's from this post. So now I look smarter, but more crotchety. Anyone who reads this please note that I enjoy these kinds of "pure perl" puzzles and have enjoyed everyone's contribution to this query of LanX's. I spent years as a coder looking for good coders to critique my work with only modest success. Here at perlmonks you get high quality feedback quickly--and you still learn from the low quality feedback because you now have context in which to evaluate it. Good stuff, and thank you LanX.


Comment on Re^3: Using hashes for set operations...
Re^4: Using hashes for set operations...
by LanX (Canon) on May 24, 2011 at 14:09 UTC
    > apparently the delete function does return a list of keys it's deleted, so delete(@A{keys B}) does return the intersection of A and B ...

    Good observation! I forgot about this.... so one gets the difference and the intersection with one operation, thats brilliant.

    So a simple upgrade of delete with an extra switch to ignore undefs would give Perl a native primitive to handle set operations...

    And I enjoyed your post too, I'm still meditating about how to make use of your approach...

    > (Even though I am a failed mathematician, it does sting a little that you felt the need to give me a link to De Morgan's laws.)

    The link wasn't meant for you, believe me. :)

    I'm mathematician by training but not a native English speaker, so I need to elaborate my posts longer and stuff them with extra informations to avoid misunderstandings. (And the monastery is full of obsessed nitpickers...)

    Cheers Rolf

      Such a kind response deserves a little extra effort. Below is my "cleanest" attempt at solving your puzzle. It was motivated by two facts 1) I was wrong about the behavior of delete in that it returns the deleted values, not the deleted keys, so the "value equal key" array is necessary and 2) I've read a few responses on perlmonks that say hash elements that have undef as a value actually point to the same undef and thus potentially save some space. (I cannot find this stated in the documentation, but it probably is that I just don't know where to look; plus all the keywords that spring to mind are ubiquitous ;-)

      • Union is dead easy.
      • Intersection uses a temporary hash as well as a grep. I chose this way as being more succinct than efficient, but admit it violates your rules.
      • Symmetric Difference is about the same effort as two intersections; however, there is no grep!

      All three constructs were written to be independent of the others, but obviously that need not be a constraint; thus symmetric difference could be as easy as taking a copy of the union and deleting the intersection (as in your approach). I decided to make intersection "easier" than symmetric difference because I think intersections are more common, at least in the type of code I write. Lastly, I have decided to have all the result hashes "look the same" in that every element value is undef--which may or may not be more efficient!

      That's all I can do. Thanks, this has been fun.

      #!/usr/bin/env perl use strict; use warnings; use Data::Dump qw(pp); my @A = qw( 1 2 3 4 5 ); my @B = qw( 3 4 5 6 7 ); my %unionAB; @unionAB{@A,@B} = undef; my %interAB; { my (%tmpA); @tmpA{@A} = @A; @interAB{(grep {defined} delete @tmpA{@B})} = undef; } my %sdiffAB; { my (%tmpA, %tmpB); @tmpA{@A} = @A; @tmpB{@B} = @B; delete @tmpA{@B}; delete @tmpB{@A}; @sdiffAB{keys %tmpA, keys %tmpB} = undef; } # View results print 'unionAB: ', pp(\%unionAB), "\n"; print 'interAB: ', pp(\%interAB), "\n"; print 'sdiffAB: ', pp(\%sdiffAB), "\n";

      (Sadly, I feel your point about adding detail. Most of my time not coding was spent trying to guard against the charge that I'm an idiot who doesn't know set theory. Of course if my beloved Dr. Kaiser were on perlmonks and posted such, I would have to agree, but since he loves PROLOG, I think I'm safe.)

        Thanks! :)

        > Lastly, I have decided to have all the result hashes "look the same" in that every element value is undef--which may or may not be more efficient!

        unfortunately this collides with my requirement to support all kind of sets:

        From the OP

        This is fundamentally wrong because keys are stringifications and any reference type will not be reproduced (just think about arrays of objects or a AoH or ...)

        Cheers Rolf

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://906491]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (4)
As of 2014-09-20 22:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (163 votes), past polls