how do I efficiently remove one hash from another?

perltux has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: how do I efficiently remove one hash from another? by CountZero (Bishop) on Nov 27, 2012 at 07:05 UTC
Use a hash slice: `delete @hash1{keys %hash2};` Update: deleted an unmatched ")". Thanks frozenwithjoy. CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James My blog: Imperial Deltronics	[reply] [d/l]
Re^2: how do I efficiently remove one hash from another? by perltux (Monk) on Nov 27, 2012 at 07:17 UTC
Thanks, that's just what I was looking for!	[reply]
Re: how do I efficiently remove one hash from another? by davido (Cardinal) on Nov 27, 2012 at 07:11 UTC
Whether the loop is explicit, or implicit, there's a loop. But one cool means is: `delete @hash1{ keys %hash2 };` [download] ...which is pretty much the same thing as... `delete $hash1{$_} for keys %hash2;` [download] ...but with an implicit loop (via the hash slice) rather than the explicit 'for' loop. Note in either case, there's no need to worry about checking exists: delete doesn't complain if the element doesn't already exist. Dave	[reply] [d/l] [select]
Re^2: how do I efficiently remove one hash from another? by tobyink (Canon) on Nov 27, 2012 at 10:26 UTC
The former compiles to a much smaller op tree than the latter: `perl -MO=Concise -e'delete @hash1{keys %hash2}' perl -MO=Concise -e'delete $hash1{$_} for keys %hash2'` [download] On my machine, the slice performs about 20% faster than the loop. `perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'`	[reply] [d/l]
Re^3: how do I efficiently remove one hash from another? by ColonelPanic (Friar) on Nov 28, 2012 at 14:25 UTC
The slice option only calls `delete` once. It also doesn't have to go the trouble of assigning `$_` for each element. So that makes sense. When's the last time you used duct tape on a duct? --Larry Wall	[reply] [d/l] [select]
Re^2: how do I efficiently remove one hash from another? by perltux (Monk) on Nov 27, 2012 at 07:18 UTC
Thanks for the detailed explanation.	[reply]
Re: how do I efficiently remove one hash from another? by rjt (Curate) on Nov 27, 2012 at 09:28 UTC
If you are concerned about looping through the values (as is common and necessary in all the ways to do what you've asked), is there any other way you can build up or maintain the data in the first place? Whenever I find an expensive operation in one of my programs, it's always worth going back to the data representation itself to see if there is a more efficient way of maintaining the data. For instance, if the hash keys do not change often, and you are more concerned about CPU than memory, separately maintaining %hash1_minus_hash2 might be better. Your "write" operation takes twice as long (still O(1), just with a higher constant), but your O(n) loop is now O(1), which can be a huge win. Also, if `%hash1` is not more than twice as big as `%hash2`, and you're going to loop through `%hash1` for the next step of your algorithm anyway (and you do this at most once per delete cycle), a construct like this might actually be more efficient: `for (keys %hash1) { next if exists $hash2{$_}; # ... your main loop code here }` [download] There are of course many additional strategies that may be a better fit, but perhaps these two will spark an interest to take the problem up a level. And, it's entirely possible there's nothing else you can do in your case. If that's true, cheers anyway! :-)	[reply] [d/l] [select]
Re: how do I efficiently remove one hash from another? by Anonymous Monk on Nov 27, 2012 at 21:48 UTC
When something needs to be in more than one structure at a time, the notion of "references" makes this very-easily done. There's only one item of data, with n references to it, and the data won't disappear from memory until the reference-count becomes zero. (But avoid "self-referencing" data objects; see "weakened" references.)	[reply]


Perl-Sensitive Sunglasses
	PerlMonks