japhy claimed to have looked at several implementations but been disappointed with each of them. I think he said that each was either too esoteric for him to understand easily enough or did too much work generating permutations that had to be skipped.
I felt that I had a rather straight-forward approach that wouldn't backtrack much at all. It is very much like Algorithm::Loops::NestedLoops(), except I attempt to build the list of values to loop over next (the offsets not currently selected) more efficiently by keeping track as I go. But I think I can do this more efficiently still.
So the code just moves along selecting the next item (actually its offset) from the list of items not selected earlier in the list and not at the same offset (and not previously selected for this slot during the current 'round').
This approach occasionally has to 'backtrack', but (I believe) this only happen when it gets to the last slot and does that at most once per derangement returned. So trying to look ahead to prevent this tiny amount of backtracking would actually be slower than the 'naive' approach.
I looked at the code for Algorithm::Combinatorics and saw that it was using the lexical-order permutation algorithm1 modified to try to skip non-derangements somewhat efficiently. I had rejected this approach as a first choice because it contains a step where you reverse a part of your list and that can place one or more items back into their original positions in such a way that it would be tricky to quickly jump to the next permutation that is a derangement. And the comments implied that it did have to skip many permutations because of this.
So, based on japhy's assessment I didn't look at other implementations. Thanks for pointing those out.
1 The classical lexical-order permutation algorithm is very similar to Algorithm::Loops::NextPermute() except for not dealing with duplicate values, something that I have yet to see done outside of my Algorithm::Loops.