Do you mean Hamming distance
Crazy idea. Since one can compute the Hamming distance between the initial words, and this is a mininum length of any possible solution path, you can eliminate any paths of shorter length as solutions. Of course, you'd need to know the paths to know the lengths, so this may not be a realizable optimization, but at least you could avoid testing for the solution until you're past the minimum distance. Just trade space for speed and precompute every path.
You said you wanted to be around when I made a mistake; well, this could be it, sweetheart.