![]() |
|
Your skill will accomplish what the force of many cannot |
|
PerlMonks |
comment on |
( #3333=superdoc: print w/replies, xml ) | Need Help?? |
IIRC these structures are called multisets because some elements are repeated in one of your examples.
If I understand your requirements correctly, you can use your approach in a pragmatic way, because any "neighboring" multi sets must have at least 8 digits in common. So
At the end you'll only need 9 hash look ups to drastically narrow down potential candidates. NB: That's a pragmatic approach, a detailed survey might show more efficient algorithms. HTH :) PS: this problem reminds me of hamming distance of error correcting codes, but I doubt you can easily apply this here.
Cheers Rolf
updateI just realized that you already sketched that approach in Re^2: Finding Nearly Identical Sets . Not sure why you say it's ugly, cause a HoH should be quite fast, and you'd need to check anyway, if your input is equidistant to multiple neighbors. In reply to Re: Finding Nearly Identical Sets
by LanX
|
|