Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

First task to do is to prepare yourself a big bowl of lime blossom tea (tilia). Let it brew for 10 minutes, take care that it doesn't become cold - a thermos flask would keep it hot. Add plenty of honey, a pinch of salt, and lemon juice. Prepare your bed with some towels on the bed sheet, have more towels to cover yourself; get into bed, cover yourself with towels then blanket(s), and drink the tea really really hot, just below the temperature which would hurt you. Keep yourself covered snug and warm, close your eyes and sweat the crap out. --

Next, the identifier thing. I have the gut feeling that a solution involving binary trees and partial match could provide a nice solution. Whilst you sweat along, I'm going to evolve that thought as I am typing.

Since you say that you accept Any change to the order (1,2,3,4,5,6,7,8,9 instead of 9,8,7,6,5,4,3,2,1) you may as well store keys with an exact length as a split/sort/concatenate key. Then again, you could also store the reversed sorted key in the B-Tree database.

If you do a lookup of the split/sort/concatenate candidate in the database, you could get a full match - done. If you get a partial match, look up the reversed sorted candidate. The difference between the sum of both partial key lengths and the required key length should be 1, no matter whether insertion, deletion or transformation of 1 digit is the cause of difference. If it is any longer, more than 1 transformation has been done to the expected key.

See DB_File's BTREE section, and maybe IP prefixes: match IP against "best" network.


Ahem. The difference between the sum of both partial match keys to the expected length should be 1 for insertion/deletion, and 2 for transformation, I guess...not. - I'm too tired right now to code the algorithm...

perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'

In reply to Re: Finding Nearly Identical Sets by shmem
in thread Finding Nearly Identical Sets by Limbic~Region

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (3)
As of 2021-06-24 12:10 GMT
Find Nodes?
    Voting Booth?
    What does the "s" stand for in "perls"? (Whence perls)

    Results (127 votes). Check out past polls.