Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: Better Hash Tables?

by Jenda (Abbot)
on Feb 18, 2025 at 21:39 UTC ( [id://11164023]=note: print w/replies, xml ) Need Help??


in reply to Better Hash Tables?

Fullness can be described in terms of an overall percentage — this table is 50% full, that one’s 90% — but researchers often deal with much fuller tables. So instead, they may use a whole number, denoted by x, to specify how close the hash table is to 100% full. If x is 100, then the table is 99% full. If x is 1,000, the table is 99.9% full. This measure of fullness offers a convenient way to evaluate how long it should take to perform actions like queries or insertions.

Seems once again researchers are researching exactly the wrong thing. I mean if the table is 99% full, you should have increased its size a long long time ago. If it's even more full, you are an ... you are clearly not interested in doing anything practical, but rather seem to be working towards a degree in theoretical something or other.

While it's possible that there's something of some practical use in that paper (so far I only read the article), I would not hold my breath. If the topic was an already pathologically overstuffed hashes, then there might be something for the sanely proportioned ones too, but...

Jenda
1984 was supposed to be a warning,
not a manual!

Replies are listed 'Best First'.
Re^2: Better Hash Tables?
by jo37 (Curate) on Feb 19, 2025 at 12:12 UTC
    Seems once again researchers are researching exactly the wrong thing.

    You haven't read the linked paper, have you?

    Greetings,
    🐻

    $gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$

      If you read my comment, you can see me stating openly that I only react to the article, not to the paper. :-)

      I'm not through the paper yet and I admit I would prefer if they stated the algorithm (also) using (pseudo)code rather than (just) by a mix of english and maths, but so far I really do not think we will see this in perl any time soon. The algorithm looks fairly complicated and while I have no reason to dispute the asymptotic behavior, I would expect the real world performance for hashes that are not excessively full to be rather bad and I do believe that for real world applications it will be better to allocate more memory for the table than to complicate the code.

      Jenda
      1984 was supposed to be a warning,
      not a manual!

        I do believe that for real world applications it will be better to allocate more memory for the table than to complicate the code.

        You are missing the point. In a hash table that is filled to a certain degree, there is some effort required to add a new element (or search for an existing one). Seems there was consensus over some decades that a specific algorithm was optimal (but which had not been proved), that was disproved in the paper.

        A non-optimal algorithm - even on an enlarged table - is slower than the optimal one. Just because there is a measure that can easily express a 99.999% filled table does not mean you should use it. The "asymptotic behavior" is asymptotic with the table's size for any given "fillness".

        Would you state there is no point in making an engine more efficient just because you can enlarge the fuel tank?

        Greetings,
        🐻

        $gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11164023]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (5)
As of 2025-11-16 18:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What's your view on AI coding assistants?





    Results (72 votes). Check out past polls.

    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.