in reply to Re: "Just use a hash": An overworked mantra? in thread "Just use a hash": An overworked mantra?
Hash is not O(1)
Are you sure?
- "Many hash table designs also allow arbitrary insertions and deletions of key-value pairs, at constant average (indeed, amortized) cost per operation." -- http://en.wikipedia.org/wiki/Hash_table
- http://en.wikipedia.org/wiki/Big_O_notation
- "Thus, everyone knows that hash table queries run in amortized constant time. That is, as the number of keys increases, the average time necessary to recover a key-value pair does not increase." -- http://lemire.me/blog/archives/2009/08/18/do-hash-tables-work-in-constant-time/
- "Hash tables, sometimes also known as associative arrays, or scatter tables, are a data structure that offers fast lookup, insertion, and deletion of (key, value) pairs. In a well-designed implementation of hash tables, all of these operations have a time complexity of O(1), rather than the O(n) that linked lists, association lists, and vectors would require for some of them." -- http://www.math.utah.edu/~beebe/software/hash/hash.html
- "Perl hashes provide constant-time lookups. They are implemented as true hash tables (with automatic retuning as necessary), not associative arrays." -- http://stackoverflow.com/questions/3774904/what-is-the-time-complexity-for-looking-up-an-item-in-a-perl-hash
- http://www.sysarch.com/Perl/sort_paper.html
I'll grant that an individual lookup might consume greater than O(1) time, but in the aggregate it's fairly well established and accepted that hash operations are O(1) operations, where 'n' is the number of elements in the hash.
Re^3: "Just use a hash": An overworked mantra?
by vkon (Curate) on Dec 23, 2011 at 20:52 UTC
|
yes. I am sure.
first, the third link you've pointed reads:
Thus—formally—hash tables often run in time O(log n).
next, if the question is "is it possible to create an algorithm that, given arbitrary strings of finite length, make associative store/lookup for O(1)" - then the answer is "yes" but these are non-realistic algorithms.
but - back to reality - real-life alhorithms are O(log N) in best case, but I afraid Perl's one is worse than that and - yet - have you ever heard of so-called "hash complexity attack" that was security-fixed by seed randomization in 5.8.0? To inform you, pre-5.8.0 were looking like O(1) but actually were O(n) (or maybe even worse in worst case scenario)
next, your "stackoverflow.com" link contains simply wrong information. IOW, I do not see any proof there that convinces me.
And - your last link does not even mention hashes.
Bother explain what is doing here? TIA!
and finally, do you see something strange in your phrase hash operations are O(1) operations, where 'n' is the number of elements in the hash
?
I could be a bit joking, but I do not trust phrases like that.
you're explaining parameter 'n' that is not presents in your formulae - what other mistakes/typos have you made?
:) | [reply] |
|
| [reply] [d/l] |
|
kewl.
thanks for the pointer in a table, I overlooked it - indeed - it takes a special care to find this place, since the only mention w/o any argumentation - not in line of overall text, easy to e\overlook, and - still - does not convince me - where is argumentation?
a bit pity that O(1) for hashes actually do not reflects on what is actually going on inside perl.
for a prove of my point - well, look into perl implementation of hashes - just read the comments - and you will see it.
Convincing?
Well, URLs provided to me are even less convincing..... :o
| [reply] |
|
|
| [reply] |
|
yes, they are O(n x log n)
how do the hashes work.
ok, hash is computed (some say these are O(log n) i do not know but this time I trust the links provided to me by davido) and then - based on computed hash - it could be hash hit or miss: either "luck" - (this is what usually happens) - this was different on all previous values.
or - otherwise - hash sum equality with earlier cases of that hash (rare in practice, but this is what is used on hash complexity attack)
in that latter case - insertion/search happens (based on exact operation), which is O(n)
otherwise this hunk of hv.c code is what for:
* The "hash seed" feature was added in Perl 5.8.1 to perturb the resu
+lts
* to avoid "algorithmic complexity attacks".
*
* If USE_HASH_SEED is defined, hash randomisation is done by default
* If USE_HASH_SEED_EXPLICIT is defined, hash randomisation is done
* only if the environment variable PERL_HASH_SEED is set.
* For maximal control, one can define PERL_HASH_SEED.
* (see also perl.c:perl_parse()).
*/
| [reply] [d/l] |
|
|
|
|