Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: Perl Internals: Hashes

by kvale (Monsignor)
on Apr 14, 2004 at 21:20 UTC ( #345208=note: print w/ replies, xml ) Need Help??


in reply to Perl Internals: Hashes

First the easy question: hashes are a kind of associative array.

The hash code can be found in hv.c in the perl distribution. The routine

HV * Perl_newHV(pTHX) { register HV *hv; register XPVHV* xhv; hv = (HV*)NEWSV(502,0); sv_upgrade((SV *)hv, SVt_PVHV); xhv = (XPVHV*)SvANY(hv); SvPOK_off(hv); SvNOK_off(hv); #ifndef NODEFAULT_SHAREKEYS HvSHAREKEYS_on(hv); /* key-sharing on by default */ #endif xhv->xhv_max = 7; /* HvMAX(hv) = 7 (start with 8 buckets +) */ xhv->xhv_fill = 0; /* HvFILL(hv) = 0 */ xhv->xhv_pmroot = 0; /* HvPMROOT(hv) = 0 */ (void)hv_iterinit(hv); /* so each() will start off right */ return hv; }
creates a new hash. We can see that the basic memory needed is 502 bytes and intially, 8 buckets are created.

For growing a hash as elements are added, the hashing algorithm is a one-at-a-time method. To grow the hash, I think the relevant routine is

STATIC void S_more_he(pTHX) { register HE* he; register HE* heend; XPV *ptr; New(54, ptr, 1008/sizeof(XPV), XPV); ptr->xpv_pv = (char*)PL_he_arenaroot; PL_he_arenaroot = ptr; he = (HE*)ptr; heend = &he[1008 / sizeof(HE) - 1]; PL_he_root = ++he; while (he < heend) { HeNEXT(he) = (HE*)(he + 1); he++; } HeNEXT(he) = 0; }
which looks like a linear algorithm to me.

-Mark


Comment on Re: Perl Internals: Hashes
Select or Download Code
Replies are listed 'Best First'.
Re: Re: Perl Internals: Hashes
by hv (Parson) on Apr 14, 2004 at 22:11 UTC

    ... hv = (HV*)NEWSV(502,0); ...

    ... creates a new hash. We can see that the basic memory needed is 502 bytes

    That's not correct: the first parameter to NEWSV is an identifier, used only for malloc debugging. It is the second parameter that provides the size, and a value of zero means "give me a new empty SV - I'll fill it in myself".

    It is the following call to sv_upgrade((SV *)hv, SVt_PVHV) that actually turns this into something capable of storing a hash. In particular it allocates the struct xpvhv, defined in hv.h:

    struct xpvhv { char * xhv_array; /* pointer to malloced string */ STRLEN xhv_fill; /* how full xhv_array currently is */ STRLEN xhv_max; /* subscript of last element of xhv_array */ IV xhv_keys; /* how many elements in the array */ NV xnv_nv; /* numeric value, if any */ #define xhv_placeholders xnv_nv MAGIC* xmg_magic; /* magic for scalar array */ HV* xmg_stash; /* class package */ I32 xhv_riter; /* current root of iterator */ HE *xhv_eiter; /* current entry of iterator */ PMOP *xhv_pmroot; /* list of pm's for this package */ char *xhv_name; /* name, if a symbol table */ };

    Also the number of buckets does increase in powers of 2, as shown by the example code in demerphq's response. S_more_he() is not relevant to this - it is (I think) fetching a new structure from a malloc arena. The routine you want is S_hsplit() which quite early on does a nice clear:

    register I32 newsize = oldsize * 2;

    Hugo

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://345208]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (20)
As of 2015-07-07 22:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (93 votes), past polls