Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
I thought of the overhead of building the list, but the difference was more than I would think that that takes into account. But I did not measure it. However I just used Devel::Size and it reported 32083260 as the size of the hash and 44083260 as the total_size. (Regardless of how I built that hash.) My guess is that the rest is extra buffering on the part of various memory requests. Part of which is going to be the OS buffering Perl's mallocs, which is likely to be highly OS dependent, and possibly order of request dependent as well.

On the implementation, note that the performance can be improved both by moving from tied data structures to OO access and again (probably more) by using documented tuning parameters. Additionally with a dataset this small, it may be acceptable to switch back to a hash stored in memory (since BerkeleyDB's hashing takes less memory than Perl's). Furthermore the fact that it is general-purpose makes me more willing to take that approach rather than constructing a special-purpose solution that is possibly fragile.

And on caching. I agree heartily that optimizing for perfection is only going to be a niche exercise. I disagree that optimizing for better use of multi-level memory is unimportant though. I predict that you will see programmers over time being pushed towards data structures that are readily tuned to local memory configurations, dynamically if possible. You could view the popularity of databases as a step in this direction. A far bigger step is Judy arrays. Sure, it is very complicated to implement, but you only need to implement once. (Well, Perl would need to reimplement for licensing reasons...) And from the end programmer perspective, it is just a question of replacing the black box of a hash by the black box of another associative array implementation and voila! Things run faster!

A few more years of increasing discrepancies between access times, and the difference will become bigger still. Eventually it will be big enough that common scripting languges will make the switch. Most programmers will never need to understand what actually happened, any more than we really understand most of the ongoing changes that make Moore's law tick in the first place.


In reply to Re: Re: Re: A (memory) poor man's hash by tilly
in thread A (memory) poor man's <strike>hash</strike> lookup table. by BrowserUk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (4)
As of 2024-03-28 20:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found