Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: Data structures in Perl. A C programmer's perspective.

by sundialsvc4 (Monsignor)
on Sep 09, 2013 at 14:50 UTC ( #1053025=note: print w/ replies, xml ) Need Help??


in reply to Data structures in Perl. A C programmer's perspective.

When people talk about linked lists “storing data all over the place,” what they are really concerned about is locality of reference, i.e. minimizing the probability of a virtual-memory page fault ... which involves stopping the process in its tracks, doing a physical I/O to retrieve the page from a spinning disk, then restarting the process where it left off.   If you only do that “occasionally,” as the operating-system strives to enable you to do, then all is well.   But if the number of page faults becomes, and stays, excessive, then the system is “thrashing.”   (The shape of the throughput-curve in a thrash situation is elbow-shaped:   beyond the thrash point, performance for pretty much the entire system degrades exponentially.   As they put it in the movie Ghostbusters, the result is “real Wrath of God stuff.”)

I will admit that I really don’t know how Perl stores a vector, or for that matter, a list or a hash.   Concerns about cache-line misses are beyond my control since they are wholly dependent upon the particular hardware upon which my software might find itself running.   But locality-of-reference is something that I can readily keep in mind, as being something that will very-directly affect the execution of my code far more than the low-level perlguts of Perl’s very highly optimized implementation.

You can build any sort of data-structure you may require in Perl, or, more likely, discover that CPAN has already done it for you.   Red-black trees?   Sure:   Tree::RedBlack or Tree:RB, to name two.

Nearly all of the time, when I stumble upon a failed-program, the manifest symptom of the problem is either a timing-hole or (more likely), thrashing.   The solution is to choose a different algorithm to solve the problem.   By the time I first see it, the code has been “diddled” to the point that it is frankly no longer even readable, and nothing helped.


Comment on Re: Data structures in Perl. A C programmer's perspective.
Re^2: Data structures in Perl. A C programmer's perspective.
by BrowserUk (Pope) on Sep 09, 2013 at 15:55 UTC
    When people talk about linked lists “storing data all over the place,” what they are really concerned about is locality of reference, i.e. minimizing the probability of a virtual-memory page fault ... which involves stopping the process in its tracks, doing a physical I/O to retrieve the page from a spinning disk,

    Downvoted: Locality of reference has exactly zero to do with page faults; zero to do with physical IO; and zero to do with spinning disk.

    Your whole post is total misinformation.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re^2: Data structures in Perl. A C programmer's perspective.
by code-ninja (Scribe) on Sep 09, 2013 at 17:33 UTC
      reading the article on Locality of Reference on wiki... I guess sundialsvc4 has a point.

      No, he doesn't.

      In the context of the discussion in this thread -- the effects of locality of reference on the performance of arrays or vectors versus linked-lists -- the salient part of the wiki article is:

      Typical memory hierarchy (access times and cache sizes are approximations of typical values used as of 2013 for the purpose of discussion; actual values and actual numbers of levels in the hierarchy vary): CPU registers (8-256 registers) – immediate access, with the speed of the inner most core of the processor

    • L1 CPU caches (32 KiB to 512 KiB) – fast access, with the speed of the inner most memory bus owned exclusively by each cores
    • L2 CPU caches (128 KiB to 24 MiB) – slightly slower access, with the speed of the memory bus shared between twins of cores
    • L3 CPU caches (2 MiB to 32 MiB) – even slower access, with the speed of the memory bus shared between even more cores of the same processor

      Main physical memory (RAM) (256 MiB to 64 GiB) – slow access, the speed of which is limited by the spatial distances and general hardware interfaces between the processor and the memory modules on the motherboard

    • Since people don't appear to have bothered to watch the full video I linked above, here is the salient part of it (7:46). It'd be worth 8 minutes of anyone's time to watch it.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
Re^2: Data structures in Perl. A C programmer's perspective.
by eyepopslikeamosquito (Canon) on Sep 09, 2013 at 18:59 UTC

    When people talk about linked lists “storing data all over the place,” what they are really concerned about is locality of reference, i.e. minimizing the probability of a virtual-memory page fault
    No. The linked list performance problem discussed in this thread was not caused by virtual-memory page faults, but by CPU cache misses.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1053025]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (7)
As of 2014-08-23 15:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (174 votes), past polls