Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re^3: Sharing XS object?

by ELISHEVA (Prior)
on Mar 09, 2011 at 23:42 UTC ( #892311=note: print w/ replies, xml ) Need Help??


in reply to Re^2: Sharing XS object?
in thread Sharing XS object?

I'm not at all clear about the source of your numbers. Did you benchmark code? Do a big-O analysis? I find it hard to believe that my code even unoptimized has a 4-fold increase in total memory consumption over shared variables when only one thread has a copy of $oData and all the other threads request individual bits of data on an as-needed basis.

The OP did not specify the usage pattern of data in his application, or at least I did not read the post that way. There is nothing there saying that he has very large number of individual items that have to be simultaneously shared between M threads.

Based on his concern about a tree with an unspecified large number of nodes and a sample client appearing to do a search for a particular node named "foo", I made the assumption that he had in mind a quite different scenario. He has a very large data structure, perhaps 1G of data (before Perl overhead). He has threads that need to select bits and pieces from that data structure, e.g. query for a particular node in his tree. At any one time, in any one scope, each thread maybe needs no more than a handful of items out of that huge data structure, lets say 10. Assuming that those 10 items consume 100bytes each, we are talking about no more than a KB of data required by each client thread. Even without optimizations, I can't possibly see how deep copying 1G of data to each thread (10G total) would be better than 1G held by a server thread and 1K held by 10 client threads (1G+10K total). Even if you argued that all that marshalling meant 4x the amount of memory per data item, you still would only have 1G+40K total. That isn't anywhere near 10G, let alone 40G. What am I missing?

Usually, if you actually did benchmarking, you post your results in some detail. Here you did not. Or did you mean me to read your numbers in a rhetorical light - if code is 100-fold slower, if code has 4x the memory.... It is unclear to me.

If your actual point was "Don't be so cavalier about memory-processing time trade-offs because some just aren't worth it.", I agree entirely. It is totally silly to take two weeks to do something, when memory constraints could be solved by buying a few more GB of RAM at 10-120$ a GB depending on quality. However, in many applications, even a 100-fold increase in per-op time is of minimal concern if that op is only a small part of the larger code. Neither of us know what percentage of time the OP is spending querying his tree object relative to other processing he does with whatever data he retrieves.

My point about the marshalling was not to say that you should tolerate it because that is the price you pay. Rather I meant just the opposite: be really sure memory is a real problem because the software solutions to memory constraints are going to cost you.

Update: fixed some typos in numbers.

Update: removed first paragraph - rewrite my post and BrowserUK's and realized he wasn't complaining that my code failed to optimize itself but rather that the whole idea of marshalling was not a tradeoff of memory consumption at the price of CPU.


Comment on Re^3: Sharing XS object?
Download Code
Re^4: Sharing XS object?
by BrowserUk (Pope) on Mar 10, 2011 at 00:22 UTC
    I'm not at all clear about the source of your numbers.

    I benchmarked.

    Your code:

    Result: Memory usage: 64.3MB CPU:4:54 minutes.

    threads::shared:

    Results: Memory usage: 21.1MB CPU usage: 7.1 seconds.

    I find it hard to believe that my code even unoptimized has a 4-fold increase in total memory consumption over shared variables when only one thread has a copy of $oData and all the other threads request individual bits of data on an as-needed basis.

    Don't believe; measure.

    ... I can't possibly see how deep copying 1G of data to each thread (10G total) would be better than 1G held by a server thread and 1K held by 10 client threads (1G+10K total). ...

    You are just guessing. I benchmarked.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      You are just guessing. I benchmarked.

      I hardly call making a reasoned guess "just guessing". When a reasoned guess and experimental evidence conflict, one of three things can be happening: the reasoned guess was logically flawed, the reasoned guess failed to include key information and so was wrong no matter how perfect the logic, or the experimental design is wrong. Those are important questions and they can't just be swept under the rug.

      No matter what the answer there is something to learn. True learning in my experience comes from reguarly checking experience with logic and logic with experience. Inductive and deductive reasoning need to work in tandem. I hardly need to tell you that.

      Secondly, with all due respect, your "benchmark" is comparing apples and oranges. The OP said he wanted to share an object and not just any object but a pre-initialized tree. This is exactly the kind of object threads::shared says it can't handle well. According to the perl docs, sharing a pre-initialized object will wipe out all of the data (see 5.10 and 5.12 bugs on threads::shared). Additionally, share can only do shallow sharing. If you have a complex deeply nested data structure you have to share each component bottom up and of course you have to reinitialize them all as well because sharing will wipe out their data.

      In short, given the current status of threads::shared a true benchmark is simply not possible. Either you level the playing field by applying a server/client solution to a situation where it is clearly overkill. Alternatively, you apply the client/server solution to an complex pre-initialized object, but then you wouldn't be able to use shared.

      Where I will agree with you is that, threads::shared is clearly not nearly as inefficient as I first believed. If they ever do clear up the intialization wipe-out bug, it will be a very powerful thing indeed.

      The very first line of my post (reply 1) made an assertion that is almost certainly false along with a disclaimer that I'm not a threads expert (implication I may be wrong). Given that you are an expert, why didn't you just come out and say:

      threads::shared may not be applicable to the kind of object the OP wants to share, but it isn't nearly as inefficient as you think. In fact in some cases it even uses less memory than a single thread. Here's why ...

      It would have been a lot more helpful to me and like to others as well. Further, your answer would have been right up there in reply 2 where it would be much more likely to be read and gotten exposure.

      I was intrigued by your performance observations and I threw together a few additional benchmarks of my own comparing scripts using three different scenarios: single thread, multi-thread w/o share and shared.

      What is really clear from the results below is that there is simply no way that thread::shared is keeping per-thread copies of data - i.e. unshared threads with a lot of behind-the-scenes automated copying. The unshared multi-threaded script used up so much memory it froze my whole system. Furthermore, the shared multi-threaded script used less memory than a single threaded script. Care to explain what is going on? It certainly isn't what I thought.

      hash memory runtime keys sz vsz wall user sys ThreadShared 1K 22.6 90.1 0:02.6 0:01.6 0:00.1 ThreadCopy 1K 23.2 93.0 0:02:9 0:01.4 0:00.8 No Threads 1K 0.9 3.5 0:00.8 0:00.7 0:00.01 ThreadShared 1M 41.5 166.3 0:12.2 0:10.7 0:00.9 ThreadCopy 1M untestable - froze up my system No Threads 1M 45.0 180.0 0:06.9 0:04.8 0:02.1 *these numbers and their relationships are stable through multiple runs so they are unlikely to be artifacts of some sort of transient system state.
        I hardly call making a reasoned guess "just guessing"

        What is the difference? Esp when you don't preface your guesses as such? And they're frequently wrong?

        I hardly call making a reasoned guess "just guessing".

        Sorry. But when the "reasoning" is nothing more than unfounded, unverified, 'pluck it out of thin air' speculation, there is no difference.

        Care to explain what is going on? It certainly isn't what I thought.

        No need. You just did.

        • You guessed at what the problem might be. You made no attempt to verify your guess.
        • You wrote a mountain of horribly complicated and convoluted code, with a frankly horrible interface.
        • You used an architecture that could never address the stated problem, and that could never be implemented efficiently.

          Your current implementation uses 100% of one core when doing absolutely nothing at all.

        • And then you threw it out there as a "solution", without having made even the most basic of tests that it met its stated goals.

        And now you're going to make a big issue of the way I choose to bring these matters to your attention to divert from the real problem.

        Oh! And BTW. There is no "intialization wipe-out bug". There is a clearly documented breach of new user expectations, that is easy to understand, and that has a documented solution. See shared_clone().

        Albeit that, once you've written a few real applications, you'll find that it is rarely ever necessary to clone a non-shared data structure into a shared copy.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://892311]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (6)
As of 2014-07-22 08:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (106 votes), past polls