Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re^6: removing all threads..

by BrowserUk (Pope)
on Dec 10, 2012 at 23:30 UTC ( #1008185=note: print w/ replies, xml ) Need Help??


in reply to Re^5: removing all threads..
in thread removing all threads..

I keenly (and painfully) remember getting segfault after segfault with threads::shared on distributed cluster of rhel 5 servers I had to work with about 6 months back. I eventually gave up and just used stock threads. The application was about 2K lines and I wasn't about to try to find out why threads::shared wouldn't work...everything looked fine, and I was following all the rules.

Could you show me the faulting code; here or privately?


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

RIP Neil Armstrong


Comment on Re^6: removing all threads..
Re^7: removing all threads..
by Tommy (Chaplain) on Dec 10, 2012 at 23:58 UTC

    I'd love to! Unfortunately, it's proprietary, confidental. I would have loved to get peer review of the codebase, but that was forbidden also. Secret sauce.

    Maybe I can cobble together a separate example that fails in a similar way...

    ...You know, actually I can't. I couldn't ever reproduce the problem on anything but those old rhel machines and I don't have access to them anymore (that project is done). The threads::shared code worked on all my own servers.

    What I can do is share a sample (mock) marshalled datastructure "storable" that contains a datastructure which in all likelihood would have made the original code/server puke. The act of passing structs like the dummy storable I'll share with you is what made the segfaults happen.

    I'll need a few hours before I can get that. It might be as long as tomorrow.

    Maybe there's something to be learned here, which I would enjoy very much!

    --
    Tommy
    $ perl -MMIME::Base64 -e 'print decode_base64 "YWNlQHRvbW15YnV0bGVyLm1lCg=="'
Re^7: removing all threads..
by Tommy (Chaplain) on Dec 11, 2012 at 14:47 UTC

    Alrighty. I'll give you the URL to download the mock storable file. You can check it out at the cmd line like so:

    perl -MData::Dumper -MStorable -e '$Data::Dumper::Indent = 1; $a = Storable::lock_retrieve("example.storable"); delete $a->{ $_ }->{data} for keys %$a; print Dumper $a;'

    ...Of course you'll want to NOT delete the data keys permanently because they contain binary data that are pertinent to the original functionality.

    The rest I'll explain in CB, as I'd prefer the contents of the file and its structure to remain private.

    Many Thanks

    --
    Tommy
    $ perl -MMIME::Base64 -e 'print decode_base64 "YWNlQHRvbW15YnV0bGVyLm1lCg=="'

      On my system, this loads the (n)storable file (reconstituted from the Dumper() .txt file you linked), into a shared hash structure and then uses four threads to reconstruct the images and output the size and bounds of those images:

      #! perl -slw use strict; use Data::Dump qw[ pp ]; use GD; use threads; use threads::shared; use Thread::Queue; use Storable qw[ retrieve ]; my $sem :shared; my $d = retrieve 'ex.nstored'; my %data :shared = %{ shared_clone( $d ) }; my $Q = new Thread::Queue; my @threads = map{ async { my $tid = threads->tid; while( my $key = $Q->dequeue ) { lock %{ $data{ $key } }; my $im = GD::Image->new( $data{ $key }{ 'data' } ) or die; lock $sem; printf "[$tid] size: %u x: %u y:%u\n", length( $data{ $key }{ 'data' } ), $im->getBounds; } } } 1 .. 4; $Q->enqueue( keys %data ); $Q->enqueue( (undef) x 4 ); $_->join for @threads; __END__ C:\test\tommy>test [2] size: 26859 x: 345 y:400 [3] size: 31463 x: 341 y:400 [4] size: 35596 x: 341 y:400 [1] size: 36991 x: 345 y:400 [2] size: 33427 x: 345 y:400 [4] size: 32584 x: 341 y:400 [3] size: 34196 x: 345 y:400 [2] size: 35245 x: 345 y:400 [1] size: 36410 x: 345 y:400 [4] size: 29900 x: 341 y:400 [3] size: 30204 x: 345 y:400 [2] size: 34803 x: 345 y:400 [4] size: 35809 x: 345 y:400 [1] size: 24890 x: 345 y:400 [3] size: 37982 x: 345 y:400 [2] size: 27071 x: 345 y:400 [1] size: 29397 x: 341 y:400 [4] size: 35311 x: 345 y:400 [3] size: 39712 x: 345 y:400 [2] size: 36052 x: 345 y:400 [1] size: 35227 x: 345 y:400 [4] size: 29317 x: 345 y:400 [2] size: 35901 x: 345 y:400 [3] size: 39274 x: 345 y:400 [1] size: 37797 x: 345 y:400 [4] size: 41571 x: 345 y:400 [2] size: 36599 x: 345 y:400 [3] size: 39924 x: 345 y:400 [1] size: 33219 x: 345 y:400 [4] size: 32058 x: 345 y:400 [2] size: 34070 x: 341 y:400 [1] size: 36155 x: 345 y:400 [3] size: 34242 x: 345 y:400 [4] size: 30914 x: 341 y:400 [2] size: 35022 x: 345 y:400 [1] size: 35794 x: 345 y:400 [3] size: 31943 x: 344 y:400 [4] size: 37375 x: 344 y:400 [2] size: 39989 x: 345 y:400 [1] size: 31005 x: 345 y:400 [3] size: 27808 x: 345 y:400 [4] size: 35477 x: 341 y:400 [2] size: 42335 x: 344 y:400 [1] size: 28935 x: 345 y:400 [3] size: 36160 x: 345 y:400 [4] size: 35398 x: 341 y:400 [1] size: 39711 x: 345 y:400 [3] size: 34969 x: 345 y:400 [2] size: 33664 x: 345 y:400 [4] size: 35818 x: 345 y:400

      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      RIP Neil Armstrong

        ....And that is why it it should come as no surprise to anyone that you have 130911 XP at perlmonks, probably more once I hit the "submit" button; because you are awesome. Your code speaks for itself.

        ...And with no disrespect I have to say that the awesomeness of that code design is very similar in practice to the approach I used for my own assignment at $work, only I used the threads a little differently. There were two primary threads that shared a queue -- a downloader and a queue processor, and the downloader spawned five to ten threads to process downloads in parallel and return their downloaded payload as scalars to the primary downloader thread. The downloader then put together the datastructure that became the queue which was then consumed by the queue processor thread.

        Everything worked fine on my RHEL 6 machines and my Debian servers. It even worked on the ubuntu workstation I was developing on at the time... But it puked on those proprietary custom-built RHEL 5 beasties. I was heartbroken, because it was clear that I had to abandon the code while it was so pure and elegant. It became much more cumbersome than one might think when I had to start bolting on Storable and working out my own locking mechanisms. Sure, I had the core files and I might have been able to dig deeper into the problem with those, but I didn't even want to start poking at them for fear that the approach was ultimately toxic. Maybe that was a mistake, but I had a looming deliverable and I couldn't take the risk of burning through development time trying to work out an already problematic methodology (at least in that environment). In the end, it was my name on the code when I handed it over, and naturally it had to function without flaw.

        At the end of the day, upper management didn't give an END {} block how the code was written; they cared that it met client requirements, that it required very little RAM, that it ran bullet-proof for months at a time, and that everybody looked good for getting it done on time.

        I always lamented what I saw as a lost opportunity, but took it as a lesson learned to -- at least for me -- beware of threads::shared on old perls.

        Thank you for taking the time to put together that code sample

        --
        Tommy
        $ perl -MMIME::Base64 -e 'print decode_base64 "YWNlQHRvbW15YnV0bGVyLm1lCg=="'

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1008185]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (11)
As of 2014-10-23 17:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (126 votes), past polls