Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re^4: Program Design Around Threads

by aeaton1843 (Acolyte)
on Mar 06, 2013 at 21:21 UTC ( #1022094=note: print w/ replies, xml ) Need Help??


in reply to Re^3: Program Design Around Threads
in thread Program Design Around Threads

That's just it though. This isn't true: "Each thread it using a different file, so no conflicts or ordering problems arise. No need for locking or semaphores or synchronisation." There are as many threads running as there are machines and commands. For Example:

tid1 -> machine1 -> "show running-config" takes 30 seconds to get reply.

tid2 -> machine2 -> "show running-config" takes 30 seconds to get reply

tid3 -> machine1 -> "show vlan" takes 5 seconds to get reply.

tid4 -> machine2 -> "show vlan" takes 5 seconds to get reply.

Since tid 3&4 finish first, my 2 output files now have show vlan at the top instead of show running-config because those threads finished before thread 1&2 and wrote contents into file machine1.txt and machine2.txt. Maybe I am missing something?


Comment on Re^4: Program Design Around Threads
Re^5: Program Design Around Threads
by BrowserUk (Pope) on Mar 06, 2013 at 22:03 UTC
    There are as many threads running as there are machines and commands

    Only if you run the different commands for each machine from different threads. That's the wrong way to do it.

    If you look at the code snippets I posted, one thread connect to one machine, opens one file, and runs all the commands for that machine, sequentially. That way, there is no possibility for ordering problems or file conflicts.

    Maybe I am missing something?

    Please read my last post slowly and thoroughly. Making multiple connections to each machine from different threads to run single commands is both inefficient and the source of your problems.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Please don't take this the wrong way. You are hitting me with the clue stick and I want to make sure I understand everything here.

      As I understand your code, it gives each machine in the machines list a thread and then runs through each of the commands in the commands list sequentially.

      What I don't understand is why your statement: "Making multiple connections to each machine from different threads to run single commands is both inefficient and the source of your problems." is true. Except for the last part because we both agree there. When I look at it, let's say I have three commands per device each of which take on average 15 seconds to retrieve. If I create a thread for each command, I wait 15 seconds to finish all three commands for that machine. That is, of course, given a high enough allowed running thread count. If I do it your way I no longer have the synch issues but I am now waiting an extra 30 seconds to finish each machine. Granted you have 10 machines threads running sequentially.

      We agree about the source of my problems. I am trying to further wrap my head around why saving a possible 30 seconds per device in this scenario was a less than optimal approach. That is other than the fact that it causes me a lot of synchronization issues. So be it if that is the answer. At least I found out the right way to go about the problem even if it takes a bit longer to get all the output.

      I do appreciate the comments.

        I am trying to further wrap my head around why saving a possible 30 seconds per device in this scenario was a less than optimal approach. That is other than the fact that it causes me a lot of synchronization issues.

        Okay. Using your numbers: 100 machines; 3 commands; 15 seconds per command; and 10 concurrent threads.

        • Your way;

          You process 10 commands (3 1/3 machines) every 15 seconds: 100 / 3.333 * 15 / 60 = 7.5 minutes.

        • My way:

          I process 10 machines every 45 seconds: 100 / 10 * 45 / 60 = 7.5 minutes.

        But: I've spawned 100 threads and made 100 connections. No locking, nor waiting, nor syncing to slow things down.

        You've spawned 300 threads and made 300 connections. And you had to acquire locks and wait for them.

        Given the IO bound nature of the problem, the locking might not slow you down too much -- assuming that you can get it right without creating dead-locks; live locks or priority inversions et al. -- but you've definitely consumed 2 or 3 times as much cpu; caused 3 times as much network traffic; 3 times the load on the remote machines; and consumed more memory; to achieve the same overall elapsed time.

        It just isn't worth the hassle.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1022094]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (7)
As of 2014-11-27 23:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My preferred Perl binaries come from:














    Results (190 votes), past polls