comment on

This happens because, when 10 threads are running parallely and i wait for a 2nd thread, suppose, to join. Meanwhile 3rd or 4th or 8th(anything till 10) might have finished running. Once 2nd joins and main thread tries to call join() on next thread object, in the array(either returned by threads->list or i keep thread object in a array), which no more exists, or no clue whether the thread is joinnable.

This is a red herring. When non-detached threads end, they wait until you call join on them before being cleaned up. You do not need to check anything before calling join. If the thread has ended before you call join, it will return immediately. If the thread is still running, it will block until the thread ends. This is how they are designed to work. Your problem lies elsewhere.

You keep posting these snippets of code, but they are so dependant upon the rest of the program that you are not posting, that it is impossible for anyone to run them in order to try and help. They are also full of lumps of commented out code, rambling comments that wrap 3 times and worst of all, all this insane "logger" crap which completely obscures the structure of the code. It is not surprising that you cannot get this to work as you cannot see what it is that you own code is doing.

So, a lot of critisism which you may not like, so I'll try to show you that the critisism can help.

Here is your code above, with all the crap stripped away, a few extra spaces and blank lines etc.

sub _replicate{
    my $ref = shift;
    
    foreach my $sc ( @{ $ref } ) {
        next unless (defined $sc);

        mkdir( $LOG_FOLDER . "/" . $sc->{sc_name} );
        my @thr_arr = ();

        foreach my $robj( @{ $sc->{ rsync} } ){
            $robj->{thr} => 'running';
           
            my $th = threads->create( \&worker, $robj );
            push @thr_arr, $th->tid;
        }
        $_->join for @thr_arr;
    }
}

sub worker{
    my $robj = shift;
    
    my( $rsync, $server, $from, $to ) = @{ $robj->{ elements } };
    my $alt_server = $RSYNC_CONN_STR_2;

    for my $i ( 0 .. $MAX_REPL_ATTEMPT ){
        my $rsync_cmd = $rsync . $server . $from . $to;
        
        `$rsync_cmd`;

        if ($?){ 
            $rsync_cmd = $rsync . $server . $from . $to;
            $server = ( $i % 2 ) ? $RSYNC_CONN_STR_1 : $RSYNC_CONN_STR
+_2; 
            sleep(120);
        }
        else{
            $robj->{status} = "Completed";
            last;
        }
    }
    $robj->{thr} = 'done';
}
[download]

Now the structure and essentials of the code are clear and easy to follow, and it is easy to pick out several problems:

You create your thread here my $th = threads->create( \&worker, $robj );,
but then you do push @thr_arr, $th->tid;
which means that @thr_arr contains a list of thread ids, not thread objects!
which means when you come to try and join your threads, you are trying to call the method join() on a number and that obviously isn't going to work.
Now that should not segfault. You should be seeing an error message, (assuming you are using strict & warnings) along the lines of:
```
Can't call method "join" without a package or object reference at...
[download]
```
.
And you shoud have seen that error the very first time you ran this code, and every time you've run it since.
Instead of fixing the actual problem, you've guessed as to what the cause might be and basically wasted your time trying to fix a problem that doesn't exist.
Please note: I'm not saying your code will work once you've fixed that problem. I am saying that it will never work until you do.
You are calling rsync using backticks: `$rsync_cmd`;, but you are doing nothing with any ouput produced.
That means you are having the system build a pipe and collect the output, and then just throwing it all away.
Have you heard of system?
And now for the biggest problem, the design of your code in _replicate().
You have 2 nested loops. Within the outer loop you run the inner loop which creates a bunch of threads all trying to contact same server.
And then block until that finishes, with several retrys and 120 second waits, before starting another bunch of threads to contact the next server. This is fundamentally bad design.
If one server is slow, or broken, with all your threads trying to talk to the same server, you will basically be doing a lot of nothing, when you could be talking to one or more of the other servers in parallel.

If you are going to be doing multi-processing, whether through threads or forks, the secret is to start simple. Write your worker subroutine in a standalone, single threaded program, and make it work.

Once you've make sure it is working that way, then try running two copies concurrently using threads or forks.

Once you've got that working reliably, only then try to scale it up!

You asked whether you should move to using forks. If you have a native fork on the platform you are working on, then there is nothing obvious from the code you have posted that requires threads, so you probably could use forks.

But, on the basis of the code you've posted, I think that you are likely to have just as many problems trying to work in that environment as you are having with threads.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

In reply to Re^3: Segmentation fault: problem with perl threads by BrowserUk
in thread Segmentation fault: problem with perl threads by katharnakh

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Think about Loose Coupling
	PerlMonks