Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re^5: Multiprocess - child process cannot be finished successfully

by bliako (Monsignor)
on Sep 09, 2022 at 21:50 UTC ( [id://11146797]=note: print w/replies, xml ) Need Help??


in reply to Re^4: Multiprocess - child process cannot be finished successfully
in thread Multiprocess - child process cannot be finished successfully

It really depends on what kind of processing you are doing in the child and whether $filename should be renamed only outside the loop, because that's what you just did when you added that missing curly bracket. Is each child writing to the $filename cumulatively and only when all children are finished writing to it you want to rename it? To me it does not sound right. But I don't know what you are trying to do and your code makes it difficult to guess. And the alarm call, setting the alarm(60) and then canceling it (alarm(0), inside or outside the loop?) makes it even more difficult ... Now that you have the fork semantics working, try and isolate what you want to go inside the alarm in a new function and ask yourself when should files be closed, alarms cancelled and files renamed. Inside or outside the loop it may not matter for Perl, it may not be a syntax error but you may end up blowing up your cpu :) So, step back and think the problem on paper without writing code for a while.

Replies are listed 'Best First'.
Re^6: Multiprocess - child process cannot be finished successfully
by wonderG (Novice) on Sep 10, 2022 at 02:04 UTC
    Thank you very much. I want to rename the name of $filename after all child process finished writing. Each child process will write to $filename cumulatively. The purpose of the script is to test network connection for every host. In the "for" loop, itis to create child processes to test the connection concurrencily, and meanwhile write to $filename. The alarm is for network connection timeout, if after 60s, the connection is still unable to setup, then will go to $SIG{ALRM} to print error message.
    use strict; use Fcntl qw(:DEFAULT :flock); use FindBin qw($Bin); use lib "$Bin"; use JSON; use POSIX; my $JSON_FILE = $ARGV[0]; open my $file, '<', $JSON_FILE or die("$JSON_FILE Could not open file: + $!\n"); my $data; eval { $data = decode_json(<$file>); print $data; }; if ( $@ ){ print "Json file parsing failed.\n"; } my $targets = $data; my $filename = "/var/log/a.tmp"; my $newfilename = "/var/log/a.log"; open (my $fh, ">", $filename) or die "Could not open file '$filename' +$!\n"; my $time = strftime('%d-%m-%Y %H:%M:%S',localtime); for (my $index=0; $index <= $#$targets; $index++){ my $array = $targets->[$index]{label}; my $hostaddress = $array->{ip}; $time = strftime('%d-%m-%Y %H:%M:%S',localtime); defined(my $pid = fork) or die "fork failed: $!"; unless( defined($pid) ) { flock $fh, LOCK_EX; print $fh "Can't execute check: can't fork(): $!\n"; flock $fh, LOCK_UN; } unless($pid) { # child print "child: $$\n"; my $var; eval { $var = #setup a network connection...; $SIG{ALRM} = sub { flock $fh, LOCK_EX; print $fh "\n"; flock $fh, LOCK_UN;}; alarm(60); }; if( $@ ) { flock $fh, LOCK_EX; print $fh "Connection failed $@\n"; flock $fh, LOCK_UN; } unless( $var ) { flock $fh, LOCK_EX; print $fh "Connection failed"; flock $fh, LOCK_UN; } else { flock $fh, LOCK_EX; print $fh "Connection failed\n"; flock $fh, LOCK_UN; } close($fh); exit; # exit child process } waitpid($pid,0); } print "Parent Process"; close($fh); # Close file handler $fh rename ($filename, $newfilename) or die "Error in renaming $!"; alarm(0);
      I'm still not really sure what you are trying to do. Its been a long time since I played with any network code, so I wrote a little code to launch 4 children and then wait for them all to finish. Then rename the output file. It's not clear if perhaps you intend to have X processes running in parallel at all times? Or not? Code would need some mods for that.

      Your code here looks backwards. You need the alarm because presumably the network setup is a blocking operation? You need to set up the alarm and turn it on BEFORE you enter into code that potentially will "freeze".

      eval { $var = #setup a network connection...; $SIG{ALRM} = sub { flock $fh, LOCK_EX; print $fh "\n"; flock $fh, LOCK_UN;}; alarm(60); };
      I would suggest you make a safe_print() like I did to cut down on the "noise".
      I don't understand what these targets are? my $array = $targets->[$index]{label};
      This simple test runs on Win10. Should be ok on Unix also.

      Update: Added a time stamp to original code. preserved in readmore tags.

      use strict; use warnings; use Fcntl qw(:flock); use File::Copy 'move'; use POSIX "sys_wait_h"; #for waitpid FLAGS use Time::Local; $|=1; my @child_sleeps = qw(10 15 12 5 12); my $start_epoch = time(); # Fire off number of child processes equal to the # number of elements in @child_sleeps; # Then the parent who started these little guys, # goes into a blocking wait until they all finish # Each child can return a status code via exit($code_number). # However, this code doesn't use that and instead assumes # that children and the parent are all writing to a common # file shared via a cooperative flock $SIG{CHLD} = 'IGNORE'; open(my $fh_log, '>>', "Alogfile.txt") or die "unable to open Alogfile +.txt $!"; #$fh_log->autoflush; #not needed this is automatic before locking or +unlocking a file! foreach my $sleep_interval (@child_sleeps) { if(my $pid = fork) { # parent safe_print ($fh_log, "Spawned child $pid lasting $sleep_interv +al seconds\n"); } elsif(defined $pid ) # pid==0 { # child safe_print ($fh_log, "This is child pid $$. I will sleep for $ +sleep_interval seconds!\n"); sleep($sleep_interval); safe_print ($fh_log, "Child $$ time is up!! I am gonna croak!\ +n"); exit(0); } else { # fork failed pid undefined die "MASSIVE ERROR - FORK FAILED with $!"; } } ### now wait for all children to finish, no matter who they are 1 while wait != -1 ; # avoid zombies this is a blocking operation safe_print ($fh_log, "Parenting talking...all my children are dead! Ho +oray!\n"); close $fh_log; #must close file before renaming it! unlink "Alogfile.back" if ( -e "Alogfile.back"); move "AlogFile.txt", "AlogFile.back" or die "unable to rename log file +! $!"; print "A happy ending!\n"; sub safe_print { my ($fh, @text) = @_; my $now_epoch = time(); my $delta_secs = $now_epoch - $start_epoch; flock $fh, LOCK_EX or die "flock can't get lock $!"; print $fh "$delta_secs secs: $_" foreach @text; print "$delta_secs secs: $_" foreach @text; flock $fh, LOCK_UN or die "flock can't release lock $!"; } __END__ Contents of AlogFile.back after a run: 0 secs: Spawned child -10680 lasting 10 seconds 0 secs: This is child pid -10680. I will sleep for 10 seconds! 0 secs: Spawned child -23840 lasting 15 seconds 0 secs: This is child pid -23840. I will sleep for 15 seconds! 0 secs: Spawned child -14556 lasting 12 seconds 0 secs: This is child pid -14556. I will sleep for 12 seconds! 0 secs: Spawned child -13972 lasting 5 seconds 0 secs: This is child pid -13972. I will sleep for 5 seconds! 0 secs: Spawned child -18600 lasting 12 seconds 0 secs: This is child pid -18600. I will sleep for 12 seconds! 5 secs: Child -13972 time is up!! I am gonna croak! 10 secs: Child -10680 time is up!! I am gonna croak! 12 secs: Child -14556 time is up!! I am gonna croak! 12 secs: Child -18600 time is up!! I am gonna croak! 15 secs: Child -23840 time is up!! I am gonna croak! 15 secs: Parenting talking...all my children are dead! Hooray! A happy ending!
      The code that you are using to test the network connection would be highly relevant here. There are timeout values for the typical socket connection request, etc. You may not need to use an ALRM.

      Also, I don't see much need to write to the log file in case of flock(),fork() failure, perhaps you will do just as well with "die" for those super massive errors? Something is super wrong if the O/S cannot fork!

Re^6: Multiprocess - child process cannot be finished successfully
by wonderG (Novice) on Sep 10, 2022 at 08:08 UTC
    I also run a test, if the network cannot be setup successfully, it is not concurrency.. I pasted part of the logs as below, each line of logs were print one by one. But I want to get all the connection status concurrency...
    10-09-2022 06:24:10 1.1.1.1 Can't connect to 1.1.1.1 10-09-2022 06:25:10 2.2.2.2 Can't connect to 2.2.2.2 10-09-2022 06:26:10 3.3.3.3 Can't connect to 3.3.3.3 10-09-2022 06:27:11 4.4.4.4 Can't connect to 4.4.4.4 ...

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11146797]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (4)
As of 2024-04-20 00:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found