Re^2: Converting a parallel-serial shell script

Replies are listed 'Best First'.
Re^3: Converting a parallel-serial shell script by kubrat (Scribe) on Sep 22, 2008 at 16:37 UTC
I think you have failed to see the motivation behind my post. I just wanted to share my thoughts and experience in approaching this type of problem. And that is why I haven't given a code example. It is probably my fault because of the way I have expressed myself but Corion seems to have got it right. Your solution proves me wrong. It is neat and elegant and I really like it. But I still think that I make good points when considering the problem of parallelization in more general terms. Finally, you could perhaps shed some light on how what I am talking about is not scalable - after all you could fork as many processes as you need. Portable? I am not sure how portable fork and semaphores really are. Though, fork() works for me on Windows with ActivePerl, it appears to be using threads behind the scenes, so does that mean that you the speed benefits of threads without the disadvantages of having to be careful with shared data? Efficient? I don't think that there will be a noticeable difference between a forking and a threading implementation.	[reply]
Re^4: Converting a parallel-serial shell script by BrowserUk (Patriarch) on Sep 22, 2008 at 21:21 UTC
Finally, you could perhaps shed some light on how what I am talking about is not scalable ... Portable? .... Efficient? I'm not for one moment going to suggest that a forking solution, where (native) forks are available, couldn't be just as scalable and efficient. Or even moreso. But, until you have implemented such a solution, you will not be aware of how hard it can be to make it so. And you won't truely appreciate the simplicity of threaded version, until you see the complexity of the forked version. And when you post your solution, we can try running them on *nix and windows and compare them for those 3 criteria. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. "Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."	[reply]
Re^5: Converting a parallel-serial shell script by kubrat (Scribe) on Sep 26, 2008 at 11:11 UTC
Here is your solution rewritten to use fork instead of threads: use strict; use warnings; use Proc::Fork; use IO::Pipe; use IO::Select; my $kids = 4; # How many children we'll create $SIG{CHLD} = 'IGNORE'; my $sel = new IO::Select(); my @payloadlist; for (my $i=0; $i<@ARGV; $i++) { my $idx = $i % $kids; push @{$payloadlist[$idx]}, $ARGV[$i]; } my $start = localtime(); print "Started $start $$\n"; foreach my $payload (@payloadlist) { my $pipe = new IO::Pipe; child { my $p = $pipe->writer; foreach my $filename (@{$payload}) { my $outFile = toTSV($filename); my $now = localtime(); print "$$ $now - Completed $filename\n"; print $p $outFile."\n"; } exit; }; my $r = $pipe->reader; $sel->add($r); } my $workdone = 0 ; while ( $workdone < @payloadlist ) { while(my @ready = $sel->can_read) { foreach my $fh (@ready) { while (<$fh>) { chomp $_; system("echo importing $_\n"); unlink $_; $workdone++; } $sel->remove($fh); $fh->close; } } } my $end = localtime(); print "Completed $end\n"; sub convert{ $_[0]; } sub toTSV { my $filename = shift; my $outFile = $filename . '.tsv'; open my $fhIn, '<', $filename or warn "$filename : $!" and next; open my $fhOut, '>', $outFile or warn "$outFile : $!" and next; while( <$fhIn> ) { my $tsv = convert( $_ ); print $fhOut $tsv; } close $fhOut; close $fhIn; return $outFile; } [download] Sure, it is more complicated but it has its advantages too. First, nothing is implicitly shared and second it makes you think on what is the best way to divide the workload. The more equally the workload is divided between the nodes the better.	[reply] [d/l]
Re^6: Converting a parallel-serial shell script by BrowserUk (Patriarch) on Sep 26, 2008 at 12:07 UTC
Re^7: Converting a parallel-serial shell script by kubrat (Scribe) on Sep 29, 2008 at 10:55 UTC


Problems? Is your data what you think it is?
	PerlMonks