comment on

Hi BrowserUk,

Basically there is not much shared data between sub tasks. And I can avoid much of that. But only thing they need to share (read and write both) using file is - how many processes of one type are in execution- just number like if there are 5 active sub task of one category or 6. Based on the max concurrency limit of overall tasks and sub tasks type, it can start a new sub task and increment that number in that shared file. The read and write operations to shared file will be over in fraction of second. And to avoid any deadlock, I can use wait of some seconds (while loop with exit condition) in case a process is not able to read or write to the shared file when it wants to do so. For just read operation on shared files, I feel there is no need to worry about synchronization between sub tasks, simple read retry would be sufficient.

Based on the detailed explanation you have given, I feel that multiprocessing using fork() would be more appropriate then threads. I thought of using threads due to only one reason - it would have avoided significant code change and still I would have benefited by parallel processing of sub tasks.

To answer query about system - it is SUN high end server with 40+ CPUs and 48 GB RAM with Solaris 10 OS. Perl modules use APIs of enterprise product to perform various operations related to the product (Perl is handling both automation and complex business logic) using input feed from CSV file.

Currently it is kind of sequential approach with only parallel processing at the Product API level using fork(). I have to change it to end to end parallel processing (it is possible to logically group sub tasks) to reduce processing time, which is heavily dependent on enterprise product. But I see parallel processing giving around 40-50% (10 Hours) reduction in overall processing time, hence this question.

I have to confess, I learned some really deep things from the the answers given to my question! And now I feel that fork() would be better option in this case, with only overhead of writing a lot more code to get this enabled :-).

Thanks a lot to you and others for giving valuable suggestions and insight into this parallel processing options using threads and fork().

Best regards, Pawan

In reply to Re^2: ithreads or fork() what you recommend? by pawan68923
in thread ithreads or fork() what you recommend? by pawan68923

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


There's more than one way to do things
	PerlMonks