Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

while reading a file, transfer the same data to two different processes.

by avanta (Beadle)
on May 20, 2010 at 15:01 UTC ( #840957=perlquestion: print w/ replies, xml ) Need Help??
avanta has asked for the wisdom of the Perl Monks concerning the following question:

Hi All,

I am trying to read a log file one transaction (block of lines) at a time. Now I wish to deliver this data to two separate processes by forking. Following is the code I tried to come up with.
my $flag=0; my $pid; while ($record = $fp->getline()) { if(flag == 0) { $flag=1; $pid=fork(); } else { if($pid == 0) { function1($record); } else { function2($record); } } }
In the code "$fp->getline()" is reading the transactions and storing in "$record" in a loop and by conditional forking I transferred "$record" to two different functions in different processes. But here as you can see for first transaction the reading from log file is done only once but for subsequent transactions this is done twice as we have two processes at that moment.

This is where my problem starts. I wish to read the log file only once and send the data to two different processes for processing. And, also the limitation is that I cannot read all the transactions in the log file at once and send it for processing, I have to read it transaction by transaction.

An example of log file content can be:
Date: 2010-05-01 location: NZ Date: 2010-05-02 location: AU Date: 2010-05-03 location: IN
So now $record will have

Date: 2010-05-01
location: NZ

as first transaction and so on.

It would be great if any one can tell me how I can achieve my goal.
Thanks
AvantA

Comment on while reading a file, transfer the same data to two different processes.
Select or Download Code
Re: while reading a file, transfer the same data to two different processes.
by JavaFan (Canon) on May 20, 2010 at 15:34 UTC
    Why fork? What's wrong with:
    while ($record = $fp->getline()) { function1($record); function2($record); }
      I need to "fork" because I dont want any interference of the two functions. i.e. if "function1" crashes it should not affect "funtion2". To put it this way, my "function1" is in the current script from where I am running the process and "function2" is collection of different procesing steps which may be huge so if "function1" dies due to some reason the whole process will crash and "function2" will die along with it, and I dont wish to have a scene like this. I wish to have "function2" continue.
        But the other way is fine? That is, function1 may crash and take with it the entire process?

        What you could do is for each line, fork twice. First child calls function1, second child calls function2. Both children exit afterwards. Parent waits till children are done before reading the next batch. Then the process will not be stopped if either function crashes on a batch.

Re: while reading a file, transfer the same data to two different processes.
by superfrink (Curate) on May 20, 2010 at 16:15 UTC
    You may want to create a couple pipes and then fork. The processes can then communicate back and forth. Have a look at the perldoc for pipe.
Re: while reading a file, transfer the same data to two different processes.
by weismat (Friar) on May 20, 2010 at 16:46 UTC
    I would work with threads and use two shared queues to send the data to the threads.
    Every fork has a rather big overhead.

      Well, every thread has some overhead too:

      use strict; use warnings; use threads; my $str = 'x' x 32_000_000; sub fork_wait { my $pid = fork; if ($pid) { wait; } else { exit 0; } } sub create_thread_join { threads->create( sub { } )->join; } use Benchmark qw( cmpthese ); cmpthese - 3, { Forks => \&fork_wait, Threads => \&create_thread_join, }; __END__ Rate Threads Forks Threads 27.7/s -- -73% Forks 103/s 272% --

      Even without this 32M string threads not win. This is Linux, perhaps on Windows you will get other result.

        perhaps on Windows you will get other result.

        On windows, those two pieces of code are all but identical (except the fork version leaks scalars). Hence, there is little difference between them. I'd be interested to see the results without the 32M scalar if you've a moment?

      His reason for using fork isn't parallelisation, it's crash-resistance. Threads won't help at all.
        Shouldn't he be using eval in that case?
Re: while reading a file, transfer the same data to two different processes.
by almut (Canon) on May 20, 2010 at 16:57 UTC

    Maybe something like this?

    open my $func1, "|-", q{perl -ne 'print "function1 (PID $$): processin +g $_"; sleep 1'}; open my $func2, "|-", q{perl -ne 'print "function2 (PID $$): processin +g $_"; sleep 3'}; use IO::Handle; $func1->autoflush(); $func2->autoflush(); while (my $record = <DATA>) { print $func1 $record; print $func2 $record; } __DATA__ foo bar baz

    Output:

    $ ./840957.pl function1 (PID 21667): processing foo function2 (PID 21668): processing foo function1 (PID 21667): processing bar function1 (PID 21667): processing baz function2 (PID 21668): processing bar function2 (PID 21668): processing baz
Re: while reading a file, transfer the same data to two different processes.
by nagalenoj (Friar) on May 22, 2010 at 06:19 UTC
    Can't you use some sort of IPC(sockets, FIFOs, Pipes.,). So, you can send the read data to both processes.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://840957]
Approved by superfrink
Front-paged by kyle
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (18)
As of 2014-10-20 18:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (89 votes), past polls