Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

Program Hangs

by anbarasans85 (Initiate)
on Dec 04, 2010 at 23:00 UTC ( #875424=perlquestion: print w/replies, xml ) Need Help??
anbarasans85 has asked for the wisdom of the Perl Monks concerning the following question:

Hello All, I am Anbarasan and I am new to perlmonks.

Problem: My script hangs during execution. Objective: I read a file (trace file of a simulation), parse it and calculate values. This trace file is the input to my script. Scenario where no problem occur:I have small trace file (with 800 lines) and run my script on that. The script produces the desired output. Scenario where problem occurs: The trace file is big (approx: 100MB) and the program stops at some point. I use another trace file which is around 80MB the program runs for some more time (than the 100MB file) and freezes. If I use another trace file which is approximately 107MB the program freezes very quickly.

I am using SUSE Linux and I use System Monitor. Before running the program CPU utilization is fine but after I run the program, when the program freezes, CPU utilization reaches 100% in one of the CPUs (I see 2 CPUs CPU0 and CPU1). The CPU is normal.

Kindly help me to make my program work. If needed I can share the code with you.

Replies are listed 'Best First'.
Re: Program Hangs
by Ratazong (Monsignor) on Dec 04, 2010 at 23:09 UTC

    Without seeing the code, the advice can only be general:

    You should investigate on which line of the 100MB-file your script fails. Using 100% of the CPU is an indication that the program enters an "endless" loop. Once you know the last line the script processed normally - and, more importantly, the line where you script hangs - you can investigate why it hangs... and implement countemeasures.

    If you want the help of us monks, this information is essential (which line won't be processed, which code do you use to process that file)

    All the best, Rata
Re: Program Hangs
by BrowserUk (Pope) on Dec 05, 2010 at 11:09 UTC

    Your algorithm is fundamentally flawed. Assume for a moment that the input file consisted of the following:

    1 S AGT 3 S AGT 5 R AGT 7

    Your algorithm would proceed as follows:

    1. The outer loop scans until it reads record 2;
    2. The inner loop then scans on until it reads record 6;
    3. Then you reset the pointer to the start of record 3; and exit to the outer loop;
    4. The outer loop scans on to record 4;
    5. The inner loop scans on to record 6;
    6. Then you reset the pointer to the start of record 5; and exit to the outer loop;
    7. The outer loop scans on to EOF.

    Note that the above has matched both S records with the same R record.

    Now imagine that your 100MB file has (say) 1000 S-records near the top of the file, and a single R-record 10 million lines further on. Then each of those 100 S-records would get matched against that single R-record; but you would have to re-read the 10 million intervening records over and over to do so.

    1e3 * 10e6 == a very long time. It might well look like it had hung.

    If the S-record/R-record pairs should appear sequentially:

    ... S AGT ... R AGT ... S AGT ... R AGT ...

    Then you should not be resetting the pointer after you've found the matching R-record.

    If however, the S-record/R-record pairs can be interleaved:

    ... S AGT #1 ... S AGT #2 ... R AGT #1 ... R AGT #2 ...

    Then you would have to be maintaining two pointers: one telling you where to start looking for the next S-record; and one telling you where to start looking for the next R-record. Whilst this could be made to work, there are other, simpler approaches to this latter problem.

    For example: a simple, single loop down the file looking for both types of record. When an S-record is found, you push it onto an array; when an R-record is found, you shift off the least recently found S-record from the array and do your calculations.

    my @Srecs; while( <FILE> ) { if( /^S AGT/ ) { push @Srecs, $_; } elsif( /^R AGT/ ) { die "R-record with no corresponding S-record" unless @Srec; my @sRecBits = split ' ', shift @Srecs; my @rRecBits = split ' ', $_; ## Calculate stuff } } if( @Srecs) { printf "There were %d unmatched S-records\n"; scalar @Srec; } ## Other stuff

    This single pass algorithm is far more efficient, less error prone and detects mismatches.

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Hi All,

      The program is working fine. I modified it using PERL hash. It is fast and I am happy.

      This is a very late reply but I thought i can share what I did.

      The logic is:
      1. Look for AGT events. Then make a hash key by combining event 's' or 'r' with packet id. ( hash_key = s<pktid> or r<pktid>) and store the corresponding time as hash value)
      2. The above runs through all lines of the trace file and build hashtable.
      3. then calculate delay for every packet_id.

      while(<DATA>) { @x=split(' '); if($x[3] eq 'AGT') { $hash_key = $x[0].$x[5]; $hash_value = $x[1]; $hash_table{"$hash_key"} = $hash_value; if($x[5] gt $last_packet_id) { $last_packet_id = $x[5]; } } } $simulation_time = $x[1]; for($packet_id = 0;$packet_id<=$last_packet_id;$packet_id++) { $hash_key = "s".$packet_id; $enqueue_time = $hash_table{"$hash_key"}; $total_enqueue_count++; $hash_key = "r".$packet_id; if(exists($hash_table{$hash_key})) { $receive_time = $hash_table{"$hash_key"}; } else { $total_drop_count++; next; } $total_receive_count++; $delay = $receive_time - $enqueue_time; $sum_of_delay = $sum_of_delay + $delay; # $delay = $delay * 1000; # print("\nDelay:$delay"); }

      Thanks a lot for your help.

      Hello BrowserUk and all,

      Thanks for your comments. As you said, the program is not hung. It is searching for a corresponding 'r' receive event for a 's' send event. Since it does not find any corresponding 'r'(or may find it well below) it looks like the program is hung.

      Yes I am aware that the program is inefficient. In spite of that I wanted this particular program to work. Then use some efficient method like the method pointed by BrowserUk or use some PERL hash.

      The scenario for the simulation is very simple and I thought this simulation scenario will not produce packet loss event but it did. Since there was loss of packets it resulted in 'r' events missing. This assumption about "no loss" caused the confusion about program hang.

      Any how thanks for your help. I will post my program here when I implement some good algorithm for this delay calculation.

Re: Program Hangs
by liverpole (Monsignor) on Dec 05, 2010 at 02:05 UTC
    Hi anbarasans85,

    Seconded what ww said.

    You might be able to narrow down the point where the program is stuck by trapping the INT (interrupt) signal, then typing ^C when the CPU goes to 100%, and looking at the stack trace.  (See caller for more details).

    For example:

    use strict; use warnings; $SIG{INT} = sub { my (@stack, $level); while (1) { my ($pkg, $fn, $ln, $sub) = caller($level++); if (!($pkg or $fn or $ln or $sub)) { for (my $i = 0; $i < @stack; $i++) { print " " x $i, $stack[$i], "\n"; } exit; } unshift @stack, "-> $pkg: $fn (line $ln) sub $sub"; } }; # Your program here ....

      Hi All, Thanks for the reply. I read the comment about minimum sample code to be presented. But I am posting all here. (I am not able to edit). The comments will be useful in understanding the code.
      #!/usr/bin/perl # #Program to calculate end to end delay. # #1. Program reads the trace file, finds for combination of 's' source +event and 'AGT' packet type. #2. For this combination, the program takes the packet id and timestam +p. #3. Within the same file, the program searches for a combination of 'r +' receive event and 'AGT' packet type. #4. For this combination, the program takes the packet id and compares + with previous packet id. If they are same, timestamp is noted. #5. Difference between step 4 timestamp and step 3 timestamp gives the + end to end delay for that packet. #6. Delays are aggregated and average delay is found. # #use strict; use warnings; $SIG{INT} = sub { my (@stack, $level); while(1) { my ($pkg, $fn, $ln, $sub) = caller($level++); if (!($pkg or $fn or $ln or $sub)) { for (my $i = 0; $i < @stack; $i++) { print " " x $i, $stack[$i], "\n"; } exit; } unshift @stack, "-> $pkg: $fn (line $ln) sub $sub"; } }; #Input trace file my($infile) =$ARGV[0]; #Keep track of variables my($enqueue_time) = 0; my($receive_time) = 0; my($packet_id) = 0; my($delay) = 0; my($total_receive_count) = 0; my($sum_of_delay) = 0; my($average_delay) = 0; my($simulation_time) = 0; my($file_position) = 0; my (@x); open(DATA,"<","$infile" ) || die "could't open $infile$!"; while(<DATA>) { @x=split(' '); if(($x[0] eq 's') && ($x[3] eq 'AGT')) { $file_position = tell(DATA); $enqueue_time = $x[1]; $packet_id = $x[5]; while(<DATA>) { ===LINE 58: @x=split(' '); if(($x[0] eq 'r') && ($x[3] eq 'AGT')) { if(($x[5] == $packet_id)) { $receive_time = $x[1]; $total_receive_count++; $delay = $receive_time - $enqueue_time; $sum_of_delay = $sum_of_delay + $delay; #Following is for debug. $delay = $delay * 1000; print("\nDelay:$delay"); last; } } } #Continue to search for next 's' event from where the previous + 's' was found. #So move to the same line where previous 's' event was found. ====LINE 78: seek(DATA,$file_position,SEEK_SET); } #While(<DATA>) takes care of moving to the next line. } $simulation_time = $x[1]; print("\n Simulation Time = $simulation_time seconds"); print("\n Total Receive Count = $total_receive_count"); if($total_receive_count != 0 ) { $average_delay = $sum_of_delay / $total_receive_count; $average_delay = $average_delay * 1000; print("\n Average End to End Delay = $average_delay milliseconds") +; } else { print("\n No packet received."); } print("\n"); print("\n"); close DATA; exit(0);

      About the statements like "very quickly" "for some more time"...:
      1. When I use a trace file with 800 lines, the script successfully completes without any problem.
      2. When I use 80 MB trace file, it hangs. It stops within 12.456 seconds after invoking the script. stops at line 58. main: (line 58) sub main::__ANON__
      3. When I use 100MB trace file, it hangs. It stops within 1.982 seconds after invoking the script. Stops at line 58. Same stack message.
      4. When I use 107 MB trace file, it hangs. It stops within 0.4 seconds after invoking the script. Stops at line 58. Same stack message.

      One comment: If the program works for small trace file, why it does not work for larger file?
      Using "use strict" command throws error "Bareword "SEEK_SET" not allowed while "strict subs" in use at line 78."

      I have marked line number in the code: LINE <num>
        Hi anbarasans85,

        It's not easy to tell without access to your specific data.

        However, one thing that strikes me as dangerous is:

        while(<DATA>) { #... if(($x[0] eq 's') && ($x[3] eq 'AGT')) { $file_position = tell(DATA); #... while(<DATA>) { #... } seek(DATA,$file_position,SEEK_SET); } }

        Which just has "infinite loop" written all over it.  As soon as you're done reading from <DATA>, you jump back to the marked $file_position which you previously saved.

        Try putting a print statement in the outer loop, and see if that isn't the problem.

        Using "use strict" command throws error "Bareword "SEEK_SET" not allowed ...

        SEEK_SET is a constant imported from Fcntl (see seek, first paragraph):
            use Fcntl qw(SEEK_SET);
            use Fcntl qw(:seek);

        Import it and restore use strict;

        Update: So, WHENCE are you seeking if SEEK_SET is not defined in your program? My guess is it's the same as using 0, but that's not tested.

Re: Program Hangs
by ww (Bishop) on Dec 05, 2010 at 01:26 UTC

    "If needed I can share the code with you." Yes, some code is needed, but probably not your entire script. Rather, edit the script down to a minimum sample that creates (demonstrates) the same problem. "Show us the code" is the best advice at the moment; as ratazong says, we can't offer much more than vague generalities without code and some sample data.

    Also note that descriptions like "very quickly," "for some more time," and "stops at some point" are only vague generalities and thus not all that helpful in our efforts to help you. After all, is "very quickly" on the order of microseconds or days?

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://875424]
Approved by ww
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (7)
As of 2016-09-30 11:11 GMT
Find Nodes?
    Voting Booth?
    Extraterrestrials haven't visited the Earth yet because:

    Results (563 votes). Check out past polls.