Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

Time::HiRes strange behavior

by golden.radish (Initiate)
on Feb 05, 2010 at 22:21 UTC ( #821657=perlquestion: print w/replies, xml ) Need Help??

golden.radish has asked for the wisdom of the Perl Monks concerning the following question:

I'm having a problem with Time::HiRes, specifically the gettimeofday function. Below I've given a code sample and instruction on how to duplicate the problem. It will only take a few seconds to confirm/refute the problem, if you have the time. Please and thank you in advance.

Context: The purpose of this code is simply to prepend a timestamp (with ~microsecond accuracy) to any lines, entries, stdout, whatever you'd care to feed the script.

#!/usr/bin/perl ### use Time::HiRes qw(gettimeofday); $|=1; while (<>) { $timestamp = sprintf("%d.%06d", gettimeofday()); print "$timestamp $_"; } exit; --- touch testfile tail -f testfile | _in another window_ echo 'test' >> testfile (wait a while) echo 'test' >> testfile (wait a few seconds) echo 'test' >> testfile (wait a little while) echo 'test' >> testfile
Observe the microsecond value on the output (it will be something like this, not exactly):
1265405350.420517 test 1265405354.420852 test 1265405356.421037 test 1265405362.421530 test

The microsecond value from 'gettimeofday()' increments by a small amount each time a new line is added, and entirely inaccurately. As you add more output to the file, you'll see that the microsecond value increments for every entry, but entirely not what it should be.

Now contrast this output with the behavior when you simply press enter a few times if you run the script by itself:

:~/bin/ 1265405711.519744 1265405711.784186 1265405711.939817 1265405712.144596 1265405712.355313 1265405712.648788

Which is working as expected. The values increment, but not predictably, and certainly not by very small values as observed above.

Tested & verified this behavior on:
perl, v5.10.0 built for i486-linux-gnu-thread-multi
perl, v5.8.8 built for i486-linux-gnu-thread-multi
perl, v5.8.4 built for sun4-solaris-64int

Any thoughts would be greatly appreciated. What is causing this strange behavior when piping the output from a tail -f into the script?


Replies are listed 'Best First'.
Re: Time::HiRes strange behavior
by markkawika (Monk) on Feb 05, 2010 at 22:35 UTC

    This is because of how tail -f works.

    The way it works is that it reads to the end of a file, outputs that to standard output, and then does a sleep(1);. Then it looks at the file again to see if it's grown. If it has, it reads what's new, outputs it to standard output, then sleeps again. If there's nothing new, it just sleeps.

    Either way, it is waking up exactly at 1-second intervals to see if there's new data. That's why the microsecond number is slowly incrementing; the small increment represents the amount of time it took to wake up, read (or not read) data, and then go back to sleep.

    tail does this because there is no standard mechanism in Unix for a process to know when a specific file has been changed. There's no signalling mechanism. Some OSes do support this (For example, FreeBSD and OS X have the kqueue system, which will work this way), but it is not standard.

      Thanks markkawika. What you've described does indeed appear to be largely related to, if not the entire source of, the problem.
      Of course, now I have to figure out how to satisfy the original goal anyway, but that's another matter entirely. :)
      Thanks for the prompt reply.
Re: Time::HiRes strange behavior
by BrowserUk (Pope) on Feb 05, 2010 at 23:04 UTC

    When you write to the pipe, the other program won't get anything until the pipe's buffer (4096 bytes) fills.

    Many of your slow, short line inputs will be read into a buffer as a single unit, and will then be timestamped as they are read (as lines) from the buffer. Since reading lines from a buffer is very fast, the timestamps will be very close together for each buffer load of lines.

    The effect is that the perl script gets bursts of input all very close together, but then a long gap until the buffer fills again. To demonstrate this, I fed a file with timestamps once a second lke this:

    perl -MTime::HiRes=time -E"$|++;say scalar localtime while sleep 1" | tee junk.log Fri Feb 5 22:42:58 2010 Fri Feb 5 22:42:59 2010 Fri Feb 5 22:43:00 2010 Fri Feb 5 22:43:01 2010 Fri Feb 5 22:43:02 2010 ...

    And then tailed the file and added a second hires timestamp like this:

    tail -f junk.log |perl -MTime::HiRes=gettimeofday -E"$|++;say gettimeofday().' '.$_ +while <>" 1265410130.569 Fri Feb 5 22:46:07 2010 1265410130.56927 Fri Feb 5 22:46:08 2010 1265410130.5694 Fri Feb 5 22:46:09 2010 1265410130.5695 Fri Feb 5 22:46:10 2010 1265410130.56959 Fri Feb 5 22:46:11 2010 ... 1265410131.50094 Fri Feb 5 22:48:46 2010 1265410131.50917 Fri Feb 5 22:48:47 2010 1265410131.51604 Fri Feb 5 22:48:48 2010 1265410131.52273 Fri Feb 5 22:48:49 2010 * End of first buffer full 1265410294.569 Fri Feb 5 22:48:50 2010 *** Start of second buffer ful +l 163 seconds later 1265410322.694 Fri Feb 5 22:48:51 2010 1265410322.69411 Fri Feb 5 22:48:52 2010 1265410322.69423 Fri Feb 5 22:48:53 2010 1265410322.69432 Fri Feb 5 22:48:54 2010 1265410322.6944 Fri Feb 5 22:48:55 2010 ... 1265410322.97337 Fri Feb 5 22:49:39 2010 1265410322.98039 Fri Feb 5 22:49:40 2010 1265410322.98775 Fri Feb 5 22:49:41 2010 1265410427.993 Fri Feb 5 22:49:42 2010 Terminating on signal SIGINT(2) 1265410427.998 Fri Feb 5 22:49:43 2010

    As you can see, the data written to the file over a period of 3 minutes, actually arrives at the perl script in just a couple of seconds--but delayed by nearly 3 minutes. Then there's another 3 minute delay before another burst arrives in just a couple of seconds. It's all down to the buffering done by the pipe.

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      One more thought on this thread:
      It's all down to the buffering done by the pipe.
      As far as I know how pipes work this is not entirely correct.

      Consider a stripped down version of your pipeline:

      perl -MTime::HiRes=gettimeofday -e '$|++; print gettimeofday() . "\n" +while sleep 1' |\ perl -MTime::HiRes=gettimeofday -e '$|++; print gettimeofday() . " " . + $_ while <>'

      Using this stripped pipeline you won't see any serious buffering related delay. Here's the why: If the process at the write end writes any amount (less then the pipe buffer) to the pipe, then the process at the read end will be able to read it irrespectively the state of the pipe buffer. If you omit the commands with possibly buffered output in your pipeline, you won't get the buffer filling delay. I think the result you showed us is caused by your version of tee (or less likely tail), by using buffered output. (I cannot see the same results with the current GNU versions of tee and tail.)

      Of course the pipe buffer can cause delays: if the buffer is empty then the reads will block, if the buffer is full then the writes will block. But reads will not block just because the buffer is not full.

      So in my comprehension your delayed results are all down to output buffering but not to pipe buffering.

        Indeed! You are correct. And it does seem to be tee that's doing the buffering.

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Time::HiRes strange behavior
by rubasov (Friar) on Feb 05, 2010 at 23:48 UTC
    tail from GNU coreutils has an option for controlling the sleep time between reads:
    -s, --sleep-interval=S with -f, sleep for approximately S seconds (default 1.0) between ite +rations
    And if you really want to follow a rapidly growing file (for instance some timing sensitive log without proper timestamps) with a lot finer time resolution than probably you appreciate the File-Tail module from CPAN. This modules adapts its sleep interval to the line-update frequency measured before. (Of course if you don't have control on the buffering at output side, this still won't be better.) An example:

    # generate test log file in bash $ while true ; do echo $(( ++i )) ; sleep .1 ; done > test

    # little File::Tail demo use Time::HiRes qw( gettimeofday ); use File::Tail; $| = 1; my $tail = File::Tail->new( name => './test', maxinterval => 0.1 ); my $usec; my $sec_since_epoch; my $line; while ( defined( $line = $tail->read() ) ) { ( $sec_since_epoch, $usec ) = gettimeofday(); printf "%d.%d\n", $sec_since_epoch, $usec; }

      This is a great suggestion, rubasov, and it works well except for one thing.
      When I use this on a CIFS or SMB mounted filesystem, it reloads (and re-"tails") the entire file each time it changes. On a local filesystem, it works fine. So it looks like I have to port the whole thing over to run locally under Windows. :(
      However, I do appreciate the response, it's definitely helped.
      I've also tried using tail -s with a value less than 1.0, and while it appears to accept the value, it doesn't actually change the behavior in any way I can see.

        Hmm, I never tried to use it on CIFS shares, however this definitely seems to me a bug in SMBFS/CIFS. Have you really tried with both of them? (AFAIK they do not share much of code lines, so it can be that in one of them the bug is not present.)

        After a little googling I found this thread, where someone claims to workaround a similar problem by mounting the CIFS share with the directio mount option. Though I don't know whether this can be recommended in a production system (how stable it is, how much it hurts performance). Anyway you may give it a try.

        Hope that helps.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://821657]
Approved by redgreen
Front-paged by redgreen
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (6)
As of 2020-01-20 08:57 GMT
Find Nodes?
    Voting Booth?