Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Splitting up a file in hourly increments

by hallikpapa (Scribe)
on Dec 12, 2007 at 20:07 UTC ( [id://656702]=perlquestion: print w/replies, xml ) Need Help??

hallikpapa has asked for the wisdom of the Perl Monks concerning the following question:

I have this script that tails a file and makes db inserts based on the parsing it does on that file. There is one file generated per day, but I would like to Continue this same process, but actually write the raw data to a file that is split up into hourly chunks. What is a good way to tell the system that when the script starts, open a file labeled (currentTime).curr, and then at the end of the hour, rename it to just (previousHour) and open a new file (currentTime).curr, and so on... This is what I am doing so far. db_record handles the inserts, so I figure right before there I should handle the open and close of files as well as the writing.
if ($connected) { $timestamp = time; if ( $timestamp < $midnight ) { log_notice("Client : Restarting $0\n"); log_error("End Processing $src_cdr_file\n"); exec '/home/$0' || log_warn("Client : Could not exec $0\n"); exit 6; # Something is wrong if this exit is taken } # End if $timestamp log_notice("Client : Normal Termination\n"); log_error("End Processing $src_cdr_file\n\n"); exec '/usr/bin/perl', '/home/$0' || log_warn("Client : Could not exec $0\n"); exit 7; # Something is wrong if this exit is taken } # End if $connected }; # End anonymous sub print $socket "tail\n"; # Rock n Roll open( STDOUT, "> /dev/null" ); # Begin processing records while ( defined( my $line = <$socket> ) ) { if ($recover_mode) { $rec_num++; } db_record( $sth, $line, $recover_from ); }
Something I was thinking about though was the fact sometimes it may not hit right on the HH:MM:00 and not cut the files appropriately. Thanks for any suggestions!

Replies are listed 'Best First'.
Re: Splitting up a file in hourly increments
by tuxz0r (Pilgrim) on Dec 12, 2007 at 20:20 UTC
    Does this script run continuously in the background, like a daemon, or is it run via cron or is it manual? If it runs continuously, then you just need to check prior to writing the data out, comparing the current hour value with the last hour value you used during you last processing and use that to create the new data file. If it is run via cron or manually, you'll need to store the last hour value checked when you processed and if it's different, open the new file for writing using the current timestamp. It really depends on how you run the program, which affects how you keep track of the hour value used to determine creation of the new data file.

    ---
    s;;:<).>|\;\;_>?\\^0<|=!]=,|{\$/.'>|<?.|/"&?=#!>%\$|#/\$%{};;y;,'} -/:-@[-`{-};,'}`-{/" -;;s;;$_;see;
    Warning: Any code posted by tuxz0r is untested, unless otherwise stated, and is used at your own risk.

      Oh man now that I think about it, this is kind of a dumb question, lol. It was a continuous script, and I have been just thinking in seconds and milliseconds so much lately, anything bigger just doesn't compute.
Re: Splitting up a file in hourly increments
by pc88mxer (Vicar) on Dec 12, 2007 at 20:52 UTC
    I'm going to take stab at what I think you are doing - correct me if I don't have it right.

    It seems that you are monitoring a log file, and you want to split the data into hourly log files (and perhaps perform some processing on them, too.)

    Assuming this is what you want to do, here's my approach:

    my ($t0, $path0, $out); $t0 = time; $path0 = path_for_time($t0); open($out, ">>", $path0); # note below while (defined(my $line = readline())) { my $t = time; my $path1 = path_for_time($t); if ($path1 ne $path0) { close($out); $path0 = $path1; open($out, ">>", $path0); } print $out $line; flush($out); # note below ... perform other processing on $line ... }
    All you need to supply is the path_for_time() subroutine.

    The routine readline() can just read from a pipe from tail as it seems like you are doing or you can use File::Tail to move the functionality into the perl script itself. Also, I use open the output files in append mode and call flush() in case you ever make this script restartable someday.

    Is this what you want to do?

Re: Splitting up a file in hourly increments
by andreas1234567 (Vicar) on Dec 13, 2007 at 09:16 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://656702]
Approved by moritz
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (5)
As of 2024-04-24 08:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found