Beefy Boxes and Bandwidth Generously Provided by pair Networks chromatic writing perl on a camel
Perl: the Markov chain saw
 
PerlMonks  

tail -f multiple nfs files

by Anonymous Monk
on Feb 02, 2002 at 00:13 UTC ( #142818=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I need to tail -f radius accounting files mounted on multiple nfs mounted servers and parse/insert them into a db. I have a script that does one file (I have an instance running for each file) already which, for the most part, works. However, at the end of every hour, I run a reconciliation report and find out that anywhere from 5-30% of the records weren't inserted into the database. I also get a lot of DBI errors while inserting indicating that a duplicate record already exists, meaning that the script proccessed the same line over again. Any ideas?

Comment on tail -f multiple nfs files
Re: tail -f multiple nfs files
by trs80 (Priest) on Feb 02, 2002 at 00:27 UTC
    Some code snippets will really help.
      #!/usr/bin/perl use DBI; use Time::Local; use File::Basename; use CGI::Carp qw(cluck); ########################### # SET UP GLOBALS, ETC ETC ########################### $/ = "\cM\cJ"; #FILES ARE ON AN NFS MOUNTED WINDOWS SERVER my $CURRENT = 'Starting the radius detail thingy--no record yet'; + #HOLDS ERROR INFO my $DETAIL_FILE = shift || die "Need file name w/ radius recs\n"; + #FILE FROM WHICH TO START READING RECs my $NEXT_FILE = get_next_file_name($DETAIL_FILE); + #NAME OF DETAIL AFTER MIDNIGHT my $DATABASE_HOST = shift || die "Need host w/ database\n"; + #HOST WITH DATABASE my $DATABASE = 'database=radius;' . "host=$DATABASE_HOST"; my $DEBUG = "false"; my $sleepcount = 0; ######################### # INITIALIZE ######################### #CONNECT TO THE DATABASE my $DBH = DBI->connect("dbi:Pg:dbname=radius;host=$DATABASE_HOST", "xx +xxxx", "xxxxxx", { RaiseError => 1, AutoCommit => 1, InactiveDestroy => 1} ) or + croak "Could not connect to database"; $count = 0; $SQL; open(ACCT, "$DETAIL_FILE"); #SEEK TO EOF seek(ACCT, -1024, 2); #THROW OUT FIRST LINE (MIGHT BE INCOMPLETE) $_ = <ACCT>; LOOP: for (;;) { for ($curpos = tell(ACCT); $_ = <ACCT>; $curpos = tell(ACCT)) +{ my @rec_array = split(/,/, $_); #GET RID OF USELESS TRASH foreach $_ (@rec_array){ $_ =~ s/"//g; $_ =~ s/\s//g; } #TEMP TO FIX USERNAMES WITH "Hex:" $rec_array[7] =~ s/Hex\:.*//; my $ff = substr($rec_array[8], -4); eval { if ($rec_array[3] eq "Start") { $SQL = qq/INSERT INTO online VALUES (/. qq/'$rec_array[0] $rec_array[1]',/. + #timestamp qq/'$rec_array[13]',/. #acct +_session_id qq/'$rec_array[6]',/. #nas_ +ip_address qq/'$rec_array[7]',/. #user +_name qq/'$rec_array[11]',/. #fram +ed_ip_address qq/'$rec_array[9]',/. #call +ing_station_id qq/'$rec_array[8]',/. #call +ed_station_id qq/'$ff',/. #fin +al_four qq/'$rec_array[23]'/. #nas_port qq/);/; #$count++; #print "$SQL\n"; $DBH->do($SQL) or croak "Could not Insert: ", +caller(), ":", $DBH->errstr, "\n$SQL\n"; } elsif ($rec_array[3] eq "Stop") { my $TABLE = "acct".$ff; $SQL = "DELETE FROM online WHERE acct_session_ +id='$rec_array[13]' AND nas_ip_address='$rec_array[6]'"; #print "$SQL\n"; $DBH->do($SQL) or croak "Could not delete: ", +caller(), ":", $DBH->errstr, "\n"; $SQL = qq/INSERT INTO $TABLE VALUES (/. qq/'$rec_array[0] $rec_array[1]',/. + #timestamp qq/'$rec_array[7]',/. #user_ +name qq/'$rec_array[11]',/. #frame +d_ip_address qq/'$rec_array[12]',/. #frame +d_protocol qq/'$rec_array[6]',/. #nas_i +p_address qq/'$rec_array[13]',/. #acct_ +session_id qq/'$rec_array[20]',/. #acct_ +session_time qq/'$rec_array[21]',/. #acct_ +input_packets qq/'$rec_array[22]',/. #acct_ +output_packets qq/'$rec_array[14]',/. #ascen +d_disconnect_cause qq/'$rec_array[15]',/. #ascen +d_connect_progress qq/'$rec_array[16]',/. #ascen +d_xmit_rate qq/'$rec_array[17]',/. #ascen +d_data_rate qq/'$rec_array[19]',/. #ascen +d_modem_portno qq/'$rec_array[18]',/. #ascen +d_modem_slotno qq/'$rec_array[9]',/. #calli +ng_station_id qq/'$rec_array[8]'/. #calle +d_station_id qq/);/; #print "$SQL\n"; $DBH->do($SQL) or croak "Could not Insert: ", +caller(), ":", $DBH->errstr, "\n$SQL\n"; #$count++; } }; #TRY TO RECONNECT IF ERROR USING reopen_db() if ($@) { print "Eval Error: $@ $SQL"; next LOOP; } } if (-e $NEXT_FILE){ sleep 3; #SLEEP TO ALLOW OS TO SYNC UP close(ACCT); $DETAIL_FILE = $NEXT_FILE; open (ACCT, "$DETAIL_FILE") || die "couldn't open file +: $!\n"; $NEXT_FILE = get_next_file_name($DETAIL_FILE); next; #NEXT FILE DOESN'T EXISTS, SO SLEEP AND TRY AGAIN } else { #IF NO CHANGE IN FILE FOR 60 SECS, RE-OPEN FILE if ($sleepcount > 60) { #RE-OPEN FILE THAT HAS RADIUS INFORMATION open (ACCT, "$DETAIL_FILE") || die "Couldn't o +pen $DETAIL_FILE: $!\n"; #GO ALMOST TO THE END seek(ACCT, -1024, 2); #THROW OUT FIRST LINE (MIGHT BE INCOMPLETE) $_ = <ACCT>; print "Reopened file. Position: ", tell(ACCT) + ,".FIRST LINE: $_\n"; $sleepcount = 0; next; } sleep 1; $sleepcount++; seek(ACCT, $curpos, 0); } } ##################### # EXIT AND CLEAN UP ##################### $DBH->disconnect(); close(ACCT); #print "Inserted $count"; ######################## # get_next_file_name() # DETERMINES THE NAME OF THE NEXT FILE TO TAIL FROM WHEN NEW NAME AT M +IDNIGHT # NAME OF FILE SHOULD BE LIKE `/mnt/rad-1/YYYYMMDD.act' ######################## sub get_next_file_name{ #FIND OUT WHERE FILE IS MOUNTED my $path = shift; #PARSE NAME OF FILE my ($base, $dir, $ext) = fileparse($path); #CONFIRM THAT FILE MATCHES THE NUMBER FORMAT $base =~ m/(\d\d\d\d)(\d\d)(\d\d)/ || die "Bad Filename Format +: $!\n"; #GENERATE NAME: my ($year, $month, $day) = ($1, $2, $3); #FINDS THE EPOCH SECONDS OF TODAY AT NOON my $curr_day = timelocal(0, 0, 12, $day, $month - 1, $year - 1 +900 ); #USE THAT NUMBER PLUS A DAYS WORTH OF SECONDS (86400) TO GENER +ATE TOMMORROW'S DATE NUMBERS my ($new_year, $new_month, $new_day) = (localtime($curr_day + +86399))[5, 4, 3]; #print "Opened new file: ".dirname($path).($new_year + 1900).( +$new_month + 1).$new_day."\n"; return sprintf("%s/%4.4d%2.2d%2.2d.act", dirname($path) , $new +_year + 1900, $new_month + 1, $new_day ); }
Re: tail -f multiple nfs files
by trs80 (Priest) on Feb 02, 2002 at 01:38 UTC
    I am not sure if the for loop you have is doing exactly what it should. I don't know for sure since I can't run it to duplicate what you are doing. I did look on the Perl Monk site and found this. Have you tried it with that module?
    I will be honest and say that I am slightly confused by the opening of a new file during the processing or at least that is what it looks like to me. That may effect the loop as well.
    if (-e $NEXT_FILE){ sleep 3; #SLEEP TO ALLOW OS TO SYNC UP close(ACCT); $DETAIL_FILE = $NEXT_FILE; open (ACCT, "$DETAIL_FILE") || die "couldn't open f +ile: $!\n"; $NEXT_FILE = get_next_file_name($DETAIL_FILE); next; #NEXT FILE DOESN'T EXISTS, SO SLEEP AND TRY AGAIN }
    That is the block that worries/confuses me. I notice on the first file you open prior to the for loop you remove the first line that may be invalid. I don't see any code verifying the other lines coming into the script. That is no test for the correct number of array elements. Could there be bad lines in the data feed that aren't accounted for?
      Thanks for the response. I know this code works, and it works well when only running one instance (one file). However, when I run two instances using two nfs mounted servers, I get the record loss. The block you pointed out is a daily rollover feature. If it detects an EOF while tailing, it will check a file for the next day exists, and then starts tailing that. This way, the script will follow logs that rotate via day.
        Move your for loop into a sub routine. Test for a file change and if the condition is true call the sub again from inside.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://142818]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (6)
As of 2014-04-20 19:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (486 votes), past polls