Resizing MRTG (RRDTool) logs en-masse

MRTG is a very widely-used and popular tool for graphing data. Most typically, it's used for graphing bandwidth utilisation, but it can be (and is) used to graph just about anything.

When used together with RRDTool, MRTG will by default create rrdfiles giving approximately 2 days of 5 minute data, 1 week of 30 minute data, 2 months of 2 hour data & 2 years of one day data.

At my $workplace, we decided that we needed to change these defaults, and fortunately recent versions of MRTG provide a suite of RRDRowCount configuration options for this purpose. And RRDTool provides a resize command for resizing existing rrd's.

My issue was that we had more than 2000 existing rrd's, and some of them had already been resized at an earlier stage. So I needed to iterate through the lot and examine them one by one, and resize accordingly.

Perl to the rescue :-)
The below script ran through the whole lot in just a few minutes.

Disclaimer: If you decide to use the below on your own rrd's, I strongly recommend a dry run on a copy of your files first. You have been warned :-p

Update: Made a couple of small changes as per comments from jwkrahn.

Update 2012-01-29: Now on GitHub.

Cheers,
Darren

#!/usr/bin/perl
use strict;
use warnings;
use RRDs;
use Time::HiRes qw/time/;

my $rrdtool = '/usr/bin/rrdtool';
my $logsdir = 'logs';  # Where all the rrd files live

# The number of data sources in each rrd file
# Typically, for mtrg-generated rrds this will be 8
my $datasources = 8;

my %wanted = (
    1   => 8640,    # 30 days of 5 minute data
    6   => 17520,   # 365 days of 30 min data
    24  => 13140,   # 3 years of 2 hour data
    288 => 3650,    # 10 years of 1 day data
    );

opendir(DIR, $logsdir) or die "Cannot open $logsdir:$!\n";
my @rrds = grep { /.rrd$/ && -f "$logsdir/$_" } readdir DIR;
closedir DIR;
my $numfiles = scalar @rrds;
print "Starting, found $numfiles rrd files\n\n";
my $start = time;

for my $rrd (sort @rrds) {
    print "\nProcessing $rrd\n";
    my $info = RRDs::info "$logsdir/$rrd";
    # Check to ensure we actually have a valid rrd file
    unless ($info->{filename}) {
        print qq|"$logsdir/$rrd" doesn't appear to be a valid rrd log,
+ skipping\n|;
        next;
    }
    for (0 .. $datasources -1) {
        my $cmd = qq|$rrdtool resize $logsdir/$rrd |;
        my $pdp = $info->{"rra[$_].pdp_per_row"};
        my $rows = $info->{"rra[$_].rows"};
        my $cf = $info->{"rra[$_].cf"};
        my $diff = $rows - $wanted{$pdp};
        printf("\tCurrent DS => PDP per row:%.f Rows:%.f CF:%s\n", $pd
+p, $rows, $cf); 
        if ($diff < 0) {
            $diff = abs($diff);
            $cmd .= qq|$_ GROW $diff|;
        }
        elsif ($diff > 0) {
            $cmd .= qq|$_ SHRINK $diff|;
        }
        else {
            print "\tNo change to this DS\n\n";
            next;
        }

        print "\tResizing to $wanted{$pdp} rows, executing $cmd\n";
        system($cmd) == 0 or die "Could not execute $cmd:$!\n";
        print "\tRenaming resized file\n";
        rename 'resize.rrd', "$logsdir/$rrd";
        print "\tDone.\n";
    }
}

my $end = time;
my $dur = sprintf("%.2f", $end - $start);
print "Finished, processed $numfiles files in $dur seconds\n\n";
[download]

Comment on Resizing MRTG (RRDTool) logs en-masse Download Code

Replies are listed 'Best First'.
Re: Resizing MRTG (RRDTool) logs en-masse by jwkrahn (Abbot) on Nov 29, 2010 at 23:19 UTC
`my @rrds = grep { /.rrd$/ && -f "$logsdir/$_" } readdir DIR;` [download] Why match zero or more characters in `/.rrd$/` when `/rrd$/` would match the same thing with less work? Perhaps you meant `/\.rrd$/`? `$cmd = qq\|$mv resize.rrd $logsdir/$rrd\|; print "\tRenaming resized file, executing $cmd\n"; system($cmd) == 0 or die "Could not execute $cmd:$!\n";` [download] Why not just use Perl's built-in rename function? `print "\tRenaming resized file, executing $cmd\n"; rename 'resize.rrd', "$logsdir/$rrd" or die "Could not rename +resize.rrd:$!\n";` [download]	[reply] [d/l] [select]
Re^2: Resizing MRTG (RRDTool) logs en-masse by McDarren (Abbot) on Nov 30, 2010 at 01:33 UTC
Thanks for the feedback, both valid points. heh... I completely forgot about the rename function ;-)	[reply]
Re^3: Resizing MRTG (RRDTool) logs en-masse by jwkrahn (Abbot) on Nov 30, 2010 at 02:23 UTC
Also, `/.rrd$/` will match both of the strings `"rrd"` and `"rrd\n"` so perhaps you should use `/.rrd\z/` instead. Or perhaps even: `'rrd' eq substr( $_, -3 )`	[reply] [d/l] [select]
Re^4: Resizing MRTG (RRDTool) logs en-masse by McDarren (Abbot) on Nov 30, 2010 at 04:03 UTC
Re: Resizing MRTG (RRDTool) logs en-masse by droid385902 (Initiate) on Jan 27, 2012 at 21:07 UTC
If you're interested, I've tweaked the code so that it moves all of the options to the command-line, and adds a couple of extra features: All DS values are examined (it's not hard-coded to 8) rows can be re-mapped on a per-DS basis rows can be re-mapped on a per-PDP basis the rrd files to be edited are now specified on the command-line, instead of on a per-directory basis (works better with tools like find/xargs) attached is a "diff -u" against the current code --- rrdtoolresize.pl.orig 2012-01-27 13:22:08.448124100 -0700 +++ rrdtoolresize.pl 2012-01-27 14:06:09.298551700 -0700 @@ -1,66 +1,175 @@ #!/usr/bin/perl +# +my $USAGE = "# +# Usage: $0 [ -v verbosity ] [ -f ] \ + [ -R rranum:rows[;rranum:rows]* \| -P pdp:rows[;pdp:rows]* ] RRDs(s) +# Where: +# -v verbosity Specify the verbosity level (default = 10) +# -f Fake (dry) run (assumed if -R not specified) +# -R X:Y[;X:Y]* Resize rra X to have Y rows +# -M X:Y[;X:Y]* Remap every RRA with X pdps to Y rows +# +# File(s) +# These are RRD files that need to be 're-shaped' +# +# If neither -M nor -P is specified, then info about the RRD +# will be printed +# If both -M and -R are specified, then -R takes precedence +# +"; +# + use strict; use warnings; + use RRDs; +use Getopt::Std; +use Data::Dumper; use Time::HiRes qw/time/; -my $rrdtool = '/usr/bin/rrdtool'; -my $logsdir = 'logs'; # Where all the rrd files live +my $rrdtool = $ENV{'RRDTOOL'} \|\| 'rrdtool'; + +my %opt; +getopts('fP:R:v:', \%opt) \|\| die $USAGE; + +my $verbosity = $opt{'v'} \|\| 10; +my $rrastr = $opt{'R'}; +my $pdpstr = $opt{'P'}; +my $dryrun = (defined $opt{'f'}) \|\| + ((!defined $rrastr) && (!defined $pdpstr)) ? 1 : 0; +my $dumpinfo = ($verbosity >= 20) \|\| + (!defined $pdpstr && !defined $rrastr) ? 1 : 0; + +my %rramap; +if (defined $rrastr) { + my @rows = split(/\s;\s/, $rrastr); + foreach my $redo (@rows) { + my @info = split(/\s:\s/, $redo); + if ($#info != 1) { + die "Bad rra resize specification ($redo) in -R $rrastr! Died"; + } + elsif ($info[1] < 1) { + die "Invalid rra row count ($info[1]) in -R $rrastr! Died"; + } + elsif ($info[0] !~ /^\d+$/) { + die "Invalid rra number ($info[0]) in -R $rrastr! Died"; + } + else { + $rramap{$info[0]} = int($info[1]); + } + } +} + +my %pdpmap; +if (defined $pdpstr) { + my @rows = split(/\s;\s/, $pdpstr); + foreach my $redo (@rows) { + my @info = split(/\s:\s/, $redo); + if ($#info != 1) { + die "Bad rra resize specification ($redo) in -P $pdpstr! Died"; + } + elsif ($info[1] < 1) { + die "Invalid rra pdp count ($info[1]) in -P $pdpstr! Died"; + } + elsif ($info[0] !~ /^\d+$/) { + die "Invalid rra number ($info[0]) in -P $pdpstr! Died"; + } + else { + $pdpmap{$info[0]} = int($info[1]); + } + } +} -# The number of data sources in each rrd file -# Typically, for mtrg-generated rrds this will be 8 -my $datasources = 8; - -my %wanted = ( - 1 => 8640, # 30 days of 5 minute data - 6 => 17520, # 365 days of 30 min data - 24 => 13140, # 3 years of 2 hour data - 288 => 3650, # 10 years of 1 day data - ); - -opendir(DIR, $logsdir) or die "Cannot open $logsdir:$!\n"; -my @rrds = grep { /.rrd$/ && -f "$logsdir/$_" } readdir DIR; -closedir DIR; -my $numfiles = scalar @rrds; -print "Starting, found $numfiles rrd files\n\n"; + +my @rrds = @ARGV; my $start = time; +my $numfiles = 0; for my $rrd (sort @rrds) { print "\nProcessing $rrd\n"; - my $info = RRDs::info "$logsdir/$rrd"; + my $info = RRDs::info $rrd; # Check to ensure we actually have a valid rrd file - unless ($info->{filename}) { - print qq\|"$logsdir/$rrd" doesn't appear to be a valid rrd log +, skipping\n\|; + if ($info->{filename}) { + printf "DEBUG: RRD %s info: %s\n", + $rrd, join("\n", sort split(/\n/, Dumper($info))) + if ($dumpinfo); + } + else { + print "$rrd isn't a valid rrd log, skipping\n"; next; } - for (0 .. $datasources -1) { - my $cmd = qq\|$rrdtool resize $logsdir/$rrd \|; - my $pdp = $info->{"rra[$_].pdp_per_row"}; - my $rows = $info->{"rra[$_].rows"}; - my $cf = $info->{"rra[$_].cf"}; - my $diff = $rows - $wanted{$pdp}; - printf("\tCurrent DS => PDP per row:%.f Rows:%.f CF:%s\n", $p +dp, $rows, $cf); + + $numfiles++; + my @rras = sort map { substr($_, 4, index($_, ']', 4)-4) } + grep { /rra\[\d+\].pdp_per_row/ } keys %{$info}; + + ## Debug: + # printf "Found:\n %s\n", join("\n ", + # grep { /rra\[\d+\].pdp_per_row/ } keys %{$info}); + # printf "RRAs: %s\n", join(" ", @rras); + + foreach my $rra (sort { $a <=> $b } @rras) { + my $cmd = qq\|$rrdtool resize $rrd \|; + my $rows = $info->{"rra[$rra].rows"}; + my $cf = $info->{"rra[$rra].cf"}; + my $pdp = $info->{"rra[$rra].pdp_per_row"}; + printf "\tDS %s => PDP per row:%.f Rows:%.f CF:%s\n", + $rra, $pdp, $rows, $cf; + my $wanted = (defined $rramap{$rra}) ? $rramap{$rra} : + (defined $pdpmap{$pdp}) ? $pdpmap{$pdp} : -1; + if ($wanted <= 0) + { + printf "DEBUG: Skipping RRA %s (no map found)\n", $rra + if ($verbosity >= 15); + next; + } + + my $diff = $rows - $wanted; if ($diff < 0) { $diff = abs($diff); - $cmd .= qq\|$_ GROW $diff\|; + $cmd .= qq\|$rra GROW $diff\|; } elsif ($diff > 0) { - $cmd .= qq\|$_ SHRINK $diff\|; + $cmd .= qq\|$rra SHRINK $diff\|; } else { - print "\tNo change to this DS\n\n"; + print "\tNo change to DS $rra\n\n"; next; } - print "\tResizing to $wanted{$pdp} rows, executing $cmd\n"; - system($cmd) == 0 or die "Could not execute $cmd:$!\n"; - print "\tRenaming resized file\n"; - rename 'resize.rrd', "$logsdir/$rrd"; - print "\tDone.\n"; + print "\tResizing to $wanted rows, executing $cmd\n"; + if (!$dryrun) { + system($cmd) == 0 or die "\tCould not execute $cmd: $!"; + print "\tRenaming resized file\n"; + + # We jump through a number of hoops because the RRD may not + # be in the current directory (but the created "resize.rrd" + # IS in the current directory!) + unlink $rrd.'.bk'; + + # Do this in case one of the steps below fails + rename $rrd, $rrd.'.bk' \|\| + die "\tUnable to move the old $rrd way! Stopping"; + + if (!link('resize.rrd', $rrd)) { + print "\tNOTICE: link(resize.rrd, $rrd) failed.". + " Trying 'mv' instead!\n"; + if (system("mv resize.rrd $rrd")) { + # Try to put the original RRD back + rename $rrd.'.bk', $rrd; + die "\tFailed to link/move resize.rrd to $rrd! Died"; + } + } + else { + unlink 'resize.rrd'; + unlink $rrd.'.bk'; + } + print "\tDone.\n"; + } } } my $end = time; -my $dur = sprintf("%.2f", $end - $start); +my $dur = sprintf('%.2f', $end - $start); print "Finished, processed $numfiles files in $dur seconds\n\n"; [download]	[reply] [d/l]
Re^2: Resizing MRTG (RRDTool) logs en-masse by McDarren (Abbot) on Jan 29, 2012 at 06:52 UTC
Hey, thanks for that :-) I've applied your patch and thrown this on GitHub Cheers, Darren	[reply]


good chemistry is complicated, and a little bit messy -LW
	PerlMonks