Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Statistical Graphs with gnuplot and mod_perl

by projekt21 (Friar)
on Sep 11, 2003 at 09:19 UTC ( #290624=sourcecode: print w/ replies, xml ) Need Help??

Category: Web Stuff
Author/Contact Info Alex Pleiner <alex@zeitform.de>
Description:

When my company's sysadmin emailed me some log files from sysstat (http://perso.wanadoo.fr/sebastien.godard/) with the note "Can you make some graphs out of those?", I started to:

  • write a converter from sysstat logs into a gnuplot data file (basically a row to col converter)
  • write a form based web-frame to select files, columns and options, generate queries for a gnuplot backend and display the resulting graphs
  • finally write a gnuplot backend that outputs corresponding graphs

The gnuplot backend is a module currently called Apache::Gnuplot that can be used for nearly any statistical data files in gnuplot format.

The converter and web-frame are not published yet, but you may contact me if you like to have a look at it. See http://alex.zeitform.de/unsorted/saplot.png for a screenshot.

Fellow monks, I would like to read your comments on possible errors or improvements. And please let me know if this is of any interest to put it on CPAN. Thanks.

Update: code cleanup for readability

Update2: added input validation thanks to Abigail-II

# Apache::Gnuplot
# (c) 2003 zeitform Internet Dienste - zeitform@zeitform.de
# This program is free software; you can redistribute it and/or
# modify it under the same terms as Perl itself.
#
# Plots data via gnuplot.
#
# [Note: we currently use it as plotting backend in a system for displ
+aying
# sysstat logs (see: http://perso.wanadoo.fr/sebastien.godard/) after 
+piping
# those logs though a simple row to col converter. It can be used to p
+lot nearly
# any preprocessed data files.]
#
# httpd.conf:
#-----------------------------
# <Location /gnuplot>
#    SetHandler  perl-script
#    PerlHandler Apache::Gnuplot
#    Options +ExecCGI
#    PerlAddVar DataDir       /usr/local/apache/gnuplot/data
#    PerlAddVar Gnuplot       "/usr/local/bin/gnuplot - 2>/dev/null"
#    PerlAddVar IgnoreInvalid 1
#    PerlSendHeader On
# </Location>
#
# Path_Info:
#-----------------------------
# - directory below "DataDir" that contains the data files
#
#
# GET-Parameters:
#----------------------------
# Parameters marked as "multiple" can be given multiple times.
#
# - f (multiple): filename containing the data in the following format
#
#       # format col1-name col2-name ...
#       10:30    0.45      0.56      ...
#       10:40    0.63      0.23      ...
#       ...
#
#       where "format" can be "time" or "date" followed by an optional
+ strftime
#       compliant format string
#       Examples: "time[%H:%M]", time[%y-%m-%d], date[%H%M], etc
#
# - c (multiple) : column name to plot
#
# - ymin  : start of y range
# - ymax  : end of y range
# - scale : scale factor (a scale of 1 generates images of size 640x48
+0)
# - sum   : include calculated sum of multiple columns into the plot
#
# Example URLs
#-----------------------------
#
# http://host.com/gnuplot/subdir?f=datafile1&c=4&c=5&c=6
#   prints the columns 4 to 6 from the data file "datafile1"
#
# http://host.com/gnuplot/subdir?f=datafile1&f=datafile2&c=4&c=5&c=6
#   prints the columns 4 to 6 from the data files "datafile1" and "dat
+afile2"
#
# http://host.com/gnuplot/subdir?f=datafile1&c=4&c=5&c=6&scale=1.5
#   same as example 1 but scales the image by 1.5
#
# http://host.com/gnuplot/subdir?f=datafile1&c=4&c=5&c=6&sum=1
#   same as example 1 but add the sum of the colums into the plot
#

package Apache::Gnuplot;

use strict;
use Apache::Constants qw(:common :response);
use CGI;

use vars qw($VERSION);

$VERSION = "0.4";

## variables (edit if you need to -- and understand them :-)
my $combined_title = "combined"; # title for plot if more than 1 file 
+is plotted
my $sum_title      = "sum";      # title for column sums if plotted
my $n_a_title      = "n/a";      # title for non-existent columns if p
+lotted

############################################################
sub handler
############################################################
  {

    my $r = shift;

    ## get gnuplot path gnuplot
    my $gnuplot   = $r->dir_config('Gnuplot') 
        || "/usr/local/bin/gnuplot - 2>/dev/null";

    ## get data file path and add path_info
    my $data_dir  = $r->dir_config('DataDir') || return SERVER_ERROR;
    my $plot_path = join("/", $data_dir, $r->path_info());
    return NOT_FOUND unless -d $plot_path;

    ## get setting to ignore invalid values
    my $ignore_invalid = $r->dir_config('IgnoreInvalid') || 0;

    ## cgi object
    use CGI;
    my $q = CGI->new;

    ## params (columns and files)
    ## instead of "return BAD_REQUEST" you may "print_empty_gif($r)"
    ## I added input validation thanks to a note from Abigail-II
    ## I guess only @columns and @files were vulnerable.
    my @columns = sort map { $_ =~ s/[^\w.%\/-]//g } grep { $_ } $q->p
+aram("c") or return BAD_REQUEST;
    my @files   = sort map { $_ =~ s/[^\w.-]//g } grep { $_ } $q->para
+m("f") or return BAD_REQUEST;

    ## other options
    my $ymin   = $q->param("ymin");  $ymin += 0;
    my $ymax   = $q->param("ymax");  $ymax += 0;
    my $scale  = $q->param("scale"); $scale += 0;
    my $sum    = $q->param("sum");

    ## read data file for field names from newest data file
    my @plotrule;
    my $time_format = ""; ## this is not very good (see below)

    foreach my $file (@files)
      {
        open DAT, "$plot_path/$file" or return NOT_FOUND;

        ## @col_name is an array containing the column names listed in
+ the 
        ## first row.
        my @col_name = split(/\s+/, <DAT>);

        close DAT;

        ## if the first row does not contain a comment sign, there see
+ms
        ## to be no line with column names, so we create one with dumm
+y
        ## values. The first "column name" can be "#" or "#format". 
        ## We check this and correct it if necessary.
        if ($col_name[0] !~ /^#/)       { $col_name[$_] = "column_$_" 
+foreach (0 .. @col_name); }
        if ($col_name[0] =~ /^#+[^#]+/) { $col_name[0] =~ s/^#+//; uns
+hift @col_name, "#"; }

        ## we can use @col_name to get a column name for the column nu
+mber, to 
        ## get a  column number for a given name we need a hash (rever
+se lookup)
        my %col_number = map { $col_name[$_] => $_ } (0 .. @col_name);

        ## check for time format - this should be done once for all fi
+les, but how?
        ## I mean, what if the formats are mixed in different files?
        if    ($col_name[1] =~ /(time|date)\[(.+)\]/) {  $time_format 
+= $2; }
        elsif ($col_name[1] =~ /(time|date)/)         {  $time_format 
+= "%H:%M"; } ## default

        ## create a plot rule for this files columns
        ## plot all columns but ignore invalid
        if ($ignore_invalid)
          {
            push @plotrule,
              join(",",
                map {
                  sprintf("\"%s\" using 1:%d title \"%s%s\" with lines
+",
                     "$plot_path/$file",              ## path to data 
+file
                     $col_number{$_},                 ## column number
                     (@files > 1 ? "$file-" : ""),    ## filename in l
+egend?
                     $_                               ## column name f
+or legend
                  )
                }
                grep { $col_number{$_} } @columns     ## only existing
+ columns
              );
      }
        ## plot all columns and mark non-existing with "n/a"
        else
          {
             push @plotrule,
               join(",",
                 map {
                   ($col_number{$_}               ## do we have that c
+olumn?
                      ? sprintf("\"%s\" using 1:%d title \"%s%s\" with
+ lines",
                          "$plot_path/$file",           ## path to dat
+a file
                          $col_number{$_},              ## column numb
+er
                          (@files > 1 ? "$file-" : ""), ## filename in
+ legend?
                          $_                            ## column name
                        )
                      : sprintf("0 title \"%s%s\"",
                          (@files > 1 ? "$file-" : ""), ## filename in
+ legend?
                          $n_a_title                    ## column name
+ ("n/a")
                        )
                    )
                  } @columns                            ## all columns
                );

          }

     ## do we plot a sum?
     if ($sum && @columns > 1)
       {
          push @plotrule,
            sprintf("\"%s\" using 1:%s title \"%s%s\" with lines",
                      "$plot_path/$file",              ## path to data
+ file
                        sprintf("(%s)",                  ## sum ($1+$2
++$3...)
                          join("+",
                            map { '$' . $col_number{$_} } #'keep syn-h
+ilight
                            grep { $col_number{$_} } @columns
                          )
                        ),
                        (@files > 1 ? "$file-" : ""),    ## filename i
+n legend 
                        $sum_title                       ## column nam
+e ("sum")
                );
        }
      }

    ## do we have something to plot
    if (@plotrule)
      {
        ## gnuplot config file
        ## see the gnuplot manual if you need to change things here.
        my $plot = "";
        $plot .= "set xdata time\n"                if $time_format;
        $plot .= "set timefmt \"$time_format\"\n"  if $time_format;
        $plot .= "set format x \"$time_format\"\n" if $time_format;
        $plot .= "set key below\n";
        $plot .= "set output\n";
        $plot .= "set terminal png\n";
    
        $plot .= "set size $scale $scale\n"        if $scale;
        $plot .= "set title \"" . 
          (@files > 1 ? $combined_title : join(" ", @files)) . "\"\n";
        $plot .= "plot [:] " . (($ymax - $ymin) ? "[$ymin:$ymax]" : ""
+) . 
          join(",", @plotrule);

        ## output image
        $r->content_type("image/png");
        $r->send_http_header;

        open PLOT, "echo \'$plot\' | $gnuplot |" or return SERVER_ERRO
+R;
        print while sysread(PLOT, $_, 1024);
        close PLOT;
      }
    else
      {
        ## no image
        print_empty_gif($r);
      }

    return OK;

  }

############################################################
sub print_empty_gif
############################################################
  {
    my $r = shift;
    ## print empty 1x1 pixel gif
    $r->content_type("image/gif");
    $r->send_http_header;

    print pack("H*", $_)
      foreach (qw/ 47 49 46 38  39 61 01 00  01 00 80 00  00 ff ff ff
               00 00 00 21  f9 04 01 14  00 00 00 2c  00 00 00 00
               01 00 01 00  00 02 02 44  01 00 3b                  /);
  }

############################################################
1;

Comment on Statistical Graphs with gnuplot and mod_perl
Download Code
Re: Statistical Graphs with gnuplot and mod_perl
by Abigail-II (Bishop) on Sep 11, 2003 at 09:31 UTC
    open PLOT, "echo \'$plot\' | $gnuplot |" or return SERVER_ERROR;

    Considering that $plot partially consists of unfiltered user input, this looks like a very, very bad idea to me. Not to mention that you do the same with open().

    Abigail

      Yes, you are totally right. I will add the necessary input checking.

      Thanks++ for reminding.

      As a partial defense: We run this in a private network, with high restrictions. So only our admin has access to that.

      alex pleiner <alex@zeitform.de>
      zeitform Internet Dienste

Back to Code Catacombs

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: sourcecode [id://290624]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (4)
As of 2015-07-05 03:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (60 votes), past polls