http://www.perlmonks.org?node_id=1033634

Jackiejarvis has asked for the wisdom of the Perl Monks concerning the following question:

Hello everyone, I'm new in perl and I'm a student in ICT. I need to write some scripts for my internship. Over 3 weeks there is a jury who judge over the internship and they will see my scripts. But it could be that some of the people are experts in perl or scripting. I would like to ask if some of you will check my script and give some suggestions and feedback. The script is to make a dump of all the databases from remote. The scripts works fine normally. Many thanks for the suggestions!

#!/usr/bin/perl # (c) Odeurs Dieter # Script to back-up the databases use strict; use warnings; use Getopt::Long; use Net::Ping; use Time::Local; # Variables my $host; my $engine; my $user; my $password; my $port; my $cmd; my @args = (); my $ping; my $all; my $timestamp; my $filename; my $help; my $error; my $result = ""; my @result; # Read Options GetOptions( "host=s" => \$host, "engine=s" => \$engine, "password=s" => \$password, "user=s" => \$user, "port=s" => \$port, "all" => \$all, "help" => \$help ); if ($help){ showUsage(); } #Creating timestamp my ($sec, $min, $hour, $day, $month, $year, $wday, $yday, $isdst) = gm +time(time); $timestamp = sprintf("%4d-%02d-%02d_%02d-%02d", $year + 1900, $month + + 1, $day, $hour, $min); #------Check options----- # Check if host is defined and alive if (!defined $host){ print "\nNo host specified\n\n"; showUsage(); } else { print "\nChecking if $host is alive.. \n"; $ping = Net::Ping->new(); if ($ping->ping($host)){ print "$host is alive\n\n"; }else { print "$host is unreachable\n\n"; exit; } $ping->close(); } #If a password is set we need to set PGPASSWORD environment variable if ($password && $password ne ''){ if ($engine eq "postgres" ){ $ENV{'PGPASSWORD'} = $password; } else { push(@args, qq{--password=$password}); } } if ($host && $host ne ''){ push(@args, qq{--host $host}); } if ($port && $port ne ''){ push(@args, qq{--port $port}); } if ($user && $user ne ''){ if ($engine eq "postgres" ){ push(@args, qq{--username $user}); } else { push(@args, qq{--user $user}); } } #Check if database engine is defined and valid if (!defined $engine) { print "Please specify the database engine postgres|mysql\n"; showUsage(); } else { # When a engine is defined the trigt sub will be called if ($engine eq "postgres" ){ postgres(); } elsif ($engine eq "mysql" ){ mysql(); } else { print "Unknown database engine\n"; showUsage(); } } #------------------------------------------------ # Postgres #------------------------------------------------ sub postgres{ if (!$all) { #Find al the databases on the specified host print "Checking databases.. \n"; $cmd = "psql -At @args -c 'SELECT datname FROM pg_database'"; #Print the non-standard databases and store them in an array my @databases; @result = `$cmd`; if ($? !=0) { print "Some errors occured..\n\n"; exit; } print "Databases to back-up: \n"; foreach $result (@result) { chomp $result; if ($result !~ "template" && $result !~ "postgres"){ print " - " . $result . "\n"; push(@databases, $result); } } #Back-up each database foreach my $database (@databases){ #Generate a filename for each database $filename = "/home/dieter/dumps/postgres/" . $database . " +_" . $timestamp . ".sql"; $cmd = "pg_dump -C @args $database > $filename"; + print $cmd . "\n" ; @result = `$cmd`; if ($? !=0) { print "Done but with errors\n\n"; exit; }else { print "DONE \n"; } } #-------All database in one file------ } else { $filename = "/home/dieter/dumps/postgres/" . "Fulldump" . "_" +. $timestamp . ".sql"; print "Taking dump of all databases...\n"; $cmd = "pg_dumpall @args > $filename"; @result = `$cmd`; if ($? !=0) { print "Done but with errors\n\n"; } else { print "Dump saved in $filename\n\n"; } } } #----------------------------------------------- # MySQL #----------------------------------------------- sub mysql{ #-------One database in one file------ if (!$all) { #list databases and exclude the default my @databases; print "Checking databases..\n"; $cmd = "mysql @args -Bse 'show databases'"; @result = `$cmd`; if ($? !=0) { print "Some errors occured..\n\n"; exit; } print "Databases to back-up: \n"; foreach $result (@result) { if ($result !~ "information_schema" && $result !~ "mysql") +{ print " - $result \n"; push (@databases, $result); } } push(@args, qq{--single-transaction}); #back-up of the selected databases foreach my $database (@databases){ chomp $database; $filename = "/home/dieter/dumps/mysql/" . $database . "_" +. $timestamp . ".sql"; print "Taking dump of $database ...\n"; $cmd = "mysqldump @args $database > $filename"; @result = `$cmd`; if ($? !=0) { print "Done but with errors\n\n"; exit; }else { print "Dump saved in $filename\n"; } } #-------All database in one file------ } else { $filename = "/home/dieter/dumps/mysql/" . "Fulldump_" . $times +tamp . ".sql"; print "Taking dump of all databases...\n"; $cmd = "mysqldump @args --all-databases > $filename"; @result = `$cmd`; if ($? !=0) { print "Done but with errors\n\n"; exit; }else { print print "Full dump saved in $filename\n"; } } print "Dump completed!\n\n"; } #----------------------------------------------- # Show usage #----------------------------------------------- sub showUsage{ print "\n------------------------------------------\n"; print "-- Usage for the database backup script --\n"; print "------------------------------------------\n"; print "\n--host \n"; print " Specify the host IP\n\n"; print "--engine \n"; print " Specify the database engine. This could be postgres o +r mysql. \n\n"; print "--password \n"; print " Give the password needed to connect with the database + server\n\n"; print "--user \n"; print " Specify user to connect to the database server.\n"; print " The user must have permissions to connect to the data +base(s)\n\n"; print "--port \n"; print " If another port than the standard must be used to con +nect you could \n"; print " specify this with the --port option\n\n"; exit; } exit;

Replies are listed 'Best First'.
Re: Review of my script
by Tux (Canon) on May 15, 2013 at 08:24 UTC

    A small note to start with:

    #------Check options----- # Check if host is defined and alive if (!defined $host){ print "\nNo host specified\n\n"; showUsage(); } else { print "\nChecking if $host is alive.. \n"; $ping = Net::Ping->new(); if ($ping->ping($host)){ print "$host is alive\n\n"; }else { print "$host is unreachable\n\n"; exit; } $ping->close(); }

    That reads awful for several reasons

    • Reverted if branches if (not expr) {} else {} => if (expr) {} else {}
    • showUsage () does an exit which is unclear from the code, as showUsage is at the bottom.

    And you should make clear difference between wanted output (STDOUT) and error/diagnostics (STDERR).

    I'd put showUsage () at the top of the script, which also functions as immediate documentation for someone starting to read the script, making that if/else easier, like:

    sub showUsage { my $err = shift and select STDERR; print "\n------------------------------------------\n"; print "-- Usage for the database backup script --\n"; print "------------------------------------------\n"; print "\n--host \n"; print " Specify the host IP\n\n"; print "--engine \n"; print " Specify the database engine. This could be postgres o +r mysql. \n\n"; print "--password \n"; print " Give the password needed to connect with the database + server\n\n"; print "--user \n"; print " Specify user to connect to the database server.\n"; print " The user must have permissions to connect to the data +base(s)\n\n"; print "--port \n"; print " If another port than the standard must be used to con +nect you could \n"; print " specify this with the --port option\n\n"; exit $err; } GetOptions ( "help|?" => sub { showUsage (0); }, "host=s" => \$host, "engine=s" => \$engine, "password=s" => \$password, "user=s" => \$user, "port=s" => \$port, "all" => \$all, ) or showUsage (1); # no need for a $help option here #------Check options----- # Check if host is defined and alive unless (defined $host) { warn "\nNo host specified\n\n"; showUsage (); } # That exits, so no need for an else print "\nChecking if $host is alive.. \n"; Net::Ping->new ()->ping ($host) or die "$host is unreachable\n"; print "$host is alive\n\n";

    Enjoy, Have FUN! H.Merijn

      Thank you! I appreciate any kind of feedback/suggestion. I would change my code!

      Other feedback is always welcome!

Re: Review of my script
by fisher (Priest) on May 15, 2013 at 08:46 UTC
    Why so many print statements out there? I'd rather do like this:
    sub Usage() { my $err = shift and select STDERR; print <<EOF; ------------------------------------------ -- Usage for the database backup script -- ------------------------------------------ --host Specify the host IP --engine Specify the database engine. This could be postgres or mysql. --password Give the password needed to connect with the database server --user Specify user to connect to the database server. The user must have permissions to connect to the database(s) --port If another port than the standard must be used to connect you co +uld specify this with the --port option EOF exit $err; }
    It is more readable to my eyes.

      Thanks I put that in the script, I had never used that method.

      FWIW, I never liked Usage messages on STDERR, and rarely find programs that do that

      exit code communicates error status well enough

      all printing Usage on STDERROR ever did for me is break paging, break  foo ... | pager

        When you type something like:

        find-junk | xargs rm -f

        ... and find-junk bails out with an error message, you don't want the error message piped to rm -f, do you?

        So a convention has arisen to provide usage instructions on STDERR if the program has been called incorrectly, but on STDOUT if the program has been called with an explicit --help.

        fisher's suggested code does precisely this. It only selects STDERR if $err is a true value.

        package Cow { use Moo; has name => (is => 'lazy', default => sub { 'Mooington' }) } say Cow->new->name

        Error messages should go to STDERR, Log to the log or STDERR if no log wanted/specified, and required/expected output to STDOUT.

        If you read that again, an expected usage message is something you want on STDOUT. e.g. when called with --help. My approach is always:

        sub usage { my $err = shift and select STDERR; print "usage: $0 [--options] ...\n"; exit $err; } # usage use Getopt::Long qw(:config bundling); my $opt_v = 0; GetOptions ( "help|?" => sub { usage (0); } # OK, STDOUT "v|verbose:1" => \$opt_v, "x|xfsgdft!" => \my $opt_x, ) or usage (1); # FAIL: STDERR

        As not all programmers are clean in that distinction, when I want help through a pager, I pipe STDOUT and STDERR together, which - in (t)csh is just a single extra character:

        > some_app --help |& less

        To return to the first point, when some script/app deals with lot of data, you absolutely want that separated! You want to see the analysis, and not store it between your data

        $ perl foo.pl big_file.txt >excerpt.csv Opening file … Analyzing columns … Validating data … Generating CSV … 1 2 3 4 5 Done! $

        Imagine the validity of the generated file if the diagnostics were sent to STDOUT. And imagine the extra time it takes to wade through all the lines to find those messages.


        Enjoy, Have FUN! H.Merijn
Re: Review of my script
by fisher (Priest) on May 15, 2013 at 08:53 UTC
    Your list of variables. Each variable on it's own line, like you did, IMHO make sense only when you documenting it,
    # Variables my $host; # hostname to connect to my $engine; # can be 'foo' or 'bar' my $user; # username for authorization my $password; # password for auth my $port; # tcp port to connect
    but if you have so many variables with self-explanatory names you can do this:
    my ($host, $engine, $user, $password, $port);
Re: Review of my script
by space_monk (Chaplain) on May 15, 2013 at 13:01 UTC

    Firstly congratulations on being a neat and organised developer.

    Apart from the issues mentioned in other comments, if you are developing a module or even a script, consider tacking a POD description at the end of it. Its the easiest way of providing documentation on what your script does and how it works.

    Please try to limit the use of "global" parameters. Your MySQL and Postgres routines should be passed all the data they need to work as parameters instead of picking up globals like @args.yes i know variables declared with my have local scope in the enclosing block, but in this case there is no enclosing block

    You should perhaps have a hash mapping the routine to the engine

    Something like:

    my %engineMap = ( 'mysql' => \&mysql, 'postgres' => \&postgres ); if ($engineMap{$engine}) { $engineMap{$engine}->(@args); } else { print "Unknown database engine\n"; showUsage(); }
    If you spot any bugs in my solutions, it's because I've deliberately left them in as an exercise for the reader! :-)
Re: Review of my script
by TomDLux (Vicar) on May 15, 2013 at 20:10 UTC

    There are a number of details I would nit-pick about, but, after all, you're applying to be an intern, not the company guru. It looks mostly correct, I'll assume you've verified it works, and does what it's supposed to.

    Two things I would suggest, however.

    • use autodie ':all'; then you don't need to check open(), close(), system calls for success or failure. The call will die automatically, with appropriate error message, if anything goes wrong. And if you DO want to trap errors, use autodie, and then use Try::Tiny to implement a try{}catch{} block.
    • I was gonna suggest something about DBI, but it looks like only one actually query is used.

    As Occam said: Entia non sunt multiplicanda praeter necessitatem.

Re: Review of my script
by NetWallah (Canon) on May 16, 2013 at 04:06 UTC
    Great code for an intern ! (++)

    If you want to put some effort into making it more professional looking, I'd suggest data-driven parameter processing. This would minimize repeated code, standardize the look, make it extensible, and many other "good" buzzwords.

    To do that, you would need to come up with a structure like:

    my %params=( host => (TYPE=>"s", REQUIRED=>1, VALUE=>undef, VALIDATE=>sub { # Tis could be a subref to an external, if +it gets too big my $host=shift; print "\nChecking if $host is alive.. \n"; $ping = Net::Ping->new(); if ($ping->ping($host)){ print "$host is alive\n\n"; }else { print "$host is unreachable\n\n"; exit; } $ping->close(); }, ), # End of host engine => (TYPE=>"s", REQUIRED=>1, VALUE=>undef,), user => (TYPE=>"s", REQUIRED=>1, VALUE=>undef), password => (TYPE=>"s", REQUIRED=>1, VALUE=>undef), );
    Then you can process it like this:
    GetOptions( map { "$_=$params{$_}{TYPE}" => \$params{$_}{VALUE} } keys %params ); for (keys %params){ showUsage("Missing required parameter $_") if $params{$_}{REQUIRED} +&& ! $params{$_}{VALUE}; $params{$_}{VALIDATE}->($params{$_}{VALUE}) if $params{$_}{VALIDA +TE}; }

                 "I'm fairly sure if they took porn off the Internet, there'd only be one website left, and it'd be called 'Bring Back the Porn!'"
            -- Dr. Cox, Scrubs

Re: Review of my script ( s/...$year + 1900/POSIX::strftime/)
by Anonymous Monk on May 16, 2013 at 09:08 UTC

    ... sprintf ... $year + 1900 ...

    Don't do that, use POSIX::strftime( )

    use POSIX(); $timestamp = POSIX::strftime('%Y-%m-%d-%H-%M-%S', gmtime );
Re: Review of my script
by graff (Chancellor) on May 17, 2013 at 02:09 UTC
    Just a couple things not mentioned yet by others:

    The string "/home/dieter/dumps" appears at four different places in the script, making the code rather non-portable and a bit harder to maintain. The string should be defined once as a variable, and should probably be among the values that can be manipulated by a command line option. (And this would also entail taking steps to create the "mysql" or "postgres" subdirectories if necessary, reporting suitable error messages when that fails, etc.)

    Since this appears to be intended for a unix/linux system, it might raise concerns about protecting database security. When you run the script, and include the "host", "user" and "password" options on the command line, the values of those options become accessible to anyone else logged in on the same machine where your shell is running. All they have to do is use "ps" or "top" with appropriate options while your script is running, and they'll see the credentials you're using to connect to your database.

    (Those values will also be preserved in the command history file for whatever shell you're using; the history file is typically readable only by the owner, but if your login is compromised, whoever can pretend to be you will also find the database credentials in your command history.)

    Typical ways of avoiding exposure of credentials is to use a config file in the user's home directory (though this might still pose a risk if your login account gets compromised), and/or to use a module like Term::ReadPassword to prompt for the password after the script has started (without echoing user input to the terminal).

    UPDATE: One other thing: Every time you do this:

    $cmd = "... @args ..."; ... @result = `$cmd`;
    you are leaving yourself wide open to unlimited grief, because you are not making any effort to protect the values in @args from being misinterpreted when that backtick command line runs. Imagine what sort of mayhem you'd get if the password string for the database connection included an ampersand or any kind of bracket character...
      Thank you for the suggestion. But I and one other admin is the only one who can access the server where scripts will be running. So i don't need to be worried about the passwords in history. But still a good suggestion for scripts i write in the future.