Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

looking for feedback on my first script

by thirtySeven (Acolyte)
on Jan 08, 2021 at 15:48 UTC ( #11126607=perlquestion: print w/replies, xml ) Need Help??

thirtySeven has asked for the wisdom of the Perl Monks concerning the following question:

Hello, I just finished my first perl script. I am having a hard time understanding a lot of the code, but it works so that's great. I am making a btrfs snapshot manager, this script is going to be just one component of it. Anyways I just want to improve my skills and would like some feedback.

#!/usr/bin/env perl use strict; use warnings; use 5.010; =pod =head1 DESCRIPTION This script is used for taking and deleting a single snapshot. Snapshot names look something like this: 'day=01_07_2021,time=15:30'. Exactly six arguments must be given: 1: subvolume to be snapshotted 2: root snapshot directory (typically /.snapshots) 3: the yabsm name of the subvolume being snapshotted (ex: 'root' for ' +/') 4: the timeframe for the snapshot (hourly, daily, etc). 5: the desired date format. Either 'mm/dd/yyyy' or 'dd/mm/yyyy' 6: the number of snapshots the user wants to keep for this subvolume/t +imeframe. This script should not be used by the end user. =cut #################################### # GRAB INPUT PARAMETERS # #################################### my $subvol_to_snapshot_arg = $ARGV[0]; my $snap_root_dir_arg = $ARGV[1]; my $yabsm_subvol_name_arg = $ARGV[2]; my $timeframe_arg = $ARGV[3]; my $date_format_arg = $ARGV[4]; my $snaps_to_keep_arg = $ARGV[5]; #################################### # MAIN # #################################### delete_earliest_snapshot(); take_new_snapshot();
#################################### # SNAPSHOT CREATION # #################################### sub take_new_snapshot { my $snap_name = create_snapshot_name(); system('btrfs subvolume snapshot ' . $subvol_to_snapshot_arg . ' ' . $snap_root_dir_arg . '/' . $yabsm_subvol_name_arg . '/' . $timeframe_arg . '/' . $snap_name); } sub create_snapshot_name { my ($min, $hr, $day, $mon, $yr) = (localtime())[1..5]; $mon++; # month count starts at zero. $yr += 1900; # year represents years since 1900. if ($date_format_arg eq 'mm/dd/yyyy') { return 'day='.pad($mon).'_'.pad($day).'_'.$yr.',time='.pad($hr).':'.pad +($min); } elsif ($date_format_arg eq 'dd/mm/yyyy') { return 'day='.pad($day).'_'.pad($mon).'_'.$yr.',time='.pad($hr).':'.pad +($min); } else { die "$date_format_arg is not a valid date format"; } } sub pad { if ($_[0] < 10) { return '0' . $_[0]; } else { return $_[0]; } } #################################### # SNAPSHOT DELETION # #################################### sub delete_earliest_snapshot { opendir(DIR,"${snap_root_dir_arg}/${yabsm_subvol_name_arg}/$timefram +e_arg"); my @snaps = grep(/^[^\.]/, readdir DIR); # exclude dot files closedir DIR; return if scalar(@snaps) < $snaps_to_keep_arg; my $earliest_snap = earliest_snapshot(\@snaps); system('btrfs subvolume delete ' . $snap_root_dir_arg . '/' . $yabsm_subvol_name_arg . '/' . $timeframe_arg . '/' . $earliest_snap); } sub earliest_snapshot { my @snaps = @{$_[0]}; my $earliest_snap = $snaps[0]; foreach my $snap (@snaps) { $earliest_snap = $snap if snapshot_lt($snap,$earliest_snap); } return $earliest_snap; } sub snapshot_lt { return lexically_lt(snap_name_to_lex_ord_nums($_[0]), snap_name_to_lex_ord_nums($_[1])); } sub lexically_lt { my ($head1, @tail1) = @{$_[0]}; my ($head2, @tail2) = @{$_[1]}; if ($head1 > $head2) { return 0; } elsif ($head1 < $head2) { return 1; } elsif (@tail1 == 0) { return 1; } elsif (@tail2 == 0) { return 0; } else { return lexically_lt(\@tail1,\@tail2); } } sub snap_name_to_lex_ord_nums { my ($yr, $mon, $day, $hr, $min); if ($date_format_arg eq 'mm/dd/yyyy') { ($mon,$day,$yr,$hr,$min) = snap_name_nums($_[0]); return [$yr,$mon,$day,$hr,$min]; } elsif ($date_format_arg eq 'dd/mm/yyyy') { ($day,$mon,$yr,$hr,$min) = snap_name_nums($_[0]); return [$yr,$mon,$day,$hr,$min]; } else { die "$date_format_arg is not a valid date format"; } } sub snap_name_nums { return $_[0] =~ m/([0-9]+)/g; }

Replies are listed 'Best First'.
Re: looking for feedback on my first script
by haukex (Bishop) on Jan 08, 2021 at 16:10 UTC

    My overall first impression is that this is pretty good code - subdivided into subroutines, nicely formatted, etc. It could be compacted / made mode "Perlish" in a couple places, but because TIMTOWTDI and the code is still clean and pretty easy to follow, I think it's fine the way it is. Just a few other issues / comments:

    • Most importantly, you're using the system function with a single argument and with user input (command-line arguments), which is vulnerable to injections, and you're not checking the calls for errors. See Calling External Commands More Safely for alternatives.
    • opendir also needs to be checked for errors, and it's better to use lexical file/dir handles, as in opendir(my $dh, $path) or die "$path: $!";.
    • Instead of your sub pad, you can use sprintf, as in sprintf("%02d", $num).
    • I would suggest you take the time to study Time::Piece and/or DateTime instead of the basic localtime stuff you're doing here, I think you'll be much happier in the long run.
    • You may also want to consider looking into Getopt::Long to pass arguments to the script instead of a plain list of arguments. IMHO it makes for much more readable script invocations.
    • Unless you're certain this will only ever be run on *NIX, you probably want to consider using a module for your filename manipulations. Personally I like Path::Class.
    • Now we're getting very nitpicky, but do you really need to provide two different date formats? For things like this I'd always stick to YYYY-MM-DD.
      I would suggest you take the time to study Time::Piece and/or DateTime instead of the basic localtime stuff you're doing here

      Completely agree. I'd also point out that of the two Time::Piece is always my first port of call as it is small and in core and is good enough for 95% of the date/time work I need to do. By contrast, DateTime is a bit of a monster but it does cater pretty well for the remaining 5% of cases.


      🦛

      thanks for the feedback. I will be taking a lot of this advice. It will take me a few hours to work through everything you have said. I did not consider that it was unsafe to use system with user inputs. I am off to look into the various modules you suggested.
Re: looking for feedback on my first script
by Discipulus (Abbot) on Jan 08, 2021 at 16:09 UTC
    Hello thirtySeven,

    at first glance it seems ok. About arguments of the script: no one rembemer the correct order of them after 2 minutes. See The Dynamic Duo --or-- Holy Getopt::Long, Pod::UsageMan! for more info. I always use the following format:

    unless ( GetOptions ( "example=s" => \$par_example, # string needed # more options a +nd switches.. "help" => \$par_help, # switch )) { my_show_help(); die "Error in command lin +e arguments: $!"} if (defined $par_help){my_show_help();exit;}

    Then you can use File::Spec to a more robust path creation and look for a date module to see if it helps managing date formats.

    Always check the return value of open and opendir like in: .. or die "Unable to open directory [$dir]"

    You can use eval inside a block to mask the scope of $@ and check $^E (last operating system error) too.

    When shelling out with system or whatever you can check it better with Capture::Tiny

    my ($stdout, $stderr, $exit) = capture { system( $cmd, @cmdargs ); };

    L*

    PS my last suggestion is to follow haukex's sugestions :)

    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
Re: looking for feedback on my first script
by shmem (Chancellor) on Jan 08, 2021 at 16:19 UTC

    Looks good. If only my first script was so orderly...

    Some things I'd do different:

    my $Usage = "usage: volume root_dir subvol timeframe dateformat keep\n +"; @ARGV == 6 or die $Usage; # we need six arguments my ( $subvol_to_snapshot_arg, # this $snap_root_dir_arg, # assigns all variables $yabsm_subvol_name_arg, # in one go $timeframe_arg, # and leaves room $date_format_arg, # to comment $snaps_to_keep_arg, # what they are about ) = @ARGV;

    That's a matter of style (and lazyness :-)

    I'd use sprintf

    sub create_snapshot_name { my ($min, $hr, $day, $mon, $yr) = (localtime())[1..5]; $mon++; # month count starts at zero. $yr += 1900; # year represents years since 1900. my @fields = $date_format_arg eq 'mm/dd/yyyy' ? ($mon, $day) : $date_format_arg eq 'dd/mm/yyyy' ? ($day,$mon) : die "$date_format_arg is not a valid date format"; sprintf "day=%02d_%02d_%04d,time=%02d:%02d",@fields,$yr,$hr,$min; }
    which obsoletes the pad() subroutine. Expressing the if/else/elsif as a construct of ternaries (COND ? IF_TRUE : IF_FALSE) is, again, a matter of style. The sprintf being the last expression eliminates the need for an explicit return.

    edit:

    Since the documentation is contained in the script, and it states

    This script should not be used by the end user.

    you could stick the following directive into it up front

    exec "pod2man $0" if -t; # display manual page if invoked from termina +l

    which can be overridden redirecting /dev/null into it as STDIN.

    perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
      the usage tip will make those variable interpolations much more readable. I am indeed gonna switch to sprintf.
      A minor quibble with "The sprintf() being the last expression eliminates the need for an explicit return.
      I always put an explicit return except for sort subs. Here the code is going to return the return value of sprintf(), presumably "1" which really doesn't mean anything in the context of this code. Ending with return; statement would return undef. I guess this is a style thing, YMMV.

      BTW, I always go with a YYYY-MM-DD format. For (a) consistency and (b) this produces a natural ASCI sort order that does "what you expect".

        Here the code is going to return the return value of sprintf(), presumably "1" which really doesn't mean anything in the context of this code.

        From the docs (sprintf):

        sprintf FORMAT, LIST
        Returns a string formatted by the usual "printf" conventions of the C library function "sprintf".

        Please elaborate how sprintf might return "1", other than as sprintf "1", as sprintf "%d", 1 or sprintf "%s",1; and how that might be the case given the incriminated statement.

        perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
Re: looking for feedback on my first script
by eyepopslikeamosquito (Bishop) on Jan 09, 2021 at 07:01 UTC

    For cheap thrills, I reviewed your script by going through the principles in On Coding Standards and Code Reviews and making a few notes every time I pulled a face:

    1. Separate user versus maintainer documentation. For a simple script like yours, I'd put the user doco in a usage subroutine (just a big here-doc with an exit 1 at the end) and call this sub if the user enters -h or --help or an invalid command line; suggest the standard Getopt::Long to parse command line arguments ... which should make your commands easy to learn and use because they behave just like system commands. I suggest using simple Perl comments (not POD) for the maintainer documentation.
    2. Handle all errors (e.g. don't ignore error returns). Fail securely. Apart from not handling command line errors (mentioned in the previous point), you are not checking system and opendir for errors (BTW, you should use a lexical variable (not DIR) when calling opendir). For now, just add an or die to the opendir while giving some thought to the next point.
    3. Establish a rational error handling policy and follow it strictly. Just FYI for now. For your script, failing by writing a clear error message to stderr and returning non-zero exit code seems appropriate.
    4. Any unexpected result from a file operation or API call or external command should be logged. This sort of thing is invaluable when supporting a product on customer sites - probably not appropriate for your scripts at the moment.
    5. Include a comment block on every non-trivial method describing its purpose. Just one or two comment lines describing each of your functions seems appropriate.
    6. Minimize the scope of variables, pragmas, etc. The six variables used for the command line arguments are all global. Do they need to be? Keeping state in global variables is a nightmare in large programs. You might like to think about how to reduce the number of global variables.
    7. Short names for short scopes, longer names for longer scopes. You are already doing this quite well in that your six global variables all have long descriptive names.
    8. Design interfaces that are: consistent; easy to use correctly; hard to use incorrectly; easy to read, maintain and extend; clearly documented; appropriate to your audience. You are already doing this quite well - using the standard command line parsing mentioned in the first point will further improve.
    9. Write the test cases before the code. You can probably get away with manual testing for now. To improve your skills for the future, think about how to test your code, especially how you might automate testing (see, for example, Test::More and Effective Automated Testing).

    You are already off to a great start with your first script. Well done!

Re: looking for feedback on my first script
by BillKSmith (Prior) on Jan 08, 2021 at 21:32 UTC

    Congratulations on a great job thirtySeven.

    Your documentation should specify the intended user. If that is another perl script, you also need a 'synopsis' which shows how you expect this file to be used.

    A search of cpan finds a module Filesys::Btrfs. You may be able to use it to avoid the use of system.

    Do you plan on implementing your 'btrfs snapshot manager' in perl? If so, convention suggests that you implement the components as modules (perlmod) rather than programs.

    Bill
Re: looking for feedback on my first script
by karlgoethebier (Abbot) on Jan 08, 2021 at 20:40 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://11126607]
Approved by Discipulus
Front-paged by shmem
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (4)
As of 2021-06-13 15:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What does the "s" stand for in "perls"? (Whence perls)












    Results (57 votes). Check out past polls.

    Notices?