Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Inconsistent behavior of Getopt::Std

by likbez (Sexton)
on Aug 16, 2020 at 05:42 UTC ( [id://11120805]=perlquestion: print w/replies, xml ) Need Help??

likbez has asked for the wisdom of the Perl Monks concerning the following question:

Who can propose modification to Getopt::Std which fixes the following behavior: in case you do not supply a value to the option which expect values (like "b:" below), the behavior of this module varies depending on if the parameter is the last in the line or not. If it is not the last, then the next option or argument is "eaten" as the value. Which is clearly undesirable. BTW it eats "final options separator"('--') with the same appetite as the other options.

NOTE: During testing I discovered that this module correctly processes setting the value of an option via repetition of the option letter , like -ddd.

Here is the test program that I used:

[0] # cat parameter_check.pl use v5.10; use warnings; use strict 'subs'; use feature 'state'; use Getopt::Std; getopts("b:cd:",\%options); if( exists $options{'b'} ){ if( $options{'b'} ){ say "option b $options{'b'}"; }else{ say "b is zero length string or equvalent"; } }else{ say 'key b does not exist'; } say "option c:", $options{'c'}; say "option d: ", $options{'d'};
In short test run with parameters -b -ddd
parameter_check.pl -b -ddd
produces non-intuitive result, where -ddd is eaten by -b and becomes the value of option -b. Please note that is the next option is option without parameter, like "-b -c -ddd", then the option -c will be simply not set (it will become the value of -b), but option -ddd will be processed correctly, so such error will probably will not be detected :-)

But if the parameter without the value is the last, like in:

parameter_check.pl -ddd -b
the result is as expected: options('b') is set to zero length string.

Here are my test runs:

[0] # perl parameter_check.pl -ddd -c -b b is zero length string or equivalent option c:1 option d: dd [0] # perl parameter_check.pl -ddd -b -c option b: -c Use of uninitialized value in say at parameter_check.pl line 18. option c: option d: dd [0] # perl parameter_check.pl -ddd -b -c option b -c Use of uninitialized value in say at parameter_check.pl line 18. option c: option d: dd [0] # perl parameter_check.pl -ddd -c -b -- option b -- option c:1 option d: dd [0] # perl parameter_check.pl -c -b -ddd option b -ddd option c:1 Use of uninitialized value in say at parameter_check.pl line 19. option d:

The main loop in this module is just 60 lines of rather compact and terse Perl. I hope that one of Perl guru here can easily find elegant solution to this problem:

sub getopts ($;$) { my ($argumentative, $hash) = @_; my (@args,$first,$rest,$exit); my $errs = 0; local $_; local @EXPORT; @args = split( / */, $argumentative ); while(@ARGV && ($_ = $ARGV[0]) =~ /^-(.)(.*)/s) { ($first,$rest) = ($1,$2); if (/^--$/) { # early exit if -- shift @ARGV; last; } my $pos = index($argumentative,$first); if ($pos >= 0) { if (defined($args[$pos+1]) and ($args[$pos+1] eq ':')) { shift(@ARGV); if ($rest eq '') { ++$errs unless @ARGV; $rest = shift(@ARGV); } if (ref $hash) { $$hash{$first} = $rest; } else { ${"opt_$first"} = $rest; push( @EXPORT, "\$opt_$first" ); } } else { if (ref $hash) { $$hash{$first} = 1; } else { ${"opt_$first"} = 1; push( @EXPORT, "\$opt_$first" ); } if ($rest eq '') { shift(@ARGV); } else { $ARGV[0] = "-$rest"; } } } else { if ($first eq '-' and $rest eq 'help') { version_mess($argumentative, 'main'); help_mess($argumentative, 'main'); try_exit(); shift(@ARGV); next; } elsif ($first eq '-' and $rest eq 'version') { version_mess($argumentative, 'main'); try_exit(); shift(@ARGV); next; } warn "Unknown option: $first\n"; ++$errs; if ($rest ne '') { $ARGV[0] = "-$rest"; } else { shift(@ARGV); } }

Thank you in advance for any help.

2020-08-16 Athanasius changed <pre> to <c> tags.

Replies are listed 'Best First'.
Re: Inconsistent behavior of Getopt::Std
by haukex (Archbishop) on Aug 16, 2020 at 08:58 UTC
    parameter_check.pl -b -ddd produces non-intuitive result, where -ddd is eaten by -b and becomes the value of option -b.

    This would be a mistake on the part of the user invoking the command. You've told the module that option -b takes a value, so the module is simply doing what you asked: taking the next value in @ARGV as the value of the option. Note that other command-line tools act the same way, e.g. grep -e -i causes -i to be taken as the value for the -e option. This does not look like a bug in the library to me.

    parameter_check.pl -ddd -b the result is as expected: options('b') is set to zero length string.

    This is actually an invalid command: option -b requires a value, and it's not getting one. You're not checking the return value of getopts, which would have told you something is wrong. A correct command line to specify the empty string for the option -b would have been parameter_check.pl -ddd -b ''.

    the behavior of this module varies depending on if the parameter is the last in the line or not

    Based on the above, this is not the correct conclusion to draw.

    Not all command-line tools agree on a standard way of processing options, so if you want different behavior, you may have to use a different library or implement it yourself - though I wouldn't advise the latter, since you'd just be adding yet another differing way to process options to the mix.

    During testing I discovered that this module correctly processes setting the value of an option via repetition of the option letter , like -ddd.

    Not really, it's simply taking dd as the value of the -d option. Getopt::Long actually handles this as a separate case - otherwise, its behavior is identical to Getopt::Std for the following test cases, plus it gives a helpful error message.

      Not really, it's simply taking dd as the value of the -d option.
      True. The code is pretty convoluted and is also very old. But the key logic can easily be improved adding less then a dosen of lines. Legacy code like built in help screen via option -help that nobody needs can be removed. The value of @EXPORT array is unclear and probably can be removed too (archaeological Perl). All useful functionality can be implemented in 40 lines or less. Here is my variant, which probably can be considerably improved as this idea of extracting $first and $rest via regex is open to review: they split at the fixed position.
      #!/usr/bin/perl
      use v5.10;
         use warnings;
         use strict 'subs';
         use feature 'state';
         use Getopt::Std;
      
         getopts('b:cd:',\%options);
         foreach $opt (keys(%options)) {
            if($options{$opt} eq '' ){
               say "$opt is set to ''";
            }else{
               say "$opt is set to $options{$opt}";
            }         
         }
         exit 0;
      sub getopts
      {
      my ($argumentative,$hash)=@_;
      my (@args,$first,$rest,$pos);
         @args = split( //, $argumentative );
         while(@ARGV && ($_ = $ARGV[0]) =~ /^-(.)(.*)$/s ){
            ($first,$rest) = ($1,$2);
            if (/^--$/) {	# early exit if --
               shift @ARGV;
               last;
            }
            $pos = index($argumentative,$first);
            if( $pos==-1) {
               warn("Undefined option -$first skipped without processing\n");
               shift(@ARGV);
               next;
            }
            if (defined($args$pos+1) and ($args$pos+1 eq ':')) {
               # option with parameters
               if( $rest eq ''){          
                  unless( @ARGV ){
                     warn("End of line reached for option -$first which requires argument\n");
                     $$hash{$first}='';
                     last;
                 }
                 if ( $ARGV[0] =~/^-/ ) {
                     warn("Option -$first requires argument\n");
                     $$hash{$first} = '';
                 }else{
                     $$hash{$first}=$ARGV[0];
                     shift(@ARGV); # get next chunk
                 }
               } else {
                  if( ($first x length($rest)) eq $rest ){
                     $$hash{$first} = length($rest)+1;
                  }else{
                     $$hash{$first}=$rest;
                  }
                  shift(@ARGV);
               }
            }else {
               $$hash{$first} = 1; # set the option
               if ($rest eq '') {
                  shift(@ARGV);
               } else {
                  $ARGV[0] = "-$rest"; # there can be other options without arguments after the first
               }
            }
         }
      }
      

      Here are two test runs:

      [0]  # perl  Std2.pl  -c -b -ddd
      Option -b requires argument
      d is set to 3
      c is set to 1
      b is set to ''
       
      [0]  # perl   Std2.pl  -c -ddd -b
      End of line reached for option -b which requires argument
      d is set to 3
      c is set to 1
      b is set to ''
      
        >True. The code is pretty convoluted and is also very old. But the key logic can easily be improved adding less then a dosen of lines. Legacy code like built in help screen via option -help that nobody needs can be removed.

        Moving the goal post?

        >archaeological Perl

        Do you mean vestigial?

        >is open to review

        Might want to take that to GH. I wouldn't call comments you get here a code review.

Re: Inconsistent behavior of Getopt::Std
by jeffenstein (Hermit) on Aug 16, 2020 at 08:48 UTC

    Getopt::Std is using the conventions traditionally used by Unix commands. For someone who's been using Unix/Linux for a long time, it's consistent with with the other(non-GNU) command's option parsing, with all of it's warts.

    Getopt::Long follows the newer GNU conventions and may be closer to what you're expecting.

      It depends on utility. For example vi processes options more intelligently:
      +num For the first file the cursor will be positioned on line "num". If "num" is missing, the cursor will be positioned on the last line.

        So, do you want your option processor to handle an option in the format +[num] (I'm guessing that's what you intended to write)?


        Give a man a fish:  <%-{-{-{-<

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11120805]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (4)
As of 2024-04-25 07:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found