Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Ordered Comparison of Generalized Version Strings.

by John M. Dlugosz (Monsignor)
on Jul 26, 2001 at 01:51 UTC ( #99825=snippet: print w/replies, xml ) Need Help??
Description: A common version number contains sequences of digits, sequences of letters, and punctuation marks.

The string is viewed as a sequence of components. The components are compared with corresponding components, from left-to-right.

A component is one of the following:

  • A series of digits. They are compared as numbers (“11” > “2”).
  • A series of ASCII letters. They are compared as strings, case-insensitive.
  • A series of punctuation marks, loosely defined as everything else. They are compared as match/no-match only. A no-match is considered an exception, and prevents greater/less comparisons from working at all.

The parts are compared from left-to-right. If the parts match up to the point where one string runs out of parts, the one with parts left over is Larger. For example, “1.23.45” < “1.23.45b”. The left string has 5 parts (3 series’ of digits, two series’ of punctuation marks); the right string has 6 parts.

Strings that don’t work with this algorithm are those that use words, dates, or non-left-to-right ordering of parts.

This was written before Perl provided the "v-string" literal. But, it is not limited to just numbers and dots. It handles most any reasonable version naming system, including other delimiters and letters as well as numbers.

{
my $splitter= qr/\d+|[A-Za-z]+|[^0-9A-Za-z]+/;
sub compare_version_string ($$)
# returns <0, 0, >0 to indicate less, eq, or greater-than.
# dies (exception) if no relation exists.
 {
 my ($left, $right)= @_;
 my @left= $left =~ /$splitter/g;
 my @right= $right =~ /$splitter/g;
 print "@left\n";
 print "@right\n";
 my $r;
 while (@left && @right) {
    $left= shift @left;
    $right= shift @right;
    if ($left =~ /^\d+$/ && $right =~ /^\d+$/) {
       # compare as numbers
       $r= $left <=> $right;
       return $r  if $r;  # or keep going if zero.
       }
    elsif ($left =~ /^[A-Za-z]+$/ && $right =~ /^[A-Za-z]+$/) {
       # compare as strings
       $r= $left cmp $right;
       return $r  if $r;  # or keep going if zero.
       }
    elsif ($left =~ /^[^0-9A-Za-z]+$/  &&  $right =~ /^[^0-9A-Za-z]+$/
+) {
       # delimter or "other", much match exact.
       die "version strings are not compatible.\n"  unless $left eq $r
+ight;
       }
    else {
       # the parts are not of the same type
       die "version strings are not compatible.\n";
       }
    }
 # one of the strings ran out of parts.
 return scalar(@left) <=> scalar(@right);
 }
}
Replies are listed 'Best First'.
(tye)Re: Ordered Comparison of Generalized Version Strings.
by tye (Sage) on Jul 26, 2001 at 02:10 UTC

    I prefer a trick I usually use when sorting:

    sub compare_versions { my( $left, $right )= @_; for( $left, $right ) { s/(\d+)/pack "N",$1/ge; } return $left cmp $right; }

            - tye (but my friends call me "Tye")
      Cute. I don't expect any numbers larger than 4G in a version string! But, it's not case-insensitive, and doesn't validate that left and right are in the same format (you get a bogus result if they are not).

      I don't know if I can use a trick like this to simplify it and still keep the error check. The check pretty much demands that each part be inspected so it can be categorized.

        Well it is trivial to make mine case insensitive. And I think mine makes better sense. Your routine considers 1.3 < 1.3a < 1.3a.2 but refuses to compare 1.3.4 to 1.3a.2. That seems inconsistant to me. I think that mine agrees with yours in all cases where yours will make a comparison.

        Mine works for lots of types of strings but not strings that contain control characters, negative numbers, digit strings larger than 0.5G, numbers with embedded commas, numbers in scientific notation, fractions, or numbers with different numbers of digits after the decimal point.

        There are quite a few variations on this "theme" that I've used at different times. If you need to support numbers of around 8000 digits, then you can use this one (that also ignores case):

        sub compare_versions { my( $left, $right )= map lc, @_; for( $left, $right ) { s/(\d+)/pack("n",length$1).$1/ge; } return $left cmp $right; }

                - tye (but my friends call me "Tye")
Re: Ordered Comparison of Generalized Version Strings.
by jmcnamara (Monsignor) on Jul 26, 2001 at 12:37 UTC

      Unfortunately, this module fails for some simple cases:

      use Sort::Versions; print join(" ",sort versions qw(1.2a 1.10a)),"\n";
      produces     1.10a 1.2a

      Note that this matches the behavior described in the module's documentation, but I think it isn't what most people would want.

      I've been considering patching the module to make it more useful...

      [Mostly a repeat of Re: Re: Sorting on Section Numbers (to save clicks).]

              - tye (but my friends call me "Tye")
Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: snippet [id://99825]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (3)
As of 2020-12-02 01:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    How often do you use taint mode?





    Results (26 votes). Check out past polls.

    Notices?