Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

number comparison with a twist

by anotherguest (Novice)
on Mar 02, 2020 at 13:53 UTC ( [id://11113628]=perlquestion: print w/replies, xml ) Need Help??

anotherguest has asked for the wisdom of the Perl Monks concerning the following question:

Greetings Monks!

I have an API that returns prices as floats in textual form and a database that contains prices in integer cents.
None of this can be changed.

I need to compare whether the prices retrieved through the API conform to the ones stored in the database.
Prototype:

#!/usr/bin/perl -w # use strict; use Data::Dumper; my $num1 = 1990; my $num2 = '19.90'; #my $num1 = 465; #my $num2 = '4.65'; $num2 += 0; $num2 *= 100; print Dumper($num1); print Dumper($num2); if ($num1 == $num2) { print "-> equal\n"; } else { print "-> different\n"; } $num2 = int($num2); print Dumper($num1); print Dumper($num2); if ($num1 == $num2) { print "-> equal\n"; } else { print "-> different\n"; }
This fails due to the way floats are stored in memory:
$VAR1 = 1990; $VAR1 = '1990'; -> different $VAR1 = 1990; $VAR1 = 1989; -> different
For other values this sort of works but still requires the use of int():
$VAR1 = 465; $VAR1 = '465'; -> different $VAR1 = 465; $VAR1 = 465; -> equal

What is the correct way of doing this within the given constraints?

Replies are listed 'Best First'.
Re: number comparison with a twist
by pryrt (Abbot) on Mar 02, 2020 at 14:12 UTC
    I am assuming you know What Every Computer Scientist Should Know About Floating-Point Arithmetic, since you say "this fails due to the way floats are stored in memory". On the off-chance you (or a future reader) needs a refresher, I have included the link.

    Since the prices are coming as floats-in-a-string, you have it easy: you should just be able to strip out the decimal point. Assuming it's always got exactly two digits following, $price_string =~ s/\.(\d{2})$/$1/;. After this, it will be in string and numeric form as integer cents. parv already posted a regex that just strips the decimal point from anywhere in the string.

    Alternately, don't use the int, which does truncation; instead, use a round-to-nearest, like round of POSIX

      I did not at all consider this solution, unfortunately, that also means I posted a too simplified prototype.

      The API can return the price with variable leading and trailing digits, so 19.900000 is a valid result as is 123.2 so a bit of an extension is needed:

      $price_string .= '0'; $price_string =~ s,^(\d+)\.(\d{2})0*,${1}${2},;
        > The API can return the price with variable leading and trailing digits, so 19.900000 is a valid result as is 123.2

        In this case my last reply should be perfect. :)

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery

Re: number comparison with a twist
by tobyink (Canon) on Mar 02, 2020 at 15:37 UTC

    This should work?

    my $fmt = '%.02f'; if ( sprintf($fmt, $num1) eq sprintf($fmt, $num2) ) { print "they're the same, to two decimal places\n"; }

    Another common way is to decide upon a maximum allowable difference between them, for example, 0.005, and compare them like this:

    my $diff = 0.005; if ( abs($num1-$num2) < $diff ) { print "they're close enough\n"; }
Re: number comparison with a twist
by hippo (Bishop) on Mar 02, 2020 at 14:08 UTC

    I'd go the other way - converting the int to a string.

    use strict; use warnings; use Data::Dumper; use Test::More tests => 1; my $num1 = 1990; my $num2 = '19.90'; $num1 = sprintf ("%.2f", $num1 * 0.01); is $num1, $num2; diag Dumper ([$num1, $num2]);
Re: number comparison with a twist
by LanX (Saint) on Mar 02, 2020 at 14:13 UTC
    Supposing your float has always exactly two decimal points

    I'd convert the textual float to textual cents by eliminating the dot .

    Like this you'll only deal with integers.

    I might even try to only use eq for comparison then, but this depends on the accuracy of your format (leading zeroes and so on)

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

      I like this answer.
      The input is NOT a float, it is a string of decimal digits which could potentially be represented by a binary float.
      I would move the decimal 2 places to the right, 12.=>1200, 12.3=>1230 and see if any non-zero digits remain. If so, there are going to be problems when comparing numbers. I would not convert "12.30" to a float unless you have to.

      This API is giving the OP a decimal character string instead of a much more compact, efficient binary float. I suspect that there are reasons behind that the OP hasn't told us. This string could have come from some BCD (Binary Coded Decimal) calculation or whatever. I have never written any BCD code, but yes there are math operations that modern processors can do on arrays of BCD's (every 4 bits only goes 0-9 instead of 0-F).

Re: number comparison with a twist
by BillKSmith (Monsignor) on Mar 02, 2020 at 15:35 UTC
    It is almost never a good idea to test floating point numbers for equality. We can assume that they are 'equal' if their difference is 'sufficiently' small. The definition of 'sufficient' depends on the application. In your case, .01 cent is probably sufficient. This is far larger than any error introduced by floating point. For your current question, this method would use far more of your existing code than any of the exact methods proposed by other monks.
    Bill
Re: number comparison with a twist
by parv (Parson) on Mar 02, 2020 at 14:00 UTC

    Below does not apply to other example data posted later.


    What about doing the comparison after converting the string to integer AFTER removing the decimal point ($x =~ s,\.,, ; $x = int( $x ) ; $x == $y ? ... : ... ;)? Won't that work?

Re: number comparison with a twist (stringification rounding)
by LanX (Saint) on Mar 02, 2020 at 17:45 UTC
    There is a little hacky way of doing it.

    I'm expanding veltro's suggestion to use the string representation, which smoothes the rounding errors from floats away.

    DB<42> $num1 = 1990; $num2 = '19.90' DB<43> p $num1 == ("" . $num2*100) # number->to_string->to_n +umber 1 DB<44>

    This test proves it works reliably even with tenth of cents

    DB<41> say join "\n", grep { sprintf ("%03d",$_) != ("".($_/1000)*100 +0) } 0..99999 DB<42>

    Please note that while this seems dirty, a numeric comparison with == will always do what you want even if the formats are not like expected.

    Like having leading zeroes or not exactly 2 decimal points.

    DB<44> $num1 = '001990'; $num2 = '19.9' DB<45> p $num1 == ("" . $num2*100) + 1 DB<46> $num2 = '19.900' DB<47> p $num1 == ("" . $num2*100) 1 DB<60> $num1 = '199e1' DB<61> p $num1 == ("" . $num2*100) 1

    update

    an easier notation might be

    DB<63> $num2*=100 DB<64> p $num1 == "$num2" 1 DB<65>

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

Re: number comparison with a twist
by Veltro (Hermit) on Mar 02, 2020 at 14:44 UTC

    You can just change the == to eq. Because $num2 is not a floating point

      > Because $num2 is not a floating point

      $num2 is a float with rounding error after already before the multiplication.

      Just the string representation will ignore the rounding error hence eq will work here.

      I that's reliable for all cases? I don't dare saying.

      DB<4> $num2 ="19.90" DB<5> printf "%.20f", $num2*100 1989.99999999999980000000 DB<6> p 1990 == $num2 *100 DB<7> p 1990 eq $num2 *100 1 DB<8> say ">". $num2 *100 ."<" >1990< DB<9>

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

        $num2 is a float with rounding error after the multiplication.

        I am not so sure about that.

        print "1989.9999999999998" + 0 ; # 1990

        The 'target form' is decimal string.

        From perlnumber Target form: If the source number can be represented in the target form, that representation is used.

        Also: When a numeric value is passed as an argument to such an operator, it will be converted to the format understood by the operator.. My assumption is that this counts for  + - / * etc applied to the decimal string and eventually also for eq.

Re: number comparison with a twist
by Marshall (Canon) on Mar 02, 2020 at 23:39 UTC
    The input number is string, not a binary float. I would convert that string to "cents" using string operations and use the resulting integer for comparison with the DB. If this input string can describe fractions of a cent, then there is more thinking involved about how to round to integers (round up, or perhaps "round to even"). And what that would mean in the overall result.
    #!/usr/bin/perl use strict; use warnings; foreach my $y (qw(19.990 19.9 19. 19.559 19.00 19.34776454540000)) { my $x =$y."00"; # at least 2 digits past the decimal $x =~ s/(\d+)(\.)(\d{2})(\d+)?/$1$3/; print "result: $y => $x\n"; } __END__ result: 19.990 => 1999 result: 19.9 => 1990 result: 19. => 1900 result: 19.559 => 1955 result: 19.00 => 1900 result: 19.34776454540000 => 1934

      I've added one more step to also catch the API returning an integer. This is now my implementation:

      $from_api .= '.' unless $from_api =~ /\./; $from_api .= '00'; $from_api =~ s,^(\d+)\.(\d{2}).*$,${1}${2},; $from_api += 0;

      This catches all corner cases I can think of.

        Truncation is still wrong.

        use warnings; use strict; use POSIX qw/round/; use Test::More tests => 2; # somebody used a generating algorithm that used 32-bit single-precisi +on floats, entered 1.13, but thought it was double-procision so print +ed it into your database as %.15f; so now your api returns '1.1299999 +95231628' for a number intended to be exactly 1.13 # 1.12999999523162841796875 # exact 32-bit float repres +entation of 1.13 sub get_from_api { '1.129999995231628' } # sprintf '%.15f', 1.129999 +99523162841796875; my $from_api = get_from_api(); print "string from_api = '$from_api' straight from api\n"; $from_api .= '.' unless $from_api =~ /\./; $from_api .= '00'; $from_api =~ s,^(\d+)\.(\d{2}).*$,${1}${2},; print "string from_api = '$from_api' after text manipulation\n"; $from_api += 0; print "bad rounding = ", $from_api, "cents\n"; is $from_api, 113, "should be 113 cents"; #### redo, with proper rounding $from_api = get_from_api(); print "string from_api = '$from_api' straight from api\n"; $from_api .= '.' unless $from_api =~ /\./; $from_api .= '00'; $from_api =~ s,^(\d+)\.(\d{2})(\d*).*$,${1}${2}.${3},; print "string from_api = '$from_api' after text manipulation\n"; $from_api = round($from_api); print "good rounding = ", $from_api, "cents\n"; is $from_api, 113, "should be 113 cents";
        BTW, aside from logic issues, I don't know how you came across the idea of using "," a comma as the separator?
        Yes, this is "legal and allowed", but the difference between a . and a , can be hard to see.

        I would use the default of "/" unless there is a reason not to.
        My second choice would be vertical bar.
        My 3rd choice would be curly braces - almost never.
        I have not ever been tempted to use "," for the separator.

        $from_api =~ s,^(\d+)\.(\d{2}).*$,${1}${2},; $from_api =~ s/^(\d+)\.(\d{2}).*$/$1$2/; $from_api =~ s|^(\d+)\.(\d{2}).*$|$1$2|; $from_api =~ s{^(\d+)\.(\d{2}).*$}{$1$2};
Re: number comparison with a twist
by leszekdubiel (Scribe) on Mar 02, 2020 at 23:22 UTC
    Try
    if (abs(num1 - num2) < 1e-6) {.,.
A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11113628]
Approved by marto
Front-paged by haukex
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (5)
As of 2024-04-24 05:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found