Perl: the Markov chain saw PerlMonks

### Numification of strings

by LanX (Bishop)
 on Aug 02, 2010 at 00:35 UTC Need Help??
LanX has asked for the wisdom of the Perl Monks concerning the following question:

Hi

what exactly is the application/motivation/benefit for the way perl numifies strings?

```  DB<9> \$x="abc"

DB<10> p 5 +\$x
5
DB<11> \$x="2abc3"

DB<12> p 5 +\$x
7

BTW: This behaviour seems only to be documented for unary - in perlop

```  DB<18> \$x="ab2c3"

DB<19> p -\$x
-ab2c3
DB<20> \$x="6ab2c3"

DB<21> p -\$x
-6
Isn't this behaviour mainly hiding ugly bugs?

Cheers Rolf

Replies are listed 'Best First'.
Re: Numification of strings
by ww (Archbishop) on Aug 02, 2010 at 01:24 UTC

All your examples except the first involve mixed alphanumerics... which, when numified -- via automatic conversion -- are converted by discarding "(t)railing non-number stuff and leading whitespace" 1 (and use warnings will call that to your attention) while treating leading (after discarding whitespace) numerals as a number.

What's more, if \$x = "LanX" (no numerals, as in your line 1, "DB<9>") numification will treat \$x as 0, whether you add, subtract, multiply or perform any other arithmetic operation...

```perl -e "\$x="LanX"; print (\$x * 3);"
0

I don't know enough about perlguts to be sure morgon's suppostion about atoi is correct, but it certainly sounds plausible. On the other hand, I can't see how a well-defined and well-documented behaviour is "hiding ugly bugs."

1   Learning Perl, 3rd Ed (paper), Schwartz & Phoenix, p26.

Learning Perl, 3rd Ed (paper), Schwartz & Phoenix, p26.

no offense against Merlyn or O'Reilly ... but this should be clearly documented in the perldocs and for all affected operators and not only in the section for "unary -".

And yes I missed the warnings .. I was recently playing around with shells of 4 other dynamic languages and the perl-debugger surprised me in not showing warnings by default.

Cheers Rolf

Actually, there's a fairly explicit explanation in perldoc perlnumber:

```Arithmetic operators
The binary operators "+" "-" "*" "/" "%" "==" "!=" ">" "<" ">=" "<
+="
and the unary operators "-" "abs" and "--" will attempt to convert
arguments to integers. If both conversions are possible without lo
+ss
of precision, and the operation can be performed without loss of
precision then the integer result is used. Otherwise arguments are
converted to floating point format and the floating point result i
+s
used. The caching of conversions (as described above) means that t
+he
integer conversion does not throw away fractional parts on floatin
+g
point numbers.

++  "++" behaves as the other operators above, except that if it is a
string matching the format "/^[a-zA-Z]*[0-9]*\z/" the string
increment described in perlop is used.

Arithmetic operators during "use integer"
In scopes where "use integer;" is in force, nearly all the operato
+rs
listed above will force their argument(s) into integer format, and
return an integer result. The exceptions, "abs", "++" and "--", do
not change their behavior with "use integer;"

Note the exception for "++" -- which several of us, esp. /me -- forgot to mention...and the additional exceptions while using use integer.

BTW, there's some related matter (much less clear, to me, anyway) in perlvar... and perhaps (probably?) in other docs? How say you, Monks?

Re: Numification of strings
by ikegami (Pope) on Aug 02, 2010 at 03:36 UTC
When you ask for the numeric value of something that isn't a number, it does the best it can and gives a warning.
... and gives a warning.

ah, thank you!

I missed there are warnings while testing in the debugger¹!

However this gives me the possibility to change \$SIG{__WARN__} to deal with this kind of errors...e.g. by dieing.

Cheers Rolf

1)... hmm but why?

UPDATE:

OK starting the debugger with -w switch helps.

```:~\$ perl -w -de0
...
DB<1>  \$x="6gg"

DB<2> p -\$x
Argument "6gg" isn't numeric in negation (-) at (eval 6)[/usr/share/pe
+rl/5.10/perl5db.pl:638] line 2.
...

Re: Numification of strings
by morgon (Curate) on Aug 02, 2010 at 00:49 UTC
I think the conversion from a string to a numeric value is simply done with atoi.

So any application/motivation/benefit would be that of the people that invented the C-library.

The atoi() function is a truly ancient thing. People were hunting dinner with spears when this thing first appeared, ...way before C. In C if you "abuse" this, there aren't any warnings. It is the user's responsibility to check the input values before calling atoi(). Some cases and normal ways to check for this are shown below.

```#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main()
{
char *digits="0123456789";

char x[] = "2abc";
int result = atoi(x);
printf ("case a) string %s is %d as decimal\n",x,result);

if (strlen(x) == 0 || (strspn(x,digits) != strlen(x)))
printf ("case b) %s is NOT a simple positive integer!\n"
"        there are either no digits or non-digits\n", x);

char y[] = "";
int result_y = atoi(y);
printf ("case c) a null string results in %d\n",result_y);
return 0;
}
/* prints:
case a) string 2abc is 2 as decimal
case b) 2abc is NOT a simple positive integer!
there are either no digits or non-digits
case c) a null string results in 0
*/
Perl is more generous and gives warnings if they are enabled. This will give a warning:
```#!/usr/bin/perl -w
use strict;

my \$a = "23B";
\$a +=0;
print \$a;
You can check yourself if \D exists in string or if null string. Otherwise Perl will do the "best that it can".

Also of note is that in Perl, everything is a string a string that looks like a number is still a string until used in a numeric context. Consider this:

```my \$x = "00012";
print "\$x\n";
\$x+=0;
print "\$x\n";

##prints
#00012
#12
This is a "trick" that I sometimes use to delete leading zeroes.
in Perl, everything is a string until used in a numeric context

I don't think that is true.

In your example you put quotation-marks around the value, so little surprise that you end up with a string...

Consider this:

```use Devel::Peek;

my \$a = "1";
my \$b = 2;

print Dump(\$a);
print Dump(\$b);
This produces:

```SV = PV(0x98c9700) at 0x98ea3f8
REFCNT = 1
PV = 0x98e6368 "1"\0
CUR = 1
LEN = 4
SV = IV(0x98ea474) at 0x98ea478
REFCNT = 1
IV = 2
As you can see \$a is a string "PV", while \$b is an int "IV", even thought it was never used in an numeric context.
Re: Numification of strings
by BrowserUk (Pope) on Aug 02, 2010 at 05:39 UTC
Isn't this behaviour mainly hiding ugly bugs?

What would you have Perl do?

Unlike Perl and PHP, both Python and Ruby don't allow you to convert strings implicitly to numbers like this, instead demanding that you explicitly convert them (by calling int() in Python and the to_i() method in Ruby). Curiously, Ruby's to_i() method, a la Perl and C's atoi(), happily converts "2abc" to the integer 2, while Python's stricter int() function instead issues a run time error. I know about all this from playing too much multi-language golf. :- ) In golf, these explicit conversions are certainly a chronic pest. In normal programming, however, I don't feel strongly -- though I suspect demanding explicit conversion may help the programmer avoid some data conversion boo boos.

What's implicit about using numeric operators on strings? If you use string operators on strings, you get string behavior. If you use numeric operators (which you must do explicitly), you get numeric behavior. What isn't explicit about that?

> Curiously, Ruby's to_i() method, a la Perl

Not that surprising, Ruby is semantically a Perl clone with new syntax and Smalltalk OOP.

It's a kind of newspeak... (I have a friend who discovered the flip-flop operator just after learning Ruby ... after working in a Perl project for 2 years. :)

The only semantic resemblance to python I could find so far is the way new variables are bound and scoped.

Cheers Rolf

Re: Numification of strings
by biohisham (Priest) on Aug 02, 2010 at 06:10 UTC
While a variable is a string and if it has neither been assigned a value nor numified its value will be '', I am not sure on this but I think '' or undef or 0 can be interchanging depending on the context that the null variable scalar is used under (i.e when a scalar which's null is numified its value becomes 0).

Isn't this behaviour mainly hiding ugly bugs?
This behavior seems consistent and indeed in/of use, for instance, consider when quickly wanting to start a counter:

```C:\Documents and Settings>perl -
print "\$hash{count} \n";
print -\$hash{count},"\n";
print --\$hash{count},"\n";
print ++\$hash{count},"\n";
__END__

0
-1
0
C:\Documents and Settings>

Update: did s/' '/''/, thanks Jenda for noticing the unintended typo as '' and ' ' are not the same thing. Added technical corrections too..

Excellence is an Endeavor of Persistence. A Year-Old Monk :D .

No, it will not be ' ', it will be ''. That's not the same thing.

And actually if a variable has not been assigned, then it's value is undef, which is not a string, but its stringified value is '' an numified value is 0. There is no such thing as an "undefined string" or "string variable"! It is a scalar variable whose value may be undef and/or string and/or integer and/or float and/or reference.

Jenda
Enoch was right!
Enjoy the last years of Rome.

well always? :)

```       dualvar NUM, STRING
Returns a scalar that has the value NUM in a numeric contex
+t and
the value STRING in a string context.

\$foo = dualvar 10, "Hello";
\$num = \$foo + 2;                    # 12
\$str = \$foo . " world";             # Hello world

Cheers Rolf

>This behavior seems consistent and indeed in/of use, for instance, consider when quickly wanting to start a counter:

Sure that's useful, but it's only about converting undef or (maybe sometimes) the empty string ''.

Cheers Rolf

"but it's only about converting undef"

Lest you or anyone else read that as generally true, I repeat, for emphasis re warnings (or for brevity here, -w) and re 'consistent':

```perl -we "\$x="LanX"; print (\$x * 3);"
Argument "LanX" isn't numeric in multiplication (*) at -e line 1.
0
perl -we "\$x="LanX"; print (\$x % 3);"
Argument "LanX" isn't numeric in modulus (%) at -e line 1.
0
perl -we "\$x="LanX"; print (3^\$x);"
Argument "LanX" isn't numeric in bitwise xor (^) at -e line 1.
3
perl -we "\$x="LanX"; print (3**\$x);"
Argument "LanX" isn't numeric in exponentiation (**) at -e line 1.
1

but

```perl -we "\$x='7LanX'; print (3^\$x);"
Argument "7LanX" isn't numeric in bitwise xor (^) at -e line 1.
4
perl -we "\$x='7LanX'; print (\$x*3);"
Argument "7LanX" isn't numeric in multiplication (*) at -e line 1.
21
perl -we "\$x='7LanX'; print (3**\$x);"
Argument "7LanX" isn't numeric in exponentiation (**) at -e line 1.
2187

and where it's useful -- using a string as a number because it's being used in an operation which is arithmetic:

```perl -we "\$x=\"5\"; print (\$x*3);"  # 2
15
perl -we "\$x='5'; print (\$x*3);"
15

2   Interesting (or not enough coffee yet?): doze balked at doublequotes without escapes here, but passed 'em cheerfully above. WTF?

wow incrementing with ++ is not orthogonal to +1!

```  DB<33> my \$y;\$y++

DB<34> my \$y;\$y+1
Use of uninitialized value \$y in addition (+) at (eval 38)[/usr/share/
+perl/5.10/perl5db.pl:638] line 2.

from perlop

```        The auto-increment operator has a little extra builtin magic t
+o it.

...(<i>special magic for incrementing strings, ie <c>++(\$x='x9') eq 'y
+0'</i>)...

"undef" is always treated as numeric, and in particular is chan
+ged to 0
before incrementing (so that a post-increment of an undef value
+ will
return 0 rather than "undef").

Cheers Rolf

Create A New User
Node Status?
node history
Node Type: perlquestion [id://852379]
Approved by ww
Front-paged by biohisham
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (6)
As of 2018-03-23 07:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
When I think of a mole I think of:

Results (288 votes). Check out past polls.

Notices?