Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re^2: Regex result being defined when it shouldn't be(?)

by chenhonkhonk (Acolyte)
on Nov 14, 2017 at 16:18 UTC ( [id://1203396]=note: print w/replies, xml ) Need Help??


in reply to Re: Regex result being defined when it shouldn't be(?)
in thread Regex result being defined when it shouldn't be(?)

I'm not doing something if it is defined, I'm doing it if it's NOT defined.
An annoyance to me (I come from a C background) is a variable failing to be defined does NOT return 0 or a 'FALSE' definition, it returns "". For safety reasons and explicitness, I program in the explicit results of tests i.e. defined $var eq "" or defined var ne "". Using simply 'defined $var' and '! defined $var' isn't as clear as what Perl is doing internally.

If I do print "$3" from a match on 'var = 10' I do not get the same as print "". Regex DO NOT return "" on failing to match, they return undef. After further testing, it appears the difference is where the quantifier comes in:
use strict; use warnings; my $string = "string"; if( $string m/([5]?)string/ ){ print "? inside group: $1\n"; #prints fine } if( $string m/([5])?string/ ){ print "? outside group: $1\n"; #Use of uninitialized value $1 in c +oncatenation (.) or string... } return 0;
P.s. the reason I'm doing this manually is because I'm making it as portable as possible and sensible to me. I'm running Perl on Windows 7/8/10, modern Linux, a Debian 2.6.32, etc. Production environment with too many distributions, internal/external network, all that jazz. I already had an issue where a CPAN module I would've liked had some Linux-only make commands.

Replies are listed 'Best First'.
Re^3: Regex result being defined when it shouldn't be(?)
by haukex (Archbishop) on Nov 14, 2017 at 16:40 UTC
    An annoyance to me (I come from a C background) is a variable failing to be defined does NOT return 0 or a 'FALSE' definition, it returns "".

    Actually, that's not exactly what is going on. Perl has a special "false" value that is 0 when used in numeric context and "" in a string context, so in Perl if (boolean) and if (!boolean) are actually "explicit" tests for truth and falsehood for functions that return "true" and "false" values (this applies to just about every builtin, of course there are some rare special cases). Have a look at Truth and Falsehood. Once you get used to this, I hope you'll find if (!defined(...)) (or any of its variants like if (not defined(...)) or unless (defined(...))) more natural. At least personally, I was initially confused when I read if ( defined($x = $1) eq "" ), and I thought you might accidentally be misapplying an idiom like if ( (my $x = $1) eq "foo" ) (which does the assignment and then the comparison).

    If I do print "$3" from a match on 'var = 10' I do not get the same as print "". Regex DO NOT return "" on failing to match, they return undef. After further testing, it appears the difference is where the quantifier comes in:

    Right, which is why I left your $3, that is (])?, out of my explanation, and explicitly referred to your $1 (([@%\$]?)), which you were asking about :-)

    ... portable ... I already had an issue where a CPAN module I would've liked had some Linux-only make commands.

    According to CPAN Testers, Config::Perl runs on Linux, MSWin32, Cygwin, Darwin (Mac OS X), and various *BSD, and from Perl versions 5.8.1 thru 5.26.1.

    Update 2019-08-17: Updated the link to "Truth and Falsehood".

Re^3: Regex result being defined when it shouldn't be(?)
by pryrt (Abbot) on Nov 14, 2017 at 16:42 UTC

    defined $var or equivalently defined($var) will return the integer 1 (which is a TRUE value; 1 is also TRUE in C, so this shouldn't confuse you) if the variable is defined. It will return undef (which is a FALSE value) a FALSE value (see haukex's answer) if the variable is undefined. You then take that value, either 1 or undefthe FALSE value, and stringify it. The integer 1 stringifies into "1". The FALSE value undef stringifies into "". If you don't want undef FALSE to become "", don't stringify. (The eq operator is forcing the stringification on both its arguments.)

    If you really just want a boolean that decides whether the $var is defined or not, just use the truthiness of the result of defined $var -- that is explicitly the boolean test for whether the $var is defined, and the defined $var and !defined $var syntax are explicitly saying "variable is defined" and "variable is not defined". This is similar to C: if you define a function int is_five(int x) { return (x==5); }, then the return value of is_five(var) and !is_five(var) are explicit ways of testing whether or not the variable is 5. From your claim, in C, I would have to write is_five(var)==-1 to verify that var is 5, and is_five(var)==0 to verify that var is not 5, which I vehemently disagree with: that notation obfuscates what c is doing, not clarifies what it's doing internally. Just trust that Perl will do the right thing with boolean expressions in a boolean context, just like you trust that C does the right thing with boolean results in a boolean context.

    if it's the lack of parentheses that are confusing you, then use the parentheses.

    Aside: Urgh... I did one last refresh before hitting create, and saw that haukex beat me by a minute or two again. :-(. I went to all the trouble of writing this up, so I'll hit create anyway.

    update: I was wrong: defined($var) doesn't return undef or 1; it returns the special value, as haukex said.

    c:> perl -le "print defined($x)//'<undef>'; print defined($x)||'untrue +'" untrue c:>

      defined $var ... will return undef (which is a FALSE value) if the variable is undefined.

      Sorry, that's not quite correct, compare the outputs of the following:

      $ perl -wMstrict -MDevel::Peek -le 'my $x = undef; Dump( $x )' $ perl -wMstrict -MDevel::Peek -le 'my $x = !1; Dump( $x )' $ perl -wMstrict -MDevel::Peek -le 'my $x = defined(undef); Dump( $x ) +'

      (Update: Whoops, just saw your update, you saw that yourself)

      The rest of your post is excellent though, so no worries about the duplicated efforts, TIMTOWTDI :-) I especially like the comparison with C, and it makes another important point: comparing a true/false value explicitly is brittle: If defined decided to return 0 as a false value instead, then defined(...) eq "" will break!

      In C and in Perl, the result of a true conditional is 1. EDIT: In Perl, false can be 0 or "" depending on context. Or not. I don't like this. /edit

      When I'm doing checks on a bunch of values, if I used the implicit return from a is_defined() function, I can only have a boolean in response. If I want multiple types of responses I must use an explicit equals. Even in the boolean context, nearly all my conditions have some sort of equality test - even in the case of less than or greater than. To not have that form is an exception when reviewing the code, I have to stop and say "wait, what is the function supposed to be returning? a number? a string? a reference/pointer?".

      You even state that saying is_five()==1 is somehow not intuitive, when that is literally what your is doing, it is checking the truthfulness of whether that number is five.

        Just a couple of thoughts to try to get you into a Perl mindset :-)

        First, note that there is a conceptual gray area between what is "a function" and "an operator". In C, I could replace all a==b's with a function is_equals(a,b), and in Perl, often functions that accept a single argument are called "operators", see e.g. Named Unary Operators (Update: and Terms and List Operators (Leftward)). That's why I look at "return values" the same, no matter if it's a return value from an operator, or a return value from a function.

        In C and in Perl, the result of a true conditional is 1.

        Not quite, in Perl it could also be "0 but true" (e.g. sysseek), or in theory any other "true" value.

        In Perl, false can be 0 or "" depending on context. Or not. I don't like this.

        Look at it this way: If in C some function that returns an int is documented to return "a true value" on success, then that does not mean it will return 1 on success, it will return a nonzero value. Just in Perl, the same abstraction also applies if a function is documented to return "a false value".

        You even state that saying is_five()==1 is somehow not intuitive

        Do you think that instead of if ( a==5 ), I should write if ( (a==5)==1 )? But then I have to write if ( ((a==5)==1)==1 ), and if ( (((a==5)==1)==1)==1 ), and ... ;-)

        Plus, if a function is documented to return "a negative integer" on failure, like many C functions, you don't check that with somefunc()==-1 either.

        I have to stop and say "wait, what is the function supposed to be returning? a number? a string? a reference/pointer?"

        That's what the documentation is for... but if you wanted to be explicit in Perl, you could use if ( !(...) ) and if ( !!(...) ) (although the latter is of course redundant) - just don't use eq or == to check boolean values.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1203396]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (2)
As of 2024-04-24 23:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found