http://www.perlmonks.org?node_id=1037135


in reply to Re^3: Undefined vs empty string
in thread Undefined vs empty string

Speed very much depends on what is the most likely value to occur in the tests. length for sure is not always faster. I prefer using defined-or:

$ cat test.pl use 5.014; use warnings; use Benchmark qw(cmpthese); foreach $a (undef, "", 0, 1, "x" x 60) { say "Testing for a = ", $a // "undef"; cmpthese (-1, { def_eq => sub { for (0..1000) { ! defined $a || $a eq "" ? 1 : +0; }}, def_len => sub { for (0..1000) { !(defined $a && length $a) ? 1 : +0; }}, dor_eq => sub { for (0..1000) { ($a // "") eq "" ? 1 : +0; }}, }); } $ perl test.pl Testing for a = undef Rate dor_eq def_eq def_len dor_eq 9135/s -- -21% -21% def_eq 11499/s 26% -- -1% def_len 11605/s 27% 1% -- Testing for a = Rate def_len def_eq dor_eq def_len 7110/s -- -8% -25% def_eq 7729/s 9% -- -19% dor_eq 9489/s 33% 23% -- Testing for a = 0 Rate def_len def_eq dor_eq def_len 7518/s -- -12% -29% def_eq 8533/s 14% -- -19% dor_eq 10577/s 41% 24% -- Testing for a = 1 Rate def_len def_eq dor_eq def_len 7518/s -- -12% -29% def_eq 8533/s 14% -- -19% dor_eq 10577/s 41% 24% -- Testing for a = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx +xxxxxx Rate def_len def_eq dor_eq def_len 7450/s -- -13% -30% def_eq 8533/s 15% -- -20% dor_eq 10676/s 43% 25% -- $

Enjoy, Have FUN! H.Merijn

Replies are listed 'Best First'.
Re^5: Undefined vs empty string
by BrowserUk (Patriarch) on Jun 05, 2013 at 11:16 UTC
    Speed very much depends on what is the most likely value to occur in the tests. length for sure is not always faster.

    Hm. You seem to be implying that the speed of length varies with the length of the string?

    Whilst that would make sense with C; it makes no sense at all with length, which doesn't need to scan the string to find its length.

    Which makes me a little suspicious of the benchmark.

    Update: Of course; the affect you are seeing it due to stringification of numeric values.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Wrong impression. If you look at the figures, there is no difference between '0', '1' and "xxxx…". I known the speed of length doesn't depend on the length of the PV (unless of course it is overloaded or magic). I was actually kind of hoping that the example showed just that.


      Enjoy, Have FUN! H.Merijn
        I was actually kind of hoping that the example showed just that.

        Here's an even mix of possible values including the empty string which you omitted, and the results are too close to call.:

        #! perl -slw use strict; use Benchmark qw[ cmpthese ]; our @tests = ( (undef) x1000, ('') x1000, (chr(0)) x1000, (0) x1000, (1) x1000, ('fred') x1000, ); cmpthese -1, { a => q[ !(defined && length() ) and 1 for @tests; ], b => q[ !defined || $_ eq '' and 1 for @tests; ], }; __END__ C:\test>junk Rate b a b 1029/s -- -17% a 1239/s 20% -- C:\test>junk Rate b a b 1065/s -- -1% a 1079/s 1% --

        Much of a muchness.

        I guess you could get into varying the proportions of the mix commensurate with your application expectations, but ...


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

      Hm. You seem to be implying that the speed of length varies with the length of the string?

      It doesn't vary for strings in the UTF8=0 format, but it does vary for strings in the UTF8=1 format. The length is cached (in a magic annotation) once discovered, though.

      >perl -MDevel::Peek -e"utf8::upgrade( $x = "abc" ); Dump($x); length($ +x); Dump($x);" SV = PV(0x7b8d54) at 0x328554 REFCNT = 1 FLAGS = (POK,pPOK,UTF8) PV = 0x7b9fac "abc"\0 [UTF8 "abc"] CUR = 3 LEN = 12 SV = PVMG(0x31e8e4) at 0x328554 REFCNT = 1 FLAGS = (SMG,POK,pPOK,UTF8) IV = 0 NV = 0 PV = 0x7b9fac "abc"\0 [UTF8 "abc"] CUR = 3 LEN = 12 MAGIC = 0x31f17c MG_VIRTUAL = &PL_vtbl_utf8 MG_TYPE = PERL_MAGIC_utf8(w) MG_LEN = 3

        How very strange.

        The upgrading to utf has required the inspection of the bytes and conversion where necessary:

        C:\test\perl -MDevel::Peek -e"utf8::upgrade( $x = qq[abc\xee] ); Dump( +$x);" SV = PV(0xea240) at 0x275898 REFCNT = 1 FLAGS = (POK,pPOK,UTF8) PV = 0xef178 "abc\303\256"\0 [UTF8 "abc\x{ee}"] CUR = 5 LEN = 6

        The original 4-bytes has been converted to (CUR=) 5, which implies (to me at least) that it could have recorded the charwise length at that point rather than having to rediscover it later.

        (Also, what shell are you using that allows double quotes embedded within double quotes unescaped?)


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.