http://www.perlmonks.org?node_id=860228


in reply to Re: length() returns wrong result - suspicious magic
in thread length() returns wrong result - suspicious magic

I though DBI was missing a def for SQL_WCHAR but it turns out it is there. The code started out as code someone submitted to me to look into another issue so I've only modified some parts of it.

Setting $txt = $txt at the end of the loop makes no difference but setting $txt='' fixes it. If this confirms your suspicion could you explain why DBD::ODBC (which wrote to the scalar) might need to call SvSETMAGIC?

  • Comment on Re^2: length() returns wrong result - suspicious magic

Replies are listed 'Best First'.
Re^3: length() returns wrong result - suspicious magic
by ikegami (Patriarch) on Sep 15, 2010 at 15:35 UTC

    why DBD::ODBC (which wrote to the scalar) might need to call SvSETMAGIC?

    To assign a value to a scalar, there are a few steps to follow.

    • Make sure the scalar has the slot (IV, PV, etc) you need by upgrading the structure if necessary.
    • Place the value in the appropriate slot.
    • Set the flag indicating there's a usable value in the slot you populated.
    • Call SvSETMAGIC to let magic respond to the new value if appropriate (if SMG=1). This will end up calling STORE for tied variables, for example. In this case, I expect that it will clear the precomputed length of the string.

    To obtain a value from a scalar, the same is done in reverse.

    • Call SvGETMAGIC to populate the scalar with the a value. (This will end up calling FETCH for tied variables, for example.)
    • Coerce the scalar into the requested type if necessary. (This may requiring upgrading the scalar).
    • Return the value in the appropriate slot.

    Some macros and functions do more than one of these steps for you.

    Setting $txt = $txt at the end of the loop makes no difference

    No, at the start of the loop. After the fetch, but before you use it.

    There's no get magic on the scalar (GMG=0), so the fact that the set magic wasn't called earlier work won't matter. But when the value is assigned back to the scalar, the assignment will properly handle the set magic.

    If I'm right, I can provide a cheaper workaround than copying the string (which defies the purpose of binding).

Re^3: length() returns wrong result - suspicious magic
by ikegami (Patriarch) on Sep 15, 2010 at 15:10 UTC
    No, at the start of the loop. After the fetch, but before you use it.
      while ( $sth_sel->fetch ) { $txt = $txt; printf "%3u %3u %3u %s [%s] [%s]\n", ++$i, length($txt), bytes::length($txt), (utf8::is_utf8($txt) ? ' utf8' : '!utf8'), $txt, $xml; }

      made no difference.

        It is indeed a missing SvSETMAGIC.

        use strict; use warnings; use Devel::Peek qw( Dump ); use Inline C => <<'__EOI__'; void buggy_assign(SV* dsv, SV* ssv) { SvSetSV_nosteal(dsv, ssv); /* Should call SvSetMagicSV_nosteal(dsv, ssv); instead * or should call SvSETMAGIC(dsv); afterwards. */ } void set_magic(SV* sv) { SvSETMAGIC(sv); } __EOI__ my $txt_de = "K\N{U+00E4}se"; my $txt_ru = "\N{U+041C}\N{U+043E}\N{U+0441}\N{U+043A}\N{U+0432}\N{U+0 +430}"; # \N should have done this already. utf8::upgrade($_) for $txt_de, $txt_ru; my $txt; buggy_assign($txt, $txt_de); print(length($txt), "\n"); Dump($txt); buggy_assign($txt, $txt_ru); print(length($txt), "\n"); Dump($txt); set_magic($txt); print(length($txt), "\n"); Dump($txt);
        4 SV = PVMG(0x81af530) at 0x817bca0 REFCNT = 1 FLAGS = (PADMY,SMG,POK,pPOK,UTF8) IV = 0 NV = 0 PV = 0x8177b98 "K\303\244se"\0 [UTF8 "K\x{e4}se"] CUR = 5 LEN = 8 MAGIC = 0x82f0d98 MG_VIRTUAL = &PL_vtbl_utf8 MG_TYPE = PERL_MAGIC_utf8(w) MG_LEN = 4 4 SV = PVMG(0x81af530) at 0x817bca0 REFCNT = 1 FLAGS = (PADMY,SMG,POK,pPOK,UTF8) IV = 0 NV = 0 PV = 0x844b718 "\320\234\320\276\321\201\320\272\320\262\320\260"\0 +[UTF8 "\x{41c}\x{43e}\x{441}\x{43a}\x{432}\x{430}"] CUR = 12 LEN = 16 MAGIC = 0x82f0d98 MG_VIRTUAL = &PL_vtbl_utf8 MG_TYPE = PERL_MAGIC_utf8(w) MG_LEN = 4 6 SV = PVMG(0x81af530) at 0x817bca0 REFCNT = 1 FLAGS = (PADMY,SMG,POK,pPOK,UTF8) IV = 0 NV = 0 PV = 0x844b718 "\320\234\320\276\321\201\320\272\320\262\320\260"\0 +[UTF8 "\x{41c}\x{43e}\x{441}\x{43a}\x{432}\x{430}"] CUR = 12 LEN = 16 MAGIC = 0x82f0d98 MG_VIRTUAL = &PL_vtbl_utf8 MG_TYPE = PERL_MAGIC_utf8(w) MG_LEN = 6

        The first one works because length hasn't placed the magic yet.

        SV = PV(0x816a0c0) at 0x817bca0 REFCNT = 1 FLAGS = (PADMY,POK,pPOK,UTF8) PV = 0x8177b98 "K\303\244se"\0 [UTF8 "K\x{e4}se"] CUR = 5 LEN = 8

        Assigning to $txt — your workaround — works because assignment properly calls SvSETMAGIC, which clears the precomputed length of the string, thus forcing the next call to length to recalculate it.

        SV = PVMG(0x81af570) at 0x817bca0 REFCNT = 1 FLAGS = (PADMY,SMG,POK,pPOK) IV = 0 NV = 0 PV = 0x8177b98 ""\0 CUR = 0 LEN = 8 MAGIC = 0x831d6a0 MG_VIRTUAL = &PL_vtbl_utf8 MG_TYPE = PERL_MAGIC_utf8(w) MG_LEN = -1
        I wonder if $txt=$txt; hits some kind of optimisation. I'll see if I can reproduce the problem without ODBC later today.