Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re^2: length() returns wrong result - suspicious magic

by mje (Deacon)
on Sep 15, 2010 at 14:53 UTC ( #860228=note: print w/ replies, xml ) Need Help??


in reply to Re: length() returns wrong result - suspicious magic
in thread length() returns wrong result - suspicious magic

I though DBI was missing a def for SQL_WCHAR but it turns out it is there. The code started out as code someone submitted to me to look into another issue so I've only modified some parts of it.

Setting $txt = $txt at the end of the loop makes no difference but setting $txt='' fixes it. If this confirms your suspicion could you explain why DBD::ODBC (which wrote to the scalar) might need to call SvSETMAGIC?


Comment on Re^2: length() returns wrong result - suspicious magic
Re^3: length() returns wrong result - suspicious magic
by ikegami (Pope) on Sep 15, 2010 at 15:10 UTC
    No, at the start of the loop. After the fetch, but before you use it.
      while ( $sth_sel->fetch ) { $txt = $txt; printf "%3u %3u %3u %s [%s] [%s]\n", ++$i, length($txt), bytes::length($txt), (utf8::is_utf8($txt) ? ' utf8' : '!utf8'), $txt, $xml; }

      made no difference.

        I wonder if $txt=$txt; hits some kind of optimisation. I'll see if I can reproduce the problem without ODBC later today.

        It is indeed a missing SvSETMAGIC.

        use strict; use warnings; use Devel::Peek qw( Dump ); use Inline C => <<'__EOI__'; void buggy_assign(SV* dsv, SV* ssv) { SvSetSV_nosteal(dsv, ssv); /* Should call SvSetMagicSV_nosteal(dsv, ssv); instead * or should call SvSETMAGIC(dsv); afterwards. */ } void set_magic(SV* sv) { SvSETMAGIC(sv); } __EOI__ my $txt_de = "K\N{U+00E4}se"; my $txt_ru = "\N{U+041C}\N{U+043E}\N{U+0441}\N{U+043A}\N{U+0432}\N{U+0 +430}"; # \N should have done this already. utf8::upgrade($_) for $txt_de, $txt_ru; my $txt; buggy_assign($txt, $txt_de); print(length($txt), "\n"); Dump($txt); buggy_assign($txt, $txt_ru); print(length($txt), "\n"); Dump($txt); set_magic($txt); print(length($txt), "\n"); Dump($txt);
        4 SV = PVMG(0x81af530) at 0x817bca0 REFCNT = 1 FLAGS = (PADMY,SMG,POK,pPOK,UTF8) IV = 0 NV = 0 PV = 0x8177b98 "K\303\244se"\0 [UTF8 "K\x{e4}se"] CUR = 5 LEN = 8 MAGIC = 0x82f0d98 MG_VIRTUAL = &PL_vtbl_utf8 MG_TYPE = PERL_MAGIC_utf8(w) MG_LEN = 4 4 SV = PVMG(0x81af530) at 0x817bca0 REFCNT = 1 FLAGS = (PADMY,SMG,POK,pPOK,UTF8) IV = 0 NV = 0 PV = 0x844b718 "\320\234\320\276\321\201\320\272\320\262\320\260"\0 +[UTF8 "\x{41c}\x{43e}\x{441}\x{43a}\x{432}\x{430}"] CUR = 12 LEN = 16 MAGIC = 0x82f0d98 MG_VIRTUAL = &PL_vtbl_utf8 MG_TYPE = PERL_MAGIC_utf8(w) MG_LEN = 4 6 SV = PVMG(0x81af530) at 0x817bca0 REFCNT = 1 FLAGS = (PADMY,SMG,POK,pPOK,UTF8) IV = 0 NV = 0 PV = 0x844b718 "\320\234\320\276\321\201\320\272\320\262\320\260"\0 +[UTF8 "\x{41c}\x{43e}\x{441}\x{43a}\x{432}\x{430}"] CUR = 12 LEN = 16 MAGIC = 0x82f0d98 MG_VIRTUAL = &PL_vtbl_utf8 MG_TYPE = PERL_MAGIC_utf8(w) MG_LEN = 6

        The first one works because length hasn't placed the magic yet.

        SV = PV(0x816a0c0) at 0x817bca0 REFCNT = 1 FLAGS = (PADMY,POK,pPOK,UTF8) PV = 0x8177b98 "K\303\244se"\0 [UTF8 "K\x{e4}se"] CUR = 5 LEN = 8

        Assigning to $txt — your workaround — works because assignment properly calls SvSETMAGIC, which clears the precomputed length of the string, thus forcing the next call to length to recalculate it.

        SV = PVMG(0x81af570) at 0x817bca0 REFCNT = 1 FLAGS = (PADMY,SMG,POK,pPOK) IV = 0 NV = 0 PV = 0x8177b98 ""\0 CUR = 0 LEN = 8 MAGIC = 0x831d6a0 MG_VIRTUAL = &PL_vtbl_utf8 MG_TYPE = PERL_MAGIC_utf8(w) MG_LEN = -1
Re^3: length() returns wrong result - suspicious magic
by ikegami (Pope) on Sep 15, 2010 at 15:35 UTC

    why DBD::ODBC (which wrote to the scalar) might need to call SvSETMAGIC?

    To assign a value to a scalar, there are a few steps to follow.

    • Make sure the scalar has the slot (IV, PV, etc) you need by upgrading the structure if necessary.
    • Place the value in the appropriate slot.
    • Set the flag indicating there's a usable value in the slot you populated.
    • Call SvSETMAGIC to let magic respond to the new value if appropriate (if SMG=1). This will end up calling STORE for tied variables, for example. In this case, I expect that it will clear the precomputed length of the string.

    To obtain a value from a scalar, the same is done in reverse.

    • Call SvGETMAGIC to populate the scalar with the a value. (This will end up calling FETCH for tied variables, for example.)
    • Coerce the scalar into the requested type if necessary. (This may requiring upgrading the scalar).
    • Return the value in the appropriate slot.

    Some macros and functions do more than one of these steps for you.

    Setting $txt = $txt at the end of the loop makes no difference

    No, at the start of the loop. After the fetch, but before you use it.

    There's no get magic on the scalar (GMG=0), so the fact that the set magic wasn't called earlier work won't matter. But when the value is assigned back to the scalar, the assignment will properly handle the set magic.

    If I'm right, I can provide a cheaper workaround than copying the string (which defies the purpose of binding).

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://860228]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (10)
As of 2014-08-20 22:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (125 votes), past polls