Re: Carp::Heavy panic: malloc Unicode?

by Steve_p (Priest)
on Feb 07, 2007 at 20:25 UTC

in reply to Carp::Heavy panic: malloc Unicode?

That bit of code is broken and should be available in a newer version of Perl. The patch below resolved the problem.

Change 25363 by rgs@bloom on 2005/09/07 11:09:10 The formatting function of Carp::Heavy has problem with utf8 strin +gs. Work around it. Affected files ... ... //depot/perl/lib/Carp/ edit Differences ... ==== //depot/perl/lib/Carp/ (text) ==== @@ -116,9 +116,10 @@ $arg = "'$arg'" unless $arg =~ /^-?[\d.]+\z/; # The following handling of "control chars" is direct from - # the original code - I think it is broken on Unicode though. + # the original code - it is broken on Unicode though. # Suggestions? - $arg =~ s/([[:cntrl:]]|[[:^ascii:]])/sprintf("\\x{%x}",ord($1))/eg; + utf8::is_utf8($arg) + or $arg =~ s/([[:cntrl:]]|[[:^ascii:]])/sprintf("\\x{%x}",ord($1) +)/eg; return $arg; }

Test your modules with bleadperl!

  rsync -avz rsync:// .
  ./Configure -des -Dusedevel -Dprefix=/path/to/test/perl
  make test
  make install

Now, please test you modules! If you have test failures that don't happen with Perl 5.8.8, send a simplified test case to

perlbug at

Replies are listed 'Best First'.
Re^2: Carp::Heavy panic: malloc Unicode?
on Feb 08, 2007 at 06:21 UTC
    Wow. That piece of enlightenment gives me hope that someday I might be able to use the perl debugger with unicode data... I've had repeated failures when using "perl -d" on scripts that involve regex operations on utf8 strings, in 5.8.6 on macosx and 5.8.7 on freebsd. A script would run okay by itself (except for bugs), but running it with "perl -d" leads to various reports of "out of memory" at the first unicode-heavy regex. So, I'm guessing the debugger contains something similar to the logic you've cited in Carp::Heavy. (But curiously, I can run a one-liner applying that regex to a utf8 string on my mac without difficulty. I'm confused.)

    Anyway, I can't say that I approve of the approach taken in that patch. All it's doing is: if the string has the utf8 flag on, avoid trying to make any non-visible or non-ascii characters "displayable" by converting them to hex notation.

    Rather than just give up completely on utf8 strings, wouldn't it be better to use a different approach? e.g.:

    $arg = join "", map { ($_ lt ' ' or $_ gt '~') ? sprintf("\\x{%x}",ord) : $_ } split //, $arg;
    I'm not sure, but I think that should avoid the malloc explosion caused by using [[:cntrl:]]|[[:^ascii:]] on a utf8 string.

Node Type: note [id://598874]
