http://www.perlmonks.org?node_id=814027

creamygoodness has asked for the wisdom of the Perl Monks concerning the following question:

Greets,

I have often seen people badmouth the bytes pragma, but there's one thing I use it for: cheaply identifying empty strings with bytes::length()when the strings may be carrying the SVf_UTF8 flag. The length() function can be inefficient for such strings, because it must traverse the entire buffer counting characters:

marvin@smokey:~/perltest $ perl compare_length_efficiency.pl Rate utf8 bytes utf8 4.35/s -- -98% bytes 185/s 4154% -- marvin@smokey:~/perltest $

use strict; use warnings; use Benchmark qw( cmpthese ); # Make bytes:: functions available, but use character semantics. use bytes; no bytes; cmpthese( 100, { bytes => sub { my $smileys = "\x{263a}" x 10_000; chop($smileys) while bytes::length($smileys); }, utf8 => sub { my $smileys = "\x{263a}" x 10_000; chop($smileys) while length($smileys); }, } );

Is there an efficient alternative to bytes::length() for this use case elsewhere in core?