Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: Truncate string to limited length, throwing away unimportant characters first.

by merlyn (Sage)
on Mar 17, 2010 at 02:39 UTC ( #829068=note: print w/ replies, xml ) Need Help??


in reply to Truncate string to limited length, throwing away unimportant characters first.

You guys are all trying too hard:

s/\s$// || s/^\s// || s/.$// while length > 256;

-- Randal L. Schwartz, Perl hacker

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.


Comment on Re: Truncate string to limited length, throwing away unimportant characters first.
Download Code
Re^2: Truncate string to limited length, throwing away unimportant characters first.
by ikegami (Pope) on Mar 17, 2010 at 04:06 UTC

    I figured this code would be executed repeatedly, so performance would matter.

    For strings with very few surrounding spaces,

    length=300 Rate merlyn ikegami merlyn 8132/s -- -80% ikegami 41078/s 405% -- length=500 Rate merlyn ikegami merlyn 1660/s -- -91% ikegami 18224/s 998% -- length=1000 Rate merlyn ikegami merlyn 488/s -- -92% ikegami 6095/s 1149% --

    I wrote a more comprehensive benchmark, but I'm getting inconsistent results at the moment. I'll run on it a more stable machine tomorrow.

    use strict; use warnings; use Benchmark qw( cmpthese ); my %tests = ( ikegami11 => 's/^(.{256,}?)\s+\z/$1/s, s/^\s*(.{256}).*\z/$1/s + if length > 256;', ikegami21 => 's/(?<=.{6})\s+\z//s, s/^\s*(.{256}).*\z/$1/s + if length > 256;', ikegami31 => 's/.{6}\K\s+\z//s, s/^\s*(.{256}).*\z/$1/s + if length > 256;', ikegami12 => 's/^(.{256,}?)\s+\z/$1/s, s/^\s+(?=.{6})//s, s/(?<=^.{ +6}).*\z//s if length > 256;', ikegami22 => 's/(?<=.{6})\s+\z//s, s/^\s+(?=.{6})//s, s/(?<=^.{ +6}).*\z//s if length > 256;', ikegami32 => 's/.{6}\K\s+\z//s, s/^\s+(?=.{6})//s, s/(?<=^.{ +6}).*\z//s if length > 256;', ikegami13 => 's/^(.{256,}?)\s+\z/$1/s, s/^\s+(?=.{6})//s, s/(?<=.{6 +}).*\z//s if length > 256;', ikegami23 => 's/(?<=.{6})\s+\z//s, s/^\s+(?=.{6})//s, s/(?<=.{6 +}).*\z//s if length > 256;', ikegami33 => 's/.{6}\K\s+\z//s, s/^\s+(?=.{6})//s, s/(?<=.{6 +}).*\z//s if length > 256;', ikegami14 => 's/^(.{256,}?)\s+\z/$1/s, s/^\s+(?=.{6})//s, s/^.{6}\K +.*\z//s if length > 256;', ikegami24 => 's/(?<=.{6})\s+\z//s, s/^\s+(?=.{6})//s, s/^.{6}\K +.*\z//s if length > 256;', ikegami34 => 's/.{6}\K\s+\z//s, s/^\s+(?=.{6})//s, s/^.{6}\K +.*\z//s if length > 256;', merlyn => 's/\s$// || s/^\s// || s/.$// while length > 256;', ); $_ = 'use strict; use warnings; local $_ = our $pat;' . $_ for values(%tests); for my $len (256, 260, 300, 500) { for our $pat ( ' ' x $len, (' ' x ($len/2)) . ('x' x ($len/2)), 'x' x $len, ) { printf("length=%u, pat=\"%s...%s\"\n", $len, substr($pat, 0, 5), + substr($pat, -5)); cmpthese(-1, \%tests); print("\n"); } }
Re^2: Truncate string to limited length, throwing away unimportant characters first.
by ikegami (Pope) on Mar 17, 2010 at 16:55 UTC
    Results. They vary *a lot* based on the input pattern.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://829068]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2014-07-23 04:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (133 votes), past polls