Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Answer: How do I remove whitespace at the beginning or end of my string?

( #324502=categorized answer: print w/ replies, xml ) Need Help??

Q&A > regular expressions > How do I remove whitespace at the beginning or end of my string? contributed by Roy Johnson

I wrote some benchmark code to test the popular answers, and a method using just pattern match and capture instead of s///. I got good results from my pattern match, so I tried plugging my pattern into Daddio's solution (s-capture), and got a significant speedup of it (s-capture2), but still not as good as just using m//.
use Benchmark 'cmpthese'; use Regexp::Common 'whitespace'; my $f = ' this is a string with spaces to remove '; cmpthese(-3, { 'Regexp-Common' => sub { $_=$f; s/$RE{ws}{crop}//g; }, 'two-s///' => sub { $_=$f; s/^\s+//; s/\s+$//; }, 'one-s///' => sub { $_=$f; s/^\s+|\s+$//g; }, 's-capture' => sub { $_=$f; s/^\s*(.*?)\s*$/$1/; }, 's-capture2' => sub { $_=$f; s/^\s*(\S+(?:\s+\S+)?)?\s*$/$1/; }, 'm-capture' => sub { $_=$f; ($_) = /(\S+(?:\s+\S+)?)/; }, });
Rate Regexp-Common s-capture one-s/// s-capture2 m-ca +pture two-s/// Regexp-Common 659/s -- -91% -93% -95% + -96% -97% s-capture 7002/s 963% -- -21% -45% + -59% -73% one-s/// 8857/s 1244% 26% -- -30% + -48% -66% s-capture2 12699/s 1827% 81% 43% -- + -26% -51% m-capture 17179/s 2507% 145% 94% 35% + -- -34% two-s/// 25941/s 3837% 270% 193% 104% + 51% --
Not surprisingly, the straightforward method of calling s/// twice is the fastest. What is surprising is how slow Regexp-Common turns out to be, and I was pleasantly surprised at how well m-capture did. If, for some bizarre reason, you needed a single atomic expression to remove leading and trailing whitespace, that would seem to be the way to go.

Comment on Answer: How do I remove whitespace at the beginning or end of my string?
Select or Download Code
Re: Answer: How do I remove whitespace at the beginning or end of my string?
by ysth (Canon) on Jan 27, 2004 at 22:17 UTC
    Why are you penalizing the one-regex methods by requiring a separate assignment to $_? You should perhaps do 2 separate benchmarks; one assuming $_ is to be changed, and the other assuming a lexical $f is to be changed.

    Update: I'm just saying that benchmarks for performing an operation on $_ won't necessarily carry over to benchmarks for performing an operation on a lexical. In fact, with a lexical, the two-regex method outshines the others even more:

    use Benchmark 'cmpthese'; use Regexp::Common 'whitespace'; my $f = ' this is a string with spaces to remove '; cmpthese(-3, { 'Regexp-Common' => sub { my $g=$f; $g=~s/$RE{ws}{crop}//g; }, 'two-s///' => sub { my $g=$f; $g=~s/^\s+//; $g=~s/\s+$//; }, 'one-s///' => sub { my $g=$f; $g=~s/^\s+|\s+$//g; }, 's-capture' => sub { my $g=$f; $g=~s/^\s*(.*?)\s*$/$1/; }, 's-capture2' => sub { my $g=$f; $g=~s/^\s*(\S+(?:\s+\S+)?)?\s*$/$ +1/; }, 'm-capture' => sub { my $g=$f; ($g) = $g=~/(\S+(?:\s+\S+)?)/; + }, });
    shows "two-s" gaining more than the others when working on lexicals.
      The test is to remove leading and trailing whitespace from a string. The assignment to $_ is an initialization condition of the test: it has to have the string value in it, and then have the whitespace removed.

      Update:
      Oh, now I understand what you're saying. It just didn't occur to me that it could make a significant difference what kind of variable was being operated on.


      The PerlMonk tr/// Advocate
Correction: How do I remove whitespace at the beginning or end of my string?
by Roy Johnson (Monsignor) on Jan 28, 2004 at 18:39 UTC
    Argh! The m-capture regex has a bug: the final ? should be a * (and the corresponding ? in s-capture2). Naturally, this slows things down a bit (and I can't update the original answer).
    Corrected code:
    's-capture2' => sub { $_=$f; s/^\s*(\S+(?:\s+\S+)*)?\s*$/$1/; }, 'm-capture' => sub { $_=$f; ($_) = /(\S+(?:\s+\S+)*)/; },
    New results (on a different machine):
    Rate Regexp-Common s-capture one-s/// s-capture2 m-c +apture two-s/// Regexp-Common 6629/s -- -89% -90% -91% + -94% -97% s-capture 62950/s 850% -- -1% -18% + -41% -71% one-s/// 63621/s 860% 1% -- -17% + -41% -71% s-capture2 77016/s 1062% 22% 21% -- + -28% -65% m-capture 107187/s 1517% 70% 68% 39% + -- -51% two-s/// 219320/s 3209% 248% 245% 185% + 105% --

    The PerlMonk tr/// Advocate
Log In?
Username:
Password:

What's my password?
Create A New User
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (8)
As of 2014-09-19 21:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (146 votes), past polls