Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Answer: How do I remove whitespace at the beginning or end of my string?

( #950578=categorized answer: print w/ replies, xml ) Need Help??

Q&A > regular expressions > How do I remove whitespace at the beginning or end of my string? contributed by trizen

There are some faster solutions which sometimes can be really slow, depending on how many whitespaces a string contain.

If a string contains a lot of whitespaces.
Example: my $str = q{    }. q{a b c d e f g h i j} x 200 . q{    };

MRE book suggests this code:
$str =~ s/^\s+((?:.+\S)?)\s+$/$1/s;

I admit, I was surprised how fast it is compared with: "s/^\s+//" and his brother "s/\s+$//". They can't even compete at a benchmark, they are too slow with the above example! (that's because of the second regex which match at the end of the string, if fails so many times if string contains a lot of whitespaces (see re 'debug')).

Another approach (I know is silly, but is faster in some casses):
$str =~ s/^\s+//; $str = reverse($str); $str =~ s/^\s+//; $str = reverse($str);
Benchmark using the above example:
's_reverse' 42017/s -- -12% -48% 'unpack_A' 47847/s 14% -- -41% 'MRE_regx' 80645/s 92% 69% --

Comment on Answer: How do I remove whitespace at the beginning or end of my string?
Select or Download Code
Replies are listed 'Best First'.
Re: Answer: How do I remove whitespace at the beginning or end of my string?
by repellent (Priest) on Jan 30, 2012 at 00:08 UTC
    MRE_regx does not trim whitespace as expected:
    $ perl -de 1 Loading DB routines from perl5db.pl version 1.3 Editor support available. Enter h or `h h' for help, or `man perldebug' for more help. main::(-e:1): 1 DB<1> $str = ' x '; $str =~ s/^\s+((?:.+\S)?)\s+$/$1/s; DB<2> x $str 0 ' x'
        use Test::More; sub trim { my $s = $_[0]; $s =~ s/^\s+(\S?.*\S)\s+$/$1/s; $s } is( trim(' '), '' ); is( trim('a '), 'a' ); is( trim(' a'), 'a' ); is( trim(' a '), 'a' ); is( trim('ab '), 'ab' ); is( trim(' ab'), 'ab' ); is( trim(' ab '), 'ab' ); is( trim('a bb c '), 'a bb c' ); is( trim(' a bb c'), 'a bb c' ); is( trim(' a bb c '), 'a bb c' ); done_testing(); __END__ not ok 1 # Failed test at ./t.pl line 12. # got: ' ' # expected: '' not ok 2 # Failed test at ./t.pl line 13. # got: 'a ' # expected: 'a' not ok 3 # Failed test at ./t.pl line 14. # got: ' a' # expected: 'a' ok 4 not ok 5 # Failed test at ./t.pl line 16. # got: 'ab ' # expected: 'ab' not ok 6 # Failed test at ./t.pl line 17. # got: ' ab' # expected: 'ab' ok 7 not ok 8 # Failed test at ./t.pl line 19. # got: 'a bb c ' # expected: 'a bb c' not ok 9 # Failed test at ./t.pl line 20. # got: ' a bb c' # expected: 'a bb c' ok 10 1..10 # Looks like you failed 7 tests of 10.

        The one I could find with best benchmark and passes tests is s/^\s*((?:.*\S)?)\s*$/$1/s;, which is essentially like MRE_regx with + replaced with * (perhaps trizen typo-ed?)
Log In?
Username:
Password:

What's my password?
Create A New User
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (14)
As of 2015-07-31 18:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (280 votes), past polls