Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Answer: How do I remove whitespace at the beginning or end of my string?

( #950578=categorized answer: print w/ replies, xml ) Need Help??

Q&A > regular expressions > How do I remove whitespace at the beginning or end of my string? contributed by trizen

There are some faster solutions which sometimes can be really slow, depending on how many whitespaces a string contain.

If a string contains a lot of whitespaces.
Example: my $str = q{    }. q{a b c d e f g h i j} x 200 . q{    };

MRE book suggests this code:
$str =~ s/^\s+((?:.+\S)?)\s+$/$1/s;

I admit, I was surprised how fast it is compared with: "s/^\s+//" and his brother "s/\s+$//". They can't even compete at a benchmark, they are too slow with the above example! (that's because of the second regex which match at the end of the string, if fails so many times if string contains a lot of whitespaces (see re 'debug')).

Another approach (I know is silly, but is faster in some casses):
$str =~ s/^\s+//; $str = reverse($str); $str =~ s/^\s+//; $str = reverse($str);
Benchmark using the above example:
's_reverse' 42017/s -- -12% -48% 'unpack_A' 47847/s 14% -- -41% 'MRE_regx' 80645/s 92% 69% --

Comment on Answer: How do I remove whitespace at the beginning or end of my string?
Select or Download Code
Re: Answer: How do I remove whitespace at the beginning or end of my string?
by repellent (Priest) on Jan 30, 2012 at 00:08 UTC
    MRE_regx does not trim whitespace as expected:
    $ perl -de 1 Loading DB routines from perl5db.pl version 1.3 Editor support available. Enter h or `h h' for help, or `man perldebug' for more help. main::(-e:1): 1 DB<1> $str = ' x '; $str =~ s/^\s+((?:.+\S)?)\s+$/$1/s; DB<2> x $str 0 ' x'
        use Test::More; sub trim { my $s = $_[0]; $s =~ s/^\s+(\S?.*\S)\s+$/$1/s; $s } is( trim(' '), '' ); is( trim('a '), 'a' ); is( trim(' a'), 'a' ); is( trim(' a '), 'a' ); is( trim('ab '), 'ab' ); is( trim(' ab'), 'ab' ); is( trim(' ab '), 'ab' ); is( trim('a bb c '), 'a bb c' ); is( trim(' a bb c'), 'a bb c' ); is( trim(' a bb c '), 'a bb c' ); done_testing(); __END__ not ok 1 # Failed test at ./t.pl line 12. # got: ' ' # expected: '' not ok 2 # Failed test at ./t.pl line 13. # got: 'a ' # expected: 'a' not ok 3 # Failed test at ./t.pl line 14. # got: ' a' # expected: 'a' ok 4 not ok 5 # Failed test at ./t.pl line 16. # got: 'ab ' # expected: 'ab' not ok 6 # Failed test at ./t.pl line 17. # got: ' ab' # expected: 'ab' ok 7 not ok 8 # Failed test at ./t.pl line 19. # got: 'a bb c ' # expected: 'a bb c' not ok 9 # Failed test at ./t.pl line 20. # got: ' a bb c' # expected: 'a bb c' ok 10 1..10 # Looks like you failed 7 tests of 10.

        The one I could find with best benchmark and passes tests is s/^\s*((?:.*\S)?)\s*$/$1/s;, which is essentially like MRE_regx with + replaced with * (perhaps trizen typo-ed?)
Log In?
Username:
Password:

What's my password?
Create A New User
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (8)
As of 2014-12-25 22:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (163 votes), past polls