trim leading & trailing whitespace

ady has asked for the wisdom of the Perl Monks concerning the following question:

What is the coolest way of trimming leading and trailing whitespace from a multiline string?
(by 'coolest' i mean mean and lean. By 'mean' i mean the mean betwen clean and efficient. By 'clean' i mean: "simple and streamlined in design"). Tnx -- allan
Update: 1: $c =~ s/^\s+//; $c =~ s/\s+$//; This does precicely what i initially asked for

2: $c =~ s/^\s*(.*?)\s*$/$1/gm; As above, but also trims "intermediate lines" for leading & trailing WS, -- that's also very usefull to me.

3: $c =~s/\s*\n\s*/\n/g;
$c =~ s/^[\t\ ]+//gm; $c =~s/[\t\ ]+$//gm;
[download]

Same as 2 but don't trim leading & trailing \n
That's not what i need for this specific solution.

Thanks again fellow monks, for fast & usefull guidelines
allan

As the eternal tranquility of Truth reveals itself to us, this very place is the Land of Lotuses -- Hakuin Ekaku Zenji

20050330 Janitored by Corion: Closing code tags are spelled </code>

Comment on trim leading & trailing whitespace Select or Download Code

Replies are listed 'Best First'.
Re: trim leading & trailing whitespace by graff (Chancellor) on Mar 30, 2005 at 07:56 UTC
Personally, I don't see any need to improve on the old standard: `s/^\s+//; s/\s+$//;` [download] update: After seeing thinker's reply I realized I might have misunderstood the question. If, like he suggests, you mean to remove leading and trailing whitespace from each line of a multi-line string, then I'd change the second line above to: `s/\s\n\s/\n/g;` [download]	[reply] [d/l] [select]
Re^2: trim leading & trailing whitespace by ihb (Deacon) on Mar 30, 2005 at 10:09 UTC
`\s` matches newlines too, meaning that your pattern will remove empty lines, not just trim them. An alternative is `s/^[^\S\n]+//gm; s/[^\S\n]+$//gm;` [download] `ihb` See perltoc if you don't know which perldoc to read!	[reply] [d/l] [select]
Re^2: trim leading & trailing whitespace by ady (Deacon) on Mar 30, 2005 at 08:01 UTC
Probably as good as it gets... Much better than my 1.cut at the job. Tanks a heap! Allan	[reply]
Re: trim leading & trailing whitespace by thinker (Parson) on Mar 30, 2005 at 08:22 UTC
If you want to trim the leading and trailing whitespace from every line in a multiline string, you can use `s/^\s(.?)\s*$/$1/gm` cheers thinker	[reply] [d/l]
Re^2: trim leading & trailing whitespace by ady (Deacon) on Mar 30, 2005 at 08:47 UTC
yup, i may need that, Good to have on the shelf! Thanks, Allan	[reply]
Re: trim leading & trailing whitespace by bart (Canon) on Mar 30, 2005 at 09:22 UTC
Multiline, huh? You mean you want to trim every line independently? `s/^[\t\ ]+//gm; s/[\t\ ]+$//gm;` [download] I don't use `\s` because I want to leave the newlines alone.	[reply] [d/l]
Re^2: trim leading & trailing whitespace by thinker (Parson) on Mar 30, 2005 at 10:04 UTC
Hi bart In the example I gave above, `s/^\s(.?)\s*$/$1/gm` the combination of the `\m` modifier, and the line anchors `^` and `$` will ensure the newlines are left alone cheers thinker	[reply] [d/l] [select]
Re^3: trim leading & trailing whitespace by bart (Canon) on Mar 30, 2005 at 10:08 UTC
`$_ = " no \n \n \n they \n \n won't \n\n\n"; s/^\s(.?)\s*$/$1/gm; print;` [download] Result: no they won't All blank lines are gone.	[reply] [d/l]
Re^4: trim leading & trailing whitespace by thinker (Parson) on Mar 30, 2005 at 10:33 UTC
Re^4: trim leading & trailing whitespace by tlm (Prior) on Mar 30, 2005 at 10:41 UTC
Re^5: trim leading & trailing whitespace by bart (Canon) on Mar 30, 2005 at 11:10 UTC
Re: trim leading & trailing whitespace by jmcnamara (Monsignor) on Mar 30, 2005 at 10:36 UTC
See perlfaq4 How do I strip blank space from the beginning/end of a string?. It also say says that the `s/^\s(.?)\s$/$1/` construct is "unnecessarily slow and destructive"*. Benchmarks anyone? See also trim() magic -- John.	[reply]
Re^2: trim leading & trailing whitespace (Benchmark) by Anonymous Monk on Mar 30, 2005 at 11:11 UTC
Here's a benchmark, using the mentioned single regex, the double regex from the FAQ (also mentioned), BrowserUKs solution, and the one from Regexp::Common: #!/usr/bin/perl use strict; use warnings; use Benchmark 'cmpthese'; use Regexp::Common; use Test::More tests => 4; our @data = ( " White space in front", "White space at back ", "Nowhitespaceatall", "White space, but not in front or at back", " White space in front, and at back " ); our @expected = ( "White space in front", "White space at back", "Nowhitespaceatall", "White space, but not in front or at back", "White space in front, and at back" ); our (@s, @d, @b, @r); print '#'; cmpthese -1, { '#single' => '@s = @data; s/^\s(.?)\s$/$1/s for @s', '#duo' => '@d = @data; s/^\s+//, s/\s+$// for @d', '#browseruk' => '@b = @data; s/^\s(.)\b\s$/$1/s for @b', '#R::C' => '@r = map {$RE{ws}{crop}->subs($_)} @data', }; is_deeply (\@s, \@expected, "single"); is_deeply (\@d, \@expected, "duo"); is_deeply (\@b, \@expected, "browseruk"); is_deeply (\@r, \@expected, "R::C"); __END__ 1..4 # Rate #R::C #single #browseruk #duo #R::C 1914/s -- -87% -94% -97% #single 15175/s 693% -- -53% -75% #browseruk 32288/s 1587% 113% -- -46% #duo 59650/s 3017% 293% 85% -- ok 1 - single ok 2 - duo ok 3 - browseruk ok 4 - R::C [download]	[reply] [d/l]
Re: trim leading & trailing whitespace by BrowserUk (Patriarch) on Mar 30, 2005 at 08:06 UTC
`s[^\s(.)\b\s$][$1]` Update: Corion rightly points out that this does not work correctly for lines where the last non whitespace character is a non-words character. Eg. `the quick brown fox ( \n # ^` [download] So, use `s[^\s(.\S)\s$][];` instead. It's slower, but still quicker than the FAQ method. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. Rule 1 has a caveat! -- Who broke the cabal?	[reply] [d/l] [select]
Re^2: trim leading & trailing whitespace by ady (Deacon) on Mar 30, 2005 at 08:46 UTC
Good one too (needs the gm modifiers to do the job tho') -- allan	[reply]
Re: trim leading & trailing whitespace by brian_d_foy (Abbot) on Mar 31, 2005 at 19:37 UTC
I'm surprised that everyone (including the FAQ) missed the most obvious solution which uses an alternation. Most other solutions seem to forget that the $ anchor is right before the trailing newline. :) `$string =~ s/^\s+\|\s+$//gm;` [download] If you want to preserve blank lines, it's a tiny bit longer. `s/^[\t\f ]+\|[\t\f ]+$//mg;` [download] Remember: in Perl, the common things are supposed to be easy, so if you're doing something common and it's not easy, you're probably missing something. :) I'm going to update the FAQ answer too. -- brian d foy <brian@stonehenge.com>	[reply] [d/l] [select]

Back to Seekers of Perl Wisdom