Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

trim leading & trailing whitespace

by ady (Deacon)
on Mar 30, 2005 at 07:50 UTC ( #443363=perlquestion: print w/ replies, xml ) Need Help??
ady has asked for the wisdom of the Perl Monks concerning the following question:

What is the coolest way of trimming leading and trailing whitespace from a multiline string?
(by 'coolest' i mean mean and lean. By 'mean' i mean the mean betwen clean and efficient. By 'clean' i mean: "simple and streamlined in design"). Tnx -- allan
Update: 1: $c =~ s/^\s+//; $c =~ s/\s+$//; This does precicely what i initially asked for

2: $c =~ s/^\s*(.*?)\s*$/$1/gm; As above, but also trims "intermediate lines" for leading & trailing WS, -- that's also very usefull to me.

3: $c =~s/\s*\n\s*/\n/g; $c =~ s/^[\t\ ]+//gm; $c =~s/[\t\ ]+$//gm;
Same as 2 but don't trim leading & trailing \n
That's not what i need for this specific solution.

Thanks again fellow monks, for fast & usefull guidelines
allan
As the eternal tranquility of Truth reveals itself to us, this very place is the Land of Lotuses -- Hakuin Ekaku Zenji

20050330 Janitored by Corion: Closing code tags are spelled </code>

Comment on trim leading & trailing whitespace
Select or Download Code
Re: trim leading & trailing whitespace
by graff (Chancellor) on Mar 30, 2005 at 07:56 UTC
    Personally, I don't see any need to improve on the old standard:
    s/^\s+//; s/\s+$//;
    update: After seeing thinker's reply I realized I might have misunderstood the question. If, like he suggests, you mean to remove leading and trailing whitespace from each line of a multi-line string, then I'd change the second line above to:
    s/\s*\n\s*/\n/g;
      Probably as good as it gets... Much better than my 1.cut at the job. Tanks a heap! Allan

      \s matches newlines too, meaning that your pattern will remove empty lines, not just trim them. An alternative is

      s/^[^\S\n]+//gm; s/[^\S\n]+$//gm;

      ihb

      See perltoc if you don't know which perldoc to read!

Re: trim leading & trailing whitespace
by BrowserUk (Pope) on Mar 30, 2005 at 08:06 UTC

    s[^\s*(.*)\b\s*$][$1]

    Update: Corion rightly points out that this does not work correctly for lines where the last non whitespace character is a non-words character. Eg.

    the quick brown fox ( \n # ^

    So, use s[^\s*(.*\S)\s*$][]; instead. It's slower, but still quicker than the FAQ method.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco.
    Rule 1 has a caveat! -- Who broke the cabal?
      Good one too (needs the gm modifiers to do the job tho') -- allan
Re: trim leading & trailing whitespace
by thinker (Parson) on Mar 30, 2005 at 08:22 UTC

    If you want to trim the leading and trailing whitespace from every line in a multiline string, you can use

    s/^\s*(.*?)\s*$/$1/gm

    cheers

    thinker

      yup, i may need that, Good to have on the shelf! Thanks, Allan
Re: trim leading & trailing whitespace
by bart (Canon) on Mar 30, 2005 at 09:22 UTC
    Multiline, huh? You mean you want to trim every line independently?
    s/^[\t\ ]+//gm; s/[\t\ ]+$//gm;
    I don't use \s because I want to leave the newlines alone.

      Hi bart

      In the example I gave above,

       s/^\s*(.*?)\s*$/$1/gm

      the combination of the \m modifier, and the line anchors ^ and $ will ensure the newlines are left alone

      cheers

      thinker

        $_ = " no \n \n \n they \n \n won't \n\n\n"; s/^\s*(.*?)\s*$/$1/gm; print;
        Result:
        no
        they
        won't
        

        All blank lines are gone.

Re: trim leading & trailing whitespace
by jmcnamara (Monsignor) on Mar 30, 2005 at 10:36 UTC
      Here's a benchmark, using the mentioned single regex, the double regex from the FAQ (also mentioned), BrowserUKs solution, and the one from Regexp::Common:
      #!/usr/bin/perl use strict; use warnings; use Benchmark 'cmpthese'; use Regexp::Common; use Test::More tests => 4; our @data = ( " White space in front", "White space at back ", "Nowhitespaceatall", "White space, but not in front or at back", " White space in front, and at back " ); our @expected = ( "White space in front", "White space at back", "Nowhitespaceatall", "White space, but not in front or at back", "White space in front, and at back" ); our (@s, @d, @b, @r); print '#'; cmpthese -1, { '#single' => '@s = @data; s/^\s*(.*?)\s*$/$1/s for @s', '#duo' => '@d = @data; s/^\s+//, s/\s+$// for @d', '#browseruk' => '@b = @data; s/^\s*(.*)\b\s*$/$1/s for @b', '#R::C' => '@r = map {$RE{ws}{crop}->subs($_)} @data', }; is_deeply (\@s, \@expected, "single"); is_deeply (\@d, \@expected, "duo"); is_deeply (\@b, \@expected, "browseruk"); is_deeply (\@r, \@expected, "R::C"); __END__ 1..4 # Rate #R::C #single #browseruk #duo #R::C 1914/s -- -87% -94% -97% #single 15175/s 693% -- -53% -75% #browseruk 32288/s 1587% 113% -- -46% #duo 59650/s 3017% 293% 85% -- ok 1 - single ok 2 - duo ok 3 - browseruk ok 4 - R::C
Re: trim leading & trailing whitespace
by brian_d_foy (Abbot) on Mar 31, 2005 at 19:37 UTC

    I'm surprised that everyone (including the FAQ) missed the most obvious solution which uses an alternation. Most other solutions seem to forget that the $ anchor is right before the trailing newline. :)

    $string =~ s/^\s+|\s+$//gm;

    If you want to preserve blank lines, it's a tiny bit longer.

    s/^[\t\f ]+|[\t\f ]+$//mg;

    Remember: in Perl, the common things are supposed to be easy, so if you're doing something common and it's not easy, you're probably missing something. :)

    I'm going to update the FAQ answer too.

    --
    brian d foy <brian@stonehenge.com>

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://443363]
Approved by jbrugger
Front-paged by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (15)
As of 2014-07-10 17:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (213 votes), past polls