Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

How do I remove whitespace at the beginning or end of my string?

by vroom (His Eminence)
on Jan 21, 2000 at 00:44 UTC ( [id://2258]=perlquestion: print w/replies, xml ) Need Help??

vroom has asked for the wisdom of the Perl Monks concerning the following question: (regular expressions)

How do I remove whitespace at the beginning or end of my string?

Originally posted as a Categorized Question.

Replies are listed 'Best First'.
Re: How do I remove whitespace at the beginning or end of my string?
by btrott (Parson) on Mar 03, 2000 at 23:57 UTC
    You could also do this in just one step. Use foreach's aliasing behavior to write this:
    for ($string) { s/^\s+//; s/\s+$//; }
Re: How do I remove whitespace at the beginning or end of my string?
by vroom (His Eminence) on Jan 21, 2000 at 00:45 UTC
    Use substitution along with the special characters ^ and $ which match the beginning and end of strings respectively along with \s which matches whitespace;
    $string=~s/^\s+//; $string=~s/\s+$//;

    Edited by davido: Removed useless use of /g modifier.
Re: How do I remove whitespace at the beginning or end of my string?
by infinityandbeyond (Sexton) on Apr 20, 2000 at 23:01 UTC
    One line is just as good.
    #replace 0 or more whitespaces at the beginning # or 0 or more whitespaces at the end # with nothing $string =~ s/^\s+|\s+$//g;
    Credit to perlmonkey for corrections to original post -- Ed.
Re: How do I remove whitespace at the beginning or end of my string?
by Roy Johnson (Monsignor) on Jan 27, 2004 at 18:07 UTC
    I wrote some benchmark code to test the popular answers, and a method using just pattern match and capture instead of s///. I got good results from my pattern match, so I tried plugging my pattern into Daddio's solution (s-capture), and got a significant speedup of it (s-capture2), but still not as good as just using m//.
    use Benchmark 'cmpthese'; use Regexp::Common 'whitespace'; my $f = ' this is a string with spaces to remove '; cmpthese(-3, { 'Regexp-Common' => sub { $_=$f; s/$RE{ws}{crop}//g; }, 'two-s///' => sub { $_=$f; s/^\s+//; s/\s+$//; }, 'one-s///' => sub { $_=$f; s/^\s+|\s+$//g; }, 's-capture' => sub { $_=$f; s/^\s*(.*?)\s*$/$1/; }, 's-capture2' => sub { $_=$f; s/^\s*(\S+(?:\s+\S+)?)?\s*$/$1/; }, 'm-capture' => sub { $_=$f; ($_) = /(\S+(?:\s+\S+)?)/; }, });
    Rate Regexp-Common s-capture one-s/// s-capture2 m-ca +pture two-s/// Regexp-Common 659/s -- -91% -93% -95% + -96% -97% s-capture 7002/s 963% -- -21% -45% + -59% -73% one-s/// 8857/s 1244% 26% -- -30% + -48% -66% s-capture2 12699/s 1827% 81% 43% -- + -26% -51% m-capture 17179/s 2507% 145% 94% 35% + -- -34% two-s/// 25941/s 3837% 270% 193% 104% + 51% --
    Not surprisingly, the straightforward method of calling s/// twice is the fastest. What is surprising is how slow Regexp-Common turns out to be, and I was pleasantly surprised at how well m-capture did. If, for some bizarre reason, you needed a single atomic expression to remove leading and trailing whitespace, that would seem to be the way to go.
      Why are you penalizing the one-regex methods by requiring a separate assignment to $_? You should perhaps do 2 separate benchmarks; one assuming $_ is to be changed, and the other assuming a lexical $f is to be changed.

      Update: I'm just saying that benchmarks for performing an operation on $_ won't necessarily carry over to benchmarks for performing an operation on a lexical. In fact, with a lexical, the two-regex method outshines the others even more:

      use Benchmark 'cmpthese'; use Regexp::Common 'whitespace'; my $f = ' this is a string with spaces to remove '; cmpthese(-3, { 'Regexp-Common' => sub { my $g=$f; $g=~s/$RE{ws}{crop}//g; }, 'two-s///' => sub { my $g=$f; $g=~s/^\s+//; $g=~s/\s+$//; }, 'one-s///' => sub { my $g=$f; $g=~s/^\s+|\s+$//g; }, 's-capture' => sub { my $g=$f; $g=~s/^\s*(.*?)\s*$/$1/; }, 's-capture2' => sub { my $g=$f; $g=~s/^\s*(\S+(?:\s+\S+)?)?\s*$/$ +1/; }, 'm-capture' => sub { my $g=$f; ($g) = $g=~/(\S+(?:\s+\S+)?)/; + }, });
      shows "two-s" gaining more than the others when working on lexicals.
        The test is to remove leading and trailing whitespace from a string. The assignment to $_ is an initialization condition of the test: it has to have the string value in it, and then have the whitespace removed.

        Update:
        Oh, now I understand what you're saying. It just didn't occur to me that it could make a significant difference what kind of variable was being operated on.


        The PerlMonk tr/// Advocate
      Argh! The m-capture regex has a bug: the final ? should be a * (and the corresponding ? in s-capture2). Naturally, this slows things down a bit (and I can't update the original answer).
      Corrected code:
      's-capture2' => sub { $_=$f; s/^\s*(\S+(?:\s+\S+)*)?\s*$/$1/; }, 'm-capture' => sub { $_=$f; ($_) = /(\S+(?:\s+\S+)*)/; },
      New results (on a different machine):
      Rate Regexp-Common s-capture one-s/// s-capture2 m-c +apture two-s/// Regexp-Common 6629/s -- -89% -90% -91% + -94% -97% s-capture 62950/s 850% -- -1% -18% + -41% -71% one-s/// 63621/s 860% 1% -- -17% + -41% -71% s-capture2 77016/s 1062% 22% 21% -- + -28% -65% m-capture 107187/s 1517% 70% 68% 39% + -- -51% two-s/// 219320/s 3209% 248% 245% 185% + 105% --

      The PerlMonk tr/// Advocate
Re: How do I remove whitespace at the beginning or end of my string?
by trizen (Hermit) on Nov 26, 2011 at 19:11 UTC
    Remove spaces at the beginning and end of your string. The fastest way, so far.
    my $l = length($string); $string = reverse unpack("A$l",reverse unpack("A$l",$string));


    Rate two-s/// split m-capture unpack s-with-\G two-s/// 239349/s -- -6% -38% -40% -41% split 255782/s 7% -- -33% -36% -37% m-capture 384362/s 61% 50% -- -4% -5% unpack 399865/s 67% 56% 4% -- -1%

      No need for the string interpolation:

      $string = reverse unpack( 'A*', reverse unpack( 'A*', $string ) );

        It probably also depends on what perl you use, but I'm amazed by the unpack speed. Here's perl-5.14.2 bench:

        use Benchmark "cmpthese"; use Regexp::Common "whitespace"; my $f = " this is a string with spaces to remove "; cmpthese (-3, { "Regexp-Common" => sub { my $g = $f; $g =~ s/$RE{ws}{crop}//g; }, "two-s///" => sub { my $g = $f; $g =~ s/^\s+//; $g =~ s/\s+$/ +/; }, "one-s///" => sub { my $g = $f; $g =~ s/^\s+|\s+$//g; }, "s-capture" => sub { my $g = $f; $g =~ s/^\s*(.*?)\s*$/$1/; }, "s-capture2" => sub { my $g = $f; $g =~ s/^\s*(\S+(?:\s+\S+)*)? +\s*$/$1/; }, "m-capture" => sub { my $g = $f; ($g) = ($g =~ m/(\S+(?:\s+\S+ +)*)/); }, "unpack" => sub { my $g = $f; $g = reverse unpack "A*", rev +erse unpack "A*", $g; }, });

        And the runs ...

        5.15.5: Rate one-s/// s-capture s-capture2 m-capture two-s/// + unpack one-s/// 146152/s -- -30% -50% -67% -78% + -88% s-capture 208100/s 42% -- -30% -53% -69% + -83% s-capture2 295196/s 102% 42% -- -34% -56% + -76% m-capture 446788/s 206% 115% 51% -- -34% + -64% two-s/// 673459/s 361% 224% 128% 51% -- + -46% unpack 1239336/s 748% 496% 320% 177% 84% + -- 5.14.2: Rate one-s/// s-capture s-capture2 m-capture two-s/// + unpack one-s/// 137451/s -- -28% -55% -67% -80% + -88% s-capture 191603/s 39% -- -38% -53% -72% + -84% s-capture2 306586/s 123% 60% -- -25% -56% + -74% m-capture 410529/s 199% 114% 34% -- -41% + -65% two-s/// 692258/s 404% 261% 126% 69% -- + -41% unpack 1172943/s 753% 512% 283% 186% 69% + -- 5.14.1 (production perl with Rexexp::Common): Rate Regexp-Common one-s/// s-capture s-capture2 m- +capture two-s/// unpack Regexp-Common 35655/s -- -79% -87% -91% + -94% -96% -97% one-s/// 171398/s 381% -- -36% -54% + -69% -79% -88% s-capture 268800/s 654% 57% -- -29% + -52% -67% -81% s-capture2 376317/s 955% 120% 40% -- + -32% -53% -74% m-capture 556706/s 1461% 225% 107% 48% + -- -31% -61% two-s/// 808968/s 2169% 372% 201% 115% + 45% -- -43% unpack 1424354/s 3895% 731% 430% 278% + 156% 76% -- 5.12.4: Rate one-s/// s-capture s-capture2 m-capture two-s/// + unpack one-s/// 136992/s -- -32% -52% -69% -80% + -88% s-capture 200658/s 46% -- -30% -54% -71% + -82% s-capture2 288356/s 110% 44% -- -34% -58% + -75% m-capture 437898/s 220% 118% 52% -- -37% + -62% two-s/// 694505/s 407% 246% 141% 59% -- + -39% unpack 1143892/s 735% 470% 297% 161% 65% + -- 5.10.1: Rate one-s/// s-capture s-capture2 m-capture two-s/// + unpack one-s/// 154239/s -- -31% -53% -69% -79% + -88% s-capture 221974/s 44% -- -32% -55% -70% + -83% s-capture2 325661/s 111% 47% -- -34% -56% + -75% m-capture 490550/s 218% 121% 51% -- -34% + -62% two-s/// 740666/s 380% 234% 127% 51% -- + -42% unpack 1278065/s 729% 476% 292% 161% 73% + -- 5.8.9: Rate s-capture one-s/// s-capture2 m-capture two-s/// + unpack s-capture 230743/s -- -12% -29% -50% -76% + -83% one-s/// 262166/s 14% -- -19% -43% -73% + -81% s-capture2 323587/s 40% 23% -- -30% -67% + -76% m-capture 463883/s 101% 77% 43% -- -53% + -66% two-s/// 977938/s 324% 273% 202% 111% -- + -28% unpack 1354222/s 487% 417% 319% 192% 38% + -- 5.6.2: Rate s-capture one-s/// s-capture2 m-capture two-s/// + unpack s-capture 331911/s -- -11% -28% -50% -78% + -79% one-s/// 372325/s 12% -- -19% -44% -75% + -76% s-capture2 462448/s 39% 24% -- -30% -69% + -71% m-capture 664093/s 100% 78% 44% -- -56% + -58% two-s/// 1503714/s 353% 304% 225% 126% -- + -5% unpack 1574832/s 374% 323% 241% 137% 5% + --

        Enjoy, Have FUN! H.Merijn
Re: How do I remove whitespace at the beginning or end of my string?
by Daddio (Chaplain) on May 12, 2001 at 21:14 UTC

    Similarly to infinityandbeyond, you could substitute out the leading and trailing white space like this.

    $string =~ s/^\s*(.*?)\s*$/$1/;
Re: How do I remove whitespace at the beginning or end of my string?
by The Mad Hatter (Priest) on Apr 16, 2003 at 14:44 UTC
Re: How do I remove whitespace at the beginning or end of my string?
by gridlock (Novice) on Feb 21, 2004 at 08:50 UTC
      ...or... more simply, you could do it this way as a one liner:
      s/^\s+|\s+$//g
Re: How do I remove whitespace at the beginning or end of my string?
by Bobinours (Acolyte) on May 23, 2002 at 01:52 UTC
    I read somewhere (probably in the CookBook) that it's faster to execute this :
    $string=~s/^\s+//; $string=~s/\s+$//;
    Than the one-liner :

    $string =~ s/^\s+|\s+$//g;

    Edited by davido: Removed useless use of /g modifier from the two-line (faster) example.
      Note that the /g in the two line example are pointless. There's no need to repeatedly remove all whitespace at the front or end.

      Abigail

Re: How do I remove whitespace at the beginning or end of my string?
by trizen (Hermit) on Jan 17, 2012 at 07:38 UTC
    my $string = q{ this is a string }; $string = reverse unpack('A*',reverse(unpack 'A*',$string)); print "<$string>\n";
Re: How do I remove whitespace at the beginning or end of my string?
by trizen (Hermit) on Jan 29, 2012 at 14:33 UTC
    There are some faster solutions which sometimes can be really slow, depending on how many whitespaces a string contain.

    If a string contains a lot of whitespaces.
    Example: my $str = q{    }. q{a b c d e f g h i j} x 200 . q{    };

    MRE book suggests this code:
    $str =~ s/^\s+((?:.+\S)?)\s+$/$1/s;

    I admit, I was surprised how fast it is compared with: "s/^\s+//" and his brother "s/\s+$//". They can't even compete at a benchmark, they are too slow with the above example! (that's because of the second regex which match at the end of the string, if fails so many times if string contains a lot of whitespaces (see re 'debug')).

    Another approach (I know is silly, but is faster in some casses):
    $str =~ s/^\s+//; $str = reverse($str); $str =~ s/^\s+//; $str = reverse($str);
    Benchmark using the above example:
    's_reverse' 42017/s -- -12% -48% 'unpack_A' 47847/s 14% -- -41% 'MRE_regx' 80645/s 92% 69% --
      MRE_regx does not trim whitespace as expected:
      $ perl -de 1 Loading DB routines from perl5db.pl version 1.3 Editor support available. Enter h or `h h' for help, or `man perldebug' for more help. main::(-e:1): 1 DB<1> $str = ' x '; $str =~ s/^\s+((?:.+\S)?)\s+$/$1/s; DB<2> x $str 0 ' x'
Re: How do I remove whitespace at the beginning or end of my string?
by Maestro_007 (Hermit) on Jun 20, 2001 at 20:23 UTC
    Try this, it's a little different, though it only works on leading whitespace:
    s/\G //g
    <paraphrase> It uses the \G anchor with the /g flag to start where the previous match left off, replacing spaces with nothing as it goes along. </paraphrase>

    It's from p. 245 of Effective Perl Programming by Joseph N. Hall and Randal Schwartz (merlyn). Randal, if there's a problem with me quoting this stuff, just let me know.

RE: How do I remove whitespace at the beginning or end of my string?
by btrott (Parson) on Mar 03, 2000 at 23:56 UTC
    In order to get it to strip whitespace at the end of the string, you need to put in a non-greedy specifier, like this:
    $string =~ s/^\s*(.*?)\s*$/$1/;
    That said, though, it's better to just do this in two steps, as others have shown.
Re: How do I remove whitespace at the beginning or end of my string?
by Mago (Parson) on Jul 14, 2003 at 14:58 UTC
    Remove whitespace at the beginning and end of your string, as well as consecutive whitespace (more than one whitespace character in a row) throughout the string.
    $string = join(' ',split(' ',$string));
      Beside removing whitespace at the beginning and end of the string, the above line will also remove any consecutive white space in the string.

      Abigail

      If you want to preserve internal whitespace, try
      s/ #replace ^[\s]* #the start of the string followed by any number of spaces (.*) #zero or more characters of anything, stuff into $1. This +is the actual string we want. (?<!\s) #look-behind assertion to get to the last non-whitespace ch +ar \s*$ #match whitespace until EOL /$1/x; #replace with the (.*)

      Update: If you value your CPU time, use one of the other regular expressions instead. I got caught up in using the .*, leading to the need for the look-behind assertion. Better answers can be found here. For the interested,

      Benchmark: timing 100000 iterations of my_long_one, one_liner, two_lin +er... my_long_one: 6 wallclock secs ( 4.62 usr + 0.00 sys = 4.62 CPU) @ 2 +1659.09/s (n=100000) one_liner: 3 wallclock secs ( 3.89 usr + 0.00 sys = 3.89 CPU) @ 256 +73.94/s (n=100000) two_liner: 1 wallclock secs ( 2.16 usr + 0.00 sys = 2.16 CPU) @ 462 +10.72/s (n=100000) Using my_long_one => s/^[\s]*(.*)(?<!\s)\s*$/$1/ one_liner => s/^\s+|\s+$//g two_liner => s/^\s+//g; s/s+$//g

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://2258]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (5)
As of 2024-03-19 08:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found