Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Pulling white space off before/after string?

by ecuguru (Monk)
on Apr 17, 2007 at 04:26 UTC ( #610450=perlquestion: print w/replies, xml ) Need Help??

ecuguru has asked for the wisdom of the Perl Monks concerning the following question:

Hi gang,

I'm trying to clean up some web page filtering, and I've got strings that have too much white space. like:
" New Hampshire "
"bob tom dick harry "

Strings that should have spaces inside them, but with spaces at the beginning and end that I need to get rid of. I'm guessing that I should make a while loop that starts from the front and if the character is a space, keep going, until I get to a non-space character, and copy the remainder of the string. Repeat from the end, backwards.

But I'm figuring there has to be a better way to do this. How can I trim whitespace off of both ends (as small as one space, or as large as 30), without killing the let spaces inbetween characters??
  • Comment on Pulling white space off before/after string?

Replies are listed 'Best First'.
Re: Pulling white space off before/after string?
by bobf (Monsignor) on Apr 17, 2007 at 04:33 UTC

    See perlfaq4: "How do I strip blank space from the beginning/end of a string?".

    s/^\s+//; # strip white space from the beginning s/\s+$//; # strip white space from the end

      I just came back to post I found a working solution:
      $buff =~ s/ *$//; #Remove Trailing Spaces $buff =~ s/^ *//; #Remove Leading Spaces
      You've got the same idea I have, but I'm too regex newbie to figure out the differences. Can you tell me if there are any differences in our regexes?
      Thanks!!

        / *$/ means 'match zero or more space characters anchored to the end of the string', which will always match since it can always match zero characters. It will also match only space characters - other white space (such as tabs) will not match.

        /\s+$/ means 'match one or more white space characters anchored to the end of the string'. The \s includes spaces, tabs, and other forms of white space.

        See perlre for more information. The Tutorials section also has several entries in Pattern Matching, Regular Expressions, and Parsing, which you may find helpful.

        You are matching only space characters. bobf's version matches any white space characters (which includes spaces, tabs and various other characters including UNICODE white space characters).


        DWIM is Perl's answer to Gödel
Re: Pulling white space off before/after string?
by GrandFather (Saint) on Apr 17, 2007 at 04:35 UTC

    Perhaps the biggest and most generally used tool in the Perl toolbox is the regex (see also perlretut). I strongly recommend that you take a look at the regular expression documentation then if you are still asking this question come back with the regex substitution you are having trouble with.


    DWIM is Perl's answer to Gödel
Re: Pulling white space off before/after string?
by monkey_boy (Priest) on Apr 17, 2007 at 09:48 UTC
    Regexs are soooo last decade!
    use String::Util
    use String::Util qw(trim) my $s = trim(' here are lots of nasty spaces ');

    UPDATE: seems i accidently caused a small argument with my tounge in cheek comments. However, IMHO this is a nice little module, with the trim() function being one of many nice little snipits provided that i use often, worth checking out!. Also, just becuase it hasnt been updated in a year does not make it a obsolete, its more likely that there are no bugs to be fixed, as all the subs are quite small & simple.


    This is not a Signature...

      In what decate was it good practice introducing a dependency on an external module in order to get the exact same two substitutions hidden in a sub:

      #--------------------------------------------------------------------- +--------- # trim # =head1 trim(string) Returns the string with all leading and trailing whitespace removed. Trim on undef returns undef. =cut sub trim{ my ($val) = @_; if (defined $val) { $val =~ s|^\s+||s; $val =~ s|\s+$||s; }; return $val; } # # trim #--------------------------------------------------------------------- +---------

      If it really bothers you just add one line to your code and be done with it.

        In what decate was it good practice introducing a dependency on an external module in order to get the exact same two substitutions hidden in a sub:

        The same decade where abstracting operations behind a meaningful name made programs easier to understand. Compare:

        # trim whitespace $string =~ s/^\s+//; $string =~ s/\s+$//;

        ... to:

        trim( $string );
Re: Pulling white space off before/after string?
by f00li5h (Chaplain) on Apr 17, 2007 at 04:36 UTC

    Checkout perlfaq4, you may also want to have a look at perlre and perlretut

    The size of the strings may change the way you do it, because capturing a 10Gb string just to remove spaces from the start and end may not be the best approach.

    Also, what have you tried so far?

    @_=qw; ask f00li5h to appear and remain for a moment of pretend better than a lifetime;;s;;@_[map hex,split'',B204316D8C2A4516DE];;y/05/os/&print;
    Much Later removed typos
Re: Pulling white space off before/after string?
by swampyankee (Parson) on Apr 17, 2007 at 04:43 UTC

    With a regex, of course.

    s/^\s+//; # removes leading white space s/\s+$//; # removes trailing white space

    emc

    Insisting on perfect safety is for people who don't have the balls to live in the real world.

    —Mary Shafer, NASA Dryden Flight Research Center
Re: Pulling white space off before/after string?
by swares (Monk) on Apr 17, 2007 at 06:05 UTC
    If you want to make the changes on the fly you could use this with your regex. The .bak specifies a backup file extension, so you have an original copy. You can change it as you like or remove it if you wish no backup. The nice part is both operations are performed at once. You could also do that like above as so by looping through the lines of the file. Variable Substitution
    $my_data_line =~ s/^\s+|\s$+//g;
    Modify file inplace and make a backup copy to filename.html.bak the easy way.
    perl -p -i.bak -e 's/^\s+|\s$+//g' *.html
      you probably mean:
      $my_data_line =~ s/^\s+|\s+$//g;

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://610450]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (3)
As of 2021-05-08 22:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Perl 7 will be out ...





    Results (97 votes). Check out past polls.

    Notices?