Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^2: How do I quickly strip blank space from the beginning/end of a string?

by blahblahblah (Priest)
on Jul 21, 2009 at 22:25 UTC ( #782109=note: print w/ replies, xml ) Need Help??


in reply to Re: How do I quickly strip blank space from the beginning/end of a string?
in thread How do I quickly strip blank space from the beginning/end of a string?

You're right, in my haste to get this question out there I made the jump from my original problem to japhy's post to talking about his benchmarks. One of my coworkers pointed out the same thing to me as I was headed out the door. Obviously I should be benchmarking the exact problem I want to solve, not some generally similar example. After I get the kids to bed I'll write a better benchmark and try your \K suggestion below too. Thanks.

Also, you made the point that none of my input ends with spaces. I think that's generally true in real life usage too. It's frustrating that we have this pervasive idiom in our code of "strip whitespace just in case", but I think most of the time the input is already just fine. In fact, I think much of the time the input is short and has no spaces at all. I wonder if I should be checking it with index() first to quickly rule out that case.

update: added paragraph spacing


Comment on Re^2: How do I quickly strip blank space from the beginning/end of a string?
Re^3: How do I quickly strip blank space from the beginning/end of a string?
by ikegami (Pope) on Jul 21, 2009 at 23:06 UTC

    Also, you made the point that none of my input ends with spaces. I think that's generally true in real life usage too.

    Then shouldn't you be benchmarking space detection?

      Good point. I've tried the 3 methods below:
      if ($x =~ /\s/)
      and
      if (substr($x, -1) =~ /^\s/
      and
      if (rindex($x," ") == 0 || rindex($x,"\r") == 0 || rindex($x,"\n") == + 0 || rindex($x,"\t") == 0)
      The rindex method is much faster on the data that has spaces; otherwise the substr method is the fastest. Any other good ideas, keeping in mind that the data (I think) won't often contain any trailing spaces?

        $x =~ /\s/
        doesn't work. It should be
        $x =~ /\s\z/
        Bonus: The speed of fixed version won't depend on the length of the string like the broken one did.

        if (rindex($x," ") == 0 || rindex($x,"\r") == 0 || rindex($x,"\n") == 0 || rindex($x,"\t") == 0)
        doesn't work. It should be
        if (rindex($x," ") == length($x)-1 || rindex($x,"\n") == length($x)-1 || rindex($x,"\t") == length($x)-1 || rindex($x,"\r") == length($x)-1)
        Or just
        my $ch = substr($x, -1); $ch eq " " || $ch eq "\n" || $ch eq "\t" || $ch eq "\r"

        And then there's
        length($x) && index(" \n\t\r", substr($x, -1)) >= 0

        You should be concentrating on writing code that actually works before worrying about operations that take 0.00001 second.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://782109]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (5)
As of 2015-07-04 05:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (57 votes), past polls