Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re^3: substr question

by stevenmay (Initiate)
on Jun 19, 2010 at 02:43 UTC ( #845498=note: print w/replies, xml ) Need Help??


in reply to Re^2: substr question
in thread substr question

Ah... the original post said 'around 100 characters' not that that was the maximum. But no matter.

Sigh. I suppose I did commit the cardinal sin of posting Sloppy code.

And, I should have been clear that what I posted was NOT a turnkey solution but a suggestion that a regex approach might make sense.

so... OK, below is the result of another few minutes fiddling, this will work better, certainly.

my $string; $string = 'lasdufaner%.alsdfi,' x 100; # $string = 'freddy\'s wife wilma, ' x 100; my $max = 100; if ( $string and length $string > $max ){ $string = substr( $string, 0, $max); my ($tmp) = $string =~ /(.+)\s.*?$/; # last space if possible $tmp or ($tmp) = $string =~ /(.+)\W.*?$/; # bust on last non-word $tmp and $string = $tmp; print $string }


freddy's wife output:
freddy's wife wilma, freddy's wife wilma, freddy's wife wilma, freddy's wife wilma, freddy's wife

lasd... output
lasdufaner%.alsdfi,lasdufaner%.alsdfi,lasdufaner%.alsdfi,lasdufaner%.alsdfi,lasdufaner%.alsdfi

The point being, I suppose, that this sort of thing might be easily handled by a regular expression in most cases.

Thanks for your comment though, it's always good to have a second set of eyes. :-)

\s

Replies are listed 'Best First'.
Re^4: substr question
by ikegami (Pope) on Jun 19, 2010 at 04:06 UTC

    but a suggestion that a regex approach might make sense.

    That had already been done.

    this will work better, certainly.

    That's a lot of code. Does it do anything different than the following?

    print $string =~ /^(.{0,100})(?!\S)/;

      Well, I generally prefer to break things into more verbose code to make life easier 2 years down the road (or next week) when I have to change something.

      Too, experience said my approach should be faster since more but easier (for Perl) regexes are typically faster than one complex one.

      Just out of curiosity I ran a test and Benchmark confirmed my suspicion.

      I set up the test with each sub populating $string from a global, processing it and returning the results. With 10,000,000 iterations it's around 8.75 seconds for the one liner, and around 4.75 for my multi-line example on my machine.

      I fiddled the code a bit to see if minor variations made any difference with little change in the times in either approach.

      So, a little less than twice as fast in execution time...

      Not a big deal at all with any reasonable number of string matches, but a little here, a little there...

      Anyway, the OP was presented with several options. Life is good.

      \s

        I generally prefer to break things into more verbose code to make life easier 2 years down the road (or next week)

        How is trying to understand 50 instructions easier than understanding 5. Your code is so complex it would take quite some time for me to understand it now, in weeks, in years.

        experience said my approach should be faster since more but easier (for Perl) regexes are typically faster than one complex one.

        Both of your regex are more complex than mine. (All three read linearly, but yours are longer.) Plus you have numerous additional Perl ops. It makes no sense for your code to be faster when the regex are executed. And it's not. You probably didn't take into account that your solution modifies the input.

        ('x' x 90).' '.('x' x 10) Rate stevenmay ikegami2 ikegami1 stevenmay 256955/s -- -35% -48% ikegami2 394568/s 54% -- -19% ikegami1 490119/s 91% 24% -- ('x' x 10).' '.('x' x 90) Rate stevenmay ikegami2 ikegami1 stevenmay 102399/s -- -13% -19% ikegami2 118154/s 15% -- -7% ikegami1 126866/s 24% 7% --

        In the case where the string is shorter than 100, having the check makes it faster. You can always add that to mine. That's what ikegami2 is.

        ('x' x 99) Rate ikegami1 stevenmay ikegami2 ikegami1 771011/s -- -81% -83% stevenmay 4071762/s 428% -- -12% ikegami2 4620312/s 499% 13% --

        Benchmark code:

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://845498]
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (3)
As of 2018-04-26 04:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?