Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

Re^2: substr question

by ikegami (Pope)
on Jun 19, 2010 at 00:57 UTC ( #845486=note: print w/replies, xml ) Need Help??

in reply to Re: substr question
in thread substr question

I don't see how that's ever useful. If the string has less than 100 characters, it returns nothing. If the string has 100 or more characters, it returns either the same as the OP's solution or too many characters. (I'm assuming 100 is a maximum, such as a screen width or a field width.)

You've also used a different definition of "word" than everyone else such that cutting "don't" into "don" and "'t" is acceptable, and so is cutting "don't" into "don'" and "t".

Replies are listed 'Best First'.
Re^3: substr question
by stevenmay (Initiate) on Jun 19, 2010 at 02:43 UTC

    Ah... the original post said 'around 100 characters' not that that was the maximum. But no matter.

    Sigh. I suppose I did commit the cardinal sin of posting Sloppy code.

    And, I should have been clear that what I posted was NOT a turnkey solution but a suggestion that a regex approach might make sense.

    so... OK, below is the result of another few minutes fiddling, this will work better, certainly.

    my $string; $string = 'lasdufaner%.alsdfi,' x 100; # $string = 'freddy\'s wife wilma, ' x 100; my $max = 100; if ( $string and length $string > $max ){ $string = substr( $string, 0, $max); my ($tmp) = $string =~ /(.+)\s.*?$/; # last space if possible $tmp or ($tmp) = $string =~ /(.+)\W.*?$/; # bust on last non-word $tmp and $string = $tmp; print $string }

    freddy's wife output:
    freddy's wife wilma, freddy's wife wilma, freddy's wife wilma, freddy's wife wilma, freddy's wife

    lasd... output

    The point being, I suppose, that this sort of thing might be easily handled by a regular expression in most cases.

    Thanks for your comment though, it's always good to have a second set of eyes. :-)


      but a suggestion that a regex approach might make sense.

      That had already been done.

      this will work better, certainly.

      That's a lot of code. Does it do anything different than the following?

      print $string =~ /^(.{0,100})(?!\S)/;

        Well, I generally prefer to break things into more verbose code to make life easier 2 years down the road (or next week) when I have to change something.

        Too, experience said my approach should be faster since more but easier (for Perl) regexes are typically faster than one complex one.

        Just out of curiosity I ran a test and Benchmark confirmed my suspicion.

        I set up the test with each sub populating $string from a global, processing it and returning the results. With 10,000,000 iterations it's around 8.75 seconds for the one liner, and around 4.75 for my multi-line example on my machine.

        I fiddled the code a bit to see if minor variations made any difference with little change in the times in either approach.

        So, a little less than twice as fast in execution time...

        Not a big deal at all with any reasonable number of string matches, but a little here, a little there...

        Anyway, the OP was presented with several options. Life is good.


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://845486]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (2)
As of 2018-07-22 18:56 GMT
Find Nodes?
    Voting Booth?
    It has been suggested to rename Perl 6 in order to boost its marketing potential. Which name would you prefer?

    Results (455 votes). Check out past polls.