Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re^5: Converting multiple spaces to nbsp

by Smylers (Pilgrim)
on Jun 17, 2005 at 14:11 UTC ( #467730=note: print w/replies, xml ) Need Help??


in reply to Re^4: Converting multiple spaces to nbsp
in thread Converting multiple spaces to nbsp

    'ikegami'     => sub {$working_var = $test_text; $working_var =~ s/(?<= )( )/'&nbsp;' x length($1)/eg}

That code is wrong: there's a + missing from after the 2nd space, which means that $1 always has a length of one! When benchmarking code, first check that each of your variants yield the same answer as each other before timing them.

However, in this particular case it doesn't seem to make much difference to the timings.

Personlly I'd go with GrandFather's solution even if it were the slower, on the grounds I think it'd be more readable&nbps;...

Personally I'd go with Ikegami's variant over GrandFather's, even though it is slower, because I think Ikegami's is more readable*! GrandFather's variant involves matching something that you don't intend replacing, then sticking it back in the substitution, which is a little messy. By using the lookbehind assertion Ikegami's way clearly documents that you wish to perform the substitution just after a space, but that the space itself isn't going to be replaced.

... to more people.

That's probably true, in the sense that the people who know the lookbehind assertion are a subset of those who know about regexps. But I think I should write my production Perl code for a target audience of people who do know Perl, and not worry that people who aren't Perl coders might not understand it: I'm employed to write Perl programs, in Perl, and I don't think it'd be reasonable of my employer to expect a Java programmer to understand them unaided.

(In the same way, when writing documentation in English I want to be able to choose the best way of saying what I want to say in English, rather than intentionally writing it more sloppily on the grounds that when I write it precisely and accurately I may be using words that are unfamiliar to those who don't speak English: I'm employed to write English documentation, in English (in England, for other English people to read), and I don't think it'd be reasonable of my employer to expect a Brazillian to understand it unaided.)

* Actually, I'd probably go with my own variant (see above), which happens to be faster than either of these.

Smylers

Replies are listed 'Best First'.
Re^6: Converting multiple spaces to nbsp
by ikegami (Pope) on Jun 17, 2005 at 15:48 UTC
    Actually, I'd probably go with my own variant (see above), which happens to be faster than either of these.

    Yours isn't the fastest for me, although I've added use warnings, use strict; and forced a scalar context unto the substitution:

    cmpthese(-3, { GrandFather => sub { local $_ = $test_text; scalar s/ ( +)/" " . ("&nbsp;" x length ($1))/ge }, ikegami => sub { local $_ = $test_text; scalar s/(?<= )( +)/'&nbsp;' x length($1)/eg }, Smylers => sub { local $_ = $test_text; scalar s/(?<= )( )/&nbsp;/g }, }); __END__ Rate ikegami Smylers GrandFather ikegami 22180/s -- -25% -31% Smylers 29772/s 34% -- -8% GrandFather 32268/s 45% 8% --
      Yours isn't the fastest for me,

      I can reproduce your results with your benchmark. So I compared it with mine and played 'spot the difference'. It seems that the significant difference was in the source text, which I'd copied from the original without thinking about it very much.

      As it happens, the way I'd copied the code and indented it in my editor I had $text_text wrapped on to 7 lines, and with a couple of spaces indenting the 2nd and subsequent lines. This gives several more places in the text where substitutions are required.

      So for comparison, with your benchmark above I get:

      Rate ikegami Smylers GrandFather ikegami 17125/s -- -28% -34% Smylers 23635/s 38% -- -9% GrandFather 25951/s 52% 10% --

      Putting 2 spaces at the beginning of the lines gives:

      Rate ikegami GrandFather Smylers ikegami 10990/s -- -20% -37% GrandFather 13676/s 24% -- -22% Smylers 17494/s 59% 28% --

      So vastly increasing the number of places where substitutions are required, by running s/e/e  e/g over the source text, gives:

      ikegami 3615/s -- -3% -50% GrandFather 3711/s 3% -- -48% Smylers 7181/s 99% 94% --

      But, reverting those extra double-spaces back out of the source text and instead increasing its indentation to 10 spaces at the beginning of each line gives these results:

      Rate Smylers ikegami GrandFather Smylers 6692/s -- -33% -43% ikegami 9917/s 48% -- -16% GrandFather 11769/s 76% 19% --

      Note that for the /e-based solutions there are the same number of places that need substitutions as when the indent was just by 2 characters, but for my solution the number of substitutions has greatly increased, cos each one is just of a single character.

      It seems reasonable that a substitution using /e takes longer than one with out, so avoiding using /e has greater benefit the more substitutions that are required. But yours and GrandFather's methods which replace a sequence of spaces in one go have greater benefit the longer the sequences of spaces are.

      So which method is faster depends on whether you expect there to be many places that need substituting, and whether you expect them only to be a couple of spaces long or much longer.

      Does that make sense?

      Smylers

Re^6: Converting multiple spaces to nbsp
by GrandFather (Sage) on Jun 17, 2005 at 23:29 UTC

    Personally I'd go with Ikegami's variant ...

    Actually, I agree. Nice point well made (I did say I was still writing toddler code didn't I).


    Perl is Huffman encoded by design.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://467730]
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (3)
As of 2018-05-26 10:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?