http://www.perlmonks.org?node_id=125453


in reply to Re (tilly) 2: Efficiency in maintenance coding...
in thread Efficiency in maintenance coding...

tilly: Kudos to the first person to figure out what the bug is.

I'm guessing split /\W/: this splits on each non-word character, but if there are several \Ws together(a comma followed by a space, for instance) it will split between them, creating a spurious "" word. The fix was to look for \w+ (although you might also say split /\W+/).

Update: The above split-based "solution" introduces spurious "" words if a line (say) begins (or ends) with a \W. Looks like m/(\w+)/g is the Right Thing in this case.

Update 2: Of course, split discards any empty trailing entries, so only the ones at the beginning of the line are a problem. (I'll get this eventually...)

--
:wq
  • Comment on Re(3): Efficiency in maintenance coding...

Replies are listed 'Best First'.
Re(tilly) 4: Efficiency in maintenance coding...
by tilly (Archbishop) on Nov 15, 2001 at 03:05 UTC
    Pretty good...

    But your proposed fix only handles 95% of the problem. Why didn't I try fixing things that way? (ie What does your fix miss?)

    BTW so far this code example is not making the case for Perl being maintainable look very good... :-(

    UPDATE
    Your update is half-right. The half that is wrong is the most common misunderstanding I have encountered about how split behaves...

    UPDATE 2
    Eventually seems to have come. Modulo difficult questions about how the definition of a word ain't what you would expect. Consider a kudo delivered. :-)

      95%? You expect to only have 20 words? ;) [Update: I was thinking of a "slurp" version -- and rechecking shows that the original Perl version had a much more serious bug...]

      BTW, I guess "ain't" ain't a word. (:

              - tye (but my friends call me "Tye")
        No, it ain't, either in Java or Perl.

        BTW you are making Perl look better, which none of the rest of us were doing! The way to demonstrate maintainability is to demonstrate how easy maintainance is. Fixing the definition of "word" used to include an optional 't or 's is easily enough done in the Perl version. In the Java version it is far harder to realize this important feature...