Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

I wrote some clever code - can the comment "defuse" it?

by Aristotle (Chancellor)
on Jun 17, 2003 at 21:14 UTC ( #266644=perlquestion: print w/replies, xml ) Need Help??

Aristotle has asked for the wisdom of the Perl Monks concerning the following question:

I have just come across the first time in a very long while that I feel a need to comment a piece of code. It's the solution for a relatively simple task that I find turned out very natural - if unorthodox. The latter is the reason I wonder whether I'll be able to decipher its meaning in a few months from now.

So first of all, I ask you to take a look at these four lines. All you need to know is that @letter contains symbolic letter names (the Postscript letter names, if anyone's curious) and that %property consists of pairs of all possible letter names and a value of either L, D or X. (Depending on whether the symbolic name represents a letter, a digit or something else as per its Unicode properties.)

Please don't look too hard at the code if you can't figure it out. Read the comment and then look again. My question is - how much help is the comment?

my @letter_seq = map( $curr->[STYLE].":".join(',', splice @letter, 0, length), join('', @property{@letter}) =~ /\G(X|D+|L+)/g, );
The comment:

# to group together letter characters, and digit characters,
# and leave anything else as single symbols, the regex engine
# is run against a map of letter properties put in a
# string. each match on the map is as long as a slice of
# consecutive characters with that property in the array

Makeshifts last the longest.

Replies are listed 'Best First'.
Re: I wrote some clever code - can the comment "defuse" it?
by dws (Chancellor) on Jun 17, 2003 at 22:04 UTC
    I have just come across the first time in a very long while that I feel a need to comment a piece of code.

    I wonder how much that need gets reduced after some minor reformatting. My preference is to use the the other form of map, add parenthesis to make the splice more readable, and make the implicit argument to length explicit.

    my @letter_seq = map { $curr->[STYLE] . ":" . join(',', splice(@letter, 0, length $_)) } join('', @property{@letter}) =~ s/\(G|D+|L+)/g;
    To my eye, that makes the intent a bit clearer.

    Then, when I look at what needs commenting, I'm drawn to two things: First, that @letter_seq doesn't contain the same thing as @letter (which I found misleading at first read). Next, the final join. The latter issue can be clarified by a using a carefully named temporary variable.

    my $flattened_properties = join('', @property{@letter}); my @letter_seq = map { $curr->[STYLE] . ":" . join('', splice(@letter, 0, length $_)) } $flattened_properties =~ /\G(X|D+|L+)/g;
    I find this more readable (YMMV), but still wonder if the algorithm is what you really intend. The properties you're extracting from the collapsed property string don't correspond with the start of the substring of @letter that you're extracting. That part puzzles me.

      I did originally have an intermediate $property_map in there.. They do correspond - remember that splice is destructive. Let me rephrase that code a bit. I think I know how to clear it up now.

      Makeshifts last the longest.

Re: I wrote some clever code - can the comment "defuse" it?
by halley (Prior) on Jun 17, 2003 at 22:12 UTC

    No, for me, the comments didn't help make this more maintainable.

    This is exactly why I often state the aphorism, strategy in comments, tactics in code. Write your comments to explain the intended overall goal of a snippet, not what each argument is or why you wrote it with that particular syntax. I nearly should be able to delete the CODE unseen and rewrite it to match the COMMENTS.

    To help understand the difference between strategy and tactics, think of the Lieutenant or Line Manager: they order you to secure a rock, or to set camp, or to close the deal, or to collect the report. Unless they're ineffective micromanagers or they're training you, they don't mention the tools that you, the grunt/peon, will need to accomplish the job.

    For code comments, I don't want to hear the word "array" or "regex," I already am familiar with those terms or you should fire me. Tell me that you want to generate the exhaustive set of printer model names, or the font's available glyph names, or the lexically scanned tokens, or whatever this is.

    If you want comments to succeed, it should be legible in English (or whatever your maintenance coders will agree to read). Say it out loud, and imagine hearing someone say it to you. If something seems clumsy or vague, rephrase. Consider capitalization and punctuation to aid the reader.

    Then reinforce your strategic comments with tactical code. Phrase the code so it reads naturally if possible, and reinforce the comments' value by sharing the key nouns as identifiers in the code.

    [ e d @ h a l l e y . c c ]

Re: I wrote some clever code - can the comment "defuse" it?
by kabel (Chaplain) on Jun 17, 2003 at 21:58 UTC
    i remembered a quotation of a former prof of mine:
    To paraphrase Samuel Johnson, we advise you to look over your code very carefully, and whenever you find a part that you think is particularly fine, strike it out!

    Charniak et al. 1980

    the code you posted is not the easiest one. plus, you are a skilled perler. plus, the language we speak about is perl (everything just is condensed). plus, you said yourself that you are unsure about the expresiveness of the code.

    i think, everyone reading that in a few months will get problems - it is then a matter of experience how long the remembering-phase will take to start ;-)

      Or to take one from the collection on my home node:
      Rule of thumb: if you think something is clever and sophisticated, beware - it is probably self-indulgence.
      — Donald Norman

      Makeshifts last the longest.

Re: I wrote some clever code - can the comment "defuse" it? (stab at restructuring)
by Aristotle (Chancellor) on Jun 17, 2003 at 22:13 UTC
    After a lot of discussion (much of which happened on the CB and is not going to be available for posterity, I'm afraid) I think the following is much clearer.
    my @sequence_length = map { length } join('', @property{@letter}) =~ /\G(X|D+|L+)/g, my @letter_seq = map { $curr->[STYLE] . ":" . join(',', splice @letter, 0, $_), } @sequence_length;
    Update: I got pretty positive feedback for this version from the monks involved in the discussion, so that's what I'm sticking with. Thanks everyone! :)

    Makeshifts last the longest.

      I think the following is much clearer.

      Saving the run lengths in a temporary does help readability. The only nit I have with the otherwise good updated version is that it destroys @letter from within an expression. In this case, it's what you want, but it risks a "Hey, what happened to my letters?!?" if someone else tries to reuse the snippet later.

        I don't like it either, but the alternative requires an index variable and some math in the index range for a slice, which I find awfully icky. I'm surprised by the number of people who were caught out by the splice though - I use it in similar fashion as I do here pretty frequently, so it's a familiar sight.

        Makeshifts last the longest.

      It just dawned on me that I can follow MarkM's advice from the chatterbox more directly than I thought. My %property is completely redundant considering I have %enc which maps from PS names to Unicode characters.
      my @sequence_length = map { length } join('', @enc{@letter}) =~ / \G( \p{Letter}+ | \p{DecimalNumber}+ | [\P{Letter}\P{DecimalNumber}] ) /gx;
      This is, IMO, hugely more self documenting than what I had before.

      Makeshifts last the longest.

      I didn't follow this node.. Just wanted to mention that the discussion can be made available if anyone would like, and no-one objects.. (Until next log-delete, that is.. hey, rolling logs would be good for im2, which module does that? :)


Re: I wrote some clever code - can the comment "defuse" it?
by hardburn (Abbot) on Jun 17, 2003 at 21:51 UTC

    I'd treat this bit of code as a black box. In other words, the comment should say what input is expected and what will come out again. The "how" part is left as an exercise to your future self :)

    Reasoning : you can't be sure that your comment will make things any more understandable at some point in the future, but with "black box code", you can at least be sure your future self will know how to rewrite that bit, should you feel the need to.

    I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
    -- Schemer

    Note: All code is untested, unless otherwise stated

Re: I wrote some clever code - can the comment "defuse" it?
by djantzen (Priest) on Jun 17, 2003 at 21:44 UTC

    I'd be explicit in the length and drop the trailing comma after your regex. How about letting the documentation read in the order of execution?

    my @letter_seq = map( $curr->[STYLE].":".join(',', splice @letter, 0, length($_)), join('', @property{@letter}) =~ /\G(X|D+|L+)/g ); # Join the result of a hash slice taken from the letter-to-property # mapping. Perform a match against the joined string to group together # alphanumeric characters and leave all others distinct. Remove each m +atch # from the @letter array and join the character(s) to an output string + such # that the resulting format is i.e., "some_type:f4,!".

    "The dead do not recognize context" -- Kai, Lexx
Re: I wrote some clever code - can the comment "defuse" it? (explanations)
by Aristotle (Chancellor) on Jun 17, 2003 at 21:31 UTC

    After some CB conversation, I decided to add the following information.

    First of all - ignore the $curr->[STYLE].":". style bit which is from elsewhere in the code and not relevant here.

    Secondly, the following is how it works. Please try to work it out before reading this, otherwise you don't help me. :)

    I'm using a hash slice and a join to create a property map for the letters. If @letter is qw(F o o comma b a are) then after the hash slice and join I get "LLLXLLL". The regex splits this into qw(LLL X LLL), which is what map iterates over. The length of each of those matches controls the number of elements spliced off the front of @letter.

    Update: these are then joined together so @letter_seq ends up as ("F,o,o", "comma", "b,a,are").

    Makeshifts last the longest.

      Apart from that I think that using the block form of map would make it a whole lot more obvious which part of this is being iterated over, which I know you'll disagree with, possibly vehmently :), I think description of how it works in this post is infinitely more useful as a comment that the original.

      I tried (really hard) to understand the code without the original comment. I eventually gave up and read the comment, and frankly was none-the-wiser. Maybe in the context of the surrounding code it might have clarified more. Once I read your description here, I finally understood what the code was doing, and how it was doing it.

      On that basis, I would use the description from this post, including the update, in preference to the original.

      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller

Re: I wrote some clever code - can the comment "defuse" it?
by jepri (Parson) on Jun 18, 2003 at 05:58 UTC
    As others have kind of touched on, comments should never explain what the code actually does. If you ever feel the need to to explain code, then you do need to rewrite it to make it clearer.

    I think that your rewritten version is clearer, but I feel the problem in the original is the splice. I have developed a well earned loathing of tinkering with arrays like that. I prefer to map or push elements and work on them methodically.

    I think you could do it a little more lisp like with multiple maps taking the place of the join and regexp, but that's personal style, and that would require a temporary variable outside the scope of the map block. But temporary variables aren't always bad.

    I didn't believe in evil until I dated it.

      I need slices of varying length from the input array. There's currently no way to express this succintly - the splice solution is merely the least unsatisfactory given my biases. It's not good, no, but the alternative is even worse.

      The other approach would have been to loop over the letter array and manually compare previous and current property. Conceptually, that's a very basic form of pattern matching - so why not use the native pattern matching mechanism Perl already offers?

      The problem is that the regex engine only works against characters in a string - if it was possible to run against any ordered sequence of elements, regardless of representation (whether that be the traditional characters in a string or elements in an array or maybe something else completely like records in a database), I wouldn't need to go through mapping to a string first. Rather than get a series of strings out of the match, I'd get exactly the series of slices I'm building manually here.

      Since I can only have the former, I have to somehow construct the latter from that. And a series of slice lengths to take from an array is not elegantly expressible in current Perl idioms.

      Makeshifts last the longest.

Re: I wrote some clever code - can the comment "defuse" it?
by ChemBoy (Priest) on Jun 18, 2003 at 05:27 UTC

    I think the various viewpoints on your original question have all been covered adequately already, so I'm going to launch onto a slightly random tangent instead—is there a particular reason why  /\G(X|D+|L+)/g is preferable to the slightly less punctuated /X|D+|L+/g in this situation?

    I might also have chosen to use a strategically placed local $, = ""; and "@property{@letter}" =~ // rather than a join, but I'm not going to ask why you didn't do that, since I admit that it probably falls well over the line into "too clever by half" territory. :-)

    If God had meant us to fly, he would *never* have given us the railroads.
        --Michael Flanders

      I opted for this particular regex as a minor safety net. If any value in %property is ever something else than L, D, or X, the regex will fail at that point. That's just a smidgen better than letting the regex skip those silently, causing the splice to pull together the wrong things.

      The latter would be too clever by a half indeed; the join is very clear in my mind, and if I wanted to make it more explicit, I'd use another temporary scalar at that point instead.

      Makeshifts last the longest.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://266644]
Approved by Enlil
Front-paged by mattriff
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (4)
As of 2023-12-05 05:39 GMT
Find Nodes?
    Voting Booth?
    What's your preferred 'use VERSION' for new CPAN modules in 2023?

    Results (25 votes). Check out past polls.