Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

What is code readability?

by brian_d_foy (Abbot)
on Jan 02, 2007 at 19:23 UTC ( #592616=perlmeditation: print w/ replies, xml ) Need Help??

[ I just finished writing a chapter for an upcoming O'Reilly book about what makes code beautiful. I did a bit of research on what people have to say about readability and ultimately used none of it because I decided it was a dumb way to measure code beauty and it was flame-bait anyway. I don't want to waste all that effort though, so I'll add it to the monastery and get it off my desktop. :) Maybe others can expand on what I started. ]

Start a language war, and no matter what side you're really on, you'll probably try to claim that your language is more readable. Is Python really easier to read than Perl, or Java easier to read than Python? Anyone voting for APL or LISP? If you're Damian Conway it doesn't really matter because you turn any language into what you want anyway and it is more readable as long as you forget which language it started as.

Worse than that is the intra-language readability wars. Everyone thinks that their code is readable and almost everyone thinks that nobody else's code is even decipherable. Do you use K&R or ANSI kung-fu style? Two or four spaces? Or tabs? Do you even use whitespace (I'm looking at you, Monks-Who-Posts-In-Obfuscation). I, like everyone else, have my own ideas about where things should go in code.

Maybe you get three people to agree on the right style. Now which built-in functions do you get to use? You get two of the three to agree not to use goto, but nobody can agree on unless except for a moment before someone asks "You mean just in statement modifiers" and the other person sucker punches him.

Indeed, the popularity of tools such as Perl::Tidy (or even Perl::Critic) demonstrates that not even people who agree on a language can decide.

Before I start on my own thoughts I figured I'd scour Google to see what people talk about when they mean "code readability". Here's a short list roughly in order I encountered them, phrasing the requirement in terms of enhanced readability. Curiously, the top Google hits were mostly blog entries rather than authoritative sources.

  1. Smaller line count
  2. Brace-defined blocks
  3. Short scopes with items declared close to their use
  4. Naming conventions
  5. Doesn't need syntax highlighting
  6. Consistency
  7. Familiar style
  8. Defined regions of code that goes together as a logical unit
  9. Lack of clever-ass tricks
  10. Use words instead of symbols
  11. Make function with good names to describe intent
  12. Use whitespace to separate related chunks of code

Now, before I get to my opinions, let me prepare the battlefield.

I don't care what language it is

Before I judge the readability of a language, I should know the language. I don't fault the programmer for using something that's perfectly valid just because I don't know that part of the language. For instance, most commonly in code reviews, I find that new-comers to Perl don't particularly like the use of map for whatever reason. That makes the code unreadable in the same way as I have trouble reading Le Monde without a way to look up the french nouns I don't know. It's not about french's readability; it's just my deficient french vocabulary.

I particularly like Joel Spolsky's "Making Wrong Code Look Wrong" because he explains how newcomers don't even know what to look at when they first see a language. You actually have to know the language and use if for a bit before you can say anything about it.

I need to understand the task, too

Someone can write a perfectly readble bit of device driver code and I still won't understand it. I'll probably be able to figure out the syntax and the various operations but I'll be foggy on what its doing and why it has to happen that way. That's not the fault of the code. I'm not a device driver sort of guy, and as complicated as those can be, I don't expect the code to make me understand the field.

I don't expect the documentation to explain the complete history of device drivers either. Documentation is good, but anything useful is going to assume quite a bit of basic knowledge about the domain. I've done a lot of web apps, but I don't document the cookie specification in my programs. I might note an exception that covers a particular browser bug and how the code deals with that, but I'm not writing an encyclopedia. Leaving code comments to knowledgable developers isn't the same as teaching newbies.

Going the other way, even with "unreadable" code, if I understand the task I can probably figure it out pretty easily. I already know the steps involved, so I just have to connect that to the code.

I'd bet that some people who like to carp about unreadable code in some language looked at code for something they wouldn't understand anyway or were new to the subject and didn't give themselves enough of a trial period with the new subject. There's no reason that we should instantly understand anything (although that's not license to be coy with the code I write).

People are different

There is no measure of absolute readability just like there is no way to get people to all have the same favorite movie. Although the stereotype would like to paint this as usual nerdy deficiency to see anything other than black-or-white, it's a universal condition.

Eric Raymond likes Python because he thinks its more readable. Okay, that's fine. He can think that and not be wrong. However, I take my wife's advice on this: "Take what Roger Ebert says and think the opposite!" It's not that Roger Ebert is wrong but that she knows what he thinks about other things, what she thinks about other things, and how that affects any single thing Ebert might say and how she might agree. Apply that to Eric Raymond other ideas.

That's not really a knock on Eric, though. If you think like he does in other things, you may find that you also will find Python to be more readable than other code. If you're a fan of Larry Wall's other ideas, you'll probably find Perl as readable as he does. It's not that Python is more readable, it's more readable for Eric. Why that might be is another, but off-topic, long post best left to a pub discussion.

Even within a language different people prefer different styles, so apply this same idea at the micro level.

High-level language source code is for people

We write in higher languages so people can understand what we're writing, as well as machine portability and encapsulating big ideas in keywords and idioms. If source code were for the computer, we'd just write it in machine language.

Most people write for themselves

Left alone, people come up with a model of their world that makes sense to them, ranging from where to put things in the kitchen to variable names in their code. Their scheme makes perfect sense to them.

Yesterday I ran across "Perlish Coding Style", which lacks both Perlishness and style in favor of an overriding fondness for semicolons. I'll come back to this later, but here's an example:

; sub foobar { my $value = shift ; my $c = 0 ; if ( ... ) { my $a = ... ; my $b = $a + 5 ; my $c = compute($a) } ; return }

That makes perfect sense to DOMIZIO, and he even tried to explain it to everyone else. His style is a bit of an extreme example, but everyone has their little thing that makes sense to them.

Real Readability

Readability isn't a feature of a language—that comes from my first assumption. I'm going to skip the usual bits about good variable names, commenting, and so on. Even programmers who refuse to do that stuff secretly know they are good ideas. More fundamental than those things are a few principles that I can apply to any code in any language.

Real readability isn't strict adherence to a set of guidelines. You aren't necessarily going to find it in pod:perlstyle or Perl Best Practices. Those certainly give you techniques that make code look nice but you need some first principles.

The important bits stand out

No matter what I'm coding, the important part of the code should be more apparent than the fiddly bits that go around it. In the extreme example, that's where that Perlish Coding Style went wrong. It elevates the banal statement and elements separator to prime importance in every line because every line starts with them and the semicolon draws attention to itself. We read code left to right (mostly), so things on the left mean more to us. Things on the right mean less. The Perlish Coding Style is less readable in general because it makes things that shouldn't be important stand out more than they should.

Whitespace is one way to emphasize things. Indention, black lines, and linings things up in columns all work to make the structure of code more apparent. People can argue exactly how to do that, but at the end of the day it's only about what's best for that code. What's best, however, might not always be the same. Most often when people argue about whitespace, they aren't arguing about what is best for the code. They're defending their editor, how they cut-n-paste, or some other extra-code concerns.

Code hiding, a.k.a. subroutining, removes the banal stuff unrelated to the task but necessary to move around the data. Programs have a job to do. That's their narrative arc. To drive that plot, all sorts of little things have to happen. We want to move the little bits out of the way to clearly show the plot. The subroutine name groups the boring stuff together and describes it, all the while hiding the boring bits from view. They were just getting in the way anyway.

It's easy to see what's related

Things that are related to each other should have some sort of connection, whether is variable name, proximity in the code base, or something else. Consider, for instance, two separate variables:

x = 3 y = 1

Outside of their context, those variables don't have much meaning (although the particular use of x and y suggests, at least to scientists, a 2D point (see "Use familiar or repeated idioms", coming up next). Are they separate things or do they go together? If they go together, make them go together by creating a pair.

point = [ 3, 1 ]

Most programmers have probably seen code that suffers from a lack of data structures, or at least reinvents the notion of a collection:

x1 = 4 x2 = 6 x3 = 9 x4 = 5 first_name = 'Fred' last_name = 'Flinstone' city = 'Bedrock'

Use familiar or repeated idioms

I don't really like the variable names the DBI documentation uses, but people know what $dbh and $sth are. If they read the DBI docs, for whatever reason, they'll know that the $sth in my code is the same sort of thing as the $sth in the documentation. Not only is the connection clear because they have the same name, but the reader doesn't have to remember an internal mapping of variable names and what they are. And, the network effect comes into play when everyone follows the DBI example. Look at just about any code, if DBI is involved $sth will probably mean something to do with DBI even if I don't see the code around it. This isn't cargo culting: you still need to understand everything.

Interesting resources

I ran across a number of interesting links, some only indirectly related to readability.
  1. Fun with Dead Languages
  2. Reading, Writing, and Code Using Regions to Improve Code Readability
  3. "Making Wrong Code Look Wrong"
  4. "Perlish Coding Style"
--
brian d foy <brian@stonehenge.com>
Subscribe to The Perl Review

Comment on What is code readability?
Select or Download Code
Re: What is code readability?
by GrandFather (Cardinal) on Jan 02, 2007 at 20:40 UTC

    My touchstone for assisting in deciding what is readable in code is "how (sensibly) close to prose is it?". My justification is that what most of us read most, and are therefore best at parsing, is prose. Applying similar criteria for use of white space and "flow" surely has some benefit.

    On the other hand a good friend of mine says that code is more like mathematics and that we should eschew (horizontal) white space where ever possible (although we agree on indentation style).

    Before I learned Perl K&R indentation was complete anathema to me and I could see no justification for it what so ever. For Perl K&R seems to be a natural fit and that is what I have come to use. Although readability isn't generally a function of the language, some languages impose their own natural style, some even force a style.

    In the past when I was more often proof reading other people's code I would fiddle with the formatting - not because that was "the right way", but because it helped me focus on the structure of the code and more easily grok it. For some reason this seemed to upset some authors. I wonder why? ;)

    Some of the OP might be better described as a comentary on "maintainability", rather than "readability" (especially the variables/data structure section), although the two are rather tightly coupled.


    DWIM is Perl's answer to Gödel

      My touchstone for assisting in deciding what is readable in code is "how (sensibly) close to prose is it?"

      Yikes! There's probably way more unreadable prose than unreadable code.

      -derby
      On the other hand a good friend of mine says that code is more like mathematics and that we should eschew (horizontal) white space where ever possible (although we agree on indentation style).

      As a mathematically inclined person, I'm of the school of thought that code is often much like Mathematics too. Still I can't understand the alleged implication that "we should eschew (horizontal) white space", which is generally not the case except perhaps in some specialized areas like Universal Algebra or Formal Languages. But otherwise careful trimming of whitespace does play an important role in mathematical typesetting, which in turn is a refined art. (Made much more accessible by Dr. Knuth's work! - To whom we all owe so much!!)

Re: What is code readability?
by Herkum (Parson) on Jan 02, 2007 at 23:39 UTC

    I think one of the most important concepts to take from programming is replacing abstract code with a function name that describe the work it is supposed to do. For example, I came across this code,

    $fee = ($tfee / 100 + $sfee/100 + 0.0000001) * 100 $mfee = 200 if ($fee >= 100); $fee = $mfee if $mfee;

    I don't even know what this does or why it is doing it. If they had done something like this, I would have a basic understanding of the what they are trying to accomplish.

    $fee = determine_maximum_fee(); sub determine_maximum_fee { $fee = ($tfee / 100 + $sfee/100 + 0.0000001) * 100 $mfee = 200 if ($fee >= 100); $fee = $mfee if $mfee; return $fee; }

    To me, the difference between a good developer and a bad developer is being able to break code into functions that explain what the program is supposed to be doing. Something most beginning programmers do not understand.

    Update: Changed sub determine_total_fee to sub determine_maximum_fee(), thanks to the Anonymous Monk for pointing out the error

      $fee = determine_maximum_fee(); sub determine_total_fee {
      Refactoring error? Is it determine_maximum_fee() or determine_total_fee? And that name that is overly verbose.
      $fee = ($tfee / 100 + $sfee/100 + 0.0000001) * 100
      ...what in the world is this doing? If I had to guess, I'd say it is a hack to misuse floating point numbers, instead of using something like Math::Currency, although I'd expect to see some calls to int(). And it diddles with global variable $foo.
      $mfee = 200 if ($fee >= 100);
      ...refers to global variables $mfee and $fee. And $mfee is non-descript. And we should at least have a comment explaining the magic numbers 100 and 200.
      $fee = $mfee if $mfee;
      ...this is seems like it is probably a horrible hack to avoid warnings about undef.
      return $fee; }
      I'd probably not refactor into a subroutine, instead using a proper data type and something like...
      use constant BREAK_POINT => 100; use constant BONUS_FEE => 200; my $total_fee = $tfee+$rfee >= BREAK_POINT ? BONUS_FEE : $tfee+$rfee;

        You do bring up some good points about the code, however the problem is not with the code or its execution but its context. By moving the code into a function it should help declare a context of what is going to happen even if it hides what it is exactly doing.

      I don't see your sentiment (which I agreed with) reflected in your modifications to the example. To my mind, breaking it into understandable functions would end up more like:
      # with this level of modularity almost anything is readable! sub determine_maximum_fee { # # functional description goes here # $fee = Roundup( Sum( @_ ) ); $fee < 100 ? $fee : 200; # all 100+ fees become 200 } sub Roundup { # # Increase just barely to avoid any rounding down problem # 100 * ( ($_[0]/100) + 0.0000001 ); } sub Sum { # # needs no description. # my $result=0; $result += shift() || return $result while (1); }
      __________________________________________________________________________________

      ^M Free your mind!

Re: What is code readability?
by swampyankee (Parson) on Jan 03, 2007 at 03:39 UTC

    APL? I had a friend (who was involved in building the engine monitoring software for the Rutan Voyager) described APL as a "write-once" language.

    I don't think coding for readability and coding for maintainability are that much different. In either case, the program has to be written for an audience, and the chances are the audience will be less skilled. This may mean eschewing perfectly valid Perl, such as multiple statements on one line, or leaving in syntactically meaningless white space, or (horror!) putting in use English;, or adding parentheses that are not strictly needed.

    I tend not to be too fussy about indentation (as long as it's reasonably consistent: 2, 4, 3, whatever is fine). I tend to dislike the practice of splitting long lines before operators, e.g.
    $x = $a +$b +$c;

    with "operator" including both arithmetic and logical operators.

    I tend to keep my copy of Kernighan & Plauger's The Elements of Programming Style near to hand. Despite the examples being mostly in PL/1 and FORTRAN-66 (pdf file), most of the basic concepts are still applicable.

    Probably the most extreme version of "readable code" is Donald Knuth's Literate Programming. I've never tried it (has anybody here tried it?)

    emc

    At that time [1909] the chief engineer was almost always the chief test pilot as well. That had the fortunate result of eliminating poor engineering early in aviation.

    —Igor Sikorsky, reported in AOPA Pilot magazine February 2003.

      I tend to dislike the practice of splitting long lines before operators,

      Well it depends on the situation in your example it is not such a big deal. However, problems occur when lines tend run long. Example:

      # Long line my $absolutely_very_long_value = $value_reference->{$first_record}{$se +cond_key}{super_property} + $value_reference->{$second_record}{$secon +d_key}{super_property}; # Split Line my $absolutely_very_long_value = $value_reference->{$first_record }{$second_key}{super_property} + $value_reference->{$second_record}{$second_key}{super_property};

      I think the worst example for a long-line is something like this,

      my $print = qq~ # HTML that goes on for 30+ lines # This some more text here. text text text text text text text~ if $in +s{save_changes};

      I could have strung up the original programmer for this. This sort of syntax you may not even know you are working on a if statement. You don't see it unless you page down or you have to scroll all the way to the right.

        Well it depends on the situation in your example it is not such a big deal. However, problems occur when lines tend run long.

        I just prefer to split lines after operators.

        In your example, my code would look like this:

        # split line my $absolutely_very_long_value = $value_reference -> {$first_record}{$second_key}{super_property} + $value_reference -> {$second_record}{$second_key}{super_property};

        emc

        At that time [1909] the chief engineer was almost always the chief test pilot as well. That had the fortunate result of eliminating poor engineering early in aviation.

        —Igor Sikorsky, reported in AOPA Pilot magazine February 2003.

      I came across the technique while reading Hanson's C interfaces & Implementations. By the way the book is a very good read (it abstracts and amplifies the techniques used in the creation of lcc). At least for books it seems quite useful. As far as using it for real in code, I don't know, you will have to teach your editor to put a link to code, and to present whatever is linked in-place in a different font say. Mean IDEs could even replace the code by a function! ;)

      cheers --stephan
Re: What is code readability?
by BrowserUk (Pope) on Jan 03, 2007 at 08:10 UTC

    For me, the single most important element of style affecting readability is "consistency".

    Whitespace

    When I first encountered the long lamented Abigail-II's code on PM, it looked strange to my eyes. And despite that there are quite a few of his guidelines that I do not follow, and disagree with his reasoning, I always find his code eminently readable. Even when that code is performing the often complex manipulations for which he is famous, the consistency of his code layout makes spending time exploring his code a joy.

    And of all the stylistic elements of his code that make that so, his liberal--too liberal in a few places for my tastes--and consistent use of horizontal whitespace ranks very high on the list of things that make it so readable. In large part, this is the inspiration behind my liberal use of horizontal whitespace. I attempted to code with consistency of layout long before I encountered Perl, but I've modified my coding style since using Perl to incorporate more horizontal whitespace and this has reflected back into my coding in other languages.

    Tokens

    As is demonstrated by this quote

    Aoccdrnig to a rscheearch at an Elingsh uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is that frist and lsat ltteer is at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae we do not raed ervey lteter by it slef but the wrod as a wlohe.

    from Txet Maglning Glof, Ayobndy?, we don't parse writing, including code, letter by letter, but rather, token by token. So ensuring that the tokens in our code are clearly delineated is (IMO), the greatest single contribution to readability.

    Indentation

    The second area where consistency also applies is in indentation. Whilst I hate the significant whitespace aspect of languages that use it--because it means that the entire function of a piece of code can silently change through the accidental omission or deletion of an invisible character. I hate inconsistent indentation even more. Why anyone would code this

    PP(pp_padhv) { dSP; dTARGET; I32 gimme; XPUSHs(TARG); if( PL_op->op_private & OPpLVAL_INTRO ) { SAVECLEARSV( PAD_SVl( PL_op->op_targ ) ); } if( PL_op->op_flags & OPf_REF ) { RETURN; else if( LVRET ) { if( GIMME == G_SCALAR ) { Perl_croak(aTHX_ "Can't return hash to lvalue scalar conte +xt"); } RETURN; } gimme = GIMME_V; if( gimme == G_ARRAY ) { RETURNOP( do_kv() ); } else if( gimme == G_SCALAR ) { SV* const sv = Perl_hv_scalar( aTHX_ (HV*)TARG ); SETs( sv ); } RETURN; }

    like this,

    PP(pp_padhv) { dSP; dTARGET; I32 gimme; XPUSHs(TARG); if (PL_op->op_private & OPpLVAL_INTRO) SAVECLEARSV(PAD_SVl(PL_op->op_targ)); if (PL_op->op_flags & OPf_REF) RETURN; else if (LVRET) { if (GIMME == G_SCALAR) Perl_croak(aTHX_ "Can't return hash to lvalue scalar context") +; RETURN; } gimme = GIMME_V; if (gimme == G_ARRAY) { RETURNOP(do_kv()); } else if (gimme == G_SCALAR) { SV* const sv = Perl_hv_scalar(aTHX_ (HV*)TARG); SETs(sv); } RETURN; }

    is so far beyond my understanding that it's not even worth my trying. I tried to think of an appropriate analogy here, but everything I came up with would have offended somebody.

    Historical justifications

    This is also why I eshew many of the common coding practices and style guidelines. Unlike Abigail's, most of them do not come with justifications other than historic precedence. If history was such a great recommendation, we'd still write English in the style of Chaucer!

    Best practice changes over time. And sticking with ancient practices, "because that's how it's always been done", doesn't make sense. When I first started programming, squared coding sheets, manually assigned, widely spaced line numbers were derigour.

    When I first wrote code commercially, 64x20(or 23?) green-on-black vdus were just starting to become available. So pouring (or is that pawing) over huge stacks of green&white fanfold listings with a handful of coloured highlighters was a necessary part of my daily life.

    With the advent of bigger, color screens, and syntax highlighting editors, I find that I have rarely printed a piece of code out for the last 10 or more years.

    Things moved on and so did I.

    Justifictions

    Where they do come with justifications, these are often (IMO) wrongly argued. For example, the justifiction for preferring underscore_separated_variable_names to camelCaseVariableNames is that the former makes it easier to parse the individual words--which it probably does to some degree. But this is a wrongly argued justification. The problem is that it makes visually separate tokens of those individual words, which you don't want. As demonstrated above, the human brain/eyes recognises patterns/tokens not characters and words, so breaking singular tokens into multiple, visually separate elements is a bad thing, not a good one.

    By way of simplistic demonstration, how many parameters are there in the following?

    some_function (some_variable,some_other_variable,and_yet_another_variable,and_one_more_for,luck)

    someFunction( someVariable, someOtherVariable, andYetAnotherVariable, andOneMoreFor, luck )

    Did you catch it the first time?

    Huffman encoding

    Another justifiction is that for Huffman encoding of keywords. Huffman encoding does make sense, but I've variously seen this justified on the basis of being quicker to type, or quicker to read, but these miss the point.

    Coders do not type at 60 wpm. And if they do, they produce bad code. I remember a metric from a very long time ago that the average programmer codes around 10-12 lines of code per day. Of course, this isn't just the time it takes to type the lines, it reflects design, debugging, maintenance etc. over the life of the code. But even if you sit down to type in a piece of code, the function of which is clear in your mind and the algorithm for which is well known and a part of your mental lexicon, you still will rarely achieve anything approaching secretarial typing speeds. You will pause for some amount of time to decide what to name each variable. You'll pause to decide whether map or grep or for or redo is appropriate for this particular piece of iteration. Should you use print and interpolated variables or printf and a template?

    Likewise, once you become familiar with keywords and function names, even those in code you just picked up, it will make negligible difference to your parsing speed whether the it is for or foreach, or map or applyThisBlockOfCodeToEachElementOfThisList. Once you know what the function/variable/keyword does, you will not parse the spelling of the token. You'll simply recognise it--the token, short or long--as doing whatever it does.

    So that leaves us with the question, what is the real value of Huffman encoding? (IMO), it is twofold:

    • Shorter is easier to remember--in context.

      My justifications for this are:

      • DANGER: KEEP OUT!

        Is more likely to have the desired affect than

        Within the area encircled by this metallic mesh barrier there exist localised high potential gradients that create the possibility for mis-endevour--including but not limited to burning, severe burning, maiming and termination of existence. You are accordingly advised that progressing inside the barrier could be hazardous.

      • We use acronyms in preference to their expansions--once we know what the acronym means.

        TIMTOWTDI!

        Of course, when you encounter an acronym or abbreviation for the first time, it doesn't make sense. But once you are familiar with them, and you are operating in the right context, not using them makes no sense. If you're a fan of Grey's Anatomy, ER or any of the many other medical soaps, then can you imagine the crash team screaming

        He's going into ventricular fibrillation, so would someone be so kind as to get me 30 cubic centiliters of D3, 1,25-dihydroxy-20-epi-Vitamin(*). Oh, and if it's not imposing upon you to much, could you do that as a matter of great priority please.

        Instead of

        He's going into v-fib, get me 30cc of epi, stat.

        *I have no idea if that's the correct expansion of "epi" in this context, but it lent itself to the point I am making and in truth, I don't need to know.

    • Brevity == clarity.

      The more frequently something is used, the shorter it should be, because it takes up less screen space.

      That means that atomic elements of code can more often be coding in a single line. And that allows more discrete steps (preferably all of them), of an overall algorithm to be visible on a single screen. This is a huge, huge aid to understanding, both for the original author and the future maintenance programmer.

    Long variable names

    Another common (IMO mis-)perception is that long, descriptive variable names make for clearer code. This is only ever true for the first few minutes before you know what the variable is used for. After that, once you have internalised the purpose of a variable, they are just tokens. And, provided that their scope is suitably confined, the ability to recognise the token quickly and easily when scanning the code is inversely proportional to it's length.

    Contrast

    my @sorted = sort{ $a <=> $b } @names;
    my @list_of_numerically_sorted_names = sort { $first_element_to_be_compared <=> $second_element to be compared } @list_of_unsorted_names;

    Of course, single character variable names are pretty useless if the life of the variable extends much beyond a few lines. But then most variables shouldn't have scopes that extend much beyond a few lines anyway--but that's a different discussion.

    Whilst it is easy to argue that longer variable names allow the maintenance programmer (you, in a month or three's time), to quickly become familiar with the purpose of a variable when they first dive into a piece of code, (IMO) that is false economy. It (along with overly commented code), encourages a practice that I term 'hit & run' or 'guerrilla' maintenance. This is where the programmer receives a description of the problem, makes an assumption about the likely cause, dives into the middle of the code in question, reads a few comments or variable names and makes changes consistent with his earlier assumptions.

    The problem is, descriptive variable names, like descriptive comments, describe what the original programmer thought they were coding. But the reason the maintenance programmer is in there, is because the code doesn't do what the original programmer thought it was doing. Of course, there are other reasons for maintenance than bugs, but I still feel that whenever you sit down to change a piece of code, you should understand what it does, and how it does it, not just what someone thought it should do, before you start making changes.

    One of the ways I get to know a piece of code is to sit down and go through it changing the source-code layout to bring it to my preferred layout. I do this manually. I find that the process of inserting/adjusting whitespace, adjusting the indentation, and sometimes, even changing the variable names to fit my understanding of the code allows me to get a much clearer picture of the code at both the macro and micro levels.

    Of course, this can offend some programmers and doesn't fit well with some source control and maintenance techniques, which means that once I have made my changes, I have to go back to the original sources and re-make them in the style of the original code if I intend to supply a patch for example. That could be seen as a problem, but it also serves as a secondary validation of the changes.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      My take.

      Vertical white space rules!

      some_function ( if_it_wasnt_for, bad_luck, I_wouldnt_have, any_luck_at_all, ) or die "handy place for your error msg\n";
      Short names, long names, I don't care. Just make them meaningful. A few moments spent choosing a good name can save hours of misunderstanding (even your own) code later.

      Code reuse is a good thing. I like variable name reuse too. :-)

      While on the subject of consistency, one should avoid doing having code that does more extra work that is not directly related for why it was written. It should be like a conversation with parts of a conversation focusing upon the subject being discussed.

      Example: BrowserUk wrote this, my wife bought a new car today and I hate my co-workers, excellent article and I enjoyed reading it.

      This is a tough to read but is common problem among programmers.

      A more practical example would be a function that mix SQL with business logic with HTML. Doing all three at the same time makes it hard to debug and reduces the portability of the code. You cannot,

      • Use the code outside of a program that does not need HTML.
      • Use the code outside of a database because you only need the business logic.
      • Use the code to just get data because the business logic is invalid in that situation.

      Small and focused code should help to provide Consistency

      It was pointed out to me by /msg that in my "How many arguments?" example above, I made two changes between the two examples, and it was postulated that only one of these, the additional whitespace, contributed to whatever extra clarity was evident.

      By way of further investigation, in which of the following is the number of parameters clearest to your eyes?

      • someFunction( someVariable, someOtherVariable, andYetAnotherVariable, andOneMoreFor, luck )
      • some_function (some_variable,some_other_variable,and_yet_another_variable,and_one_more_for,luck)
      • someFunction (someVariable,someOtherVariable,andYetAnotherVariable,andOneMoreFor,luck)
      • some_function( some_variable, some_other_variable, and_yet_another_variable, and_one_more_for, luck )

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        The last one, definitely. :-)

        1st & 4th, my favourite one BTW, are much easier on the eyes.

        There is one more variation ...

        some_function ( some_variable , some_other_variable , and_yet_another_variable , and_one_more_for , luck )
Re: What is code readability?
by adrianh (Chancellor) on Jan 03, 2007 at 11:23 UTC

    Not so much a comment on what is readable code (I cuddle my elses - so what do I know :-) but a tip on figuring out when it isn't that I've found very useful over the last few years.

    Treat comments as a red flag.

    This doesn't mean "don't comment". However every time you want to write a comment think "Is this a sign that the code is unclear? Can I make it more intention revealing without a comment?"

Re: What is code readability?
by Anonymous Monk on Jan 03, 2007 at 11:34 UTC
    Code readability is the property of a source code to confuse the innocent, and to exalt the guilty :-)
Re: What is code readability?
by Anonymous Monk on Jan 03, 2007 at 12:27 UTC
    Python readable, eh? One thing that currently stops me from further learning is missing some 'end' tag in various block definitions. Ruby has 'end', Perl has '}', but Python has nothing, except new indentation order. That kind of nihilism scares me :-)

      I'm in the group that finds Python less readable than Perl (or Ruby). I find Python's white space rules far too reminiscent of IBM's JCL. Perhaps oddly, I also find COBOL less than readable, mostly after being confused by nested block if's ending on a single period.

      emc

      At that time [1909] the chief engineer was almost always the chief test pilot as well. That had the fortunate result of eliminating poor engineering early in aviation.

      —Igor Sikorsky, reported in AOPA Pilot magazine February 2003.
      This gets exactly to brian's point. Languages are not inherently more readble than others because of these design choices. I have not used Python, but I absolutely love Haskell, which has a similar layout rule to Python's. To me, knowing that my (or more importantly, someone else's) module won't even compile if a block is indented inconsistently is a great relief. For others, it may not be so significant.

        If only that were true...

        if DEFCON == 1: soundAlarm() launchMissiles() #oops
Re What is code readability -- Language readability
by starX (Chaplain) on Jan 04, 2007 at 18:46 UTC
    Being a pioneer in using Perl in a computer science department that was grounded heavily in C and Java (with the occasional foray into Lisp or assembler), I was hit with the inherent sloppiness of Perl thing a lot. I eventually started insisting on being able to reply to people via email. It went a little something like this.

    Your argument is wrong. My argument Is Right. Perl is as readable as poetry. Your Just Not A Poet.

    Crude, but it usually got the point across. Languages are never inherently readable, the writer of a document creates readability vis a vie their own personal writing style. Some people have a hard time understanding poetry, but should that mean English is inherently unreadable? A child might not be able to forge a grammatically accurate sentence, does that mean their language is difficult to understand?

    What it comes down to is this: there are programmers who know how to write code for other people to read, and programmers who don't. You could write complicated code in any language without ever creating a new line because (generally) the compiler doesn't care. Perl just extends that.

    Perl isn't an unreadable language, it just has a very understanding interpreter.

Re: What is code readability?
by rir (Vicar) on Jan 05, 2007 at 04:19 UTC
    My first and persisting question is: How did you get from beauty to readability? Is this by some etymological backflow from beautifiers or pretty-printers and so no change of topic at all? In code and people, I find beauty and readability different things.

    Be well,
    rir

Re: What is code readability?
by aplonis (Pilgrim) on Jan 14, 2007 at 06:29 UTC

    I used to work in a company that programmed entirely in Forth. Yes it was a long time ago. In Forth there were not any types at all. And zero error checking. Forth tended also to be extremely terse. On account of that everybody who wrote Forth code very soon developed their own private dialect...which no other Forth programmer anywhere could easily read.

    That situation is doubtless contributory to very few people even having ever heard of Forth today. Yet I do remember a certain liveliness of freedom those lack of restrictions afforded. And I remember enjoying them. Perl's own inbuilt freedoms were a large measure of what lured me to it originally. I shouldn't enjoy to see them trimmed too severely back. That would be saddening, I think.

    I did have a rather short affair with Perl Critic. It did not much inspire me to run out and buy the book from which it cites most of its rules. At first I tried to take all its advice. And the results to me looked hideous. Having started out in Forth, I find Perl's flexibility, broad as it is, refeshingly restrained. But I shouldn't want it any tigher. I just do not see the point of having to do "if (bar) {foo}" always in favor of "foo if bar" for no better reason than to prevent C++ folks from being shocked by it.

    My current fling is with a new book entitled Higher Order Perl which promotes the idea of functional programming in Perl. The first two chapters alone are worth the price of the book. The section on using dispatch tables has done more to make my code more readable than Perl Critic was able to do, I rather tend to think.

Re: What is code readability?
by Anonymous Monk on Apr 08, 2008 at 05:36 UTC
    I got into an argument with my friend about the readability of a simple boolean method. I think readability counts for a lot. I argue here for using good, simple, no-mental-parsing-required method names.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://592616]
Approved by jettero
Front-paged by liverpole
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (7)
As of 2014-09-30 22:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (385 votes), past polls