Many programmers (myself included), have a habit of presenting programs in a very linear style. The program reads straight through from top to bottom. I call this linear programming. While this is very easy to do, programs should not be written for ease of writing, but ease of reading. Consider the following example. This program is not bad. The variable names are generally chosen well. They are scoped properly, and repeated non-changing elements are turned into constants. I even pulled out a "magic number" (the number of fields per record) and made that into a constant.
Further, note that there are no comments in the program. You might think that this program does not need comments as it's only 32 lines long. Now, ask yourself what this program does.
This is a very simple program. Your boss has come to you and told you that department expenses have been lumped in a CSV file and he wants a report summing total expenses per department. The above program is a quick, well-written tool that does exactly what your boss wants. However, it uses linear programming and if this is going to be used repeatedly, it's going to have to be maintained. This is a problem. Consider this alternative way to write this program:
Ugh! What the heck have I done? I took a simple, straightforward program, added three subroutines and several lines of code. Why the heck would I do something like that? What happened to laziness?
There are several benefits to breaking a program out like this. First of all, each function does precisely one thing and does it well. There is no ambiguity. If I realize that I am going to be doing a lot of work with multiple files like this, I could probably take my &get_expense_data and &write_report subroutines and stuff them into a module with absolutely no change (except perhaps the names). Further, rather than reading through the entire program to figure out what's going on, the maintenance programmer only needs to look at three lines of code:
Remember what I said earlier about the program not having any comments? Well, when you break code up into a series of small, single-purpose functions, avoid "magic variables" (FIELDS_PER_RECORD), and carefully choose your variable names, programs often don't need many comments. You can simply read what you need and skip the rest.
You're no longer with the company and the boss comes in and says to the new programmer "Expenses are too high. I want you to change this program so that all expenses over $200.00 are not totalled, but reported to me in an error report for personal evaluation. Here's how the program might change:
Note that in this change, we have added three constants and one function, &split_normal_from_excessive. How's that function implemented? Who cares? It's going to be fairly straightforward and, because everything has been modularized, we have less worry that it's going to impact anything else in the program. Further, we still have no comments, but it's easy to read. Also, because &write_report is a generic routine, It's been reused.
Linear programming is not always a bad thing. If you know you're doing a one-time data migration or need a quick data file analysis, it's not that big of a deal. If the program is going to be reused, though, it's better to take the time up front to break it out into a series of small, descriptive subroutines. It's much easier to read and extend and the poor maintenance programmer who comes behind you will breathe a sigh of relief.
Update: I think a big point to this node was kind of mentioned in passing: programs grow. I started programming in 1982 and have been programming for a living for about four years. If I had a dollar for every program that didn't experience "feature creep", I probably wouldn't have enough to buy my double tall half-caf flat tepid Irish cream latté :)
Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.