Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Many programmers (myself included), have a habit of presenting programs in a very linear style. The program reads straight through from top to bottom. I call this linear programming. While this is very easy to do, programs should not be written for ease of writing, but ease of reading. Consider the following example. This program is not bad. The variable names are generally chosen well. They are scoped properly, and repeated non-changing elements are turned into constants. I even pulled out a "magic number" (the number of fields per record) and made that into a constant.

Further, note that there are no comments in the program. You might think that this program does not need comments as it's only 32 lines long. Now, ask yourself what this program does.

#!/usr/bin/perl -w use strict; use constant EXPENSE_FILE => 'expense.dat'; use constant EXPENSE_REPORT => 'final_expense.txt'; use constant FIELDS_PER_RECORD => 4; use constant REPORT_HEADER => "Department,Total\n"; open IN_FILE, "<", EXPENSE_FILE or die "Cannot open ".EXPENSE_FILE." for reading: $!"; my @expenses = <IN_FILE>; close IN_FILE; my %deptTotals; foreach my $expense ( @expenses ) { my %record; my @fields = qw/ dept empID amount desc /; @record{ @fields } = split /,/, $expense, FIELDS_PER_RECORD; if ( ! exists $deptTotals{ $record{ dept } } ) { $deptTotals{ $record{ dept } } = 0; } $deptTotals{ $record{ dept } } += $record{ amount }; } open OUT_FILE, ">", EXPENSE_REPORT or die "Cannot open ".EXPENSE_REPORT." for writing: $!"; print OUT_FILE REPORT_HEADER; foreach my $dept ( sort keys %deptTotals ) { print OUT_FILE "$dept,$deptTotals{ $dept }\n"; } close OUT_FILE;

This is a very simple program. Your boss has come to you and told you that department expenses have been lumped in a CSV file and he wants a report summing total expenses per department. The above program is a quick, well-written tool that does exactly what your boss wants. However, it uses linear programming and if this is going to be used repeatedly, it's going to have to be maintained. This is a problem. Consider this alternative way to write this program:

#!/usr/bin/perl -w use strict; use constant EXPENSE_FILE => 'expense.dat'; use constant EXPENSE_REPORT => 'final_expense.txt'; use constant FIELDS_PER_RECORD => 4; use constant REPORT_HEADER => "Department,Total\n"; my $expenses = get_expense_data( EXPENSE_FILE ); my $deptTotals = sum_expenses_by_department($expenses, FIELDS_PER_ +RECORD); write_report( $deptTotals, EXPENSE_REPORT, REPORT_HEADER ); sub get_expense_data { my $file = shift; open IN_FILE, "<", $file or die "Cannot open $file for reading +: $!"; my @expenses = <IN_FILE>; close IN_FILE; return \@expenses; } sub sum_expenses_by_department { my ( $expenses, $fields_per_record ) = @_; my %deptTotals; foreach my $expense ( @$expenses ) { my %record; my @fields = qw/ dept empID amount desc /; @record{ @fields } = split /,/, $expense, $fields_per_reco +rd; if ( ! exists $deptTotals{ $record{ dept } } ) { $deptTotals{ $record{ dept } } = 0; } $deptTotals{ $record{ dept } } += $record{ amount }; } return \%deptTotals; } sub write_report { my ( $report_data, $file, $header ) = @_; open OUT_FILE, ">", $file or die "Cannot open $file for writin +g: $!"; print OUT_FILE $header; foreach ( sort keys %$report_data ) { print OUT_FILE "$dept,$report_data->{ $_ }\n"; } close OUT_FILE; }

Ugh! What the heck have I done? I took a simple, straightforward program, added three subroutines and several lines of code. Why the heck would I do something like that? What happened to laziness?

There are several benefits to breaking a program out like this. First of all, each function does precisely one thing and does it well. There is no ambiguity. If I realize that I am going to be doing a lot of work with multiple files like this, I could probably take my &get_expense_data and &write_report subroutines and stuff them into a module with absolutely no change (except perhaps the names). Further, rather than reading through the entire program to figure out what's going on, the maintenance programmer only needs to look at three lines of code:

my $expenses = get_expense_data( EXPENSE_FILE ); my $deptTotals = sum_expenses_by_department( $expenses, FIELDS_PER +_RECORD ); write_report( $deptTotals, EXPENSE_REPORT, REPORT_HEADER );

Remember what I said earlier about the program not having any comments? Well, when you break code up into a series of small, single-purpose functions, avoid "magic variables" (FIELDS_PER_RECORD), and carefully choose your variable names, programs often don't need many comments. You can simply read what you need and skip the rest.

Time passes...

You're no longer with the company and the boss comes in and says to the new programmer "Expenses are too high. I want you to change this program so that all expenses over $200.00 are not totalled, but reported to me in an error report for personal evaluation. Here's how the program might change:

use constant EXPENSE_FILE => 'expense.dat'; use constant EXPENSE_REPORT => 'final_expense.txt'; use constant FIELDS_PER_RECORD => 4; use constant REPORT_HEADER => "Department,Total\n"; use constant ERROR_REPORT => 'expense_error.txt'; use constant ERROR_HEADER => "Department,Employee,Total,Descr +iption\n"; use constant EXCESSIVE => 200; my $raw_expenses = get_expense_data( EXPENSE_FILE +); my ( $expenses, $largeExpenses ) = split_normal_from_excessive( $raw_expenses, EXCESSIVE ); my $deptTotals = sum_expenses_by_department($expenses, FIELDS_PER_ +RECORD); write_report( $deptTotals, EXPENSE_REPORT, REPORT_HEADER ); write_report( $large_expenses, ERROR_REPORT, ERROR_HEADER );

Note that in this change, we have added three constants and one function, &split_normal_from_excessive. How's that function implemented? Who cares? It's going to be fairly straightforward and, because everything has been modularized, we have less worry that it's going to impact anything else in the program. Further, we still have no comments, but it's easy to read. Also, because &write_report is a generic routine, It's been reused.

Linear programming is not always a bad thing. If you know you're doing a one-time data migration or need a quick data file analysis, it's not that big of a deal. If the program is going to be reused, though, it's better to take the time up front to break it out into a series of small, descriptive subroutines. It's much easier to read and extend and the poor maintenance programmer who comes behind you will breathe a sigh of relief.

Cheers,
Ovid

Update: I think a big point to this node was kind of mentioned in passing: programs grow. I started programming in 1982 and have been programming for a living for about four years. If I had a dollar for every program that didn't experience "feature creep", I probably wouldn't have enough to buy my double tall half-caf flat tepid Irish cream latté :)

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.


In reply to Linear programming is bad by Ovid

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (3)
As of 2024-04-19 21:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found