Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

Comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

Many programmers (myself included), have a habit of presenting programs in a very linear style. The program reads straight through from top to bottom. I call this linear programming. While this is very easy to do, programs should not be written for ease of writing, but ease of reading. Consider the following example. This program is not bad. The variable names are generally chosen well. They are scoped properly, and repeated non-changing elements are turned into constants. I even pulled out a "magic number" (the number of fields per record) and made that into a constant.

Further, note that there are no comments in the program. You might think that this program does not need comments as it's only 32 lines long. Now, ask yourself what this program does.

#!/usr/bin/perl -w use strict; use constant EXPENSE_FILE => 'expense.dat'; use constant EXPENSE_REPORT => 'final_expense.txt'; use constant FIELDS_PER_RECORD => 4; use constant REPORT_HEADER => "Department,Total\n"; open IN_FILE, "<", EXPENSE_FILE or die "Cannot open ".EXPENSE_FILE." for reading: $!"; my @expenses = <IN_FILE>; close IN_FILE; my %deptTotals; foreach my $expense ( @expenses ) { my %record; my @fields = qw/ dept empID amount desc /; @record{ @fields } = split /,/, $expense, FIELDS_PER_RECORD; if ( ! exists $deptTotals{ $record{ dept } } ) { $deptTotals{ $record{ dept } } = 0; } $deptTotals{ $record{ dept } } += $record{ amount }; } open OUT_FILE, ">", EXPENSE_REPORT or die "Cannot open ".EXPENSE_REPORT." for writing: $!"; print OUT_FILE REPORT_HEADER; foreach my $dept ( sort keys %deptTotals ) { print OUT_FILE "$dept,$deptTotals{ $dept }\n"; } close OUT_FILE;

This is a very simple program. Your boss has come to you and told you that department expenses have been lumped in a CSV file and he wants a report summing total expenses per department. The above program is a quick, well-written tool that does exactly what your boss wants. However, it uses linear programming and if this is going to be used repeatedly, it's going to have to be maintained. This is a problem. Consider this alternative way to write this program:

#!/usr/bin/perl -w use strict; use constant EXPENSE_FILE => 'expense.dat'; use constant EXPENSE_REPORT => 'final_expense.txt'; use constant FIELDS_PER_RECORD => 4; use constant REPORT_HEADER => "Department,Total\n"; my $expenses = get_expense_data( EXPENSE_FILE ); my $deptTotals = sum_expenses_by_department($expenses, FIELDS_PER_ +RECORD); write_report( $deptTotals, EXPENSE_REPORT, REPORT_HEADER ); sub get_expense_data { my $file = shift; open IN_FILE, "<", $file or die "Cannot open $file for reading +: $!"; my @expenses = <IN_FILE>; close IN_FILE; return \@expenses; } sub sum_expenses_by_department { my ( $expenses, $fields_per_record ) = @_; my %deptTotals; foreach my $expense ( @$expenses ) { my %record; my @fields = qw/ dept empID amount desc /; @record{ @fields } = split /,/, $expense, $fields_per_reco +rd; if ( ! exists $deptTotals{ $record{ dept } } ) { $deptTotals{ $record{ dept } } = 0; } $deptTotals{ $record{ dept } } += $record{ amount }; } return \%deptTotals; } sub write_report { my ( $report_data, $file, $header ) = @_; open OUT_FILE, ">", $file or die "Cannot open $file for writin +g: $!"; print OUT_FILE $header; foreach ( sort keys %$report_data ) { print OUT_FILE "$dept,$report_data->{ $_ }\n"; } close OUT_FILE; }

Ugh! What the heck have I done? I took a simple, straightforward program, added three subroutines and several lines of code. Why the heck would I do something like that? What happened to laziness?

There are several benefits to breaking a program out like this. First of all, each function does precisely one thing and does it well. There is no ambiguity. If I realize that I am going to be doing a lot of work with multiple files like this, I could probably take my &get_expense_data and &write_report subroutines and stuff them into a module with absolutely no change (except perhaps the names). Further, rather than reading through the entire program to figure out what's going on, the maintenance programmer only needs to look at three lines of code:

my $expenses = get_expense_data( EXPENSE_FILE ); my $deptTotals = sum_expenses_by_department( $expenses, FIELDS_PER +_RECORD ); write_report( $deptTotals, EXPENSE_REPORT, REPORT_HEADER );

Remember what I said earlier about the program not having any comments? Well, when you break code up into a series of small, single-purpose functions, avoid "magic variables" (FIELDS_PER_RECORD), and carefully choose your variable names, programs often don't need many comments. You can simply read what you need and skip the rest.

Time passes...

You're no longer with the company and the boss comes in and says to the new programmer "Expenses are too high. I want you to change this program so that all expenses over $200.00 are not totalled, but reported to me in an error report for personal evaluation. Here's how the program might change:

use constant EXPENSE_FILE => 'expense.dat'; use constant EXPENSE_REPORT => 'final_expense.txt'; use constant FIELDS_PER_RECORD => 4; use constant REPORT_HEADER => "Department,Total\n"; use constant ERROR_REPORT => 'expense_error.txt'; use constant ERROR_HEADER => "Department,Employee,Total,Descr +iption\n"; use constant EXCESSIVE => 200; my $raw_expenses = get_expense_data( EXPENSE_FILE +); my ( $expenses, $largeExpenses ) = split_normal_from_excessive( $raw_expenses, EXCESSIVE ); my $deptTotals = sum_expenses_by_department($expenses, FIELDS_PER_ +RECORD); write_report( $deptTotals, EXPENSE_REPORT, REPORT_HEADER ); write_report( $large_expenses, ERROR_REPORT, ERROR_HEADER );

Note that in this change, we have added three constants and one function, &split_normal_from_excessive. How's that function implemented? Who cares? It's going to be fairly straightforward and, because everything has been modularized, we have less worry that it's going to impact anything else in the program. Further, we still have no comments, but it's easy to read. Also, because &write_report is a generic routine, It's been reused.

Linear programming is not always a bad thing. If you know you're doing a one-time data migration or need a quick data file analysis, it's not that big of a deal. If the program is going to be reused, though, it's better to take the time up front to break it out into a series of small, descriptive subroutines. It's much easier to read and extend and the poor maintenance programmer who comes behind you will breathe a sigh of relief.


Update: I think a big point to this node was kind of mentioned in passing: programs grow. I started programming in 1982 and have been programming for a living for about four years. If I had a dollar for every program that didn't experience "feature creep", I probably wouldn't have enough to buy my double tall half-caf flat tepid Irish cream latté :)

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

In reply to Linear programming is bad by Ovid

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and all is quiet...

    How do I use this? | Other CB clients
    Other Users?
    Others drinking their drinks and smoking their pipes about the Monastery: (3)
    As of 2018-05-24 05:05 GMT
    Find Nodes?
      Voting Booth?