http://www.perlmonks.org?node_id=976888


in reply to Design elegance : How to best design this simple program ?

I think you are overthinking the problem. Don't just go and design a whole program before you have written a few simple tests to... well, test out your understanding of the problem.

Then, expand on what you have learned.

I know, that's not how to teach it in school. But so far (20+ years software development), this approach has served me quite well. Of course, everyone has his/her own methods and processes to write a program (no matter what company policy says).

Let's try it with a simple example. Maybe i can explain how i usually go about writing a program like these. I won't go too far into it, just the first few steps along the way.

Ok, say you have some CSV files by which you manage your families central piggy bank (see note below). For every family member, you run a separate file in which you record each deposit and withdrawel. Overdrawing is possible as long as there is money in the pig. You want to find out two things: First, the amount of money each family member has (or owns you), and second, the total of money left in the bank ("the Central PIG Money Storage Inc. (non-profit)").

There are three files:

piggybank/brother.csv piggybank/sister.csv piggybank/mother.csv piggybank/test.csv

brother.csv reads:

deposit;1 beer;-30 beer;-20 won bet;200

sister.csv reads:

deposit;1000 buy coffee;-20 buy lunch;-30 new shoes;-500 deposit;20

mother.csv reads:

deposit;50 deposit;30 deposit;300 small car accident;-550

test.csv is a directory that your brother made to break your program ;-)

First step is of course finding the filenames and making sure they are in fact files (Users can be accidently-on-purpose very creative about this things). Just print out what we find and generate warnings about non-files:

#!/usr/bin/env perl use strict; use warnings; # get a list of all CSV files in the piggybank directory my @fnames = glob('piggybank/*.csv'); foreach my $fname (@fnames) { if(!-f $fname) { print STDERR "$fname is not a file!\n"; next; } print "Found data file $fname\n"; }

Ok, that works. For this example, we won't bother using a "real" parser module. You should for your program, but this would make this example too complex. In the first step, we just want to output the content of each file, so we ad a subroutine and call it for every file. Here's the modified code:

#!/usr/bin/env perl use strict; use warnings; # get a list of all CSV files in the piggybank directory my @fnames = glob('piggybank/*.csv'); foreach my $fname (@fnames) { if(!-f $fname) { print STDERR "$fname is not a file!\n"; next; } readAccount($fname); } sub readAccount { my ($fname) = @_; open(my $fh, "<", $fname) or die($!); foreach my $line (<$fh>) { chomp $line; print $fname, ': ', $line, "\n"; } close $fh; }

The next part is doing the sum for each file and printing the result. We know that every valid line is in the format

sometext;value
where withdrawels are always negative numbers and deposits positive ones. So we do it very simply, matching the line with a regular expression and take everything after the semicolon. (A bit error prone, but enough for this example). Tread that scalar as a number and just add it to the accounts balance.

Ok, here we go, we only have to modify readAccount() for this.

sub readAccount { my ($fname) = @_; my $balance = 0; open(my $fh, "<", $fname) or die($!); foreach my $line (<$fh>) { chomp $line; if($line =~ /(.+)\;(.+)/) { $balance += $2; } } close $fh; print "$fname balance: $balance\n"; }

Now, where's nearly there. All that's left to do is the total balance of our piggy bank. We already have the balance for each individual account. We just have to modify readAccount to return it and sum it all up in the main loop. And then print it out.

#!/usr/bin/env perl use strict; use warnings; # get a list of all CSV files in the piggybank directory my @fnames = glob('piggybank/*.csv'); my $total = 0; foreach my $fname (@fnames) { if(!-f $fname) { print STDERR "$fname is not a file!\n"; next; } $total += readAccount($fname); } print "Money left in the piggybank: $total\n"; sub readAccount { my ($fname) = @_; my $balance = 0; open(my $fh, "<", $fname) or die($!); foreach my $line (<$fh>) { chomp $line; if($line =~ /(.+)\;(.+)/) { $balance += $2; } } close $fh; print "$fname balance: $balance\n"; return $balance; }

This is our final output of the script (i called it bankman.pl):

piggybank/brother.csv balance: 151 piggybank/test.csv is not a file! piggybank/mother.csv balance: -170 piggybank/sister.csv balance: 470 Money left in the piggybank: 451

No need to write complicated designs (which wont work out exactly as planned most of the time anyway). making an "overview" design sketch for bigger projects is a good thing. But don't get bogged down in the details.

Since you learn most about the problem at hand is actually hands-on solving it, the best time to draw out the theoretical design for the program is after you have written it. That's why many experienced coders write a quick and dirty proof-of-concept (and maybe some test cases) first, then tackle designing an elegant, optimized solution...

...that is, if it's still required. From my personal experience, the more often you do this, the more often you will come up with a quick-hacked proof of concept that is good and fast enough to be also the final solution.

Another thing you'll find: Most of the time you don't actually have to optimize and squeezeyour code every bit of performance you can get. One thousand million bytes seem like a huge amount of data for a human. For a computer capable of doing more than 3 billion operations every second, not so much.

While i was running the example program (which took about a second including loading the perl binary, compiling/parsing/running the programm, accessing the disks, printing out the results to a graphical terminal) i was also running a YouTube-video with audio through my USB headphones (which is much more data shuffled around in memory than parsing a few Megabytes of data files), my CPU was barely used at all.

So, to conclude: Just try solving your problem step-by-step. When you found a working solution, you can still decide if it's worth a rewrite with a clean, simple and elegant design. Or if you want to work on the next, even more exiting problem that needs solving.

Note to self: Solving problems can be addictive. Remember to leave some for friends and coworkers.


Note on piggy banks (from Wikipedia): Piggy bank (sometimes penny bank or money box) is the traditional name of a coin accumulation and storage receptacle. Sorry, Wikipedia editors, could you describe that even less clearly?

"You have reached the Monastery. All our helpdesk monks are busy at the moment. Please press "1" to instantly donate 10 currency units for a good cause or press "2" to hang up. Or you can dial "12" to get connected directly to second level support."