http://www.perlmonks.org?node_id=426104

scooterm coding tips: The Principle of Parameter Parsimony

This coding tip is intended to demonstrate the Principle of Parameter Parsimony. It basically means this:

You should be as stingy and as sparing as possible when adding new parameters to your scripts, database schemas and application designs. The goal is to minimize the number of ways your application can go wrong.

To illustrate this principle, here are two straightforward examples:

Example1: Assign multiple scalars the same value

This example deals with the simple scenario of assigning the same value to more than one scalar. This node shows an interesting way to do it. This node explains why the previous interesting way can cause problems. The problem comes from violating the Principle of Parameter Parsimony.

### This is nice and compact, but what happens if you ### add new variables? You have to remember to keep count. ### You are essentially saying the same thing in two ### different places, and in two different ways. my ($a, $b, $c) = ('TRUE') x 3; ### A more parsimonious example my $a = my $b = my $c = 'TRUE';

We see from Example1 that we introduce problems when we aren't stingy with the way we arrange parameters in our code. The goal is to *avoid* saying the same thing in more than one place, and in more than one way.

Example2: Superfluous 'Status Flags'

Example2 deals with a more 'real world' scenario. Consider the following data structure that is based on a user database for a popular website.

my $User = {}; $User->{never_used_site_before} = '1'; $User->{fname} = ''; $User->{lname} = ''; $User->{password} = ''; $User->{age} = '0';

The basic strategy behind this data structure is to automatically consider a user 'new' if she has never used the site before. New users start with blank information. Afterwards, the 'never_used_this_site_before' flag is set to '0' when the user first signs on to setup an account on the "New User" page.

Seems harmless enough, right? Wrong! This is another example of non-parsimonious parameterization!

The problem is this. What happens if the user starts to sign on under the "New User" page, but then goes on a lunch break after filling in the fname field, and then leaves for lunch leaving some of the information blank? Suppose there is a session timeout? The never_used_this_site_before flag is turned to '0', but the user still has not filled in all the required information!

Based on this scenario, when the user comes back to log on to the site, it will tell her the password is blank, assume she is a new user and needs to create an account. When she goes to create an account, however, it will tell her that an account with that name *already exists* because fname and never_used_this_site_before have been initialized.

A more parsimonious way to do what we want is to *get rid* of the never_used_this_site_before flag, and simply evaluate whether all required fields are filled in (non-blank) before determining whether to send the user to the "Create New User" page or the "Edit Account Info" page.

NOTE: Obviously there are a lot of assumptions and serious design issues painted in this example, but it is included here because this *specific* case has happened often enough to motivate this point on parsimony. Numerous other issues beyond the scope of this note are excluded.

In Summary: Be Stingy with your Parameters and Variables

Unlike with humans, where it is good to paraphrase and say the same thing in multiple variations so as to get your point across, this practice of redundancy is not as generously rewarded in the programming realm. Keep the parameters to a minimum, say things as parsimoniously as possible, and you will minimize the number of ways things can go wrong in your code.

Replies are listed 'Best First'.
Re: Basic Coding Tips: Parsimonious Parameterization
by TimToady (Parson) on Jan 29, 2005 at 20:54 UTC
    A higher-level view of this is that the programmer must be aware of when a transaction is happening, and avoid installing any "holes" that allow the transaction to end up partially committed. The analytical/reductionist mindset of "break it down into steps" must be balanced by the synthetic/holistic flipside that says "this whole thing is just one step". Transactions are not just something that database programmers have to worry about. I think training in transactional thinking tends to fall through the paradigmatic cracks because transactions can't be classified as functions or events or objects. If anything, a transaction is most closely related to the logic programming paradigm, which is undertaught. A transaction can be viewed as a sort of hypothesis that can either be proven or disproven. A partially proven hypothesis is of little use.
Re: Basic Coding Tips: Parsimonious Parameterization
by brian_d_foy (Abbot) on Jan 29, 2005 at 03:30 UTC

    I call this the "Use what you already know" concept. I once saw in a code review a bit of code from someone you didn't trust arrays to know how many elements were in them. It looked a lot like:

    my @array = (); for( my $i = 0; $i < 10; $i++) { $count++; push @array, $i; } print "There are $count elements\n";

    I've seen the same thing for hashes too, since some people don't read the perlfunc entry for keys().

    I'm still waiting to see some Perl code on The Daily WTF?

    --
    brian d foy <bdfoy@cpan.org>
      Well, if you want a "WTF", I found this gem on cpan a while ago:
      #merges the elements of two or more arrays together so that the values + of one # are appended to the end of the previous one sub array_merge{ my(@array1, @array2) = @_; foreach my $element (@array2){ @array1 = (@array1, $element); }#foreach return @array1; }#array_merge

        I have to admit - he couldn't do it much faster than that and still have a subroutine...

        (Anyone else notice that @array2 will always be empty since @array1 will get all of @_? Thus, the foreach will always be skipped!)

      There have been a number of Perl submissions there, though not as many as one might expect. Check out The Guy Who Invented Arrays for an example. I guess VB has captured the majority of the crop of casual programmers who are prone to pretzel logic, where ten years ago a much larger percentage of those would be using Perl.

      Makeshifts last the longest.

        First, thanks for the link, thats an interesting site. :-) But i have to quibble a little with this particular example. Obviously the code posted in that link appears hokey. Not using pack/substr or vec for the index string strikes me as a little odd for instance. But the overall strategy represented there is not IMO that crazy. Consider that if you have large numbers of strings such a strategy can greatly reduce the overhead of storing it in perl. Each SV in perl represents a certain amount of space, for the sake of the discussion lets say 20 bytes, versus the 2 - 8 required by a representationg akin to the one mentioned. So if we are storing large numbers of strings this overhead can be quite burdensome. Also, by using such an approach you can grab a large chunk of memory at a single go instead of doing a large number of small allocations. (Its interesting (if a little obvious) to consider that if you have 100k null terminated strings you are using 100k to simply store the end of string point :-)

        Anyway i didnt think such comments were worth posting on the site you linked to, but I do think its worth mentioning here.

        ---
        demerphq

      No Perl? Oh but there is. Enjoy the hilarity as we observe a cunning ploy to avoid using length :-)

      -- vek --

      Seen recently in a production script distributed around the company I work for-

      my $var=1; while ($var == 1){ # do a bunch of stuff and never touch $var # exit if (some condition); }

      while(1) must have been too difficult :)

      --
      Do not seek to follow in the footsteps of the wise. Seek what they sought. -Basho

        Or, depending on structure, even

        do { # ... } until $condition;

        Makeshifts last the longest.

Re: Basic Coding Tips: Parsimonious Parameterization
by stvn (Monsignor) on Jan 28, 2005 at 21:58 UTC
    Very nice.
    Superfluous 'Status Flags

    I cannot tell you how many times I have encountered this issue. And not only does it complicate things from the user side (as you described), I have found it many times makes code totally unreadable. You example shows only one flag, but I have dealt with code and database tables which had 5 or more flags. The justification is usually that it "speeds things up", but this is a classic case of premature optimization.

    -stvn
      I'm not sure anybody will read this, since I'm a bit late. However, the term 'superfluous' status flags is a little bit indescriptive. For instance, I find the status flags in Apache::Session to not be superfluous, but then again I'm sure they could be replaced... how do you define superfluous in this case? Are they status flags that lead to mistakes in code logic? Aren't all mistakes in code logic a problem? What 'superfluous status' flags simply aren't mistakes in code logic rather than "Parsimonious Parameterization."

        Superfluous status flags are those which can readily be inferred from the rest of an object's state. They denormalize the data structure (in relational database terms) and the additional code required to update them in sync with the actual data violates the Once And Only Once principle.

        Makeshifts last the longest.

Re: Basic Coding Tips: Parsimonious Parameterization
by Aristotle (Chancellor) on Jan 28, 2005 at 23:39 UTC

    Agreed. This is a variation on what's known as the “Once And Only Once” or “Don't Repeat Yourself” principle.

    Makeshifts last the longest.