MOPT-02 - substitution and formal systems

In MOPT-01, I said that computers manipulate symbols according to a predefined set of rules. I said that symbols represent human assumptions, and listed some assumptions that pop up fairly often in practice. The central theme of that whole meditation, though, was the concept of representation.

Representation is the first mystery of programming.

The second mystery of programming is the concept of substitution.

meditation #2: Substitution and formal systems:

Representation gives us a way to define symbols. Substitution gives us a way to define relationships between symbols. Of course, relationships are just another kind of assumption, so substitution is really just representation in disguise.

It's a useful disguise, though. It gives us the power to apply one relationship after another, which in turn gives us the 'iterated sequence of operations' model that distinguishes programming from everyday mathematics.

The steps of a mathematical proof are more like the trace of a single program variable than a program per se. You apply an operation to what you have, and write the result as the next line of your proof. The operation itself is usually never stated, and mathematicians learn to infer the operations from the 'before' and 'after' values they see on the page. Programs, OTOH, list the sequence of operations explicitly, and let the computer work out the intermediate values.

Programming is a branch of mathematics, though. The mathemetician's version of an iterated sequence of operations is called a formal system (1)(2).

(1) - For a fun, if surreal, treatment of formal systems, check out Godel, Escher, Bach: An Eternal Golden Braid by Douglas Hofsteader. CORRECTION: the author's name is actually spelled: "Hofstadter".. thanks gjb.. serves me right for being too lazy to go over and look at my bookshelf.

(2) - Formal systems are also known as Post systems, after Emil Post, the mathemetician who discovered them.

A formal system uses a set of symbols called an alphabet (3). A sequence of characters from that alphabet is called a string. A formal system also defines a transformation function, which maps one string to another. You apply transformation rules by replacing part of an existing string with whatever the transformation function maps it to.

(3) - An alphabet is a set of items that support the concept of identity, but not the concepts of order, distance, or origin. In other words, to refer back to MOPT-01, an alphabet is a nominal space.

Most formal systems also define a set of strings you can start with, called axioms. Any string you can derive by applying a sequence of transformations to some axiom is called a theorem. The set of all valid theorems (i.e.: the set of all strings you can derive from some axiom or other) is called the system's language.

At this point, you're probably thinking, "okay, lots of vocabulary.. more or less makes sense.. so what?"

So a language is another kind of information space. Formal systems generate languages, so you can use a formal system to generate whatever kind of information space you want. All you have to do is create an alphabet and a set of translation rules that represent the basic assumptions of the space you want to generate. As an example, let's look at a simple formal system I call the 'CIOP' system:

The system's alphabet consists of four symbols:

    { 'c', 'i', 'o', 'p' }
[download]

hence its name. The transformation function looks like so:

    S   ->  Sp
    op  ->  i
    ip  ->  c
    oc  ->  io
    ic  ->  co
    cS  ->  ioS
[download]

where 'S' means 'any string', and the rule 'S -> Sp' means 'you can replace any occurrence of string S with the string Sp'. In other words, you can append <t>'p' to any string.

The system also has one axiom: the string 'o'.

Now let's implement that system in Perl:

    $S = 'o';
    
    for (1..16) {
        $S .= 'p';                  # S  -> Sp
        @processing = ($S);
    
        $S =~ s/op/i/;              # op -> i
        $S =~ s/ip/c/;              # ip -> c
        push @processing, $S;
    
        while ($S =~ /c/) {
            $S =~ s/oc/io/;         # oc -> io
            $S =~ s/ic/co/;         # ic -> co
            $S =~ s/^c/io/;         # cS -> ioS
            push @processing, $S;
        }
        
        @out = map { sprintf ("%6s", $_) } @processing;
        printf ("%-44s == %6s\n", join (" ->", @out), $S);
    }
[download]

When we run that code, it produces the following output:

        op ->     i                              ==      i
        ip ->     c ->    io                     ==     io
       iop ->    ii                              ==     ii
       iip ->    ic ->   ioo                     ==    ioo
      ioop ->   ioi                              ==    ioi
      ioip ->   ioc ->   iio                     ==    iio
      iiop ->   iii                              ==    iii
      iiip ->   iic ->   ico ->  iooo            ==   iooo
     iooop ->  iooi                              ==   iooi
     iooip ->  iooc ->  ioio                     ==   ioio
     ioiop ->  ioii                              ==   ioii
     ioiip ->  ioic ->  ioco ->  iioo            ==   iioo
     iioop ->  iioi                              ==   iioi
     iioip ->  iioc ->  iiio                     ==   iiio
     iiiop ->  iiii                              ==   iiii
     iiiip ->  iiic ->  iico ->  icoo -> ioooo   ==  ioooo
[download]

And as you can see, we've created a binary counter. The symbols 'i' and 'o' stand for the binary values 'one' and 'zero', respectively. The symbol 'p' stands for 'plus one', and the symbol 'c' stands for 'carry one'.

Now let's shuffle things around a little. Instead of starting with the string 'o' and adding one 'p' per iteration, let's start with the string 'opppppppp' and see what happens. It's a legal theorem, after all, from repeated application of the rule 'S -> Sp':

    $S = 'opppppppp';               # (S  -> Sp) x 8
    
    while ($S =~ /p/) {
        @processing = ($S);
        
        $S =~ s/op/i/;              # op -> i
        push @process, $S;
        $S =~ s/ip/c/;              # ip -> c
        push @processing, $S;
    
        while ($S =~ /c/) {
            $S =~ s/oc/io/;         # oc -> io
            $S =~ s/ic/co/;         # ic -> co
            $S =~ s/^c/io/;         # cS -> ioS
            push @processing, $S;
        }
        
        @out = map { sprintf ("%9s", $_) } @processing;
        print join (" ->", @out), "\n";
    }
[download]

This time we get:

    opppppppp -> ippppppp ->  cpppppp -> iopppppp
     iopppppp ->  iippppp ->   icpppp ->  ioopppp
      ioopppp ->   ioippp ->    iocpp ->    iiopp
        iiopp ->     iiip ->      iic ->      ico ->     iooo
[download]

which once again fits our original meanings for the symbols.

The CIOP system's language is equivalent to the set of all positive integers, and it supports the basic operation of unit addition. If we wanted to simulate unit subtraction, we could flip the transformation rules backwards, and let any string composed of 'i's and 'o's be an axiom.

With more symbols and rules, we could create a decimal or hexadecimal counter. With different rules, we could simulate boolean algebra, or any of the logic circuits that we use to build digital computers. In fact, with an elaborate enough set of symbols and rules, we could build a formal system that does anything a computer can do. Post systems are computationally equivalent to Turing machines and the Lambda calculus, the two most popular theoretical models for computation.

Now since I've just used that word, the term computation refers to a specific sequence of transformations. When you apply that sequence of transformations to a given axiom, you compute a specific theorem. A theorem is computable if you can compute that theorem from one of the defined axioms.. in other words, if there's a sequence of transformations that will turn some axiom into that theorem.

A computation is vaguely like what programmers think of as a function: It takes an axiom as input, it produces a theorem as output, and it defines a sequence of operations that turn the input into the output. The difference is that most computations only work for a specific axiom. The substitutions that turn 'o' into 'ioo' wouldn't turn 'ii' into 'iii', for example. You'll have to wait for the next meditation to learn the mystery that lets us create actual functions.

But that brings us back to the mystery for this meditation: substitution.

Languages are information spaces, and formal systems generate languages. Substitution is the engine that makes formal systems go. Every computation, no matter what flavor of theoretical model you choose, boils down to a process of substituting one group of symbols for another. That's a very useful thing for programmers to know. In fact, it's the foundation of several best practices:

Black-box design: If you get the right output for the right input, it shouldn't matter how you compute it.
Testing: If your code is well organized, you should be able to break it down into a series of 'substitution gates', where a given input is replaced by a given output. Testing makes sure that the substitution happens as planned.
Dataflow diagrams: It's easiest to design programs by reverse-engineering your substitutions. Instead of thinking, "how do I get from this input to that output?" think "what kind of input do I need to produce the output I want?" A dataflow diagram helps you keep track of what inputs you need to perform a given substitution, and where each of those inputs comes from.
Elimination of side-effects: A substitution replaces one string with another, period. If anything else has changed when you finish your computation, you haven't performed one substitution. You've performed several. If you say you're only performing one substitution, but are actually performing several, you're lying to the next person who reads your code. Don't do that.. it's rude.

That's about it for this meditation, but before I go, I need to mention another kind of substitution that crops up very frequently in practice. For want of a better term, I'm going to call it unit-aggregate duality:

The CIOP system offers at least two ways to represent any integer. The second example above demonstrates that, by showing that we can compute the string 'iooo' from the string 'opppppppp'. That kind of equivalence occurs frequently in formal systems.

Now if we wanted to, we could add the mapping:

    8   ->  iooo
[download]

to the system's transformation function. That would give us yet another way to represent the value 'eight', but more importantly, it would give us a way to switch between a single symbol that means 'eight' and a collection of symbols that mean 'eight'.

If we apply the same kind of reasoning to a string like, "hello, world.\n", we can treat it as a unit (i.e.: a string), or as an aggregate of smaller units (i.e.: a sequence of characters).

High-level languages like Perl make it so easy to switch between those two kinds of representation that programmers can forget there's a difference. They think of strings as indeterminate entities which are both units and aggregates until tested, and which then collapse to whatever form works best for a given operation.

Down in the world of programming theory, distinctions like that can make a world of difference. Most of the theoretical structures we'll run into in the future operate in terms of reading or writing a single symbol. We'll also run into a lot of ordered sets, generically called tuples, in the future:

(0,1) is an ordered pair, aka: a 2-tuple
(0,1,2) is an ordered triple, aka: a 3-tuple
(0,1,2,3) is an ordered quadruple, aka: a 4-tuple

If you aren't clear on your unit-aggregate duality, you can lose time wondering whether the 3-tuple '(0,1,2)' is a single symbol, or a collection of three symbols. The correct answer is that it's neither. It's an idea that you can represent any of several different ways. If you end up using more than one representation, you'll need rules that let you substitute one representation for the other.

So.. go meditate on the nature of substitution. Look pieces of code that you could model by adding a bunch of symbols to a string, then replacing the final sequence with a new symbol. Think about unit-aggregate duality, and try to figure out where Perl switches the representations so you don't have to. In the meantime, I'll be writing next week's meditation, which will reveal the third and final mystery of programming theory: the concept of identification.

Comment on MOPT-02 - substitution and formal systems Select or Download Code

Replies are listed 'Best First'.
Re: MOPT-02 - substitution and formal systems by diotalevi (Canon) on Dec 16, 2002 at 22:46 UTC
And past reading of that book (An Eternal Golden Braid) has me immediately objecting to the assignment of meaning to your symbols. I think you may have just shortchanged the audience here by skipping right over the idea that just because a system produces results that have behaviour corresponding to a meaning that the system itself is supposed to mean that. I don't have your experience and wouldn't have been able to write your meditation on my own but I've really benefitted by being able to divorce an ascribed meaning from a system. In fact, that may be an important skill on it's own for use when reverse engineering an application. Part of the task involves observing a system in operation and attempting to assign meaning. The problem is that frequently your observed meaning is violated if you let the system generate more and more theorems. I had a practical application of that last year when reversing a COINSERV database and then the Xerox Metacode format. I had to continually revise my assumptions regarding the meaning of various things as I ran into more data. Eventually I got things right but it was a long, slow process. So it's useful to look at a system and have an "aha!" moment and describe it as a binary counter. It's also useful to step back and remember that it may have a different meaning which only partly intersects with a binary counter.	[reply]
Re2: MOPT-02 - substitution and formal systems by mstone (Deacon) on Dec 16, 2002 at 23:55 UTC
think you may have just shortchanged the audience here by skipping right over the idea that just because a system produces results that have behaviour corresponding to a meaning that the system itself is supposed to mean that. That's a fair point, and it raises the extremely important mathematical concept of abstraction. One of the great leaps of faith in mathematical thought is to forget that the symbols have any meaning whatsoever, and simply observe them as symbols in their own right. That's the boundary between applied mathematics and pure mathematics. In this case, I was trying to emphasize the "hey look, you can make these do something useful" aspect without being too much of a windbag. ;-)	[reply]
Re: MOPT-02 - substitution and formal systems by adrianh (Chancellor) on Dec 16, 2002 at 22:10 UTC
(1) - For a fun, if surreal, treatment of formal systems, check out Godel, Escher, Bach: An Eternal Golden Braid by Douglas Hofsteader. I second that suggestion. One of my all time favourite books. Great fun.	[reply]
What does MOPT stand for? by dmitri (Priest) on Dec 17, 2002 at 15:24 UTC
Several ideas: `$ /usr/games/wtf mopt mopt - Mail-Order Purple Turtle $ /usr/games/wtf mopt mopt - Meditations Of a Perl Tourist` Which one is it?	[reply]
Re: What does MOPT stand for? by IlyaM (Parson) on Dec 17, 2002 at 15:56 UTC
According MOPT-01 - assumptions and spaces node it stands for "Meditations on Programming Theory". -- Ilya Martynov, ilya@iponweb.net CTO IPonWEB (UK) Ltd Quality Perl Programming and Unix Support UK managed @ offshore prices - http://www.iponweb.net Personal website - http://martynov.org	[reply]


Perl: the Markov chain saw
	PerlMonks