|laziness, impatience, and hubris|
MOPT-02 - substitution and formal systemsby mstone (Deacon)
|on Dec 16, 2002 at 21:16 UTC||Need Help??|
In MOPT-01, I said that computers manipulate symbols according to a predefined set of rules. I said that symbols represent human assumptions, and listed some assumptions that pop up fairly often in practice. The central theme of that whole meditation, though, was the concept of representation.
Representation is the first mystery of programming.
The second mystery of programming is the concept of substitution.
meditation #2: Substitution and formal systems:
Representation gives us a way to define symbols. Substitution gives us a way to define relationships between symbols. Of course, relationships are just another kind of assumption, so substitution is really just representation in disguise.
It's a useful disguise, though. It gives us the power to apply one relationship after another, which in turn gives us the 'iterated sequence of operations' model that distinguishes programming from everyday mathematics.
The steps of a mathematical proof are more like the trace of a single program variable than a program per se. You apply an operation to what you have, and write the result as the next line of your proof. The operation itself is usually never stated, and mathematicians learn to infer the operations from the 'before' and 'after' values they see on the page. Programs, OTOH, list the sequence of operations explicitly, and let the computer work out the intermediate values.
Programming is a branch of mathematics, though. The mathemetician's version of an iterated sequence of operations is called a formal system (1)(2).
(1) - For a fun, if surreal, treatment of formal systems, check out Godel, Escher, Bach: An Eternal Golden Braid by Douglas Hofsteader. CORRECTION: the author's name is actually spelled: "Hofstadter".. thanks gjb.. serves me right for being too lazy to go over and look at my bookshelf.
(2) - Formal systems are also known as Post systems, after Emil Post, the mathemetician who discovered them.
A formal system uses a set of symbols called an alphabet (3). A sequence of characters from that alphabet is called a string. A formal system also defines a transformation function, which maps one string to another. You apply transformation rules by replacing part of an existing string with whatever the transformation function maps it to.
(3) - An alphabet is a set of items that support the concept of identity, but not the concepts of order, distance, or origin. In other words, to refer back to MOPT-01, an alphabet is a nominal space.
Most formal systems also define a set of strings you can start with, called axioms. Any string you can derive by applying a sequence of transformations to some axiom is called a theorem. The set of all valid theorems (i.e.: the set of all strings you can derive from some axiom or other) is called the system's language.
At this point, you're probably thinking, "okay, lots of vocabulary.. more or less makes sense.. so what?"
So a language is another kind of information space. Formal systems generate languages, so you can use a formal system to generate whatever kind of information space you want. All you have to do is create an alphabet and a set of translation rules that represent the basic assumptions of the space you want to generate. As an example, let's look at a simple formal system I call the 'CIOP' system:
The system's alphabet consists of four symbols:
hence its name. The transformation function looks like so:
where 'S' means 'any string', and the rule 'S -> Sp' means 'you can replace any occurrence of string S with the string Sp'. In other words, you can append <t>'p' to any string.
The system also has one axiom: the string 'o'.
Now let's implement that system in Perl:
When we run that code, it produces the following output:
And as you can see, we've created a binary counter. The symbols 'i' and 'o' stand for the binary values 'one' and 'zero', respectively. The symbol 'p' stands for 'plus one', and the symbol 'c' stands for 'carry one'.
Now let's shuffle things around a little. Instead of starting with the string 'o' and adding one 'p' per iteration, let's start with the string 'opppppppp' and see what happens. It's a legal theorem, after all, from repeated application of the rule 'S -> Sp':
This time we get:
which once again fits our original meanings for the symbols.
The CIOP system's language is equivalent to the set of all positive integers, and it supports the basic operation of unit addition. If we wanted to simulate unit subtraction, we could flip the transformation rules backwards, and let any string composed of 'i's and 'o's be an axiom.
With more symbols and rules, we could create a decimal or hexadecimal counter. With different rules, we could simulate boolean algebra, or any of the logic circuits that we use to build digital computers. In fact, with an elaborate enough set of symbols and rules, we could build a formal system that does anything a computer can do. Post systems are computationally equivalent to Turing machines and the Lambda calculus, the two most popular theoretical models for computation.
Now since I've just used that word, the term computation refers to a specific sequence of transformations. When you apply that sequence of transformations to a given axiom, you compute a specific theorem. A theorem is computable if you can compute that theorem from one of the defined axioms.. in other words, if there's a sequence of transformations that will turn some axiom into that theorem.
A computation is vaguely like what programmers think of as a function: It takes an axiom as input, it produces a theorem as output, and it defines a sequence of operations that turn the input into the output. The difference is that most computations only work for a specific axiom. The substitutions that turn 'o' into 'ioo' wouldn't turn 'ii' into 'iii', for example. You'll have to wait for the next meditation to learn the mystery that lets us create actual functions.
But that brings us back to the mystery for this meditation: substitution.
Languages are information spaces, and formal systems generate languages. Substitution is the engine that makes formal systems go. Every computation, no matter what flavor of theoretical model you choose, boils down to a process of substituting one group of symbols for another. That's a very useful thing for programmers to know. In fact, it's the foundation of several best practices:
That's about it for this meditation, but before I go, I need to mention another kind of substitution that crops up very frequently in practice. For want of a better term, I'm going to call it unit-aggregate duality:
The CIOP system offers at least two ways to represent any integer. The second example above demonstrates that, by showing that we can compute the string 'iooo' from the string 'opppppppp'. That kind of equivalence occurs frequently in formal systems.
Now if we wanted to, we could add the mapping:
to the system's transformation function. That would give us yet another way to represent the value 'eight', but more importantly, it would give us a way to switch between a single symbol that means 'eight' and a collection of symbols that mean 'eight'.
If we apply the same kind of reasoning to a string like, "hello, world.\n", we can treat it as a unit (i.e.: a string), or as an aggregate of smaller units (i.e.: a sequence of characters).
High-level languages like Perl make it so easy to switch between those two kinds of representation that programmers can forget there's a difference. They think of strings as indeterminate entities which are both units and aggregates until tested, and which then collapse to whatever form works best for a given operation.
Down in the world of programming theory, distinctions like that can make a world of difference. Most of the theoretical structures we'll run into in the future operate in terms of reading or writing a single symbol. We'll also run into a lot of ordered sets, generically called tuples, in the future:
If you aren't clear on your unit-aggregate duality, you can lose time wondering whether the 3-tuple '(0,1,2)' is a single symbol, or a collection of three symbols. The correct answer is that it's neither. It's an idea that you can represent any of several different ways. If you end up using more than one representation, you'll need rules that let you substitute one representation for the other.
So.. go meditate on the nature of substitution. Look pieces of code that you could model by adding a bunch of symbols to a string, then replacing the final sequence with a new symbol. Think about unit-aggregate duality, and try to figure out where Perl switches the representations so you don't have to. In the meantime, I'll be writing next week's meditation, which will reveal the third and final mystery of programming theory: the concept of identification.