|Problems? Is your data what you think it is?|
As you may know if you follow my use.perl journal or my posts to the perl6 language lists, I'm working on a Scheme implementation in Perl.
As it turns out, a basic interpreter for a schemelike language is surprisingly easy to implement. The problem is that, in order to be able to call it Scheme, you have to do 'tail call optimization'. "What's that then?" I hear you ask?
Well, here's a simple example. Consider the following piece of tail recursive perl:
notice that, in the 'else' branch of the code, we're simply returning the value of a further function call, we don't actually do anything with it in the current function. Wouldn't it be great if the 'if' branch of that conditional could just return directly to the 'original' caller of tail_factorial instead of having the result passed up a tower of intermediate callers.
Well, that's what tail call optimization is all about. A full implementation would do the optimization even in cases where the tail call isn't a straightforward recursive call (and that's what is needed for a 'proper' scheme implementation).
Note that you can sort of do tail call optimization in native Perl right now, but it's not exactly optimal. Here's here's that factorial function again:
But it's scary, not desperately quick, and requires programmer intervention. Ideally this sort of thing should happen automatically. Larry has already said that simple tail recursion in Perl 6 will be optimized, but how do I make my scheme interpreter do the Right Thing in perl 5?
Well, frankly, it's scary. I've had to roll my own virtual machine, complete with registers and a stack (there's no call stack though, but eventually I'll implement a 'real' chain of partial continuations.) The driver loop for this machine looks something like:
Notice that labeled while loop. The idea is that, instead of calling $self->expr->evaluate($self) directly (doing a classic OO double dispatch trick), functions set up a continuation (the function that will 'carry on' once things have been evaluated), set 'expr' to an appropriate value and do redo EVAL_DISPATCH, which unwinds the call stack and jumps to the start of the loop again. This avoids recursing desperately deeply into multiple calls to evaluate (my naive, none tail recursive implementation could easily get 100s of levels deep).
The inner while loop is then responsible for handing control off to the current continuation, which will eventually, one hopes, get set to 'return' so the subroutine can exit.
It turns out that this simple loop combined with a small number of registers and a stack gives me enough that I can bootstrap scheme. My current problem is that my simpleminded scheme parser (implemented in perl) recurses far deeper than my interpreter, so I probably need to reimplement it on the virtual machine. (Something I'm going to have to do when I replace the virtual machine with Parrot anyway...). AFAICT it shouldn't be too hard to implement the full scheme call-with-current-continuation on this machine, as well as 'compiled' functions (but compiled functions are some way off for now).
I'm afraid the CPAN release is some way off too. At the very least I need to write a test suite and get a few more of the standard scheme things implemented and, who knows, maybe write some documentation.
And then there's the Parrot compiler to write. And something to compile to perl 5 as well. And the port to Perl 6 (I only started the perl 5 version to make sure that what I'd implemented in perl 6 would actually work), and then there's Inline::Scheme to write, and...