http://www.perlmonks.org?node_id=174903


in reply to Clunky parsing problem, looking for simple solution

Interesting problem. A simple regexp-based solution that doesn't take lexemes into account will fail, at least on pathological cases like 1800 IF V$="THEN FOO ELSE IF BAR" THEN $K="FOO" I don't have the brain cells left tonight to work through a complete solution, though I have a dim recollection of having done something like this many years back using "fixups". 1800 IF V$="K" THEN A$="+K+" ELSE IF V$="R" THEN A$="?R?" ELSE IF V$="M" THEN A$="!M!":Z1=R1:Z2=R2 would translate first into
1800 IF V$="K" THEN GOTO {fixup:skip} 1800.1 GOTO {fixup:after-next-goto} 1800.2 A$="+K+" 1800.3 GOTO {fixup:end} 1800.4 IF V$="R" THEN GOTO {fixup:skip} 1800.5 GOTO {fixup:after-next-goto} 1800.6 A$="?R?" 1800.7 GOTO {fixup:end} 1800.8 IF V$="M" THEN GOTO {fixup:skip} 1800.9 GOTO {fixup:end} 1800.a A$="!M!" 1800.b Z1=R1 1800.c Z2=R2 1800.d REM
The second pass would peform the fixups.

By using a {fixup:skip} fixup (rather than calculating the target line number as you're generating the sequence, you allow for the possibility of doing peephole optimizations. In the sequence above,

1800.8 IF V$="M" THEN GOTO {fixup:skip} 1800.9 GOTO {fixup:end} 1800.a
could be optimized to
1800.8 IF V$<>"M" THEN GOTO {fixup:end} 1800.9
This "renumbers" lines within the sequence, but since target line number calculation/assignment has been deferred, not GOTOs are broken.