Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: Clunky parsing problem, looking for simple solution

by talexb (Canon)
on Jun 17, 2002 at 03:56 UTC ( #175020=note: print w/ replies, xml ) Need Help??


in reply to Clunky parsing problem, looking for simple solution

This is a little late, but I had so much fun writing it I thought I'd post it anyway.

#!/usr/bin/perl -w # Something to decompose BASIC If-Then-Else statements into simpler # statements, expanding IF .. THEN .. ELSE and IF .. THEN statements # recursively as necessary, and also expanding sub-statments separate +d by # ':' characters. # # In response to node 174889 on PerlMonks. use strict; my @Block = (); # Block to hold output statments in reverse or +der. my $Depth = 0; # Variable to indicate out depth of recursion. # Read DATA in from in-line data section, parse line and output resul +t. while (<DATA>) { print $_ . "expands to:\n"; my ( $LineNumber, $LineSource ) = m/^(\d+) (.+)$/; $Depth = 0; ParseStatement ( $LineSource ); DumpSimpleEquivalent ( $LineNumber ); } # Parse the BASIC statment, recursively if necessary. sub ParseStatement { my ( $ThisLine ) = @_; # This variable will reflect any additional lines added to the bloc +k # of statements. my $StatementCount = 0; # Check for the more difficult case of an IF .. THEN .. ELSE if ( $ThisLine =~ /^\s*IF(.+?)THEN(.+?)ELSE(.+?)$/ ) { my ( @Tokens ) = ( $1, $2, $3 ); # print "3 part statement$: $Tokens[0]-$Tokens[1]-$Tokens[2].\n"; # Only if we're at the first level do we need to output an end bl +ock. if ( $Depth == 0 ) { push ( @Block, "REM End of If Then Else block\n\n" ); } # If there's another level of IF statement within, call self recu +rsively. if ( $3 =~ /IF/ ) { my $LocalCount = 4; $Depth++; $LocalCount += ParseStatement ( $Tokens[2] ); push ( @Block, "ELSE" ); push ( @Block, "GOTO +$LocalCount" ); # To be fixed up later } else # Otherwise, handle normally: Split statement on ':', add then, t +he ELSE # statement and the GOTO to go around this block. { my $LocalCount = 3; my @SubStatements = split ( /:/, $Tokens[2] ); $LocalCount += @SubStatements - 1; push ( @Block, @SubStatements ); push ( @Block, "ELSE" ); push ( @Block, "GOTO +$LocalCount" ); # To be fixed up later } push ( @Block, $Tokens[1] ); push ( @Block, "IF$Tokens[0]THEN" ); } # OK, it's not an IF .. THEN .. ELSE; try just an IF .. THEN. We do +n't need # to worry about recursion. elsif ( $ThisLine =~ /^\s*IF(.+?)THEN(.+?)$/ ) { my ( @Tokens ) = ( $1, $2 ); # print "2 part statement: $Tokens[0]-$Tokens[1].\n"; if ( $Depth == 0 ) { push ( @Block, "REM End of If Then Else block\n\n" ); } my @SubStatements = split ( /:/, $Tokens[1] ); $StatementCount = @SubStatements - 1; push ( @Block, @SubStatements ); push ( @Block, "IF$Tokens[0]THEN" ); } return ( $StatementCount ); } # Routine to dump the expanded (simplified) version of BASIC code. sub DumpSimpleEquivalent { my $LineNumber = shift; my @NewBlock = reverse @Block; my @SubLine = ( (0..9), ('a'..'z') ); my $Index = 0; foreach ( @NewBlock ) { # Update line numbering, delete leading space. $NewBlock[ $Index ] =~ s/\+(\d)/$LineNumber.".".$SubLine[ $Index+$ +1 ]/e; $NewBlock[ $Index ] =~ s/^\s+//; print "$LineNumber.$SubLine[ $Index ] $NewBlock[ $Index++ ]\n"; } @Block = (); } __DATA__ 7130 IF B3<>0 THEN PRINT "FROM E TO S":W1=B4:X=B5:GOTO 6920 9810 IF K$="Y" AND RND(1) <.5 THEN GOTO 9820 ELSE GOTO 9770 8020 IF SO=0 THEN SO=1 ELSE SO=0 4835 IF V$="K" THEN A$="+K+" ELSE IF V$="M" THEN A$="!M!" ELSE IF V$=" +R" THEN A$="?R?" 1850 IF K3=0 AND EX(Q1,Q2)=0 THEN GOTO 8500 ELSE GOSUB 6000 1800 IF V$="K" THEN A$="+K+" ELSE IF V$="R" THEN A$="?R?" ELSE IF V$=" +M" THEN A$="!M!":Z1=R1:Z2=R2

--t. alex

"Mud, mud, glorious mud. Nothing quite like it for cooling the blood!" --Michael Flanders and Donald Swann

Update: After reading the other posts (I couldn't bear to read anything else till I was done my solution) I acknowledge that there are shortcomings in my solution..

  • No optimizations like deleting the second of two GOTO statements
  • A colon inside a string will muck up my statement separation operation

Update 2: Well, of course it goes without saying, but jepri mentioned that Parse::RecDescent could be used for parsing your BASIC syntax .. but that could be likened to using a transport to carry a single ream of paper.

I could also have implemented a b-tree structure to store the IF .. THEN .. ELSE statement pieces, but I decided just to write a quick and dirty solution, an array, instead. My solution ain't complete, but it will most likely take care of 90% of the job, and that sounded like what you needed.


Comment on Re: Clunky parsing problem, looking for simple solution
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://175020]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (15)
As of 2015-07-01 16:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (12 votes), past polls