Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Variables and Scope: The battle begins

by Shadow-Master (Initiate)
on Dec 12, 2013 at 02:34 UTC ( #1066758=perlquestion: print w/replies, xml ) Need Help??
Shadow-Master has asked for the wisdom of the Perl Monks concerning the following question:

So here's my problem. I wrote a script to do both inline and from-file ASM parsing for a project that I plan on expanding later on. It is still very much a work in progress, but fairly functional. Part of that is parsing loop {label}| JZ |JNZ |... ASM commands. In the from-file functionality there is no problem. My script does a two-pass method to store the commands and parse them, so looping and jumping is handled without issue. However, in the inline parser it gets a bit complicated. Do to the nature of the inline parser, I cannot use the same two pass method until I know for sure that a looping command was entered, so I search for one of the looping commands, and if it was found, enter into the subloop that will deal with that little snippet, then after the loop is finished, exit it, all the while pushing every command that was entered into an array for storage and processing in case a loop was found.

in pseudocode #usr/bin/perl -w sub inline my @asm; #the ASM cmds buffer do { getinput(); push(@asm,input); if(isloop(input)) { dealwithloop(); } else { dealwithregularcommand(); } }

Fine. In theory this is great. However, to control the loop my code has to search each command entered a few times, and check to see if that command is the jump or loop ASM instruction that is controlling the loop and decide whether it should run it again. Horribly inefficient.

our pseudocode for the loop now looks like this sub inline my @asm; #the ASM cmds buffer do { getinput(); push(@asm,input); if(isloop(input)) { do { if(isjmp(input)) { $continue=testjmp(); } elsif(isloop(input())#loops and jumps are handled differently { $continue=testloop(); } else { dothing(input); } }until($continue==0) } else { dealwithregularcommand(); } }

So this works, but is not pretty. My other idea was to use a global variable and change that in the looping/jumping subs, and just check it after the loop

Something like this sub jump # this is basically all psuedocode the real sub is more compl +icated but this is the gist { if($condition == TRUE) { $jmptaken = 1; } else { $jmptaken = 0; } } sub loopstuff { do { &DoAllTheThing!!(); #since the loop/jmp is pushed to the arr +ay anyway, it will be run in the &doallthethings }until($jmptaken == 0) }

BUT! no matter what I tried my $jmptaken, our $jmptaken, no matter what scope, no matter where it was placed, $jmptaken would ALWAYS evaluate to the value it was set with, 1 and the loop never stops, 0 and it only runs once, no matter that the debug output always showed it being set in the corresponding subs correctly AND that the value was being passed to the loop conditional correctly. So I'm stumped. What am I missing here and what's the best way to fix this? For reference, I'll include the actual code I'm using at the bottom, but since I am a terrible perl novice at best, it's both messy and inefficient. Yes I know I shouldn't be using & in front of subs but i use that to differentiate between one I wrote and one that's a native perl function. And &tellme is a debug/standard output sub in use throughout my script &tellme("message",indent,urgency)

the real code: sub interactive { my $current = 0; my %interactives; my @tmpASM; my $placeholder = 0; my $linesplit = ''; my $iLINE= ''; my $tmpstr = ''; my $first = 0; &tellme("\e[31mInteractive mode is enabled.",1,0); &tellme("\e[31mType your commands as you were were putting them in +to a file.",1,0); &tellme("\e[31mType ;status to check status, and int 0x80 to save +a frame.",1,0); &tellme("\e[31mType -v or +v at any time to raise or lower the ver +bosity level.",1,0); &tellme("\e[31mType current or cmd at any time to print out the cu +rrent cmds stored in the main array.",1,0); &tellme("\e[31mType q or quit to leave.",2,0); &tellme("\e[31mType anything to continue.",2,0); &getinput; print "\n"; &status; do { push(@tmpASM,$iLINE); #add it to our main parsing array $interactives{$iLINE} = $placeholder; &tellme("\e[34m".$iLINE."\e[33m pushed into main array. There +have been \e[31m".$placeholder." \e[33m commands entered.",1,3) if$ f +irst > 0; $placeholder++; &parsecmd($iLINE); system 'clear' if $first > 0; &status if $first > 0; print "\e[37mPhant0m>"; $first++; $tmpstr = &getinput; $tmpstr =~ m/(^[^;]*)/; # split at comments $iLINE = &trim($1) if (defined($1) and $1 ne '' and substr(&tr +im($1),0,1) ne ';'); # if we got something... $iLINE =~ m/(\w+)[ \t]+(\w+)[ \t]*,?[ \t]*(([^;]*))/i; #spli +t that sucker up my $quickloop = &trim(uc($1)); my $quicklabel = &trim($2) if ($iLINE ne 'CDQ' and defined($2 +)); if (uc($iLINE) eq '+V') { $verbose++ ; &tellme("\e[31mVerbosity increased.",2,0); $iLINE = ''; } elsif (uc($iLINE) eq '-V') { $verbose-- ; $verbose =0 if $verbose == -1; &tellme("\e[31mVerbosity decreased.",2,0); $iLINE = ''; } elsif (uc($iLINE) eq 'CURRENT' or uc($iLINE) eq "CMD") { $iLINE = ''; &tellme("Main Parsing Array",2,0); &tellme("------------------------------------",1,0); foreach (@tmpASM) { &tellme($_,2,0); } } elsif ($quickloop eq 'JNE' or $quickloop eq 'JNS' or $quickloo +p eq 'JS' or $quickloop eq 'JE' or $quickloop eq 'JNE' or $quickloop +eq 'JZ' or $quickloop eq 'JNZ' or $quickloop eq 'LOOP') { my $isjmptaken = 0; $current = $interactives{$quicklabel.":"}; my $loopnumber = $current; do { $tmpASM[$current] =~ m/(\w+)[ \t]+(\w+)[ \t]*,?[ \t]* +(([^;]*))/i; #split that sucker up yet again... $quickloop = &trim(uc($1)); $quicklabel = &trim($2) if ($iLINE ne 'CDQ' and defin +ed($2)); if (uc($quickloop) eq 'LOOP') { $isjmptaken = &LOOP($quickloop, $quicklabel); } elsif($quickloop eq 'JNE' or $quickloop eq 'JNS' or $q +uickloop eq 'JS' or $quickloop eq 'JE' or $quickloop eq 'JNE' or $qui +ckloop eq 'JZ' or $quickloop eq 'JNZ') { $isjmptaken = &JMP($quickloop, $quicklabel); } else { &parsecmd($tmpASM[$current]); } $current++; if (not defined $tmpASM[$current]) { $current = $loopnumber; } } while ($isjmptaken == 1); push(@tmpASM,$quickloop." ".$quicklabel); #add it to our m +ain loop $iLINE = ''; #clear it out so the loop isnt run again at t +he top of our code :) } } until (uc($iLINE) eq 'QUIT' or uc($iLINE) eq 'Q' or uc($iLINE) +eq 'E' or uc($iLINE) eq 'EXIT'); }

Replies are listed 'Best First'.
Re: Variables and Scope: The battle begins
by GrandFather (Sage) on Dec 12, 2013 at 02:56 UTC

    Great story if we had all night around the camp fire and something to roast on sticks while we swapped tales, but there's no sample code that I can run to reproduce your result and too much required context for me to be able to tell where things are going pear shaped for you.

    I can make a suggestion or two though, but if they don't help you'll have to reduce your code to a small stand alone sample script the reproduces the bug.

    1. I suspect you are doing this anyway, but just in case: always use strictures (use strict; use warnings;).
    2. Don't use the &subname calling style. Use subname() instead.
    3. If you are using subroutine prototypes, don't!
    True laziness is hard work
      Thank you for the fast reply.
      1) I always use strict and warnings. :)
      2) All my subs are called with arguments, so the ()'s are always there, but given that I just saw the difference clearly delineated for me, I will remove the &'s in my code and see if that changes. THank you.
      3) I had to google what a subroutine prototype was. I can tell you with certainty that I have none of those. :)
        All my subs are called with arguments, so the ()'s are always there

        Actually, no they aren't ;). Your code includes &getinput; and &status calls without arguments at least. Just goes to show how easy it is for code not to be the way you think it is!

        If you haven't resolved the issue yet you should try to strip out code until you find the problem or you have a sample script with a minimum of irrelevant code and still demonstrating the issue. You may find I know what I mean. Why don't you? helps focus your effort.

        True laziness is hard work
      How do any of those things you mention help the OP?

        For the OP's immediate problem it's likely they don't. However the type of issue the OP is having leads me to think that the OP hasn't wide programming experience so pointing out areas that are a frequent cause of subtle problems may help. For example the & calling convention leads to

        doit(1, 2, 3); againSam(1, 2, 3); sub againSam { andAgain(); andAgain(); } sub doit { &andAgain; &andAgain; } sub andAgain { my (@values) = @_; print "<@values>\n"; }


        <1 2 3> <1 2 3> <> <>

        which may be surprising if you don't appreciate what &andAgain is doing. Mixing prototypes in may also be surprising:

        sub andAgain ($); my @nums = (1, 2, 3); andAgain(@nums); &andAgain(@nums); sub andAgain ($) { my (@values) = @_; print "<@values>\n"; }


        <3> <1 2 3>

        I doubt either "issue" is what is troubling the OP currently. However & call usage at least is present in the OP's code and where there is & calling there are likely prototypes. Pointing out potential problem areas in the OP's code, even if it's not what the OP is asking about, helps the OP. Until the OP shows us some code that reproduces the actual problem the best we can do is offer help with a few style issues.

        Note, the example code really isn't for educated_foo whom I'm sure knows all about this stuff. The examples are for those who are wondering what the fuss is about. Oh, and strictures are always useful, at the very least to pick up typos and brain farts.

        True laziness is hard work
Re: Variables and Scope: The battle begins
by tangent (Priest) on Dec 12, 2013 at 03:23 UTC
    $jmptaken would ALWAYS evaluate to the value it was set with, 1 and the loop never stops, 0 and it only runs once, no matter that the debug output always showed it being set in the corresponding subs correctly AND that the value was being passed to the loop conditional correctly.
    Your code seems to be a confusion of the two approaches you mention. You say that $jmptaken is set 'in' a corresponding sub (approach 2) but from what I can see in your code it is being set by the return value of the sub, i.e. the return value from &LOOP or from &JMP (approach 1). No matter what you set it to 'in' the subs it will always take on the value returned from those subs.

    Edit: If you do set $jmptaken in the subs LOOP and JMP, rather than use a global variable you could pass it as a reference to those subs:

    if (uc($quickloop) eq 'LOOP') { LOOP($quickloop, $quicklabel,\$isjmptaken); } ... sub LOOP { my ($quickloop,$quicklabel,$isjmptaken_ref) = @_; $$isjmptaken_ref = 1; }
    Note the use of $$ to dereference.
      Thank you for the fast reply.
      I think there was a slight misunderstanding in my code. The code that I posted was after I had already made the switch to checking to see if LOOP or JMP was called, then running it. In that code, there is no $jmptaken global variable anymore, it was nixed in favor of an approach that (inefficiently) worked. I am aware that it is set by the value returned in the sub, and I'm happy for it, since it means that the control variable is controlling the loop.
      I did not know about passing it to the sub and was actually looking for a way to pass variable byref instead of byval, so this will be tremendously useful, thank you.
      However in this particular case, it won't be, since to use this technique for this, I would have the worst of both worlds: global variables, and reparsing already parsed input. Until I can get variable scope to work for me without reparsing, I may just stick with what I have.
Re: Variables and Scope: The battle begins
by Bloodnok (Vicar) on Dec 12, 2013 at 09:33 UTC
    If time is of the essence, why not utilise the depth of knowledge already in CPAN e.g. Parse::RecDescent, or has this become a problem that you now have to solve for yourself ?

    Just a(nother:-) thought ...

    A user level that continues to overstate my experience :-))
      To expand on this great suggestion from Bloodnok, I found a nice tutorial for the parser.


      Sample code from the tutorial.

      use strict; use warnings; use Parse::RecDescent; # Create and compile the source file my $parser = Parse::RecDescent->new ( q( startrule : day month /\d+/ day : "Sat" | "Sun" | "Mon" | "Tue" | "Wed" | "Thu" | "Fri +" month : "Jan" | "Feb" | "Mar" | "Apr" | "May" | "Jun" | "Jul" | "Aug" | "Sep" | "Oct" | "Nov" | "Dec" ) ); # Test it on sample data print "Valid date\n" if $parser->startrule("Thu Mar 31"); print "Invalid date\n" unless $parser->startrule("Jun 31 2000");
        I will check these out, thank you.
Re: Variables and Scope: The battle begins
by sundialsvc4 (Abbot) on Dec 12, 2013 at 12:58 UTC

    I totally agree.   What you are doing here is creating a fairly brute-force, recursive-descent compiler/assembler ... an approach which makes a program that is every bit as, or more so, complicated as the language that is to be processed.   A far better approach is to use a parser, such as the one listed above, in order to permanently shove “the complexity of the language” into the grammar.

    Don’t re-invent this wheel.   Especially not this one.

      One of the first things I did when started this project was to spend a few hours googling ASM parsers in perl. I found none that did not directly affect registers and stack, instead of variables that acted as registers and stack, so I decided to write my own.
      It seems I was too narrow in my search. This parser you showed me is basically what I want, and had I found it when I started I would have used it, instead of my own.
      As it stands now, I would love to use it, and now need to sit down and figure out how to create grammar rules for ASM instructions (registers, stack, and flags all need to be touched.) and then most likely I will rewrite my whole code and drop several thousand lines from it. :)
      Thus, my new goal is to learn how this works. Of course, any general (or specific!) tips on either my code, or the CPAN parser would be amazing, since as noted above, obviously I don't really know what I'm doing. Thank you all for your help.
Re: Variables and Scope: The battle begins
by Anonymous Monk on Dec 12, 2013 at 17:56 UTC
    I'll just give a pointer to dominus's Higher-Order Perl (the book). It has sections that deal with dispatchers (which is what your code is doing). It's very much worth the read.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1066758]
Approved by davido
Front-paged by ww
[Corion]: Meh. I need to find myself a better "programming" feed than r/programming (which is just HackerNews reposts and advertisements, very little code). Maybe I should select a list of links that I liked on r/programming and then google for ...
[Corion]: ... an aggregator site that also listed (ideally) all of these links once. And maybe also have an exclude list to blacklist some of the most spammy links that the site may have never mentioned at all
[marto]: I gave up reading HN at all, and feel much better for it :P
[Corion]: Now, how to best automate that Google search ... :-)
[Corion]: marto: Yeah, I'm also short of not reading it at all. I already stopped reading it in the morning because it gave me a foul mood.
[Corion]: The good posts on HN are the non-computing posts. Most of the other stuff is maybe relevant to you if you are 20 and live in Silicon Valley...
[Corion]: Two attributes that don't describe me.
[Corion]: Maybe I should also write a curator for HN or simply not read it anymore, like you do.

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (7)
As of 2017-07-24 13:26 GMT
Find Nodes?
    Voting Booth?
    I came, I saw, I ...

    Results (354 votes). Check out past polls.