Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Debug code out of production systems

by liz (Monsignor)
on Jan 24, 2004 at 22:08 UTC ( #323875=perlmeditation: print w/ replies, xml ) Need Help??

Today I finally bit the bullet. I created a pragma (lowercase name module) that allows you to add debug code to your programs that will only actually compile when you tell it to. And to boot, I wrote documentation and a test-suite for it and it's now on the way to a CPAN mirror near you (or available from my own CPAN modules site).

However, I'm a bit in two minds about its name: begin. The reason I chose "begin" as the name, is that it is based on the behaviour of the =begin pod directive. I'm also thinking that "debug" might be a good name. But that doesn't indicate how the magic is achieved.

Anyway, I wonder what my fellow monks would want to say about it.

Here is an excerpt from the pod:
NAME
begin - conditionally enable code within =begin pod sections

SYNOPSIS
export DEBUGGING=1 perl -Mbegin yourscript.pl
or:
perl -Mbegin=VERBOSE yourscript.pl
or:
perl -Mbegin=all yourscript.pl
with:
======= yourscript.pl =================== # code that's always compiled and executed =begin DEBUGGING warn "Only compiled and executed when DEBUGGING or 'all' enabled\n" =cut # code that's always compiled and executed =begin VERBOSE warn "Only compiled and executed when VERBOSE or 'all' enabled\n" =cut # code that's always compiled and executed ========================================

DESCRIPTION
The "begin" pragma allows a developer to add sections of code that will be compiled and executed only when the "begin" pragma is specifically enabled. If the "begin" pragma is not enabled, then there is no overhead involved in either compilation of execution (other than the standard overhead of Perl skipping =pod sections).

To prevent interference with other pod handlers, the name of the pod handler must be in uppercase.

If a =begin pod section is considered for replacement, then a scope is created around that pod section so that there is no interference with any of the code around it. For example:

my $foo = 2; =begin DEBUGGING my $foo = 1; warn "debug foo = $foo\n"; =cut warn "normal foo = $foo\n";
is converted on the fly (before Perl compiles it) to:
my $foo = 2; { my $foo = 1; warn "foo = $foo\n"; } warn "normal foo = $foo\n";

But of course, this happens only if the "begin" pragma is loaded and the environment variable DEBUGGING is set.

All other feedback is of course always appreciated!

Liz

Update:
I've decided to change the name of the pragma (as seen by the outside world) to "ifdef". I've also fixed the problems found by "Mr. Muskrat" and dbwiz and added an API for AUTOLOADing modules that may want to do the same source code conversion. For the impatient ones, also available from my own CPAN modules site.

Comment on Debug code out of production systems
Select or Download Code
Re: Debug code out of production systems
by PodMaster (Abbot) on Jan 24, 2004 at 22:54 UTC
    Why stake claim to all =begin UPPERCASE? I don't like that. You should pick 1 and stick with it (=begin begin sounds good).

    MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
    I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
    ** The third rule of perl club is a statement of fact: pod is sexy.

      Why stake claim to all =begin UPPERCASE?

      Hmmm... good question. To my knowledge there are no pod parsers that use uppercase names. I want to prevent a pod parser by accident adding debug code to the documentation of a module.

      Uppercase letters stand out. So in that sense you could say I'm staking this claim, but only visually!

      Please note that you can use uppercase names for your own pod parsers. You just can't activate it with begin then, because you'll most likely get compile errors because your documentation inside that =begin section will most likely not be intended to be executed.

      You should pick 1 and stick with it (=begin begin sounds good).

      As a developer, I want to be free in the choosing of my names. You might want to use DEBUGGING for real debug code, and VERBOSE for just making your script a bit more verbose than usual. Or FOO for some temporary stuff.

      You should note that the =debug sections are parsed for all modules once "begin" is loaded. Activating DEBUGGING will activate all of the =begin DEBUGGING pod sections of all modules. So I'm actually thinking of adding some namespace support for it, e.g:

      perl -MBEGIN=forks::TRACE myscriptusingforks.pl
      would only activate the =begin sections inside the forks.pm module, whereas:
      perl -MBEGIN=TRACE myscriptusingforks.pl
      currently activates all modules that have a =begin TRACE pod section.

      Liz

Re: Debug code out of production systems
by dbwiz (Curate) on Jan 24, 2004 at 23:38 UTC

    liz,

    Thanks for writing this pragma. I like the idea and I have started using it immediately.

    However, I don't like the name and I am sure that many others will object about it. While it is clear how it does the trick, it is really misleading about what the pragma does.

    I would rather have a debug pragma, and a quick search and replace in your module shows that it can be done, so that a sample code would look like the following:

    print "before\n"; =debug MIDDLE print "inside\n"; =cut print "after\n";

    And when I call it with this pragma, it works just fine.

    perl -Mdebug=MIDDLE test.pl

    You could also implement it as a pragma with two parameters, one for the tag and one for the label to activate. Possible names would be "podebug," "debugpod," "tagdebug," or "debpod." Not really suggesting, just brainstorming.

    Anyway, with this enhanced pragma, I would call my script as

    perl -Mdebpod=debug,MIDDLE test.pl # or, if I change my mind about the tag, perl -Mdebpod=begin,MIDDLE test.pl

    However, I should point out a possible problem. The following code contains valid POD but the embedded code doesn't get executed by the "begin" pragma. Not only that, but also the code after the POD block disappears.

    print "before\n"; =pod Whatever I want to include here. Comments or code, it doesn't matter. =begin DEBUGGING print "inside\n"; =cut print "after\n";

    Using the "begin" pragma, this code will only print "before\n". You should either fix it or amend the docs about this risk.

      While it is clear how it does the trick, it is really misleading about what the pragma does.
      I have some of that feeling as well. But please note that this pragma is only intended to be called from the commandline (it actually emits a warning when you're trying to use it inside a script). So my thinking was, use "begin" as a pragma to activate certain =begin pod sections.

      I would rather have a debug pragma
      I've thought about using =debug as a pod delimiter. The thing is that many pod processor generate a lot of noise when they encounter an unknown =pod delimiter. Wherease if they don't know how to handle =begin, they're supposed to ignore it.

      ...a quick search and replace in your module shows that it can be done...
      Indeed. That shouldn't be the problem. But causing headaches for maintainers of pod parsers may be a problem.

      ...as a pragma with two parameters, one for the tag and one for the label to activate.
      I think that would make it more confusing and obfuscating. I think the namespace idea that I head, may be more handy.

      ... the code after the POD block disappears.
      That's because:

      print "before\n"; =pod Whatever I want to include here. Comments or code, it doesn't matter. =begin DEBUGGING print "inside\n"; =cut print "after\n";
      becomes after conversion:
      print "before\n"; =pod Whatever I want to include here. Comments or code, it doesn't matter. { print "inside\n"; } print "after\n";
      which makes clear why the code disappears. I could check for that at the expense at a more complex filter. Or add another CAVEAT. Probably the first.

      Thanks for the feedback.

      Liz

        "I've thought about using =debug as a pod delimiter. The thing is that many pod processor generate a lot of noise when they encounter an unknown =pod delimiter. Wherease if they don't know how to handle =begin, they're supposed to ignore it. "

        Couldn't you use the =for directive =for debug? That does not emit warnings.


        -Lee
        "To be civilized is to deny one's nature."
Re: Debug code out of production systems
by BrowserUk (Pope) on Jan 25, 2004 at 01:40 UTC

    I think that this is probably the most 'legitimate' use of a source filter I've seen. I really like that it turns the usual practice on it's head and only imposes additional overhead (minimal, compile-time) when enabled, whilst imposing none when disabled. Nice++

    However, I am also a little dubious about the use of =begin as the trigger, and as the name of the module.

    Whilst =begin DEBUGGING and =begin VERBOSE read quite nicely, and combining that with the requirement that the second word be all-uppercase makes it less likely that it will step on anyones toes, using what is a fairly common place word 'begin' as the trigger seems to invite the possibility that it will encounter a few modules containing a

    =begin HERE Some dramaticly dangerously piece of (DON'T USE) example code:) =cut

    It's a stretch, but using a less common, combined or made-up word might lessen the chances. The only (pathetic, cutesy) possibiliy that comes to mind is

    =dBugin DEBUG

    Aweful, but it demonstrates the idea that a made up word that lends itself to immediate association with debugging, is less likely to turn up by accident, and less likely to be overlooked by the unfamiliar.

    I also think that as a module name, 'begin' is most unlikely to attract my attention, or turn up on my keyword searches of CPAN for a debugguging aid.

    I really do like the implementation though -- and I will be making use of it.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    Timing (and a little luck) are everything!

      ...it will encounter a few modules containing a
      =begin HERE Some dramaticly dangerously piece of (DON'T USE) example code:) =cut

      For code to be included this way, it would have to be:

      • code, not some type of documentation. If it was some type of documentation, it would cause compilation errors.
      • the HERE or "all" feature must ba activated.
      I think that's a save stretch (famous last words ;-).

      I also think that as a module name, 'begin' is most unlikely to attract my attention, or turn up on my keyword searches of CPAN...

      Would "ifdef" have worked?

      Liz

Re: Debug code out of production systems
by Mr. Muskrat (Abbot) on Jan 25, 2004 at 02:27 UTC

    Every =begin should have an =end!

    Does your pragma honor =end? According to perlpod, 'all text from "=begin" to a paragraph with a matching "=end" are treated as a particular format.'

Re: Debug code out of production systems
by hardburn (Abbot) on Jan 25, 2004 at 04:23 UTC

    This is really easy to achive. perl will optimize away sections enclosed in if blocks with constant conditions:

    $ perl -MO=Deparse -le ' > sub DEBUG () { 1 } > if(DEBUG) { print "Debug" } > print "Done"; > ' BEGIN { $/ = "\n"; $\ = "\n"; } do { print 'Debug' }; print 'Done'; -e syntax OK $ perl -MO=Deparse -le ' > sub DEBUG () { 0 } > if(DEBUG) { print "Debug" } > print "Done"; > ' BEGIN { $/ = "\n"; $\ = "\n"; } '???'; print 'Done'; -e syntax OK

    So with a false constant, perl will replace the if with a statement B::Deparse doesn't like (but is probably a noop). With a true constant, it gets enclosed in a do block (in case there are any lexicals declared--perl doesn't look ahead far enough to know).

    (And I say 'perl', not 'Perl', because it's possible that a different implementation of Perl, should there ever be one (Ponie?), may apply different optimizations.)

    ----
    I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
    -- Schemer

    : () { :|:& };:

    Note: All code is untested, unless otherwise stated

      If I understand correctly, this optimization is done by the optimizer, working on already compiled code. This has a number of drawbacks:
      • The code must be compilable
      • Effort is spent compiling, that is thrown away later
      • Any BEGIN type code in there will have executed, so if you have a use in there, that module _will_ have been loaded.
      To prove that last point:
      if (0) { use strict; } print "INC = @{[keys %INC]}\n"; __END__ INC = strict.pm

      Furthermore, to activate sections of code, no changes would need to be made to the program. It's all external with my solution. With your solution, you would have to export constant subs to all namespaces, a not so easy task and a definite pollution of namespace. Even though the optimizer takes away sections of code, there remains a coderef to the original constant in the package namespace. See this example:

      sub FALSE () { 0 } if (FALSE) { use strict; } print "INC = @{[keys %INC]}\n"; print "FALSE exists\n" if exists &FALSE; __END__ INC = strict.pm FALSE exists

      Hope this explains some of the reasons I had for doing it the way I did.

      Liz

        Effort is spent compiling, that is thrown away later
        Which is done in C. A source filter spents effort filtering - doing the work in Perl. Do you have any figures that show you have some form of "gain"?
        Any BEGIN type code in there will have executed, so if you have a use in there, that module _will_ have been loaded.
        Depends on how you write it.
        use if 0, My::Module;
        will not load My::Module.

        Abigail

Re: Debug code out of production systems
by graff (Chancellor) on Jan 25, 2004 at 04:37 UTC
    This is pretty awesome -- I could see it being applied to a range of things other than debugging (and perhaps some holy wars breaking out about what might constitute overuse or misuse of the idea).

    Because of its potential, I think the tendancy to give it a name related to debugging would be... well, constraining or inapt somehow -- it might limit people's perception about what it's really doing, and what it's able to do.

    Keeping it as "=begin" seems okay (especially if folks follow Mr. Muskrat's idea of always terminating the conditional code with "=end"). Another possibility would be to call it something like "=run_when" -- e.g.:

    =run_when ODDBALL_OS_VERSION. use Local::Module::ForODDOSV; # nice to have a new way to do this! =cut ... while (<BLAH>) { ... =run_when DEBUGGING warn "reading BLAH and got: $_" =cut ... =run_when TESTING $expected = "What should happen here"; warn "This isn't working\n" unless ( $expected eq some_test()); =cut ... =run_when BORED print "Did you do something different with your hair today?\n"; =cut
    If you feel that C-like pre-processor directives are really second nature to just about everybody who uses Perl, then you could try "=ifdefined" -- this wouldn't be any worse that "=begin", I think, but really, what's in a name? "bless" by any other name would be no less abstract...

    I guess the next thing to wonder about is whether folks might opt for having selected chunks of pod content be produced or not via perldoc, depending on what's in the environment. (This might be tricky -- how to flag the end of the =begin block within pod, without flagging the end of pod as well. Or does the use of "=begin" impose the end of pod processing already? I confess to not having studied pod grammar all that closely.)

      If you feel that C-like pre-processor directives are really second nature to just about everybody who uses Perl, then you could try "=ifdefined" -- this wouldn't be any worse that "=begin"
      Well, I admit being inspired by C preprocessor's directives. Ideally, I would like to use the =if, =ifdef, =elseif and =else and =endif family. However, any of these will cause severe noise with even the standard pod2html and pod2man utilities. And worse, it will include them inside the generated documentation!

      Maybe I should call the pragma "ifdef" and support all of the above pod markers and include =begin and =end (good point "Mr. Muskrat") as synonyms that can be used in the current code base without upsetting current pod parsers?

      Liz

Re: Debug code out of production systems
by rcaputo (Chaplain) on Jan 25, 2004 at 17:33 UTC

    A faster way to compile code out of a system is to hide it behind an if(CONSTANT) or unless(CONSTANT), as in

    use constant DEBUG => 0; if (DEBUG) { warn( "This code is only compiled into the program ", " when DEBUG is true.\n"; ); }
    The bonus is that you don't introduce more delay in the compile time, which a lot of people apparently dislike. I discovered this in POE, a project where I gained about 20% runtime performance with POE::Preprocessor by replacing small, commonly used functions with macros. A contrived example:
    macro num_max (x,y) { ((x) > (y) ? (x) : (y)) }

    This macro is then used, template-like, in the main body of source as:

    print "You owe: \$", {% num_max $total-$paid, 0 %}, "\n";

    Back to compile-time inclusion. POE::Preprocessor uses the common if/elsif/else syntax, tagged with an "# include" marker. That is, if you comment a construct with "# include", it will be evaluated at compile time (using the CONSTANT trick), and the code in the block will be included (or not) depending on the condition's outcome.

    unless ($expression) { # include ... lines of code ... } elsif ($expression) { # include ... lines of code ... } else { # include ... lines of code ... } # include
    Problems with macros and source filters in general:
    1. They alter your source's line numbers, which interferes with warnings and error messages. POE::Preprocessor takes great pains to insert "# line" directives that not only preserve your original line numbers but also indicate where in your macros the problem may really lie.
    2. They confound packagers, most notably perlapp and perl2exe. These Perl "compilers" do not evaluate source filters at runtime. They don't even evaluate them at "compile" time. Instead, the original, non-Perl syntax becomes an error when you try to run things.
    3. Source filtering is slow. I got no end of complaints about slow startup times, even though POE::Preprocessor attempts to be optimal Perl.
    4. Any non-Perl syntax, no matter how trivially like any number of template toolkits, is greeted with shock and confusion. (Heck, people still don't like @_[CONST1, CONST2], even though it is standard Perl syntax.)

    Liz's solution is much smarter than mine. It addresses all these problems. Very nice!

    -- Rocco Caputo - rcaputo@pobox.com - poe.perl.org

      ...It addresses all these problems. Very nice!

      Thank you for your kind words. But I'm afraid there are still some issues involved ;-(

      They alter your source's line numbers...

      I've chosen a very simple, line by line algorithm, that will lend itself for rewriting in C if ever necessary. No lines should be removed or added, so line numbers should always be correct (although pod lines that are not activated, are replaced by empty lines). I'm contemplating emptying out lines that start with "#" also, but I'm afraid the additional check (in Perl) would cost more CPU than adding the whole line to the source again and having the Perl parser get rid of such a line (in C).

      ... These Perl "compilers" do not evaluate source filters at runtime...

      Well, add mod_perl to that list. My nice little magic module doesn't do it in mod_perl. ;-( One of the reasons I started this in the first place. ;-(

      Anyway, I have added an API for other modules that would allow them to have an arbitrary piece of code stored in a variable to be processed in the same manner. For instance

      eval $source;
      could become:
      ifdef::process( $source ) if exists &ifdef::process; eval $source;

      Now to find out what magic mod_perl is performing when it loads its Perl modules and convince the mod_perl people to add the above extra line... ;-)

      Liz

      Update:
      Actually, I just realized the above could be done smarter:

      =begin MODPERL ifdef::process( $source ); =cut eval $source;
      Under mod_perl, the environment variable MODPERL is always defined and not null. If ifdef is active, then the extra processing line becomes active automatically, making sure the $source will also be processed. If ifdef is not loaded, then the pod section will be skipped, directly evalling the source.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://323875]
Approved by BazB
Front-paged by gmax
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (4)
As of 2014-09-02 03:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite cookbook is:










    Results (18 votes), past polls