http://www.perlmonks.org?node_id=744845

Update: The updated version of this Meditation is at Re: RFC: Basic debugging checklist (updated) and the Tutorial version is now at Basic debugging checklist. Here is the untouched original:


Before I post this as a Tutorial, please help me to improve this by offering constructive feedback.

Are you new to Perl? Is your program misbehaving? Not sure where or how to begin debugging? Well, here is a concise checklist of tips and techniques to get you started.

This list is meant for debugging some of the most common Perl programming problems; it assumes no prior working experience with the Perl debugger (perldebtut). Think of it as a First Aid kit, rather than a fully-staffed state-of-the-art operating room.

These tips are meant to act as a guide to help you answer the following questions:

  1. Add the "stricture" pragmas (Use strict and warnings)
  2. use strict; use warnings; use diagnostics;
  3. Print the contents of variables
  4. print "$var\n"; print "@things\n"; # array with spaces between elements
  5. Check for unexpected whitespace
    • chomp, then print with colon delimiters for visibility
      chomp $var; print ":$var:\n";
    • Check for unprintable characters and identify them by their ASCII codes using ord
      print ":$_:", ord($_), "\n" for (split //, $str)
  6. Dump arrays, hashes and arbitrarily complex data structures
  7. use Data::Dumper; print Dumper(\%hash); print Dumper($ref);
  8. If you were expecting a reference, make sure it is the right kind (ARRAY, HASH, etc.)
  9. print ref $ref, "\n";
  10. Check to see if your code is what you thought it was: B::Deparse

  11. $ perl -MO=Deparse program.pl
  12. Check the return (error) status of your commands

    • open with $!
      open my $fh, '<', 'foo.txt' or die "can not open foo.txt: $!";
    • system and backticks (qx) with $?
      if (system $cmd) { print "Error: $? for command $cmd" } else { print "Command $cmd is OK" } $out = `$cmd`; print $? if $?;
    • eval with $@
      eval { do_something() }; warn $@ if $@;
  13. Demystify regular expressions using the CPAN module YAPE::Regex::Explain
  14. # what the heck does /^\s+$/ mean? use YAPE::Regex::Explain; print YAPE::Regex::Explain->new('/^\s+$/')->explain();
  15. Checklist for debugging when using CPAN modules:
    • Check the Bug List by following the module's "View Bugs" link.
    • Is your installed version the latest version? If not, check the change log by following the "Changes" link.
    • If a module provides status methods, check them in your code as you would check return status of built-in functions:
      use WWW::Mechanize; if ($mech->success()) { ... }

Replies are listed 'Best First'.
Re: RFC: Basic debugging checklist
by kyle (Abbot) on Feb 18, 2009 at 22:05 UTC

    Another good link for dumping a data structure is How can I visualize my complex data structure?

    I agree with the others that vanilla Data::Dumper isn't the best advice.

    I find a useful option to B::Deparse is -p to show where precedence rules have bitten me.

    I would say use delimiters even when you're not checking for unexpected white space. Which delimiters you use doesn't really matter, but I've most often seen balanced ones.

    print "[$foo] >>>$bar<<< {{$etc}}\n";
Re: RFC: Basic debugging checklist
by GrandFather (Saint) on Feb 18, 2009 at 20:22 UTC

    If you use good IDE then 2, 4 and 5 become either redundant or trivial without making any code changes.

    Running your code through PerlTidy can show up issues hidden by slack indentation or incorrectly paired tokens (quotes, brackets, ...).

    It often helps to comment out chunks of code or put temporary early exits into subs so you can focus on the problem area. That technique can be counter productive if you forget to restore the code though!

    Use a revision control system so that you can back out of large changes easily that may have been made while tracking a problem down.


    True laziness is hard work
      Running your code through PerlTidy can show up issues hidden by slack indentation or incorrectly paired tokens (quotes, brackets, ...).
      But that's redundant or trivial when using a good IDE (which doesn't have to be more advanced that a modern vi clone).
      It often helps to comment out chunks of code or put temporary early exits into subs so you can focus on the problem area.
      Excellent point. I seem to remember a node dedicated to this topic here at the Monastery, but I do not have it bookmarked, and my search-fu is weak. I also seem to remember that you were the author. I will continue looking because I think it would be useful to link to it here.

        You may be remembering:

        Start with the code that shows the problem and consider the minimum input and output requirements to demonstrate the problem. Then reduce the code and data as much as possible while still showing the problem.

        from I know what I mean. Why don't you?.


        True laziness is hard work
Re: RFC: Basic debugging checklist
by jplindstrom (Monsignor) on Feb 19, 2009 at 00:26 UTC
    Always, always print variables with visible delimiters so it's crystal clear when the values contain white space.

    print "title: ($title)\n"; print "title: '$title'\n";

    This goes for log files too, or anything that isn't presented to end users, really.

    In general, a good mindset when composing error reports or debug info is to spend five extra seconds to consider what the programmer is going to need to diagnose or recover from the problem.

    /J

Re: RFC: Basic debugging checklist (Data Dumber)
by tye (Sage) on Feb 18, 2009 at 20:27 UTC

    The default settings for Data::Dumper are pretty ugly. In particular, one should turn on Useqq if one is debugging unexpected string results. Indent=1 and Sortkeys=1 are also big improvements.

    - tye        

      That's why I prefer to use YAML when dumping structures for debugging. I find its output more readable (and more compact) than Data::Dumper. Or course, that just a personal preference.
Re: RFC: Basic debugging checklist
by Tanktalus (Canon) on Feb 18, 2009 at 23:32 UTC

    Regarding the hate-in with Data::Dumper, of which I've not found issue, my suggestion would be to simply add text as follows (or something to this effect):

    If you find the format of Data::Dumper to be too difficult, try installing and using Data::Dump, YAML, JSON, Data::DumpXML, among other possibilities, instead. They all fill the same need, with varying output.
    In my mind, it's more important to get them debugging than it is which tool they're using. Offer choices, but get the concepts across. Preferably with something they can start using immediately, and let them advance into other options.

    In that way, I'd suggest that you point out when a tool you're suggesting to use may need installation, such as YAPE::Regex::Explain in #8.

    Also, for #3, eliminate the word "colon" from the text, merely suggest "delimiters". Then you can give examples, "such as colons." Personally, I prefer square bracket delimiters ([]) - even if there are brackets in the text I'm printing, I usually find that I'm not confused, whereas trying to follow colons can give me eye-strain. Thus, I think that'd make another good example. (Well, I think it makes a better example, but I'm not writing this checklist ;-) )

Re: RFC: Basic debugging checklist
by jplindstrom (Monsignor) on Feb 19, 2009 at 01:30 UTC
    Use the Carp module to show you the stack trace, i.e. each sub routine call in the source code from which you started the program to where you are printing the debug output. Carp comes with Perl.
    use Carp qw/ cluck confess longmess /; # die of errors with stack backtrace confess("title: ($title)") # warn of errors with stack backtrace cluck("Before escaping title ($title)"); $title = $self->escape($title); cluck("After escaping title ($title)"); # longmess - return the message that cluck and confess produce $self->log_error( "title not set: " . longmess() );

    Carp on search.cpan.org appears to identify perl itself and doesn't provide POD for some reason, but see: perldoc Carp.

    Another useful module is Carp::Always, which makes all dies and warns emit stack traces.

    Note that this will change exception strings, so any eval BLOCK that checks for specific exception matches against $@ may fail because of this.

    Enable Carp::Always with a regular "use" statement in the code, or temporarily from the command line using one of these:

    perl -MCarp::Always your_program.pl #or, depending on the shell/CMD export PERL5OPT=-MCarp::Always set PERL5OPT=-MCarp::Always

    Or from within Emacs using the M-x setenv function. This is useful if you run Perl programs using the *compilation* mode.

    /J

Re: RFC: Basic debugging checklist
by Your Mother (Archbishop) on Feb 18, 2009 at 20:54 UTC

    Regarding #4, it also seems there is a pretty wide consensus (someone will clobber me if I'm wrong) that Data::Dumper is at this point in time "considered harmful" and one should reach for Data::Dump (or even YAML or JSON) instead.

      Consider yourself clobbered. Now, I'm the first to admit that the output of Data::Dumper is harder to read that YAML, but considering that most of the time I see programmers reach for Data::Dumper instead of any of the alternatives, I'm not inclined to agree on the "pretty wide consensus".

        You hit me in the ear!

        Sorry, bad choice of words. I meant "growing consensus" and I meant it among those who don't cargo cult. I've never even used Data::Dump but I've seen Data::Dumper mentioned quite pejoratively in comparisons lately. Data::Dumper's defaults are terrible (::Terse and ::Indent are usually necessary/desired) and using eval to bring the code back to life is, using this chestnut again so soon, "considered harmful."

Re: RFC: Basic debugging checklist
by JavaFan (Canon) on Feb 19, 2009 at 00:06 UTC
    Check for unprintable characters and identify them by their ASCII codes using ord
    print ":$_:", ord($_), "\n" for (split //, $str)
    That, IMO, makes it hard to spot unexpected unprintable characters because you turn ALL characters into numbers. I often do:
    my $copy = $str; $str =~ s/([^\x20-\x7E])/sprintf '\x{%02x}', ord $1/eg; print $str, "\n";
    which leaves all printable ASCII characters as is, and turns all other characters into hex escapes.
Re: RFC: Basic debugging checklist
by rcaputo (Chaplain) on Feb 19, 2009 at 06:40 UTC

    warn() is better than print() unless you have set $|=1. Buffering gets in the way if you're tracking down the location of a problem. Especially if it's a coredump.

Re: RFC: Basic debugging checklist (updated)
by toolic (Bishop) on Feb 19, 2009 at 18:19 UTC
    What a terrific response! I'm glad I have enough votes today: ++ to all.

    Every suggestion and opinion is valuable. When I post this as a Tutorial, I will link back to this Meditation so that all these valid discussions are available. Since I think this checklist will be most effective if I keep it as terse as possible, I have not captured all the reasons for using certain techniques in the updated meditation below. Nor have I included some of the more advanced methods mentioned.

    I was most surprised at the amount of discussions surrounding Data::Dumper. I use it all the time because it always Does What I Want, probably because my programs are much simpler than those of more advanced coders. One advantage of Data::Dumper over its CPAN counterparts is that it is a core module, and therefore, no installation is required. Since this checklist will be geared for beginners, I will mention Data::Dumper and acknowledge the more advanced alternatives. And I will stop being so lazy and try some of them myself to see what I'm missing!

    Rather than cluttering the Monastery with individual replies to all your replies, I will toss bouquets of "Thank You"'s to all who spent their time to remind me of and teach me new techniques:

    my @righteous_monks= qw( GrandFather tye YourMother kyle Tanktalus JavaFan jplindstrom rcaputo roho bart ELISHEVA tilly gwadej # from the CB );

    Are you new to Perl? Is your program misbehaving? Not sure where or how to begin debugging? Well, here is a concise checklist of tips and techniques to get you started.

    This list is meant for debugging some of the most common Perl programming problems; it assumes no prior working experience with the Perl debugger (perldebtut). Think of it as a First Aid kit, rather than a fully-staffed state-of-the-art operating room.

    These tips are meant to act as a guide to help you answer the following questions:

    • Are you sure your data is what you think it is?
    • Are you sure your code is what you think it is?
    • Are you inadvertently ignoring error and warning messages?
Re: RFC: Basic debugging checklist
by jplindstrom (Monsignor) on Feb 19, 2009 at 00:41 UTC
    In addition to mentioning the command line debugger, you could mention the very nice support for perldb in Emacs. I blogged about this recently when I made PerlySense debugger-aware (the PerlySense docs contain a summary of the most common debugger operations).

    A while back I blogged about how to limit the output when dumping huge data structures with Data::Dumper and the debugger x command. Feel free to incorporate any of that in this post. In particular the syntax of the .perldb config was nontrivial to figure out, so that may be useful to someone.

    That concludes the shameless plugs for today :)

    /J

Re: RFC: Basic debugging checklist
by roho (Bishop) on Feb 19, 2009 at 09:43 UTC
    Don't forget the great GUI debugger Devel::ptkdb by Andrew E. Page.

    "Its not how hard you work, its how much you get done."

Re: RFC: Basic debugging checklist
by McDarren (Abbot) on Feb 23, 2009 at 16:52 UTC
    I don't think any Perl debugging guide could be considered complete without a reference to brian's Guide to Solving Any Perl Problem.

    Also, for dealing with more complex data structures, I'm personally quite attached to Data::Dumper::Simple
    (yes, I know it uses source-filtering and is therefore by definition _evil_, but as a debugging tool I love it)

    Cheers,
    Darren :)