Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

RFC: Data::Dumper::Simple

by Ovid (Cardinal)
on Jul 31, 2004 at 04:18 UTC ( #378865=perlmeditation: print w/replies, xml ) Need Help??

I've had an itch I've wanted to scratch for a long time. So it's scratched. I'm tired of typing the following when I want named variables in Data::Dumper output:

warn Data::Dumper->Dump( [$foo, \%this, \@array, \%that], [qw/*foo *that *array *this/] );

Now I can just type:

use Data::Dumper::Simple; warn Dumper($foo, %this, @array, %that);

Note that you don't even need to use references. The variables do not flatten into lists.

POD follows. This works as advertised, but I don't have a bundle uploaded yet. If you're interested in this or have suggestions, please let me know.

Update: Data::Dumper::Simple is now on the CPAN.


NAME

Data::Dumper::Simple - Perl extension for dumping variables


SYNOPSIS

  use Data::Dumper::Simple;
  warn Dumper($scalar, @array, %hash);


ABSTRACT

  This module allow the user to dump variables in a Data::Dumper format.
  Unlike the default behavior of Data::Dumper, the variables are named
  (instead of $VAR1, $VAR2, etc.)  Data::Dumper provides an extended 
  interface that allows the programmer to name the variables, but this
  interface requires a lot of typing and is prone to tyops (sic).  This 
  module fixes that.


DESCRIPTION

Data::Dumper::Simple is actually a source filter that replaces all instances of Dumper($some, @args) in your code with a call to Data::Dumper->Dump(). You can use the one function provided to make dumping variables for debugging a trivial task.

The Problem

Frequently, we use Data::Dumper to dump out some variables while debugging. When this happens, we often do this:

use Data::Dumper; warn Dumper($foo, $bar, $baz);

And we get simple output like:

$VAR1 = 3; $VAR2 = 2; $VAR3 = 1;

While this is usually what we want, this can be confusing if we forget which variable corresponds to which variable printed. To get around this, there is an extended interface to Data::Dumper:

warn Data::Dumper->Dump( [$foo, $bar, $baz], [qw/*foo *bar *baz/] );

This provides much more useful output.

$foo = 3; $bar = 2; $baz = 1;

(There's more control over the output than what I've shown.)

You can even use this to output more complex data structures:

warn Data::Dumper->Dump( [$foo, \@array], [qw/*foo *array/] );

And get something like this:

$foo = 3; @array = ( 8, 'Ovid' );

Unfortunately, this can involve a lot of annoying typing.

warn Data::Dumper->Dump( [$foo, \%this, \@array, \%that], [qw/*foo *that *array *this/] );

You'll also notice a typo in the second array ref which can cause great confusion while debugging.

The Solution

With Data::Dumper::Simple you can do this instead:

use Data::Dumper::Simple. warn Dumper($scalar, @array, %hash);

Note that there's no need to even take a reference to the variables. The output of the above resembles this (sample data, of course):

$scalar = 'Ovid'; @array = ( 'Data', 'Dumper', 'Simple', 'Rocks!' ); %hash = ( 'it' => 'does', 'I' => 'hope', 'at' => 'least' );

Taking a reference to an array or hash is effectively a no-op, but a scalar containing a reference works as expected:

my $foo = { hash => 'ref' }; my @foo = qw/foo bar baz/; warn Dumper ($foo, \@foo);

Produces:

$foo = { 'hash' => 'ref' }; @foo = ( 'foo', 'bar', 'baz' );

This is to ensure that similarly named variables are properly disambiguated in the output.

EXPORT

The only thing exported is the Dumper() function.

Well, actually that's not really true. Nothing is exported. However, a source filter is used to automatically rewrite any apparent calls to Dumper() so that it just Does The Right Thing.


SEE ALSO


BUGS AND CAVEATS

This module uses a source filter. If you don't like that, don't use this.

There are no known bugs but there probably are some as this is Alpha Code. As for limitations, do not try to call Dumper() with a subroutine in the argument list:

Dumper($foo, some_sub()); # Bad!

The filter gets confused by the parentheses. Your author was going to fix this but it became apparent that there was no way that Dumper() could figure out how to name the return values from the subroutines, thus ensuring further breakage. So don't do that.

Getting really crazy by using multiple enreferencing will confuse things (e.g., \\\\\\$foo), don't do that, either. I might use Text::Balanced at some point to fix this if it's an issue.

Note that this is not a drop-in replacement for Data::Dumper. If you need the power of that module, use it.


AUTHOR

Curtis ``Ovid'' Poe, <eop_divo_sitruc@yahoo.com>

Reverse the name to email me.


COPYRIGHT AND LICENSE

Copyright 2004 by Curtis ``Ovid'' Poe

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

Cheers,
Ovid

New address of my CGI Course.

Replies are listed 'Best First'.
Re: RFC: Data::Dumper::Simple
by BrowserUk (Pope) on Jul 31, 2004 at 20:34 UTC

    I've become a great fan of Devel::StealthDebug (except the name).

    use Devel::StealthDebug; ... #!dump( $var, \@foo, \%bas )! ... #!watch( $volatile )! ## Only traced when it changes. ... ----outputs---- $ $var = 7; $\@foo = [ 123, 456, ]; $\%bas = { 'a' => 1, 'b' => 2, }; in Test::new at Test.pm line 31

    Comment out the use line and all the tracing disappears. It can also be enabled/disabled by use line parameter, environment variable or presence/absence of a filename.

    No run-time intrusions at all when disabled, but easily re-enabled. Watches are especially useful for cutting down the volumes of trace, though can be a little temperamental. That's only scratched the surface of it's capabilities; it also has #!assert( condition ), #!when( varname, op, value )! and #!emit("sometext")! pragmas.

    It's dump() format is preferrable to most of the dumpers-that-wannabe-serializers I've tried, and it doesn't exact the huge memory overhead that serialisers require for circular reference detection when dumping complex structures.

    It's also filter-based, but like you, I don't have a problem with that for debugging purposes.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    "Memory, processor, disk in that order on the hardware side. Algorithm, algoritm, algorithm on the code side." - tachyon

      Note that this strikes me as a much better use of a filter. It's not trying to find substitutable bits inside the source, just generating code from comments (though even being certain about what is a comment or not is not nearly trivial). I'd be even more appreciative if its directive syntax didn't look like code, particularly if it was clearly restricted to things the module can parse with 100% certainty.

      If I were to write such a module, I'd wrap the debug statements in POD, because while that is just as non-trivial as comments, pretty much all existing POD parsers misparse source in known, predictable ways.

      I don't think the debugging scenario makes brittle approaches excusable; the potential for subtle breakage introduced by filters would be doubly maddening if I'm already looking into another problem. I want to steer particularly clear of Heisenbugs in instrumentation code.

      Makeshifts last the longest.

        I think that the requirement for plings (!) at either end of each comment embedded directive make the parsing fairly unambiguous. The fact that it allows for 2 or more ##s to preceed the directive pleases me as I tend to use 2 most of the time. It enables me to slightly mis-define the comment card in my syntax highlighter which reduces the chance of it mis-recognising $#array and similar as the start of a comment.

        I don't like the POD idea, but then I am not a fan of POD anyway. The need to use 5 lines of source-space to embed a single line of POD has always bugged me immensely.


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail
        "Memory, processor, disk in that order on the hardware side. Algorithm, algoritm, algorithm on the code side." - tachyon
      I'm still coming to a decision as to exactly how I want debug constructs in my code. Looks like I'll have to investigate adding this to my bag'o'tricks.

      Some of my code never see's the light of day.
      Some goes into highly visible (e.g. your phone bill) production system.

      I am trying to establish my own set of guidelines on code 'noise', that doesn't differ by too much across those two disciplines. I know there will necessarily need to be a difference, I want to minimise that difference.

      • I want maximum debugging during development
      • I want minimum noise during production
      • making code changes to change this behaviour is unacceptable - sometimes (client sites especially) that is just not an option, so plan ahead - don't do it.

      Currently I mainly use STDERR->print(Dumper(...)); in 'private' code, and Log::Log4perl for everything else I think might ever be seen/used by anyone else.

      All our internally developed libraries use Log::Log4perl - if someone in the company uses one of our libraries, they need to configure a logger.

      Perhaps I am not properly separating the disciplines of logging and debugging - I feel that everything your code reports back is a debug statement - to somebody at some level - some are aimed at developers, some at sysadmins.

      use brain;

        I understand your dilemma. I've being vacillating between the 'log everything in case it goes wrong' and 'turn it all off for production, cos it slows everything down, causes diskspace maintainance problems, noone ever reads them and thing you need to see is never in the log anyway' camps for 20 years.

        In the early days, processors were slow and disks were small, so logging was minimal by necessity. More recently, disk-space got cheap, compression got better and the processors faster. Extensive logging became attractive.

        I tried it on a couple of projects but came to the conclusion that for the most part extensive logging os pretty pointless. My reasoning goes like this:

        • Faster processors, bigger disks and better compression just mean that you can produce and store stuff that nobody will ever read at an even faster rate.
        • What you need is never in the logs.

          Unless you log every line, variable and every variable change, you always end up adding more logging or turning more of it on and trying to re-create the problem anyway. You might as well leave it all turned off and then turn it on when you need to.

        • In general, the more there is in the log, the harder it becomes to follow it.

          Stage 1) I want a course grained "So far, so good. So far, so good" heartbeat level across the whole application until I can track down roughly where things are going belly-up.

          Stage 2) I want to be able to turn more detail tracing on, bracketing either side of the suspected point of failure--but with everything else turned off. Otherwise you get the "can't see the wood for the trees" syndrome.

          Stage 3) I almost always want to add some extra watches or assertions to track down and the confirm the bug prior to changing the code, and reconfirming after it is fixed.

          Historically, these additions get deleted and have to be put back when the bug (or a new one) reappears. Or they get left in, commented out, and have to be manually reenabled.

        I want to be able to switch tracing on and off across a span of lines, subroutine or package. When if off, it should leave no artifacts in the code.

        Currently, D:SD doesn't easily allow this range-of-lines, or subroutine enablement, but I think that it's filter-based nature lends itself to this modification. The package-level mechanism had me foxed for a while, but thanks to theorbtwo's response to my recent question, I now think I figured out how to do this. I may have a go a tweaking D::SD for lines and subs, and if it seems to work okay, I'll offer the mod back to the author.

        The only other thing missing is a debug level selector.

        1. Heartbeat tick through out the code.
        2. Entry/exit point values.
        3. Major logic flow.
        4. Individual assertions and watches.

        More levels than that and it becomes a labour of love to categorise them--and nobody can agree on the categorisations anyway. Prefereably, the first 3 levels should be install automagically. With the fourth level evolving over time as required, but remaning in-situ in perpetuity.

        At least, that where I think I stand on the subject. Tomorrow I may vascillate again :)


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail
        "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
Re: RFC: Data::Dumper::Simple
by Aristotle (Chancellor) on Jul 31, 2004 at 19:10 UTC

    To be honest I shy away from anything that relies on Filter::*.

    I don't see it as particularly necessary here either, as to me, the greatest annoyance has always seemed to be the fact that named variable output requires passing two parallel arrays — rather than, as would seem more natural, hash-like named parameters. Something like this would suffice for me:

    warn Data::Dumper->Dump( foo => $foo, this => \%this, array => \@array, that => \%that, );

    This also makes the subtle typo you demonstrated much less likely.

    (Assume that you are allowed to say '$foo' => $foo where you need the control.)

    I am guessing this is all because whoever wrote Data::Dumper wanted the Dump() function to more-or-less DWIM, so that you wouldn't have to remember to call different functions for named or anonymous output. That could easily be achieved as well — Dump() could expect exactly one array- or hash-ref for anonymous or named output, respectively. This gives

    warn Dump { foo => $foo, this => \%this, array => \@array, that => \%that, }; #vs warn Dump [ $foo, \%this, \@array, \%that ];

    The price is, of course, that you have to use the appropriate set of delimiters.

    Another thing that bugs me about Data::Dumper is the OO interface. The constructor takes no options, you have to set them all with one mutator call each. I'd much prefer the former:

    my $dumper = Data::Dumper->new( purity => 1, terse => 1, deepcopy => 1 + ); warn $dumper->Dump( [ $foo, \%this, \@array, \%that ] );

    Or something along these lines. There are numerous problems with my propositions here; I've never entirely thought this all through. All I'm really saying is that Data::Dumper really has a pretty awful interface, and that it should be possible to sanitize it and keep it DWIMmish without resorting to filters.

    Makeshifts last the longest.

      Generally I would agree with you regarding using a source filter. It certainly isn't my first choice. However, it works in this case because this code is merely a very useful aid in quick debugging. It is not intended to be used for any sort of production work. As such, your interface would be good for a new Data::Dumper interface, but it does not satisfy my "quick debugging" need.

      warn Dump { foo => $foo, this => \%this, array => \@array, that => \%that, }; # versus Dumper($foo, %this, @array, %that);

      I certainly know which one I'd rather type :)

      Side note: it's on it's way to the CPAN (with a version number that I need to fix -- darn it!), but in the interim, you download it from my site and tell me how buggy it is.

      Cheers,
      Ovid

      New address of my CGI Course.

        Maybe there's a way to fugde it with PadWalker or something similar, instead.

        Makeshifts last the longest.

Re: RFC: Data::Dumper::Simple
by Dog and Pony (Priest) on Jul 31, 2004 at 14:58 UTC
    Seems to me this would be a worthy *::Simple module in that it provides an easy and intiutive "most common" subset of the Real ThingTM. Quickly examining structures that doesn't seem to behave the right way is the only thing I use Data::Dumper for, and this module makes it a bit easier to type and on the eyes. I'd use it.

    You have moved into a dark place.
    It is pitch black. You are likely to be eaten by a grue.
Re: RFC: Data::Dumper::Simple
by NetWallah (Canon) on Jul 31, 2004 at 16:43 UTC
    Wonderful! - I always hated the 2 issues with Data::Dumper - it wants a REFERENCE, and it wont easily tell you the variable NAME.

    There was one additional nit - which perhaps you can pick -the need to say "print Dumper (\$blah)" - why keep repeating the "print" or "warn" .. wouldn't is be nice to have an OO interface that remembers what you want to do .. something like:

    use Data::Dumper::Simple; my $Dump => new Data::Dumper::Simple( Output=>"warn", # or "print" # or: Output=\&CodeRef vars=($this,%that,@other) }; $Dump->() ; # Can't think of appropriate syntax here.. # Line above would do the equivalent of print Dumper ($$this,%that,@other); # If we no longer want to output %that, do $Dump->Removevars(%that); # or even $Dump->Addvars(@Something_Else);

        Earth first! (We'll rob the other planets later)

      There was one additional nit - which perhaps you can pick -the need to say "print Dumper (\$blah)" - why keep repeating the "print" or "warn" .. wouldn't is be nice to have an OO interface that remembers what you want to do.

      The latest version of Data::Dumper::Simple supports this.

      use Data::Dumper::Simple autowarn => 1; Dumper($scalar, @array, %hash);

      Or if you'd rather carp and you're already using a sub named Dumper:

      use Data::Dumper::Simple as => 'show', autowarn => 'carp'; show($scalar, %hash);

      There's not much of a savings if you're doing this just once, but if you are doing a lot of debugging, it can save quite a bit of time.

      Cheers,
      Ovid

      New address of my CGI Course.

Re: RFC: Data::Dumper::Simple
by dfaure (Chaplain) on Jul 31, 2004 at 23:50 UTC

    And what about this:

    #!perl -w use strict; my $scalar = 'dfaure'; my @array = ( 'Data', 'Dumper', 'Simple', 'Rocks!' ); my %hash = ( 'it' => 'does', 'I' => 'hope', 'at' => 'least' ); sub MyDumper { my @values; for (@values = @_) { s/^([%@])/\\$1/; $_ .= ','; } my @names; for (@names = @_) { s/^[%@]/*/; s/^\$//; } return eval "use Data::Dumper;Data::Dumper->Dump([@values],[qw/@na +mes/])"; } print MyDumper(qw/$scalar @array %hash/);

    ____
    HTH, Dominique
    My two favorites:
    If the only tool you have is a hammer, you will see every problem as a nail. --Abraham Maslow
    Bien faire, et le faire savoir...

      That fails because eval is limited to the current lexical scope, hence Aristotle's suggestion to use PadWalker.

      do { my @array = ( qw/foo bar/ ); print MyDumper(qw/@array/); }; sub MyDumper { my @values; for (@values = @_) { s/^([%@])/\\$1/; $_ .= ','; } my @names; for (@names = @_) { s/^[%@]/*/; s/^\$//; } + return eval "use Data::Dumper;Data::Dumper->Dump([@values],[qw/@na +mes/])"; }

      Cheers,
      Ovid

      New address of my CGI Course.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://378865]
Approved by elusion
Front-paged by gmax
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (2)
As of 2018-12-16 04:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    How many stories does it take before you've heard them all?







    Results (70 votes). Check out past polls.

    Notices?
    • (Sep 10, 2018 at 22:53 UTC) Welcome new users!