Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Data structure examiner

by dbw (Beadle)
on Sep 05, 2007 at 15:55 UTC ( #637187=perlquestion: print w/ replies, xml ) Need Help??
dbw has asked for the wisdom of the Perl Monks concerning the following question:

This is part of my continuing quest to grok references in Perl.

I've written a function that pretty-prints a data structure recursively, akin to the behavior of the debugger's x command and the package GraphViz::Data::Grapher. By "data structure", I mean a reference that may contain references to references of hashes of references to arrays of typeglobs (for example).

Here's the script, which contains a test data structure:

#!/usr/bin/perl # lets you examine data recursively, following references. use warnings; use strict; # control behavior: # never go deeper than MAXDEPTH (saves us from circular shenanigans) use constant MAXDEPTH => 16; # cut off long hash keys use constant MAXHASHKEYLEN => 16; # if deferencing through multiple references ($$$$node), collapse the # references? use constant MULTICOLLAPSE => 1; # if deferencing through multiple references ($$$$node), indent # accordingly? use constant MULTIINDENT => 0; # show the indices on array elts? use constant ARRAYIDX => 1; # control the output format: # for references to references: \$var, \\\%hash, etc. use constant DEFERENCE => '->'; # for truncating long or nasty hash keys use constant TRUNCATE => '|..'; # for printing hashes use constant HASHOPEN => '{'; use constant HASHCLOSE => '}'; use constant HASHKEY => '=>'; use constant HASHSEP => ','; # for printing arrays use constant ARRAYOPEN => '['; use constant ARRAYCLOSE => ']'; use constant ARRAYSEP => ','; # for printing array indices use constant ARIDXOPEN => '['; use constant ARIDXCLOSE => ']'; # for printing strings use constant STROPEN => '"'; use constant STRCLOSE => '"'; # for printing undefined scalars use constant UNDEF => 'undef'; # for extra horizontal spacing (recommended ' ') use constant EXTRASPACE => ' '; # for vertical spacing (recommended "\n") use constant NEWLINE => "\n"; # indentation string (set to null if NEWLINE != "\n") use constant INDENT => ' 'x4; my $depth = 0; sub indent () { print INDENT x $depth; } sub newline () { print NEWLINE; } # is it a number? Put it in its own sub to keep warnings on elsewhere sub numberp ($) { no warnings; my $s = shift; return ($s == 0 && $s ne '0'); } # quote non-numbers, hack around "" sub formatscalar ($) { my $str = shift; if (! defined($str)) { # return UNDEF; } elsif (numberp($str)) { # not a number return STROPEN."$str".STRCLOSE; } else { # a number return $str; } } # truncate degenerate hash keys sub formatkey ($) { my $key = shift; if ( # truncate at newline ($key =~ s/\n.*//m) or # truncate at nonprinting chars ($key =~ s/[^[:print:]].*//m) or # truncate if it's too long ($key =~ s/(.{@{[MAXHASHKEYLEN - length TRUNCATE]}}).+/$1/) or # this can only serve to confuse ($key =~ s/@{[HASHKEY]}.*//) ) { # then we truncated. $key .= TRUNCATE; } elsif ( length $key == 0 ) { $key = UNDEF; } return $key; } # recursive, so prototype first sub printnode ($); sub printnode ($) { my $node = shift; my $ref = ref $node; if ($depth > MAXDEPTH) { warn 'Depth exceeded '.MAXDEPTH.', possible circular reference +'; return; } if ($ref eq 'HASH') { print HASHOPEN; newline; for (sort keys %{ $node }) { ++$depth; indent; print formatkey($_),EXTRASPACE.HASHKEY.EXTRASPACE +; printnode(${ $node }{$_}); --$depth; print HASHSEP; newline; } indent; print HASHCLOSE; } elsif ($ref eq 'ARRAY') { print ARRAYOPEN; newline; my $index = 0; for (@{ $node }) { ++$depth; indent; print "[@{[$index++]}]".EXTRASPACE if ARRAYIDX; printnode($_); --$depth; print ARRAYSEP; newline; } indent; print ARRAYCLOSE; } elsif ( $ref eq '' ) { print formatscalar($node); } elsif ( $ref eq 'CODE' or $ref eq 'LVALUE' ) { print $node; } elsif ( $ref eq 'GLOB' ) { print $$node; } elsif ( $ref eq 'REF' or $ref eq 'SCALAR') { if (MULTICOLLAPSE) { my $refdepth=1; while ($ref eq 'REF') { $node = $$node; $ref = ref $node; $refdepth++; } if ($ref eq 'SCALAR') { $node = $$node; } else { $refdepth--; } $depth += $refdepth if MULTIINDENT; print DEFERENCE.(($refdepth>1)?"^$refdepth":'').EXTRASPACE +; newline if MULTIINDENT; indent if MULTIINDENT; printnode($node); $depth -= $refdepth if MULTIINDENT; } else { print DEFERENCE.EXTRASPACE; $depth++ if MULTIINDENT; printnode($$node); $depth-- if MULTIINDENT; } } else { die "ref returned unknown value $ref" } } sub examinedata($) { my $node = shift; printnode($node); newline; } my $variable; my $variable2 = 140; my $hashref; $hashref = { key => $hashref }; sub code () {} examinedata( { hash => $hashref, test => { test => 1, }, test2 => { test3 => 4, test4 => [(1..10)], test5 => \&code, }, test6 => [ \\\\$variable2, \\\\$variable, \\\$variable2, \\\$variable, \\$variable2, \\$variable, \$variable2, \$variable, $variable2, $variable, \\&code, \&code, \\\\\\\[1..2], \\\[1..2], \\[1..2], \[1..2], [1..2], ], test7 => [ "test", \\*GLOB, \*GLOB, *GLOB, ], test8 => \\@{ [ "test", \\*GLOB, \*GLOB, *GLOB, ] }, "This is a long key." => undef, "This\n is a stupid key." => undef, "\n This one's stupider still!" => undef, "ASCII 020\020 = Data Link Escape. WTF?" => undef, "LEET=>HAX" => undef, $variable => \$variable, 100 => \$variable, } );
It works quite well on (relatively) simple data structures such as {array => [1..10], hash => { key => \\\( "string" )}} and so forth. However, I've got some questions:
  1. What about my test structure is emitting the warning "Use of uninitialized value in anonymous hash ({}) at ./examinedata.pl line 205." ??
  2. The documentation for ref (for which my code is essentially a wrapper) mentions a possible return value of LVALUE. When does this happen?
  3. The docs also mention different behavior "If the referenced object has been blessed into a package". (I don't really have a question about this, other than that I don't understand it. Grokking object-oriented Perl is my next quest.)
  4. How can I create a circular reference? If I do, is there a sane way to represent it?
  5. Is there a sexier way to handle code references? In particular, can I get it to print the name of the subroutine (if there is one)?
  6. How can I descend into globs?
  7. Is there a way I can use the debugger to stop at a certain point and execute my code (if, say I prefer to use it over the debugger's x command)?
Thanks!

/usr/bin/perl '-nemap$.%$_||redo,2..$.++;print$.--'

Comment on Data structure examiner
Select or Download Code
Re: Data structure examiner
by ikegami (Pope) on Sep 05, 2007 at 16:06 UTC

    $variable is never assigned a value, so the stringification of $variable when it's used as a keyissues a warning.

    $variable => \$variable,

    It was very easy to debug by removing elements from the hash until the error disappeared.

    By the way, are you aware of Data::Dumper? It does the same thing as your code.

    use Data::Dumper; $Data::Dumper::Useqq = 1; my $x = { ... }; print Dumper $x;
Re: Data structure examiner
by Fletch (Chancellor) on Sep 05, 2007 at 16:16 UTC

    Tangentially of note: your use of prototypes (save the () on newline possibly) is extraneous and subject to the same advice from the second paragraph that I just wrote yesterday . . .

    Update: And the () prototype on indent combined with the action-at-a-distance dependency on the file-scope lexical $depth raises the hackles as well (even if you were using prototypes needfully).

Re: Data structure examiner
by FunkyMonk (Canon) on Sep 05, 2007 at 16:21 UTC
    The documentation for ref (for which my code is essentially a wrapper) mentions a possible return value of LVALUE. When does this happen?
    An LVALUE is something that can be assigned to (ie can appear on the left hand side of =. An easy way to see it returned from ref is by...
    my $str = "hello"; print ref \substr( $str, 0, 2 );

    The docs also mention different behavior "If the referenced object has been blessed into a package". (I don't really have a question about this, other than that I don't understand it. Grokking object-oriented Perl is my next quest.)
    Compare the output from these two print statements
    print ref {}; print ref bless {}, "MyObject";

    How can I create a circular reference? If I do, is there a sane way to represent it?
    See what Data::Dumper does with circular references? DD is usually quite sensible.
    Is there a sexier way to handle code references? In particular, can I get it to print the name of the subroutine (if there is one)?
    You could look at B::Deparse, which will decompile a coderef.

    Update: I don't think you can get the sub's name, though

    Is there a way I can use the debugger to stop at a certain point and execute my code (if, say I prefer to use it over the debugger's x command)?
    Yes. If you start the program running under the debugger, $DB::single = 1 will force entry into the debugger. You can then execute any Perl code you like. (It might be DB::Single, but I'm fairly sure it's lowercase 's'.)

    update: fixed a couple of typos & added deparse disclaimer

Re: Data structure examiner
by TGI (Vicar) on Sep 05, 2007 at 17:58 UTC

    You probably want to use Scalar::Util's reftype instead of ref. It will handle blessed references properly for you.

    For example:

    use Scalar::Util qw(reftype); my $foo = bless {}, 'Artichoke'; print join "\n", ref $foo, reftype $foo, ; __END__ # Output: Artichoke HASH


    TGI says moo

Re: Data structure examiner
by planetscape (Canon) on Sep 05, 2007 at 18:27 UTC
Re: Data structure examiner
by bsb (Priest) on Sep 06, 2007 at 03:49 UTC
Re: Data structure examiner
by pemungkah (Priest) on Sep 06, 2007 at 04:03 UTC
    And there's GraphViz::Data::Structure, if you want pictures instead of text. Integrate that into the debugger with Devel::Command::Viz. You might find the internals of that module interesting; I stole some code from dumpvar.pl to handle coderefs and globs.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://637187]
Approved by ww
Front-paged by grinder
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (12)
As of 2014-07-10 09:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (203 votes), past polls