Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Data structure examiner

by dbw (Beadle)
on Sep 05, 2007 at 15:55 UTC ( #637187=perlquestion: print w/ replies, xml ) Need Help??
dbw has asked for the wisdom of the Perl Monks concerning the following question:

This is part of my continuing quest to grok references in Perl.

I've written a function that pretty-prints a data structure recursively, akin to the behavior of the debugger's x command and the package GraphViz::Data::Grapher. By "data structure", I mean a reference that may contain references to references of hashes of references to arrays of typeglobs (for example).

Here's the script, which contains a test data structure:

#!/usr/bin/perl # lets you examine data recursively, following references. use warnings; use strict; # control behavior: # never go deeper than MAXDEPTH (saves us from circular shenanigans) use constant MAXDEPTH => 16; # cut off long hash keys use constant MAXHASHKEYLEN => 16; # if deferencing through multiple references ($$$$node), collapse the # references? use constant MULTICOLLAPSE => 1; # if deferencing through multiple references ($$$$node), indent # accordingly? use constant MULTIINDENT => 0; # show the indices on array elts? use constant ARRAYIDX => 1; # control the output format: # for references to references: \$var, \\\%hash, etc. use constant DEFERENCE => '->'; # for truncating long or nasty hash keys use constant TRUNCATE => '|..'; # for printing hashes use constant HASHOPEN => '{'; use constant HASHCLOSE => '}'; use constant HASHKEY => '=>'; use constant HASHSEP => ','; # for printing arrays use constant ARRAYOPEN => '['; use constant ARRAYCLOSE => ']'; use constant ARRAYSEP => ','; # for printing array indices use constant ARIDXOPEN => '['; use constant ARIDXCLOSE => ']'; # for printing strings use constant STROPEN => '"'; use constant STRCLOSE => '"'; # for printing undefined scalars use constant UNDEF => 'undef'; # for extra horizontal spacing (recommended ' ') use constant EXTRASPACE => ' '; # for vertical spacing (recommended "\n") use constant NEWLINE => "\n"; # indentation string (set to null if NEWLINE != "\n") use constant INDENT => ' 'x4; my $depth = 0; sub indent () { print INDENT x $depth; } sub newline () { print NEWLINE; } # is it a number? Put it in its own sub to keep warnings on elsewhere sub numberp ($) { no warnings; my $s = shift; return ($s == 0 && $s ne '0'); } # quote non-numbers, hack around "" sub formatscalar ($) { my $str = shift; if (! defined($str)) { # return UNDEF; } elsif (numberp($str)) { # not a number return STROPEN."$str".STRCLOSE; } else { # a number return $str; } } # truncate degenerate hash keys sub formatkey ($) { my $key = shift; if ( # truncate at newline ($key =~ s/\n.*//m) or # truncate at nonprinting chars ($key =~ s/[^[:print:]].*//m) or # truncate if it's too long ($key =~ s/(.{@{[MAXHASHKEYLEN - length TRUNCATE]}}).+/$1/) or # this can only serve to confuse ($key =~ s/@{[HASHKEY]}.*//) ) { # then we truncated. $key .= TRUNCATE; } elsif ( length $key == 0 ) { $key = UNDEF; } return $key; } # recursive, so prototype first sub printnode ($); sub printnode ($) { my $node = shift; my $ref = ref $node; if ($depth > MAXDEPTH) { warn 'Depth exceeded '.MAXDEPTH.', possible circular reference +'; return; } if ($ref eq 'HASH') { print HASHOPEN; newline; for (sort keys %{ $node }) { ++$depth; indent; print formatkey($_),EXTRASPACE.HASHKEY.EXTRASPACE +; printnode(${ $node }{$_}); --$depth; print HASHSEP; newline; } indent; print HASHCLOSE; } elsif ($ref eq 'ARRAY') { print ARRAYOPEN; newline; my $index = 0; for (@{ $node }) { ++$depth; indent; print "[@{[$index++]}]".EXTRASPACE if ARRAYIDX; printnode($_); --$depth; print ARRAYSEP; newline; } indent; print ARRAYCLOSE; } elsif ( $ref eq '' ) { print formatscalar($node); } elsif ( $ref eq 'CODE' or $ref eq 'LVALUE' ) { print $node; } elsif ( $ref eq 'GLOB' ) { print $$node; } elsif ( $ref eq 'REF' or $ref eq 'SCALAR') { if (MULTICOLLAPSE) { my $refdepth=1; while ($ref eq 'REF') { $node = $$node; $ref = ref $node; $refdepth++; } if ($ref eq 'SCALAR') { $node = $$node; } else { $refdepth--; } $depth += $refdepth if MULTIINDENT; print DEFERENCE.(($refdepth>1)?"^$refdepth":'').EXTRASPACE +; newline if MULTIINDENT; indent if MULTIINDENT; printnode($node); $depth -= $refdepth if MULTIINDENT; } else { print DEFERENCE.EXTRASPACE; $depth++ if MULTIINDENT; printnode($$node); $depth-- if MULTIINDENT; } } else { die "ref returned unknown value $ref" } } sub examinedata($) { my $node = shift; printnode($node); newline; } my $variable; my $variable2 = 140; my $hashref; $hashref = { key => $hashref }; sub code () {} examinedata( { hash => $hashref, test => { test => 1, }, test2 => { test3 => 4, test4 => [(1..10)], test5 => \&code, }, test6 => [ \\\\$variable2, \\\\$variable, \\\$variable2, \\\$variable, \\$variable2, \\$variable, \$variable2, \$variable, $variable2, $variable, \\&code, \&code, \\\\\\\[1..2], \\\[1..2], \\[1..2], \[1..2], [1..2], ], test7 => [ "test", \\*GLOB, \*GLOB, *GLOB, ], test8 => \\@{ [ "test", \\*GLOB, \*GLOB, *GLOB, ] }, "This is a long key." => undef, "This\n is a stupid key." => undef, "\n This one's stupider still!" => undef, "ASCII 020\020 = Data Link Escape. WTF?" => undef, "LEET=>HAX" => undef, $variable => \$variable, 100 => \$variable, } );
It works quite well on (relatively) simple data structures such as {array => [1..10], hash => { key => \\\( "string" )}} and so forth. However, I've got some questions:
  1. What about my test structure is emitting the warning "Use of uninitialized value in anonymous hash ({}) at ./examinedata.pl line 205." ??
  2. The documentation for ref (for which my code is essentially a wrapper) mentions a possible return value of LVALUE. When does this happen?
  3. The docs also mention different behavior "If the referenced object has been blessed into a package". (I don't really have a question about this, other than that I don't understand it. Grokking object-oriented Perl is my next quest.)
  4. How can I create a circular reference? If I do, is there a sane way to represent it?
  5. Is there a sexier way to handle code references? In particular, can I get it to print the name of the subroutine (if there is one)?
  6. How can I descend into globs?
  7. Is there a way I can use the debugger to stop at a certain point and execute my code (if, say I prefer to use it over the debugger's x command)?
Thanks!

/usr/bin/perl '-nemap$.%$_||redo,2..$.++;print$.--'

Comment on Data structure examiner
Select or Download Code
Re: Data structure examiner
by ikegami (Pope) on Sep 05, 2007 at 16:06 UTC

    $variable is never assigned a value, so the stringification of $variable when it's used as a keyissues a warning.

    $variable => \$variable,

    It was very easy to debug by removing elements from the hash until the error disappeared.

    By the way, are you aware of Data::Dumper? It does the same thing as your code.

    use Data::Dumper; $Data::Dumper::Useqq = 1; my $x = { ... }; print Dumper $x;
Re: Data structure examiner
by Fletch (Chancellor) on Sep 05, 2007 at 16:16 UTC

    Tangentially of note: your use of prototypes (save the () on newline possibly) is extraneous and subject to the same advice from the second paragraph that I just wrote yesterday . . .

    Update: And the () prototype on indent combined with the action-at-a-distance dependency on the file-scope lexical $depth raises the hackles as well (even if you were using prototypes needfully).

Re: Data structure examiner
by FunkyMonk (Canon) on Sep 05, 2007 at 16:21 UTC
    The documentation for ref (for which my code is essentially a wrapper) mentions a possible return value of LVALUE. When does this happen?
    An LVALUE is something that can be assigned to (ie can appear on the left hand side of =. An easy way to see it returned from ref is by...
    my $str = "hello"; print ref \substr( $str, 0, 2 );

    The docs also mention different behavior "If the referenced object has been blessed into a package". (I don't really have a question about this, other than that I don't understand it. Grokking object-oriented Perl is my next quest.)
    Compare the output from these two print statements
    print ref {}; print ref bless {}, "MyObject";

    How can I create a circular reference? If I do, is there a sane way to represent it?
    See what Data::Dumper does with circular references? DD is usually quite sensible.
    Is there a sexier way to handle code references? In particular, can I get it to print the name of the subroutine (if there is one)?
    You could look at B::Deparse, which will decompile a coderef.

    Update: I don't think you can get the sub's name, though

    Is there a way I can use the debugger to stop at a certain point and execute my code (if, say I prefer to use it over the debugger's x command)?
    Yes. If you start the program running under the debugger, $DB::single = 1 will force entry into the debugger. You can then execute any Perl code you like. (It might be DB::Single, but I'm fairly sure it's lowercase 's'.)

    update: fixed a couple of typos & added deparse disclaimer

Re: Data structure examiner
by TGI (Vicar) on Sep 05, 2007 at 17:58 UTC

    You probably want to use Scalar::Util's reftype instead of ref. It will handle blessed references properly for you.

    For example:

    use Scalar::Util qw(reftype); my $foo = bless {}, 'Artichoke'; print join "\n", ref $foo, reftype $foo, ; __END__ # Output: Artichoke HASH


    TGI says moo

Re: Data structure examiner
by planetscape (Canon) on Sep 05, 2007 at 18:27 UTC
Re: Data structure examiner
by bsb (Priest) on Sep 06, 2007 at 03:49 UTC
Re: Data structure examiner
by pemungkah (Priest) on Sep 06, 2007 at 04:03 UTC
    And there's GraphViz::Data::Structure, if you want pictures instead of text. Integrate that into the debugger with Devel::Command::Viz. You might find the internals of that module interesting; I stole some code from dumpvar.pl to handle coderefs and globs.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://637187]
Approved by ww
Front-paged by grinder
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (16)
As of 2014-07-31 14:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (249 votes), past polls