Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re^2: In need of a Dumper that has no pretentions to being anything else.

by BrowserUk (Pope)
on Feb 23, 2005 at 02:11 UTC ( #433547=note: print w/ replies, xml ) Need Help??


in reply to Re: In need of a Dumper that has no pretentions to being anything else.
in thread In need of a Dumper that has no pretentions to being anything else.

if a structure contains a reference to itself ... otherwise it would loop forever.

That's my problem, if I give it a self-referential structure--I won't.

I wouldn't worry about the memory use of Data::Dumper.

Um. Er. But... it keeps crashing my program by exhausting all the memory! But that's okay fergal says: "Don't worry about it!".

I assume that means you don't know a proper dumper module?


Examine what is said, not who speaks.
Silence betokens consent.
Love the truth but pardon error.


Comment on Re^2: In need of a Dumper that has no pretentions to being anything else.
Re^3: In need of a Dumper that has no pretentions to being anything else.
by merlyn (Sage) on Feb 23, 2005 at 02:14 UTC
      but your request is self-conflicting, and you haven't said exactly what you don't like about Data::Dumper.

      Actually, I thought I had spelt out what I was looking for pretty carefully.

      I am dealing with a huge (> 500 MB) heavily nested data structure consisting of lots of small hashes and arrays. Data::Dumper

    • consumes huge amounts of memory (pushing my machine into swaping) checking for circular references when I know there will be none.
    • Either dumps everything one element per line indented, or totally flattened without any structure.
    • Produces this
      use Math::Pari qw[ :int factorint sqrtint divisors PARI ]; $f = factorint 1000000; print Dumper $f; $VAR1 = bless( [ bless( [ bless( do{\(my $o = 33884400)}, 'Math::Pari' + ), bless( do{\(my $o = 33884376)}, 'Math::Pari' + ) ], 'Math::Pari' ), bless( [ bless( do{\(my $o = 33884388)}, 'Math::Pari' + ), bless( do{\(my $o = 33884364)}, 'Math::Pari' + ) ], 'Math::Pari' ) ], 'Math::Pari' ); Attempt to free unreferenced scalar: SV 0x19f41f4 at c:\Perl\bin\p1.pl + line 14, <STDIN> line 3. Attempt to free unreferenced scalar: SV 0x19f42cc at c:\Perl\bin\p1.pl + line 14, <STDIN> line 3.

      Or this

      $Data::Dumper::deepcopy=1; print Dumper $f; $VAR1 = bless( [ bless( [ bless( do{\(my $o = 33884400)}, 'Math::Pari' + ), bless( do{\(my $o = 33884376)}, 'Math::Pari' + ) ], 'Math::Pari' ), bless( [ bless( do{\(my $o = 33884388)}, 'Math::Pari' + ), bless( do{\(my $o = 33884364)}, 'Math::Pari' + ) ], 'Math::Pari' ) ], 'Math::Pari' ); Attempt to free unreferenced scalar: SV 0x19f43bc at c:\Perl\bin\p1.pl + line 14, <STDIN> line 5.

      Or this

      $Data::Dumper::Indent=0; print Dumper $f; $VAR1 = bless( [bless( [bless( do{\(my $o = 33884400)}, 'Math::Pari' ) +,bless( do{\(my $o = 33884376)}, 'Math::Pari' )], 'Math::Pari' ),bles +s( [bless( do{\(my $o = 33884388)}, 'Math::Pari' ),bless( do{\(my $o += 33884364)}, 'Math::Pari' )], 'Math::Pari' )], 'Math::Pari' ); Attempt to free unreferenced scalar: SV 0x19f37c8 at c:\Perl\bin\p1.pl + line 14, <STDIN> line 7.

      When what I want is something more akin to this:

      print "[@$_]" for @$f; [2 5] [6 6]

      Or this

      print "[@{[ join', ', map{ \"[@$_]\" } @$f ]}]"; [[2 5], [6 6]]

      Except that there are thousands of arrays at varying depths of nesting.

      I can write one myself, perhaps based around Data::Rmap or similar, but I thought I look and see if there is an existing one available. My search didn't turn up anything promising, but it seems a reasonably simple enough requirement that someone might know or have one already written?


      Examine what is said, not who speaks.
      Silence betokens consent.
      Love the truth but pardon error.
        Apart from the "Attempt to free unreferenced scalar" stuff (which seems like a bug) what's wrong with this? Math::Pari objects are just a simple scalar blessed into a class as far as Perl is concerned. Nothing is going to dump them out as anything more sensible unless it specififcally knows how to understand Math::Pari objects.

        You'll need to play with DD's Freezer stuff to get anywhere on that.

Re^3: In need of a Dumper that has no pretentions to being anything else.
by fergal (Chaplain) on Feb 23, 2005 at 02:19 UTC
    That sounds like a serious bug in DD. In order to catch circularity it only needs to keep a 4 byte hash key and a short string for every reference in the structure. Unless your structure is full of almost empty arrays, hashes and scalar refs, this should take less memory than your structure.

    Update: Just dug in DD and I see it's storing a 2 element arrayref for each ref it find. That's still not very much and unless you have an unusual structure, it should be negligible. The only other thing is that it stores a copy of the hash key in that 2 element array. If your keys are very big then that could be a problem however you'd still only be at most doubling things.

    Update again It was a DD bug, see below

      Okay. Try this:

      #! perl -slw use strict; use Data::Dumper; my %h; $h{ $_ } = [ 1 .. 10 ] for 'aaaa' .. 'zzzz'; print Dumper \%h;

      Add whatever Dumper options you like. Prior to the Dump, this hash with somewhat under 500,000 keys and a smallish array for each value consumes ~ 177 MB of ram.

      Attempting to dump it pushes that memory consumption (transiently on Win32) to well over 700 800 MB (and still going and consumption still climbing after 1/2 3/4 hour!).

      My real hash has close to a million keys and nested arrays. It consumes over 500 MB to start with. Trying to dump it blows 2GB of virtual memory before it crashes Perl--and the time taken even before swapping starts is measured in the half-lifes of Plutonium. I'd like to avoid both. I just need to be able to dump the structure to a file. Preferably in a reasonably compact format.


      Examine what is said, not who speaks.
      Silence betokens consent.
      Love the truth but pardon error.
        Try the following patch to Data/Dumper.pm . Also, turn off Deepcopy as in my first post. It should make a huge difference. I'll file a bug and submit the patch.
        --- ./ok/Data/Dumper.pm.orig 2005-02-22 20:17:13.000000000 -0800 +++ ./ok/Data/Dumper.pm 2005-02-22 20:16:47.000000000 -0800 @@ -405,7 +405,7 @@ my $ref = \$_[1]; # first, catalog the scalar - if ($name ne '') { + if ($s->{deepcopy} and ($name ne '')) { ($id) = ("$ref" =~ /\(([^\(]*)\)$/); if (exists $s->{seen}{$id}) { if ($s->{seen}{$id}[2]) {
        Updated And do $Data::Dumper::Useperl = 1;

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://433547]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (8)
As of 2014-07-14 07:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (256 votes), past polls