Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Perl object memory overhead

by Ineffectual (Scribe)
on Mar 27, 2014 at 22:46 UTC ( [id://1080016]=perlquestion: print w/replies, xml ) Need Help??

Ineffectual has asked for the wisdom of the Perl Monks concerning the following question:

Hi all!

If I have 3 million small hashes and change them into 3 million objects (of the same type), how much would performance and memory be impacted?

Thanks!

Replies are listed 'Best First'.
Re: Perl object memory overhead
by shmem (Chancellor) on Mar 27, 2014 at 23:21 UTC

    There's no difference, since an allocated hash is already set up for being blessed.

    push @l,{} for 1..3e7; # allocate 3 mill. anon hashes system "ps -o vsz= -p $$"; # get memory usage of process $_ = bless $_ for @l; # bless each hash into 'main' system "ps -o vsz= -p $$"; # again get memory usage __END__ 2650968 2650968

    which shows: making an anonymous hash into an object means blessing it into a namespace. That operation signifies no overhead as far as memory is concerned, since the namespace bits are already allocated in the first place.

    update: added comments

    update: AnomalousMonk correctly noted that 3e7 for 3 millions is wrong by an order of magnitude. Who else noticed? ;)

    perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
      > system "ps -o vsz= -p $$";

      do you happen to know how to best replicate this on windows?

      FWIW: I was experimenting with something like

      PS D:\tmp\pm> perl -e'print `powershell (ps -id $$).pm`' 1564672

      not sure if there is a better way (or if it's fully equivalent).

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

        Sadly (or not :-) no. I don't do Windows if I can avoid it, only on specific points if needed. In which cases almost always perl provides me longbow and sword to attack the enemy's bugs, e.g. patching binaries inplace or such :P

        perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
Re: Perl object memory overhead
by BrowserUk (Patriarch) on Mar 28, 2014 at 02:00 UTC

    If you have 3 million hashes, then you need to be able to refer to them, so, you've probably got them, or rather, references to them, stored in an Array (an @AoH):

    @AoH = map{ a=>1, b=>2, c=>3 }, 1 .. 3e6;; say total_size( \@AoH );; 936000209

    To turn a hashref into an object, you need to bless it. The effect on memory is:

    bless $_, 'OO' for @AoH;; say total_size( \@AoH );; 936000209

    Zilch!

    However, then you need to manipulate the contents of those 'objects'.

    Manipulate those 3e6 objects via the inconvenience of object notation and accessors:

    sub OO::a{ $_[0]->{a}=$_[1] if defined $_[1]; $_[0]->{a} };; $t = time; $_->a( $_->a() +1 ) for @AoH; say time - $t;;; 4.60931897163391

    Now perform that same manipulation on those very same "objects" the easy way:

    $t = time; ++$_->{a} for @AoH; say time() - $t;; 0.931627988815308

    Of course, that wasn't 'proper OO' above because my methods manipulated the object properties directly, so let's fix that:

    sub OO::get_a{ $_[0]->{a} }; sub OO::set_a{ $_[0]->{a} = $_[1] };; sub OO::get_b{ $_[0]->{b} }; sub OO::set_b{ $_[0]->{b} = $_[1] };; sub OO::get_c{ $_[0]->{c} }; sub OO::set_c{ $_[0]->{c} = $_[1] };; sub OO::adjust_a{ $_[0]->set_a( $_[0]->get_a() + 1 ) };; $t = time; $_->adjust_a() for @AoH; say time() - $t;; 4.95501685142517

    But you're still not 'doing it properly' because you didn't name your arguments:

    sub OO::get_a{ my( $o ) = @_; $o->{a} }; sub OO::set_a{ my( $o, $v ) = + @_; $o->{a} = $v };; sub OO::get_b{ my( $o ) = @_; $o->{b} }; sub OO::set_b{ my( $o, $v ) = + @_; $o->{b} = $v };; sub OO::get_c{ my( $o ) = @_; $o->{c} }; sub OO::set_c{ my( $o, $v ) = + @_; $o->{c} = $v };; sub OO::adjust_a{ my( $o ) = @_; $o->set_a( $o->get_a() + 1 ) };; $t = time; $_->adjust_a() for @AoH; say time() - $t;; 6.20793199539185

    Still wrong! I didn't check my arguments:

    sub OO::get_a{ my( $o ) = @_; die unless ref( $o ) eq 'OO'; $o->{a} }; + sub OO::set_a{ my( $o, $v ) = @_; die unless ref( $o ) eq 'OO'; die un +less looks_like_number( $v ); $o->{a} = $v };; sub OO::get_b{ my( $o ) = @_; die unless ref( $o ) eq 'OO'; $o->{b} }; + sub OO::set_b{ my( $o, $v ) = @_; die unless ref( $o ) eq 'OO'; die un +less looks_like_number( $v ); $o->{b} = $v };; sub OO::get_c{ my( $o ) = @_; die unless ref( $o ) eq 'OO'; $o->{c} }; + sub OO::set_c{ my( $o, $v ) = @_; die unless ref( $o ) eq 'OO'; die un +less looks_like_number( $v ); $o->{c} = $v };; $t = time; $_->adjust_a() for @AoH; say time() - $t;; 8.23622608184814

    Of course, writing all those setters and getters manually is just so 'Legacy Perl'; there are packages that'll do that for me:

    use Moose; ...

    Best we don't look. The price of modern convenience!


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Indeed; writing all those getters and setters manually is inconvenient.

      Rate OO MOOSE OO 3.47/s -- -47% MOOSE 6.56/s 89% --
      use Moops; class Cow :rw { has name => (default => 'Ermintrude') }; say Cow->new->name

        Ooh! Do I spy a benchmark special case like the ones that compiler makers used to add to get good scores? :)

        isa => 'Int', traits => ['Counter'], handles => { adjust_a => 'inc' },

        What if you changed the increment to +33?

        Also, it would be really useful to see the timing of adjust_a() run on the 3e6 objects, along with memory consumption?


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Perl object memory overhead
by davido (Cardinal) on Mar 27, 2014 at 23:18 UTC

    Not enough information to provide a useful answer. In fact, even if you provided more information, it would probably still be necessary to profile, and benchmark alternatives.

    Minimally, we would need to know a lot more about the design; will the objects have methods called on them which are currently handled by plain old subroutines? What object framework will you use. Have you determined (before making the switch) where performance bottlenecks are most significant in your existing code?

    Converting a script that passes references to hashes as parameters to subroutines, into a script that calls object methods on objects that hold references to hashes will have some non-zero performance impact. The impact doesn't come from the fact that there's a blessed hash involved, but rather, that method calls have to go through a layer of lookups to decide what method to invoke out of the inheritance chain. But anyone who is eager to quantify the impact without methodical testing, profiling and benchmarking is just guessing.


    Dave

Re: Perl object memory overhead
by Anonymous Monk on Mar 27, 2014 at 23:19 UTC
    not much
    $ perl -MDevel::Peek -e " %f = qw/ AAA BBB /; $q = bless{ qw/ CCC DDD / }, q/TheQ/; Dump(\%f); Dump($q); "
    SV = IV(0x3f9bc8) at 0x3f9bcc
      REFCNT = 1
      FLAGS = (TEMP,ROK)
      RV = 0x99b9fc
      SV = PVHV(0x3ff39c) at 0x99b9fc
        REFCNT = 2
        FLAGS = (SHAREKEYS)
        ARRAY = 0x994124  (0:7, 1:1)
        hash quality = 100.0%
        KEYS = 1
        FILL = 1
        MAX = 7
        RITER = -1
        EITER = 0x0
        Elt "AAA" HASH = 0x320d3b3d
        SV = PV(0x3f7a4c) at 0x3f9a9c
          REFCNT = 1
          FLAGS = (POK,pPOK)
          PV = 0x994304 "BBB"\0
          CUR = 3
          LEN = 12
    SV = IV(0x3f9b98) at 0x3f9b9c
      REFCNT = 1
      FLAGS = (ROK)
      RV = 0x3f9b7c
      SV = PVHV(0x3fecec) at 0x3f9b7c
        REFCNT = 1
        FLAGS = (OBJECT,SHAREKEYS)
        STASH = 0x3f9c1c    "TheQ"
        ARRAY = 0xa78284  (0:7, 1:1)
        hash quality = 100.0%
        KEYS = 1
        FILL = 1
        MAX = 7
        RITER = -1
        EITER = 0x0
        Elt "CCC" HASH = 0x78f11fea
        SV = PV(0x3f7a74) at 0x3f9b8c
          REFCNT = 1
          FLAGS = (POK,pPOK)
          PV = 0x9a0c3c "DDD"\0
          CUR = 3
          LEN = 12
    $ perl -MDevel::Size=total_size -l - %f = qw/ AAA BBB /; $q = bless{ qw/ CCC DDD / }, q/TheQ/; print for total_size(\%f ), total_size($q); __END__ 137 137
Re: Perl object memory overhead
by ww (Archbishop) on Mar 27, 2014 at 23:27 UTC

    TITS for your answer


    Questions containing the words "doesn't work" (or their moral equivalent) will usually get a downvote from me unless accompanied by:
    1. code
    2. verbatim error and/or warning messages
    3. a coherent explanation of what "doesn't work actually means.
Re: Perl object memory overhead
by sundialsvc4 (Abbot) on Mar 27, 2014 at 23:45 UTC

    The biggest “performance impact,” obviously, is going to arise from there being millions of just-about-anythings, all in memory, all at the same time.   Simple page-faults are probably going to eat up both your breakfast and your lunch.   You can get away with this sort of thing if you’ve got gobs and gobs of real RAM, and a 64-bit environment, but the design here is IMHO inferior.   If you can possibly reduce that memory-footprint, I think you’ll be very glad that you did.

    bless(), by itself, has very little impact, as has been shown.   But what’s kinda scarin’ me is the thought of all those method calls.   If you really do intend to have three million things in-memory at once and for those things to be objects that are all “really, highly-active,” this is a design that’s really going to be fighting some uphill battles in terms of performance.   Can this really not be simplified in some way?

      this is a design that’s really going to be fighting some uphill battles in terms of performance. Can this really not be simplified in some way?

      Well, the best way to avoid battles is: keeping peace.

      SCNR.

      perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'

        Well, the best way to avoid battles is: keeping peace.

        Its easy to do with carrots

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1080016]
Approved by shmem
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (4)
As of 2024-03-19 07:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found