Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

The costs of packages

by BrowserUk (Pope)
on Sep 16, 2003 at 01:36 UTC ( #291700=perlquestion: print w/ replies, xml ) Need Help??
BrowserUk has asked for the wisdom of the Perl Monks concerning the following question:

I'm working on something which will use lots of different 'types', literally hundreds. To prevent having to load all of the code for all of the types, it would be convenient, to put each type in it's own package.

Each of the packages would consist of an AUTOLOAD sub, which would actually resolve back to a single AUTOLOAD sub in a parent class from which each of the others would be subclassed.

The superclass AUTOLOAD would then take care of generating the actual class upon demand. This would allow the programmer to use the superclass in his code and then just use the types as required and have the code to support them generated upon demand.

#! perl -slw use strict; use My::Types; my $typeA = new TypeA;

In the above, use My::Types; would load (actually, generate) a package for each of the types that would notionally look something like

package TypeA; *AUTOLOAD = \&My::Type::AUTOLOAD;

Which causes my $typeA = new TypeA; to end up calling My::Types::AUTOLOAD with the name of the type being used and that will generate the rest of the package required to support the type before passing control to the constructor (new) for that type and then go on normally from there.

The question is, what is the penalty of having hundreds of minimal packages floating around? In terms of both memory usage and the performance of looking up individual routines within the program?

I'm don't fully understand the how the glob lookup works at runtime. I've played around dumping both the contents of the main package space and the individual package spaces and there appear to be many typeglob hashes created for each package, which I find confusing.

Can anyone with a better understanding have a feel for the effect of having close to 1000 of these minimal packages floating around mostly unused would have upon the program?

The alternative I am considering is to require the programmer to use

... use My::Types; my $typeA = new My::Types::TypeA;

Looking around in the debugger, I see keys created in %:: that contain (?) hashes for each package loaded. I guess my fear is that by creating all these little packages I might be creating hundreds of hashes each containing just one key/value pair, most of which would never be used. Would the second form conserve any resources or would it amount to much the same thing?

Thanks for any insights you can offer.


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
If I understand your problem, I can solve it! Of course, the same can be said for you.

Comment on The costs of packages
Select or Download Code
Re: The costs of packages
by dragonchild (Archbishop) on Sep 16, 2003 at 02:24 UTC
    This sounds an awful lot like a factory. Why not have a Types::Factory class, create an instance of it, then do something like my $typeA = $factory->new(type => 'TypeA', ...);? That not only is simple, keeps from you from making AUTOLOAD mistakes (the first being using the damn thing), and allows for options like:
    my %types; foreach my $type (map { "type$_" } qw(A B C D E)) { $types{$type} = $factory->new(type => $type, ...); }

    Keep

    It

    Simple,

    Stupid

    Second-best piece of advice my Dad ever gave me.

    ------
    We are the carpenters and bricklayers of the Information Age.

    The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

    Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

      >Second-best piece of advice my Dad ever gave me. What was the first-best?
        Best advice my Dad ever gave me was "Find a rock to build your life upon." Now, I know he meant to use Jesus for this, but I went my own way and chose a slightly different path. *grins*

        He didn't talk to me for four years cause of it. *shrugs*

        ------
        We are the carpenters and bricklayers of the Information Age.

        The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

        Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

Re: The costs of packages
by perrin (Chancellor) on Sep 16, 2003 at 05:10 UTC
    I don't know exactly what you're trying to do, but I'll bet you there's a way to do it with a hash and a single package that would be much more efficient.

      What I am trying to succeeding in doing, is encapsulate several hundred datatypes (think C-style structs and unions) in such a way that they can be imported into a program on demand without requiring mass pre-declaration as would be the case with

      use My::Types qw[typeA typeB typeZZY];

      or

      use My::Type::typeA; use My::Type::typeB; use My::Type::typeC;

      or

      use My::Types::Factory; my $typeA = My::Type::Factory->new( 'TypeA' ); my $typeB = My::Type::Factory->new( 'TypeB' ); my $typeZZY = My::Type::Factory->new( 'TypeZZY' ); my $varTypeA = $typeA->new(); my $varTypeB = $typeB->new(); my $varTypeZZY = $type->new();

      I also don't wish to have every program carry the weight of several hundred unused types because it uses 1 of them, hence the need to use the autoload. My inspiration comes from the POSIX package.

      I need the types to each be in a seperate package space because each type will have the same set of methods.

      So, whilst I am quite happy with the technique I outlined from the use and implementation point of view, I am a little concerned that the first variation my consume more "glob space" than is necessary, and if the second variation would save any substantial amount of memory and/ or prevent or reduce any performance hit that (may or may not) result from the first, then I would accept the slightly more verbose syntax of the second over the first.

      However, if there is little or no difference in the overheads of the two variations, then I will go with the former's reduce syntax.

      Now, if you can show me an efficient way of doing this with "a hash and a single package" I'm all ears?


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
      If I understand your problem, I can solve it! Of course, the same can be said for you.

        Actually, the factory approach does not require mass pre-declaration and doesn't need to load anything that you don't use, so it's the best approach if you want to stick with the multiple packages approach. The factory method here would figure out the package name dynamically and then do a require for it, instantiate it, and return it.

        I still think you can do it without multiple packages though, if they all share the same methods. You only need multiple packages if every package has different methods.

        So, whilst I am quite happy with the technique I outlined from the use and implementation point of view, I am a little concerned that the first variation my consume more "glob space" than is necessary, and if the second variation would save any substantial amount of memory and/ or prevent or reduce any performance hit that (may or may not) result from the first, then I would accept the slightly more verbose syntax of the second over the first.

        I dont think that you need to worry about this. Certainly not as long as you are in the hundreds and not hundreds of thousands. Incidentally worrying about this smells like premature optimization to me... :-) Are there actually symptoms of a problem or are you just considering best practice?

        A last point regarding AUTOLOAD, you may find the biggest drawback of this approach is speed. I have heard it said that AUTOLOADING a sub spoils the method cache (I dont know if thats only for one package or for all or what,) so if speed is an issue you might look at that.


        ---
        demerphq

        <Elian> And I do take a kind of perverse pleasure in having an OO assembly language...
Re: The costs of packages
by Abigail-II (Bishop) on Sep 16, 2003 at 08:45 UTC
    Why have minimal packages at all? Would something like this work for you?
    # # This is the file My/Types.pm # package UNIVERSAL; use strict; use warnings; sub create { my $class = $_ [0]; print "Creating class $class\n"; eval <<"--"; package $class; sub create { my \$class = shift; print "Now inside \${class}::create\n"; bless [] => \$class; } -- no strict 'refs'; my $sub = "${class}::create"; goto &$sub; } 1; __END__ #!/usr/bin/perl use strict; use warnings; use My::Types; my $typeA1 = create TypeA; my $typeA2 = create TypeA; my $typeB1 = create TypeB; __END__ Creating class TypeA Now inside TypeA::create Now inside TypeA::create Creating class TypeB Now inside TypeB::create

    Abigail

      Thanks Abigail. That works for me.


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
      If I understand your problem, I can solve it! Of course, the same can be said for you.

      That's an extremely cool approach! It's actually quite obvious, once it's demonstrated. (As all neat and simple things usually are ...)

      Now, are there maintainability issues here? Why would you put create() in UNIVERSAL instead of exporting it, ala use My::Types qw(create);?

      Also, I would assume that, since you're eval'ing the package, you could have it do inheritance and the like, right? It would be interesting to see a non-trivial example of this in action ...

      ------
      We are the carpenters and bricklayers of the Information Age.

      The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

      Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

        Now, are there maintainability issues here? Why would you put create() in UNIVERSAL instead of exporting it, ala use My::Types qw(create);?

        If you export the create sub, it would mean that each time you call create, the create function in My/Types.pm is called, because it's the create function in the current package. With the UNIVERSAL trick, the create function in My/Types.pm is only called once for each class - a second call to create with the same package name as argument is handled by the create package directly. The example program I gave shows this. This is because UNIVERSAL is searched *last*, while the current package is searched *first*, and that's the crucial difference between exporting and using UNIVERSAL.

        Also, I would assume that, since you're eval'ing the package, you could have it do inheritance and the like, right?

        Uhm, no. As I understood, the problem BrowserUK was having is that his program typically would use a few classes, but those classes would be picked from potentially hundreds. And the memory of those hundreds "skeletons" was a concern. The "eval" trick (which might as well have been a 'require') makes that only packages that are used consume memory. I didn't get the impressions BrowserUKs original approach considered inherited constructors, and I certainly didn't considered it either.

        Abigail

Re: The costs of packages
by Anonymous Monk on Sep 17, 2003 at 06:25 UTC
    lots of different 'types', literally hundreds
    What is that ? Why do you need hundreds of types ? Could you give an example type ? Just curious, Murat

      As Anonymonk you probably didn't get shown this post, which explains a little more.

      The "types" in question are C structures and unions. Look through the header files for your OS and depending which one you use, you'll see just how many there are.


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
      If I understand your problem, I can solve it! Of course, the same can be said for you.

Re: The costs of packages
by jmcnamara (Monsignor) on Sep 17, 2003 at 09:41 UTC

    What I am trying to do is encapsulate several hundred datatypes (think C-style structs and unions)

    This is secondary to your main question but the Convert::Binary::C module is very useful for manipulating structs, unions, enums and typedefs in C source and headers files.

    --
    John.

      Thanks for the pointer:) I hadn't encountered that module in my peruse of CPAN.

      For various reasons, the header files I am using are actually assembler header files (.inc) rather than C header files which kyboshes it somewhat. Also, from my quick look at the module, it would require the header files to be available on the target system (which could be a legal problem unless they got them from a C compiler distribution ) and (I think) it would export everything in the header rather than just those things used, which doesn't suit my purpose.


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
      If I understand your problem, I can solve it! Of course, the same can be said for you.

Re: The costs of packages
by Beechbone (Pilgrim) on Sep 18, 2003 at 20:31 UTC
    Is there any real need to use classes at all? You could just use one class and store the information which C-class it is for inside.

    example:

    package My::Types; sub new { my $pclass = shift; my $cclass = shift; my $cself = $pclass->readFromData($cclass); my $pself = { cself => $cself, cclass => $cclass }; return bless $pself, ref($pclass)||$pclass; } sub isSame { my $pclass = shift; my $other = shift; return undef unless $other->CORE::isa(__PACKAGE__); return $other->{cclass} eq $pself->{cclass}; } sub getNumberOfParams { my $pself = shift; my $cself = $pself->{cself}; return $cself->{noOfParm}; } ... package main; $a = My::Types->new('printf'); print $a->getNumberOfParams();
    (untested code)

    Update: Oops, I coded an isSame() method, not an isa(). Renamed it to avoid misunderstandings...

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://291700]
Approved by ybiC
Front-paged by broquaint
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (5)
As of 2014-12-27 11:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (177 votes), past polls