BrowserUk has asked for the wisdom of the Perl Monks concerning the following question:
I'm working on something which will use lots of different 'types', literally hundreds. To prevent having to load all of the code for all of the types, it would be convenient, to put each type in it's own package.
Each of the packages would consist of an AUTOLOAD sub, which would actually resolve back to a single AUTOLOAD sub in a parent class from which each of the others would be subclassed.
The superclass AUTOLOAD would then take care of generating the actual class upon demand. This would allow the programmer to use the superclass in his code and then just use the types as required and have the code to support them generated upon demand.
#! perl -slw
use strict;
use My::Types;
my $typeA = new TypeA;
In the above, use My::Types; would load (actually, generate) a package for each of the types that would notionally look something like
package TypeA;
*AUTOLOAD = \&My::Type::AUTOLOAD;
Which causes my $typeA = new TypeA; to end up calling My::Types::AUTOLOAD with the name of the type being used and that will generate the rest of the package required to support the type before passing control to the constructor (new) for that type and then go on normally from there.
The question is, what is the penalty of having hundreds of minimal packages floating around? In terms of both memory usage and the performance of looking up individual routines within the program?
I'm don't fully understand the how the glob lookup works at runtime. I've played around dumping both the contents of the main package space and the individual package spaces and there appear to be many typeglob hashes created for each package, which I find confusing.
Can anyone with a better understanding have a feel for the effect of having close to 1000 of these minimal packages floating around mostly unused would have upon the program?
The alternative I am considering is to require the programmer to use
...
use My::Types;
my $typeA = new My::Types::TypeA;
Looking around in the debugger, I see keys created in %:: that contain (?) hashes for each package loaded. I guess my fear is that by creating all these little packages I might be creating hundreds of hashes each containing just one key/value pair, most of which would never be used. Would the second form conserve any resources or would it amount to much the same thing?
Thanks for any insights you can offer.
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
If I understand your problem, I can solve it! Of course, the same can be said for you.
Re: The costs of packages
by Abigail-II (Bishop) on Sep 16, 2003 at 08:45 UTC
|
Why have minimal packages at all? Would something like this
work for you?
#
# This is the file My/Types.pm
#
package UNIVERSAL;
use strict;
use warnings;
sub create {
my $class = $_ [0];
print "Creating class $class\n";
eval <<"--";
package $class;
sub create {
my \$class = shift;
print "Now inside \${class}::create\n";
bless [] => \$class;
}
--
no strict 'refs';
my $sub = "${class}::create";
goto &$sub;
}
1;
__END__
#!/usr/bin/perl
use strict;
use warnings;
use My::Types;
my $typeA1 = create TypeA;
my $typeA2 = create TypeA;
my $typeB1 = create TypeB;
__END__
Creating class TypeA
Now inside TypeA::create
Now inside TypeA::create
Creating class TypeB
Now inside TypeB::create
Abigail | [reply] [d/l] |
|
That's an extremely cool approach! It's actually quite obvious, once it's demonstrated. (As all neat and simple things usually are ...)
Now, are there maintainability issues here? Why would you put create() in UNIVERSAL instead of exporting it, ala use My::Types qw(create);?
Also, I would assume that, since you're eval'ing the package, you could have it do inheritance and the like, right? It would be interesting to see a non-trivial example of this in action ...
------ We are the carpenters and bricklayers of the Information Age. The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6 Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.
| [reply] [d/l] |
|
Now, are there maintainability issues here? Why would you put create() in UNIVERSAL instead of exporting it, ala use My::Types qw(create);?
If you export the create sub, it would mean that each
time you call create, the create function in My/Types.pm is
called, because it's the create function in the current
package. With the UNIVERSAL trick, the create function in
My/Types.pm is only called once for each class - a second
call to create with the same package name as argument is
handled by the create package directly. The example program
I gave shows this. This is because UNIVERSAL is searched
*last*, while the current package is searched *first*, and
that's the crucial difference between exporting and using
UNIVERSAL.
Also, I would assume that, since you're eval'ing the package, you could have it do inheritance and the like, right?
Uhm, no. As I understood, the problem BrowserUK was having
is that his program typically would use a few classes, but
those classes would be picked from potentially hundreds.
And the memory of those hundreds "skeletons" was a concern.
The "eval" trick (which might as well have been a 'require')
makes that only packages that are used consume memory.
I didn't get the impressions BrowserUKs original approach
considered inherited constructors, and I certainly didn't
considered it either.
Abigail
| [reply] |
|
|
|
|
|
| [reply] |
Re: The costs of packages
by dragonchild (Archbishop) on Sep 16, 2003 at 02:24 UTC
|
This sounds an awful lot like a factory. Why not have a Types::Factory class, create an instance of it, then do something like my $typeA = $factory->new(type => 'TypeA', ...);? That not only is simple, keeps from you from making AUTOLOAD mistakes (the first being using the damn thing), and allows for options like:
my %types;
foreach my $type (map { "type$_" } qw(A B C D E))
{
$types{$type} = $factory->new(type => $type, ...);
}
Keep
It
Simple,
Stupid
Second-best piece of advice my Dad ever gave me.
------ We are the carpenters and bricklayers of the Information Age. The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6 Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified. | [reply] [d/l] [select] |
|
>Second-best piece of advice my Dad ever gave me.
What was the first-best?
| [reply] |
|
Best advice my Dad ever gave me was "Find a rock to build your life upon." Now, I know he meant to use Jesus for this, but I went my own way and chose a slightly different path. *grins*
He didn't talk to me for four years cause of it. *shrugs*
------ We are the carpenters and bricklayers of the Information Age. The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6 Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.
| [reply] |
Re: The costs of packages
by perrin (Chancellor) on Sep 16, 2003 at 05:10 UTC
|
I don't know exactly what you're trying to do, but I'll bet you there's a way to do it with a hash and a single package that would be much more efficient. | [reply] |
|
use My::Types qw[typeA typeB typeZZY];
or
use My::Type::typeA;
use My::Type::typeB;
use My::Type::typeC;
or
use My::Types::Factory;
my $typeA = My::Type::Factory->new( 'TypeA' );
my $typeB = My::Type::Factory->new( 'TypeB' );
my $typeZZY = My::Type::Factory->new( 'TypeZZY' );
my $varTypeA = $typeA->new();
my $varTypeB = $typeB->new();
my $varTypeZZY = $type->new();
I also don't wish to have every program carry the weight of several hundred unused types because it uses 1 of them, hence the need to use the autoload. My inspiration comes from the POSIX package.
I need the types to each be in a seperate package space because each type will have the same set of methods.
So, whilst I am quite happy with the technique I outlined from the use and implementation point of view, I am a little concerned that the first variation my consume more "glob space" than is necessary, and if the second variation would save any substantial amount of memory and/ or prevent or reduce any performance hit that (may or may not) result from the first, then I would accept the slightly more verbose syntax of the second over the first.
However, if there is little or no difference in the overheads of the two variations, then I will go with the former's reduce syntax.
Now, if you can show me an efficient way of doing this with "a hash and a single package" I'm all ears?
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
If I understand your problem, I can solve it! Of course, the same can be said for you.
| [reply] [d/l] [select] |
|
| [reply] [d/l] |
|
|
|
|
So, whilst I am quite happy with the technique I outlined from the use and implementation point of view, I am a little concerned that the first variation my consume more "glob space" than is necessary, and if the second variation would save any substantial amount of memory and/ or prevent or reduce any performance hit that (may or may not) result from the first, then I would accept the slightly more verbose syntax of the second over the first.
I dont think that you need to worry about this. Certainly not as long as you are in the hundreds and not hundreds of thousands. Incidentally worrying about this smells like premature optimization to me... :-) Are there actually symptoms of a problem or are you just considering best practice?
A last point regarding AUTOLOAD, you may find the biggest drawback of this approach is speed. I have heard it said that AUTOLOADING a sub spoils the method cache (I dont know if thats only for one package or for all or what,) so if speed is an issue you might look at that.
---
demerphq
<Elian> And I do take a kind of perverse pleasure in having an OO assembly language...
| [reply] [d/l] |
|
|
Re: The costs of packages
by jmcnamara (Monsignor) on Sep 17, 2003 at 09:41 UTC
|
What I am trying to do is encapsulate several hundred datatypes (think C-style structs and unions)
This is secondary to your main question but the Convert::Binary::C module is very useful for manipulating structs, unions, enums and typedefs in C source and headers files.
--
John.
| [reply] |
|
Thanks for the pointer:) I hadn't encountered that module in my peruse of CPAN.
For various reasons, the header files I am using are actually assembler header files (.inc) rather than C header files which kyboshes it somewhat. Also, from my quick look at the module, it would require the header files to be available on the target system (which could be a legal problem unless they got them from a C compiler distribution ) and (I think) it would export everything in the header rather than just those things used, which doesn't suit my purpose.
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
If I understand your problem, I can solve it! Of course, the same can be said for you.
| [reply] |
Re: The costs of packages
by Anonymous Monk on Sep 17, 2003 at 06:25 UTC
|
lots of different 'types', literally hundreds
What is that ?
Why do you need hundreds of types ?
Could you give an example type ?
Just curious,
Murat | [reply] |
|
As Anonymonk you probably didn't get shown this post, which explains a little more.
The "types" in question are C structures and unions. Look through the header files for your OS and depending which one you use, you'll see just how many there are.
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
If I understand your problem, I can solve it! Of course, the same can be said for you.
| [reply] |
Re: The costs of packages
by Beechbone (Friar) on Sep 18, 2003 at 20:31 UTC
|
Is there any real need to use classes at all? You could just use one class and store the information which C-class it is for inside.
example:
package My::Types;
sub new {
my $pclass = shift;
my $cclass = shift;
my $cself = $pclass->readFromData($cclass);
my $pself = { cself => $cself, cclass => $cclass };
return bless $pself, ref($pclass)||$pclass;
}
sub isSame {
my $pclass = shift;
my $other = shift;
return undef unless $other->CORE::isa(__PACKAGE__);
return $other->{cclass} eq $pself->{cclass};
}
sub getNumberOfParams {
my $pself = shift;
my $cself = $pself->{cself};
return $cself->{noOfParm};
}
...
package main;
$a = My::Types->new('printf');
print $a->getNumberOfParams();
(untested code)
Update: Oops, I coded an isSame() method, not an isa(). Renamed it to avoid misunderstandings... | [reply] [d/l] |
|
|