http://www.perlmonks.org?node_id=744049

Ovid has asked for the wisdom of the Perl Monks concerning the following question:

I'm working on a module named Class::Sniff. While it's primarily for finding "code smells" in object-oriented hierarchies, it also lets you graph those hierarchies. For example, here's the graph of the B:: modules (Perl's backend modules).

The problem I'm trying to solve is that requiring a package via string eval creates a symbol table entry for that package, even if the require fails. Thus, I can't tell if the package is "real" or not (e.g., if it's likely to get called and should be added to my graph).

Jesse Vincent sent a bug report about Class::Sniff detecting non-existent packages.

Seems Jesse has a lot of code like this:

eval "require RT::Ticket_Overlay"; if ($@ && $@ !~ qr{^Can't locate RT/Ticket_Overlay.pm}) { die $@; };

Well, that seems quite reasonable. Except for this:

#!/usr/bin/env perl -l use strict; use warnings; print $::{'Foo::'} || 'Not found'; eval "use Foo"; print $::{'Foo::'} || 'Not found'; eval "require Foo"; print $::{'Foo::'} || 'Not found'; __END__ Not found Not found *main::Foo::

That's right. Attempting the require a non-existent module via a string eval creates a symbol-table entry. Aristotle told me he was astonished that no one had caught this before. Frankly, I just think that not enough people are trying to do introspection in Perl.

This one will be tricky to work around. I thought "if the module doesn't actually exist, can I check to see if @ISA is there?" It gets automatically created for every package, but since the module representing that package doesn't exist, maybe it won't? No such luck:

print defined *NoSuchModule::ISA{ARRAY} ? 'Yes' : 'No'; print defined *NoSuchModule::xxx{ARRAY} ? 'Yes' : 'No';

That always prints "Yes" and then "No". @ISA is always created for every package if you try to access it. Darn.

I thought I could check for the module's existence in %INC, but inlined packages don't show up there, either (unless the author explicitly puts them there).

The only thing I can think of is this curious line:

print scalar keys %Foo::;

If you do that with a non-existent package which nonetheless has a symbol table entry, it still has no keys in its symbol table. However, if you do that with a module which exists but failed to compile, you will probably have a few symbol table entries. This still doesn't quite solve the problem.

So how do I detect if a module in a symbol table failed to load? I'm not sure if I can. If I simply check to see if there are any keys in the symbol table, that should be enough, right? If someone evals "require $badmodule" and that require fails due to compilation errors, they'll exit or die, right? (too optimistic, I know)

Of course, even this is problematic. As Rafael Garcia-Suarez pointed out, a nested stash will create its parent stash (e.g., CGI::Application will create a symbol table for CGI, even if the latter is not loaded).

I can't just check %INC because inlined modules won't be there unless the author remembers to add them manually. Could I hook require or add a coderef to @INC? It's a strange edge case I'm dealing with, but for larger, more complex applications, it's a problem.

Replies are listed 'Best First'.
Re: How Can I Know If a Package is "Real"?
by Corion (Patriarch) on Feb 16, 2009 at 12:48 UTC

    I think you mostly want that so you can check whether something can influence how a class behaves, which seems to point mostly to @ISA to me (barring weirdo UNIVERSAL or MRO trickery). If you want to check whether @Some::Package::ISA exists, you can look into the namespace hash:

    #!perl -w use Data::Dumper; sub isa_exists { my ($package) = @_; exists ${"$package\::"}{ISA} and defined *{${"$package\::"}{ISA}}{ +ARRAY}; }; print "Now you don't\n"; print isa_exists('Some::Package'),"\n"; eval q(package Some::Package;@ISA='foo';); print "Now you see it\n"; print isa_exists('Some::Package'),"\n"; print "Now you don't\n"; print isa_exists('Some::Other::Package'),"\n"; print "Now you don't\n"; print isa_exists('Some::Other::Package'),"\n"; eval q(require A::Package::That::Doesn't::Exist); print "A::Package::That::Doesn't::Exist ($@)\n"; print isa_exists("A::Package::That::Doesn't::Exist");
Re: How Can I Know If a Package is "Real"?
by Bloodnok (Vicar) on Feb 16, 2009 at 13:20 UTC
    Ovid += infinity :D))

    I've been chasing my tail over something extremely similar myself of late - to the point of drafting a SoPW node. Ovid has, unknown to him 'til now, saved me the trouble of the posting and provided the clarity/insight I was seeking...

    My algorithm did the simplest test (eval { require ...}) first, going on to test for a local file scoped package after that - so a non-existent package always exists if a test is made for the existence of the stash.

    It never even occurred to me that, as Ovid points out, the require causes vivification of a stash for the package - whether or not it exists ... doh !!!

    Many thanx again Ovid.

    .oO(Wonder whether my sig should be changed to read: "A user level that overstates my experience by an order of magnitude" ?)

    Update:

    Nearly, but not quite - some more extensive testing (appears to) show that a stash is always created - but would appear to be empty i.e. no keys, if the package is non-existent. So whereas require creates the stash and (sometimes) partially populates it, the test for the stash also appears to create a stash - but seemingly always empty ... or more accurately, an existent package always has the standard packages & classes as its keys.

    Any road up, it's close enough for jazz at this joint :D))

    A user level that continues to overstate my experience :-))
Re: How Can I Know If a Package is "Real"?
by tilly (Archbishop) on Feb 16, 2009 at 19:04 UTC
    I just tried it. You can't override use directly. If you override require, then use will know to pay attention to it only if you are in the package that is overridden. Which means that if you override CORE::require, you don't override use globally.

    You could add a coderef to @INC, but no matter how you go about it, that will take a lot of work. First you need to implement the full handling of @INC, including coderefs that pass back filehandles and coderefs that pass back objects (then the INC method gets called). At this point there are three ways to go.

    The simplest is to say that if you did not find the module, you are going to mark it as not loaded. This is sufficient for Jesse Vincent's use case.

    Somewhat more complex is to add a __DIE__ handler so that if that module fails to compile you might be told about it.

    The hardest is to manually try to detect whether the module successfully loaded, which would mean returning a filehandle that will track whether it was successfully read up to the end, or __END__ or __DATA__. (It is up to you whether to worry about the possibility of arriving at those tokens within strings.)

    Even after all of this work, are you done? No! Because people sometimes create small helper classes within a file, and these are real. To handle those I'd suggest that any package that is found which has subroutines in it that you never saw anyone trying to load are to be considered successfully loaded. That also will cover modules that were loaded before Class::Sniff was.

    I would say that the fact that overriding CORE::require doesn't override use should be documented as a bug. And I would stick with the simplest solution, which is that you only ignore modules that your coderef said were not found.

Re: How Can I Know If a Package is "Real"?
by Herkum (Parson) on Feb 16, 2009 at 15:58 UTC

    It would seem like a bug internally for the way the symbol tables are handled. If the symbol tables are creating stuff automatically without checking it would appear to be similar to nested Perl hashes, it just creates something(at least for some things).

    my %hash; print "Ovid exists '" . (exists $hash{'Ovid'}) . "'\n"; if ($hash{'Ovid'}{'tall'}) { print "Ovid is tall\n" }; # I hate this b +ehavior! print "Ovid exists '" . (exists $hash{'Ovid'}) . "'\n";

    The question would be, is that a bug that should be fixed or something that should be documented?

    All of that being said, I don't any answer for your problem Ovid, you have covered any idea I would have come up and a few I did not know about.

Re: How Can I Know If a Package is "Real"?
by Tanktalus (Canon) on Feb 16, 2009 at 19:44 UTC

    Isn't this approximately the same issue that base tries to deal with when it sets $VERSION to "-1, set by base.pm" if it's not already set, based solely on whether require returns true or not? And the same issue promptly ignored by advocates of parent? Those advocating parent seem to be (apologies if I misrepresent them - my goal is not a straw-man) that you Really Shouldn't Do That, So Please Stop It.

    For example, if you try to:

    eval "use Foo;"
    you should be able to check the eval return code - if it's false, Foo failed to load. You pretty much can delete its namespace (but not anything beneath it).
    if (eval "use Foo;") { # success } else { # failure (not found, failed to compile, who cares?) }
    If it's a pre-loaded module, you pretty much have to assume compilation succeeded. If someone maliciously does a eval "use Foo;" and continues to ask you information about Foo even though it failed to compile, well, there's bupkiss you can do about it, I think. (This is a special case of tye's eval "$text; 1" since modules have to return true or they're considered to have failed compilation anyway.)

    Introspection in perl is fine ... as long as you don't push the edge cases. Then it gets hard. I'm guessing Perl 6 will make this easier, but you'd know better than I :-)

Re: How Can I Know If a Package is "Real"?
by DrHyde (Prior) on Feb 17, 2009 at 10:28 UTC

    How about over-riding eval so that you can compare the environment before and after executing the eval?

    eval isn't directly over-rideable, but with PPI and a small source filter (eeuuww!) maybe you can replace all the evals with my_evals. my_eval would have a prototype and take a string parameter or a sub-ref, so that it should "just work" with both string- and block-evals.