Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Testing for valid package names

by boftx (Deacon)
on Nov 25, 2013 at 20:51 UTC ( [id://1064292]=perlquestion: print w/replies, xml ) Need Help??

boftx has asked for the wisdom of the Perl Monks concerning the following question:

I have a need to test a string to see if it would be a valid name for a package suitable for submission to CPAN. Currently, I am taking a rather draconic approach and allowing only the characters that are matched by \w.

My questions are these:

  1. Is my character set too restrictive, or is it reasonable in light of the majority of use-cases and portability concerns?
  2. Should I remove the "_" character from the list?
  3. What improvements can be made to the test cases below to better test edge cases?

Here is the test code:

#!/usr/bin/perl -T use 5.008_008; use strict; use warnings FATAL => 'all'; use Test::More; my @good_pnames = ( qw( foo foo::bar foo_bar foo::bar_baz ) ); my @bad_pnames = ( qw( foo.pm foo! foo: foo:: foo::! foo:bar foo::bar! foo::bar:baz ) ); push( @bad_pnames, 'foo bar' ); for (@good_pnames) { ok( valid_pname($_), "$_ is valid" ); } for (@bad_pnames) { ok( !valid_pname($_), "$_ is not valid" ) or BAIL_OUT("Invalid package was accepted! $_"); } done_testing(); exit; sub valid_pname { my $pname = shift; return !!($pname =~ /^\w+(?:::\w+)*$/); } __END__

Update: Added foo.pm to list of bad names.

Update 2: Questions 1 and 2 are answered directly by the code from Module::Runtime supplied below by tobyink

It helps to remember that the primary goal is to drain the swamp even when you are hip-deep in alligators.

Replies are listed 'Best First'.
Re: Testing for valid package names
by tobyink (Canon) on Nov 25, 2013 at 21:56 UTC

    Module::Runtime is probably what you want to use for this. Here are the relevant parts...

    sub _is_string($) { my($arg) = @_; return defined($arg) && ref(\$arg) eq "SCALAR"; } our $module_name_rx = qr/[A-Z_a-z][0-9A-Z_a-z]*(?:::[0-9A-Z_a-z]+)*/; sub is_module_name($) { _is_string($_[0]) && $_[0] =~ /\A$module_name_rx\z/o; }

    This is more restrictive than your current check - in particular it excludes all non-ASCII word characters. This is because Unicode file names are handled pretty inconsistently across different file systems and Perl versions.

    Note that this is really a module name check; not a package name check. Modules are files on the filesystem; packages are namespaces. Package names are a lot more relaxed than module names; for example, the space character can be used as a package name:

    perl -E'*{" ::foo"} = sub {42}; say " "->foo'
    use Moops; class Cow :rw { has name => (default => 'Ermintrude') }; say Cow->new->name

      Thanks! That was the regex I was coming up with, but seeing it in use is even better. And I hadn't considered the Unicode aspect. :)

      It helps to remember that the primary goal is to drain the swamp even when you are hip-deep in alligators.
Re: Testing for valid package names
by toolic (Bishop) on Nov 25, 2013 at 21:15 UTC

      Thanks!

      I thought about numbers as the first char, but that is legal for filenames so far as I know. I am still mulling that over.

      I wasn't aware of that Perl::Critic rule (not surprising) but the POD doesn't mention anything about the legal character set, only that a single-quote "'" should not be used even though it is a valid substitute for the double-colon '::' separator.

      It helps to remember that the primary goal is to drain the swamp even when you are hip-deep in alligators.
        IIRC, no (non-special) identifier is allowed to start with a number.

        Though I'm sure one can trick to enter them as key value pairs into the stash.

        Cheers Rolf

        ( addicted to the Perl Programming Language)

Re: Testing for valid package names
by taint (Chaplain) on Nov 25, 2013 at 21:20 UTC
    Greetings boftx.

    There's already a Module for that. I just saw it a bit ago. Might save you some unnecessary trouble, unless of course you're doing this for the "experience". :)
    I'll see if I can find it again, and post back with a link. If you want to try yourself; I seem to remember it being in the CPAN namespace.

    --Chris

    #!/usr/bin/perl -Tw
    use Perl::Always or die;
    my $perl_version = (5.12.5);
    print $perl_version;

      Hi Chris, that wouldn't surprise me at all.

      I'm doing this partly to keep my hand in on some stuff that I don't typically use, as well as try to (re-)invent a better mousetrap with regards to other modules that don't let me do things as easily as I think it might be as possible to do so. More to follow on that once the package is fleshed out more. :)

      It helps to remember that the primary goal is to drain the swamp even when you are hip-deep in alligators.
        COMPLETELY understood.

        I'm bad that way myself. :)
        While it doesn't look complete, CPAN-Index might provide some helpful bits for your needs (project).

        Good luck, and have fun.

        --Chris

        #!/usr/bin/perl -Tw
        use Perl::Always or die;
        my $perl_version = (5.12.5);
        print $perl_version;
        This might give you a nice index object to run your search against:
        CPAN-Index-API.

        --Chris

        #!/usr/bin/perl -Tw
        use Perl::Always or die;
        my $perl_version = (5.12.5);
        print $perl_version;

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1064292]
Approved by toolic
Front-paged by toolic
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2024-03-19 07:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found