<?xml version="1.0" encoding="windows-1252"?>
<node id="795418" title="Find installed Perl modules matching a regular expression" created="2009-09-15 12:01:49" updated="2009-09-15 12:01:49">
<type id="1042">
CUFP</type>
<author id="622051">
toolic</author>
<data>
<field name="doctext">
Here is a handy command-line tool to quickly view installed
Perl modules whose name matches a specified regular expression.

&lt;h2&gt;Features&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Perl regular expression syntax, with separate case-sensitive switch.
&lt;li&gt;Optional initialization file for faster look-ups.
&lt;li&gt;Option to print the module name or the full directory path to the module file.
&lt;li&gt;Option to display duplicate modules and other statistics.
&lt;li&gt;Uses only core modules.
&lt;/ul&gt;

&lt;h2&gt;Other well-known methods&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;From [doc://perlfaq3]: [doc://perlfaq3#How-do-I-find-which-modules-are-installed-on-my-system?]
&lt;li&gt;The [mod://pminst] script from the [dist://pmtools] CPAN bundle
&lt;li&gt;The CPAN module: [mod://HTML::Perlinfo]
&lt;/ul&gt;

&lt;h2&gt;So... why another way to do it?&lt;/h2&gt;
Simply put: I could not easily convince these other tools to Do What I Want,
as quickly as I want,
in (what I consider) a bug-free manner. Obviously, this is not a new idea;
it is merely a different implementation. 

There are many threads here at the Monastery, as a [href://?node_id=3989;HIT=find%20install%20module;re=N|Super Search] would reveal, and I have probably read every node of every thread on the topic.


I believe [mod://HTML::Perlinfo]
does everything this script does (and much, much more), except that I could
not easily figure out how to generate output as simple text, rather than HTML.
I consider [mod://HTML::Perlinfo] to be a valuable companion to this script. 
I run a daily cronjob to dump out recent versions of both HTML and text.

&lt;h3&gt;Impatience&lt;/h3&gt;
In my opinion, the biggest advantage here is the fast look-up capability.
No matter how you slice it, you have to search through the &lt;c&gt;@INC&lt;/c&gt;
directories via some variant of &lt;c&gt;find&lt;/c&gt;, which can take 
&lt;b&gt;a whole minute&lt;/b&gt; or so -- I just do not have the patience 
to wait that long!  Maintaining the initialization file avoids that nonsense.

&lt;h2&gt;The code&lt;/h2&gt;
&lt;readmore&gt;
&lt;c&gt;
use warnings;
use strict;
use Getopt::Long;
use Pod::Usage;
use File::Find;

my $print_path;
my $report;
my $re;

parse_args();

# Clean up @INC
my @dirs;
for my $dirname (@INC) {
    if (-d $dirname) {
        next if $dirname eq '.';
        $dirname =~ s{/+}{/}g;
        $dirname =~ s{/$}{};
        push @dirs, $dirname;
    }
}
@dirs = uniq(@dirs);

# For quicker operation, use init file, if it exists
my @files;
my $use_find = 1;
my $message;
my $init_file = exists $ENV{HOME} ? "$ENV{HOME}/.findpm" : '';
if (-e $init_file) {
    if (open my $fh, '&lt;', $init_file) {
        @files = &lt;$fh&gt;;
        close $fh;
        chomp @files;
        my $days = 1;
        if (-M $init_file &gt; $days) {
            $message = "Warning: $init_file is older than $days day\n";
        }
        die "Error: $init_file is empty" if -z $init_file;
        $use_find = 0;
    }
    else {
        $message = "Warning: $init_file exists, but can not be opened: $!";
    }
}

# Otherwise, use the slower find command
if ($use_find) {
    # Find all .pm files under @INC dirs
    my @find_dirs = reduce_dirs(@dirs);
    find(
        {
            wanted =&gt; sub { push @files, $_ if -f $_ and /\.pm$/ },
            no_chdir =&gt; 1,
        },
        @find_dirs
    );
    @files = uniq(@files);
}

# Print those modules/files which match the regex
my %mods;
for my $file (@files) {

    my @ds;
    for my $dir (@dirs) {
        if (index($file, $dir) == 0) {
            #print "$d2 is a substring of $d1, starting at pos 0\n"
            push @ds, $dir;
        }
    }
    my $d = (sort {length($b) &lt;=&gt; length($a)} @ds)[0];
    my $rel = substr($file, (length($d)+1));
    my $name = $rel;
    $name =~ s/\.pm$//;

    next unless $name =~ /$re/;

    push @{ $mods{$rel} }, $d;
    if ($print_path) {
        print "$file\n";
    }
    else {
        $rel =~ s/\.pm$//;
        $rel =~ s{/}{::}g;
        print "$rel\n";
    }
}

if ($report) {
    my $num_dups = 0;
    for (keys %mods) {
        $num_dups++ if (scalar(@{$mods{$_}}) &gt; 1);
    }

    if ($num_dups) {
        print "\nDUPLICATES\n";
        for my $rel (keys %mods) {
            if (scalar(@{$mods{$rel}}) &gt; 1) {
                print "$rel\n";
                for my $dir (@{$mods{$rel}}) {
                    print "    $dir/$rel\n";
                }
            }
        }
    }

    print "\nSUMMARY\n";
    print "    regex = $re\n";
    print "    Used '$init_file' init file instead of 'find'\n" unless $use_find;
    print "    INC dirs:\n";
    print "        $_\n" for @dirs;
    print '    Total              ".pm" files = ', scalar @files, "\n";
    print '    Matching unique    ".pm" files = ', scalar keys %mods, "\n";
    print '    Matching duplicate ".pm" files = ', $num_dups, "\n";
}

warn $message if $message;
exit;

sub reduce_dirs {
    # Reduce a list of directory names by eliminating
    # names which contain other names.  For example,
    # if the input array contains (/a/b/c/d /a/b/c /a/b),
    # return an array containing (/a/b).
    my @dirs = @_;
    my %substring_count = map { $_ =&gt; 0 } @dirs;

    for my $x (@dirs) {
        for my $y (@dirs) {
            next if $x eq $y;
            if (index($x, $y) == 0) {
                # if y is substring of x, starting at position 0
                $substring_count{$x}++;
            }
        }
    }

    my @dsubs;
    for (keys %substring_count) {
        push @dsubs, $_ if $substring_count{$_} == 0;
    }
    return @dsubs;
}

sub uniq {
    # From List::MoreUtils, $VERSION = '0.22'
    my %h;
    map { $h{$_}++ == 0 ? $_ : () } @_;
}

sub parse_args {
    my ($help, $sens);
    GetOptions(
        'sens'      =&gt; \$sens,
        'path'      =&gt; \$print_path,
        'report'    =&gt; \$report,
        'help'      =&gt; \$help
    ) or pod2usage();

    $help and pod2usage(-verbose =&gt; 2);

    my $pat = (@ARGV) ? shift @ARGV : '.';
    $pat =~ s{::}{/}g;
    $re = ($sens) ? qr/$pat/ : qr/$pat/i;
    #print "pat=$pat\n";
    #print "re=$re\n";#exit;

    @ARGV and pod2usage("Error: unexpected args: @ARGV");
}


=head1 NAME

B&lt;findpm&gt; - Find installed Perl modules

=head1 SYNOPSIS

findpm [options] [regex]

    Options:
    -help       verbose help
    -path       print out full directory paths also
    -report     print out detailed report
    -sens       case-sensitive [default is case-insensitive]

=head1 DESCRIPTION

Search through the directories in the Perl C&lt;@INC&gt; variable
for Perl module files (all files with a C&lt;.pm&gt; extension) matching
a specified regular expression.
The names of all the modules which match will be printed to STDOUT.

Any directories listed in C&lt;@INC&gt; which do not exist will be silently ignored.
Excludes the current directory (.).

If you are impatient (like I am) you can optionally use an initialization
file instead of letting the script search through all the C&lt;@INC&gt;
directories every time you run the script.  The file must be in your home
directory and must be named C&lt;.findpm&gt;.  You must create this file yourself
(see EXAMPLES below), and you should keep it up to date.  Since you will
get a warning if the init file is more than a day old, I recommend
creating the file using a cron job that runs once a day.  If the init file
does not exist, the script will proceed to search C&lt;@INC&gt;.

=head1 ARGUMENTS

=over 4

=item regex

An optional regular expression may be given.  The regex may be a simple
string, such as C&lt;foo&gt;, or it may be a more complicated expression, such as
C&lt;^foo.*bar\d&gt;. The regex syntax is Perl; it should not be confused
with shell wilcard syntax or the syntax for other common Unix utilities,
such as I&lt;sed&gt; or I&lt;grep&gt;. It is best to quote the regex to prevent
interaction with the shell. Do not include the C&lt;.pm&gt; extension as part of the
regex.  If no regex is given, find all modules.

=back

=head1 OPTIONS

All options can be abbreviated.

=over 4

=item sens

By default, the regular expression is case-insensitive. So, if the input
regex is C&lt;foo&gt;, it will match C&lt;foo&gt; as well as C&lt;FOO&gt; and C&lt;Foo&gt;, etc.
To use case-sensitive, use the C&lt;-sens&gt; option.

    findpm -sens foo

=item path

By default, only the module name is printed. To instead print the full
directory path to the module file, use the C&lt;-path&gt; option.

    findpm -path foo

=item report

To print out additional statistics, use the C&lt;-report&gt; option.
This will show the total number of matching modules, duplicate modules, etc.

    findpm -report

=item help

Show verbose usage information.

=back

=head1 EXAMPLES

Find xml modules:

    findpm xml

Find modules with case-sensitive "Ext":

    findpm -sens Ext

Find modules like File::Find.  The following are equivalent because
C&lt;::&gt; will be converted to C&lt;/&gt; (similar to I&lt;perldoc&gt;):

    findpm 'file::find'
    findpm 'file/find'

Find all modules in all C&lt;@INC&gt; directories:

    findpm

Create init file:

    rm -f ~/.findpm; findpm -path &gt; /tmp/.findpm; mv /tmp/.findpm ~/.findpm

=head1 CONFIGURATION AND ENVIRONMENT

Searches for an optional initialization file in the directory specified
by the C&lt;HOME&gt; environment variable:

    ${HOME}/.findpm

=head1 LIMITATIONS

The initialization file is only supported for Unix-type operating systems.

=cut
&lt;/c&gt;
&lt;/readmore&gt;

&lt;p&gt;Constructive criticism, suggestions for improvements
and bug reports are welcome.

&lt;p&gt;&lt;small&gt;Update: Now only uses core modules.&lt;/small&gt;
&lt;br&gt;&lt;small&gt;Update: Avoid potential warning; small change to POD.&lt;/small&gt;&lt;br&gt;&lt;small&gt;Update: &lt;c&gt;find&lt;/c&gt; is more portable.&lt;/small&gt;
</field>
</data>
</node>
