Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??
Here is a handy command-line tool to quickly view installed Perl modules whose name matches a specified regular expression.

Features

  • Perl regular expression syntax, with separate case-sensitive switch.
  • Optional initialization file for faster look-ups.
  • Option to print the module name or the full directory path to the module file.
  • Option to display duplicate modules and other statistics.
  • Uses only core modules.

Other well-known methods

So... why another way to do it?

Simply put: I could not easily convince these other tools to Do What I Want, as quickly as I want, in (what I consider) a bug-free manner. Obviously, this is not a new idea; it is merely a different implementation. There are many threads here at the Monastery, as a Super Search would reveal, and I have probably read every node of every thread on the topic. I believe HTML::Perlinfo does everything this script does (and much, much more), except that I could not easily figure out how to generate output as simple text, rather than HTML. I consider HTML::Perlinfo to be a valuable companion to this script. I run a daily cronjob to dump out recent versions of both HTML and text.

Impatience

In my opinion, the biggest advantage here is the fast look-up capability. No matter how you slice it, you have to search through the @INC directories via some variant of find, which can take a whole minute or so -- I just do not have the patience to wait that long! Maintaining the initialization file avoids that nonsense.

The code

use warnings; use strict; use Getopt::Long; use Pod::Usage; use File::Find; my $print_path; my $report; my $re; parse_args(); # Clean up @INC my @dirs; for my $dirname (@INC) { if (-d $dirname) { next if $dirname eq '.'; $dirname =~ s{/+}{/}g; $dirname =~ s{/$}{}; push @dirs, $dirname; } } @dirs = uniq(@dirs); # For quicker operation, use init file, if it exists my @files; my $use_find = 1; my $message; my $init_file = exists $ENV{HOME} ? "$ENV{HOME}/.findpm" : ''; if (-e $init_file) { if (open my $fh, '<', $init_file) { @files = <$fh>; close $fh; chomp @files; my $days = 1; if (-M $init_file > $days) { $message = "Warning: $init_file is older than $days day\n" +; } die "Error: $init_file is empty" if -z $init_file; $use_find = 0; } else { $message = "Warning: $init_file exists, but can not be opened: + $!"; } } # Otherwise, use the slower find command if ($use_find) { # Find all .pm files under @INC dirs my @find_dirs = reduce_dirs(@dirs); find( { wanted => sub { push @files, $_ if -f $_ and /\.pm$/ }, no_chdir => 1, }, @find_dirs ); @files = uniq(@files); } # Print those modules/files which match the regex my %mods; for my $file (@files) { my @ds; for my $dir (@dirs) { if (index($file, $dir) == 0) { #print "$d2 is a substring of $d1, starting at pos 0\n" push @ds, $dir; } } my $d = (sort {length($b) <=> length($a)} @ds)[0]; my $rel = substr($file, (length($d)+1)); my $name = $rel; $name =~ s/\.pm$//; next unless $name =~ /$re/; push @{ $mods{$rel} }, $d; if ($print_path) { print "$file\n"; } else { $rel =~ s/\.pm$//; $rel =~ s{/}{::}g; print "$rel\n"; } } if ($report) { my $num_dups = 0; for (keys %mods) { $num_dups++ if (scalar(@{$mods{$_}}) > 1); } if ($num_dups) { print "\nDUPLICATES\n"; for my $rel (keys %mods) { if (scalar(@{$mods{$rel}}) > 1) { print "$rel\n"; for my $dir (@{$mods{$rel}}) { print " $dir/$rel\n"; } } } } print "\nSUMMARY\n"; print " regex = $re\n"; print " Used '$init_file' init file instead of 'find'\n" unless + $use_find; print " INC dirs:\n"; print " $_\n" for @dirs; print ' Total ".pm" files = ', scalar @files, "\n" +; print ' Matching unique ".pm" files = ', scalar keys %mods, +"\n"; print ' Matching duplicate ".pm" files = ', $num_dups, "\n"; } warn $message if $message; exit; sub reduce_dirs { # Reduce a list of directory names by eliminating # names which contain other names. For example, # if the input array contains (/a/b/c/d /a/b/c /a/b), # return an array containing (/a/b). my @dirs = @_; my %substring_count = map { $_ => 0 } @dirs; for my $x (@dirs) { for my $y (@dirs) { next if $x eq $y; if (index($x, $y) == 0) { # if y is substring of x, starting at position 0 $substring_count{$x}++; } } } my @dsubs; for (keys %substring_count) { push @dsubs, $_ if $substring_count{$_} == 0; } return @dsubs; } sub uniq { # From List::MoreUtils, $VERSION = '0.22' my %h; map { $h{$_}++ == 0 ? $_ : () } @_; } sub parse_args { my ($help, $sens); GetOptions( 'sens' => \$sens, 'path' => \$print_path, 'report' => \$report, 'help' => \$help ) or pod2usage(); $help and pod2usage(-verbose => 2); my $pat = (@ARGV) ? shift @ARGV : '.'; $pat =~ s{::}{/}g; $re = ($sens) ? qr/$pat/ : qr/$pat/i; #print "pat=$pat\n"; #print "re=$re\n";#exit; @ARGV and pod2usage("Error: unexpected args: @ARGV"); } =head1 NAME B<findpm> - Find installed Perl modules =head1 SYNOPSIS findpm [options] [regex] Options: -help verbose help -path print out full directory paths also -report print out detailed report -sens case-sensitive [default is case-insensitive] =head1 DESCRIPTION Search through the directories in the Perl C<@INC> variable for Perl module files (all files with a C<.pm> extension) matching a specified regular expression. The names of all the modules which match will be printed to STDOUT. Any directories listed in C<@INC> which do not exist will be silently +ignored. Excludes the current directory (.). If you are impatient (like I am) you can optionally use an initializat +ion file instead of letting the script search through all the C<@INC> directories every time you run the script. The file must be in your h +ome directory and must be named C<.findpm>. You must create this file you +rself (see EXAMPLES below), and you should keep it up to date. Since you wi +ll get a warning if the init file is more than a day old, I recommend creating the file using a cron job that runs once a day. If the init +file does not exist, the script will proceed to search C<@INC>. =head1 ARGUMENTS =over 4 =item regex An optional regular expression may be given. The regex may be a simpl +e string, such as C<foo>, or it may be a more complicated expression, su +ch as C<^foo.*bar\d>. The regex syntax is Perl; it should not be confused with shell wilcard syntax or the syntax for other common Unix utilitie +s, such as I<sed> or I<grep>. It is best to quote the regex to prevent interaction with the shell. Do not include the C<.pm> extension as par +t of the regex. If no regex is given, find all modules. =back =head1 OPTIONS All options can be abbreviated. =over 4 =item sens By default, the regular expression is case-insensitive. So, if the inp +ut regex is C<foo>, it will match C<foo> as well as C<FOO> and C<Foo>, et +c. To use case-sensitive, use the C<-sens> option. findpm -sens foo =item path By default, only the module name is printed. To instead print the full directory path to the module file, use the C<-path> option. findpm -path foo =item report To print out additional statistics, use the C<-report> option. This will show the total number of matching modules, duplicate modules +, etc. findpm -report =item help Show verbose usage information. =back =head1 EXAMPLES Find xml modules: findpm xml Find modules with case-sensitive "Ext": findpm -sens Ext Find modules like File::Find. The following are equivalent because C<::> will be converted to C</> (similar to I<perldoc>): findpm 'file::find' findpm 'file/find' Find all modules in all C<@INC> directories: findpm Create init file: rm -f ~/.findpm; findpm -path > /tmp/.findpm; mv /tmp/.findpm ~/.f +indpm =head1 CONFIGURATION AND ENVIRONMENT Searches for an optional initialization file in the directory specifie +d by the C<HOME> environment variable: ${HOME}/.findpm =head1 LIMITATIONS The initialization file is only supported for Unix-type operating syst +ems. =cut

Constructive criticism, suggestions for improvements and bug reports are welcome.

Update: Now only uses core modules.
Update: Avoid potential warning; small change to POD.
Update: find is more portable.


In reply to Find installed Perl modules matching a regular expression by toolic

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others taking refuge in the Monastery: (9)
    As of 2014-10-01 21:36 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      What is your favourite meta-syntactic variable name?














      Results (38 votes), past polls