Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??
Here is a handy command-line tool to quickly view installed Perl modules whose name matches a specified regular expression.

Features

  • Perl regular expression syntax, with separate case-sensitive switch.
  • Optional initialization file for faster look-ups.
  • Option to print the module name or the full directory path to the module file.
  • Option to display duplicate modules and other statistics.
  • Uses only core modules.

Other well-known methods

So... why another way to do it?

Simply put: I could not easily convince these other tools to Do What I Want, as quickly as I want, in (what I consider) a bug-free manner. Obviously, this is not a new idea; it is merely a different implementation. There are many threads here at the Monastery, as a Super Search would reveal, and I have probably read every node of every thread on the topic. I believe HTML::Perlinfo does everything this script does (and much, much more), except that I could not easily figure out how to generate output as simple text, rather than HTML. I consider HTML::Perlinfo to be a valuable companion to this script. I run a daily cronjob to dump out recent versions of both HTML and text.

Impatience

In my opinion, the biggest advantage here is the fast look-up capability. No matter how you slice it, you have to search through the @INC directories via some variant of find, which can take a whole minute or so -- I just do not have the patience to wait that long! Maintaining the initialization file avoids that nonsense.

The code

use warnings; use strict; use Getopt::Long; use Pod::Usage; use File::Find; my $print_path; my $report; my $re; parse_args(); # Clean up @INC my @dirs; for my $dirname (@INC) { if (-d $dirname) { next if $dirname eq '.'; $dirname =~ s{/+}{/}g; $dirname =~ s{/$}{}; push @dirs, $dirname; } } @dirs = uniq(@dirs); # For quicker operation, use init file, if it exists my @files; my $use_find = 1; my $message; my $init_file = exists $ENV{HOME} ? "$ENV{HOME}/.findpm" : ''; if (-e $init_file) { if (open my $fh, '<', $init_file) { @files = <$fh>; close $fh; chomp @files; my $days = 1; if (-M $init_file > $days) { $message = "Warning: $init_file is older than $days day\n" +; } die "Error: $init_file is empty" if -z $init_file; $use_find = 0; } else { $message = "Warning: $init_file exists, but can not be opened: + $!"; } } # Otherwise, use the slower find command if ($use_find) { # Find all .pm files under @INC dirs my @find_dirs = reduce_dirs(@dirs); find( { wanted => sub { push @files, $_ if -f $_ and /\.pm$/ }, no_chdir => 1, }, @find_dirs ); @files = uniq(@files); } # Print those modules/files which match the regex my %mods; for my $file (@files) { my @ds; for my $dir (@dirs) { if (index($file, $dir) == 0) { #print "$d2 is a substring of $d1, starting at pos 0\n" push @ds, $dir; } } my $d = (sort {length($b) <=> length($a)} @ds)[0]; my $rel = substr($file, (length($d)+1)); my $name = $rel; $name =~ s/\.pm$//; next unless $name =~ /$re/; push @{ $mods{$rel} }, $d; if ($print_path) { print "$file\n"; } else { $rel =~ s/\.pm$//; $rel =~ s{/}{::}g; print "$rel\n"; } } if ($report) { my $num_dups = 0; for (keys %mods) { $num_dups++ if (scalar(@{$mods{$_}}) > 1); } if ($num_dups) { print "\nDUPLICATES\n"; for my $rel (keys %mods) { if (scalar(@{$mods{$rel}}) > 1) { print "$rel\n"; for my $dir (@{$mods{$rel}}) { print " $dir/$rel\n"; } } } } print "\nSUMMARY\n"; print " regex = $re\n"; print " Used '$init_file' init file instead of 'find'\n" unless + $use_find; print " INC dirs:\n"; print " $_\n" for @dirs; print ' Total ".pm" files = ', scalar @files, "\n" +; print ' Matching unique ".pm" files = ', scalar keys %mods, +"\n"; print ' Matching duplicate ".pm" files = ', $num_dups, "\n"; } warn $message if $message; exit; sub reduce_dirs { # Reduce a list of directory names by eliminating # names which contain other names. For example, # if the input array contains (/a/b/c/d /a/b/c /a/b), # return an array containing (/a/b). my @dirs = @_; my %substring_count = map { $_ => 0 } @dirs; for my $x (@dirs) { for my $y (@dirs) { next if $x eq $y; if (index($x, $y) == 0) { # if y is substring of x, starting at position 0 $substring_count{$x}++; } } } my @dsubs; for (keys %substring_count) { push @dsubs, $_ if $substring_count{$_} == 0; } return @dsubs; } sub uniq { # From List::MoreUtils, $VERSION = '0.22' my %h; map { $h{$_}++ == 0 ? $_ : () } @_; } sub parse_args { my ($help, $sens); GetOptions( 'sens' => \$sens, 'path' => \$print_path, 'report' => \$report, 'help' => \$help ) or pod2usage(); $help and pod2usage(-verbose => 2); my $pat = (@ARGV) ? shift @ARGV : '.'; $pat =~ s{::}{/}g; $re = ($sens) ? qr/$pat/ : qr/$pat/i; #print "pat=$pat\n"; #print "re=$re\n";#exit; @ARGV and pod2usage("Error: unexpected args: @ARGV"); } =head1 NAME B<findpm> - Find installed Perl modules =head1 SYNOPSIS findpm [options] [regex] Options: -help verbose help -path print out full directory paths also -report print out detailed report -sens case-sensitive [default is case-insensitive] =head1 DESCRIPTION Search through the directories in the Perl C<@INC> variable for Perl module files (all files with a C<.pm> extension) matching a specified regular expression. The names of all the modules which match will be printed to STDOUT. Any directories listed in C<@INC> which do not exist will be silently +ignored. Excludes the current directory (.). If you are impatient (like I am) you can optionally use an initializat +ion file instead of letting the script search through all the C<@INC> directories every time you run the script. The file must be in your h +ome directory and must be named C<.findpm>. You must create this file you +rself (see EXAMPLES below), and you should keep it up to date. Since you wi +ll get a warning if the init file is more than a day old, I recommend creating the file using a cron job that runs once a day. If the init +file does not exist, the script will proceed to search C<@INC>. =head1 ARGUMENTS =over 4 =item regex An optional regular expression may be given. The regex may be a simpl +e string, such as C<foo>, or it may be a more complicated expression, su +ch as C<^foo.*bar\d>. The regex syntax is Perl; it should not be confused with shell wilcard syntax or the syntax for other common Unix utilitie +s, such as I<sed> or I<grep>. It is best to quote the regex to prevent interaction with the shell. Do not include the C<.pm> extension as par +t of the regex. If no regex is given, find all modules. =back =head1 OPTIONS All options can be abbreviated. =over 4 =item sens By default, the regular expression is case-insensitive. So, if the inp +ut regex is C<foo>, it will match C<foo> as well as C<FOO> and C<Foo>, et +c. To use case-sensitive, use the C<-sens> option. findpm -sens foo =item path By default, only the module name is printed. To instead print the full directory path to the module file, use the C<-path> option. findpm -path foo =item report To print out additional statistics, use the C<-report> option. This will show the total number of matching modules, duplicate modules +, etc. findpm -report =item help Show verbose usage information. =back =head1 EXAMPLES Find xml modules: findpm xml Find modules with case-sensitive "Ext": findpm -sens Ext Find modules like File::Find. The following are equivalent because C<::> will be converted to C</> (similar to I<perldoc>): findpm 'file::find' findpm 'file/find' Find all modules in all C<@INC> directories: findpm Create init file: rm -f ~/.findpm; findpm -path > /tmp/.findpm; mv /tmp/.findpm ~/.f +indpm =head1 CONFIGURATION AND ENVIRONMENT Searches for an optional initialization file in the directory specifie +d by the C<HOME> environment variable: ${HOME}/.findpm =head1 LIMITATIONS The initialization file is only supported for Unix-type operating syst +ems. =cut

Constructive criticism, suggestions for improvements and bug reports are welcome.

Update: Now only uses core modules.
Update: Avoid potential warning; small change to POD.
Update: find is more portable.


In reply to Find installed Perl modules matching a regular expression by toolic

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others taking refuge in the Monastery: (7)
    As of 2015-07-04 20:22 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









      Results (60 votes), past polls