Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

file wildcards in Win32

by jjhorner (Hermit)
on Jun 16, 2000 at 19:07 UTC ( #18475=perlquestion: print w/replies, xml ) Need Help??

jjhorner has asked for the wisdom of the Perl Monks concerning the following question:

One of my Windows using compatriots was lamenting the lack of a grep-like program for his Win32 workstation. He wanted to be able to check is .c files for strings.

I coded this up, and it works, but it has one drawback: he can't specify the extension of the file to search.

Any ideas? I'm not a Win32 programmer, but I looked through the Learning Perl on Win32 book and didn't find anything.

Here is my current code:

#!/usr/bin/perl -w use strict; use File::Find; my ($string, @files); ($string, @files) = @ARGV; @files = "." unless @files; find sub { unless (-x) { if (open (FILE, "$_")) { while (my $line = <FILE>) { print "$_: $line" if $line =~ /$string/i; } close FILE; } else { warn "Can't open $_: $!"; next; } } }, @files;

Replies are listed 'Best First'.
Re: file wildcards in Win32
by httptech (Chaplain) on Jun 16, 2000 at 19:46 UTC
    Many unix commands have been entirely rewritten in Perl by the Perl Power Tools project. You can find a couple of their implementations of grep here. I've used tcgrep on Win32 and it works.

    BTW, your /$string/i might be faster as /$string/io since $string does not change during the duration of the program.

Re: file wildcards in Win32
by athomason (Curate) on Jun 16, 2000 at 20:09 UTC
    Check out the glob function, which expands filenames with wildcards. kiyohara put this to use in a Win32 prog in the Craft entry GlobArgv. Also take a look at the study function and its documentation. You may be able to speed up your query, and the doc has a grep-like code sample at the bottom: two for one! :-) I believe more than a few versions of grep have already been coded in Perl (with varying complexity, of course), so take a look around the web to avoid reinventing the wheel.
Another alternative: Cygwin (was: cygwin)
by visnu (Sexton) on Jun 16, 2000 at 20:33 UTC
    there's also cygwin

    2001-03-13 Edit by Corion : Changed title

RE: file wildcards in Win32
by Ovid (Cardinal) on Jun 16, 2000 at 20:39 UTC
    Update: I didn't change the code, I just reformatted the comments for easier reading. All comments immediately precede the relevant code. I think the suggestions above are better, but if you really want to get this working, I think this might work:
    #!/usr/bin/perl -w use strict; use File::Find; my ($string, @files); #filename will match anything if we haven't specified an extension my $ext = "."; ($string, @files) = @ARGV; #if it begins with a dot or an asterisk dot, it's an extension if (@files && $files[0] =~ /^(\.|(\*\.))/) { # join all extensions in regex suitable string $ext = join ('|', @files); # remove the asterisks and escape the dots $ext =~ s/(\*|\.)/($1 eq "*")? "":"\\."/ge; @files = ("."); } @files = "." unless @files; find sub { # skip if it doesn't end in $ext unless ((-x) || ($_ !~ /($ext)$/io)) { if (open (FILE, "$_")) { while (my $line = <FILE>) { print "$_: $line\n" if $line =~ /$string/io; } close FILE; } else { warn "Can't open $_: $!"; next; } } }, @files;
    It works just like before, but if you use extensions instead of filenames, it will search only extensions. It's an XOR situation: only extensions or filenames, not both.

    Extensions may or not be preceded by an asterisk. Your choice. This is an example that I used:

    perl grep.pl findxyz .htm *.pl
    I know it's a hack, but it was fun!
RE: file wildcards in Win32
by barZion (Beadle) on Jun 16, 2000 at 21:42 UTC
    This should print out all the filenames with a .h or .c extension on Win32 in a specified subdirectory. Naturally, you would want to tweak it to suit your needs, but it finds the extensions quite nicely.

    use Win32::Process; print "Enter the subdirectory you'd like to parse: "; $path = <STDIN>; chomp( $path ); $prefix = "C:\\Projects\\Source\\"; $path = "$prefix$path"; $path = "." unless $path; opendir( DIR, $path ) or die "Can't open $path: $!"; while ( $entry = readdir( DIR ) ) { $type = ( -d "$path\\$entry" ) ? "dir" : "file"; # $path is crucia +l! if( $entry =~ /\.h|\.c$/ ){ print "$entry\n"; } } closedir( DIR );
      This will print all files that end in .h* or just .c. It will list .html files .htm files, etc. Surround the \.h|\.c with parens. Change
      if ( $entry =~ /\.h|\.c$) {
      to
      if ( $entry =~ /(\.h|\.c)$) {
      It also won't catch files that end in .H or .C if for some reason those are in caps. You can adjust accordingly.
Re: file wildcards in Win32
by buzzcutbuddha (Chaplain) on Jun 16, 2000 at 23:16 UTC
    Ok, not the cleanest code by any stretch, but it works.
    usage: greplike.pl C:\ .c /*
    will then look through all Source code on C for the beginnings of comments.
    One of the little features that I like is that if you have a file name with more than one
    "." in it ( ".htm.txt") it will grab only the last part after the extension for examination.
    Could be made better so it searchs across multiple lines, recognizes and strips "" from the string
    you are searching for unless you escape them, and so on. Fun little program to make.

    Update
    Corrected my two errors that Ovid pointed out.
    Thanks Ovid! :)
    use File::Find; use strict; use integer; my $filecounter = 0; my $dir = @ARGV[0]; @ARGV[1] =~ m/(.*)(\.){1}(\w+)$/g; my $filename = $1; my $filesoftype = $3; my $stringtosearchfor = @ARGV[2]; print "Looking for files of type $filesoftype in $dir containing $stri +ngtosearchfor\n"; sleep 10; find(\&lookingfor, $dir); print "Found $filecounter files of type $filesoftype!\n"; sub lookingfor() { if (($_ =~ m/\.+($filesoftype)+$/io) && (-f $_)) { $filecounter++; open INPUT, "<$_" || die "Unable to open $_ for examination: $ +!\n"; while(my $line = <INPUT>) { chomp $line; if ($line =~ m/$stringtosearchfor/gio) { print "Found match in $File::Find::name at line $.\.\n +"; } } close INPUT; } }
      You can skip the $linecounter variable and use "$." (that's a dollar sign followed by a dot). The "dollar dot" variable keeps track of the current input line number.

      Also, don't forget to use the /o switch in a regex included in a loop if the regex never changes. This prevents the regex compiler from recompiling the regex every time time it encounters it and will improve program performance.

      Nice code, btw.

        Two excellent points! I had forgotten about $. and I had used /o in the first regex
        but not in the second. You can tell it's Friday and my mind is elsewhere. Thanks for the insight.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://18475]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (7)
As of 2021-03-05 20:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My favorite kind of desktop background is:











    Results (115 votes). Check out past polls.

    Notices?