Hello Monks,
I wrote this simple script ‘findinfile.pl’ to search for arbitrary text in a file:
use strict;
use warnings;
print "\n";
scalar @ARGV == 2
or die "USAGE: perl $0 <filename> <regex>\n";
open(my $fh, '<', $ARGV[0])
or die "Unable to open file '$ARGV[0]' for reading: $!";
my $match = 0;
my $regex = qr/$ARGV[1]/;
my @lines = <$fh>; # read whole file
foreach (0 .. $#lines)
{
if ($lines[$_] =~ /$regex/)
{
printf "Match found on line %d\n", ($_ + 1);
$match = 1;
}
}
print "No matches found\n" unless $match;
Example use: to find the main() function (if any) in file ‘run.c’, enter (I’m using a Windows command prompt):
>perl findinfile.pl run.c "int\s+main\s*\("
This works well (except that it doesn’t allow for embedded comments), provided the regex matches on a single line of text. However, some programmers code like this:
int
main(int argc, char** argv)
So, I can modify the script as follows:
...
my $text;
my $match = 0;
my $regex = qr/$ARGV[1]/;
{
local $/; # enable "slurp" mode
$text = <$fh>; # read whole file
}
while ($text =~ /$regex/gms)
{
print "Match found\n";
$match = 1;
}
print "No matches found\n" unless $match;
but now I’ve lost track of the line numbers.
I read somewhere that I could count occurrences of "\n" to calculate the line number of each match, but how would I identify the start- and end-points of each substring between successive matches? Or, is there a more straightforward approach that will retain line numbers while searching across multiple lines?
Thanks,
Athanasius <°(((>< contra mundum
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.