Re: pattern match -vs- *ix grep
by Limbic~Region (Chancellor) on Apr 07, 2005 at 13:40 UTC
|
ministry,
You might want to take a look at PPT::Util's tcgrep. In general, a compiled command line tool for a specific task is going to be faster than perl which is designed to be flexible. Depending on your specific needs, it is possible to make perl win. For instance, if you only care if the search string is present in the file, you can abort as soon as it is found. You can also use a sliding window buffer so that less disk I/O is involved.
I did this very thing shortly after joining the Monastery nearly 3 years ago. My cow orkers were impressed and my code replaced the shell scripts in production.
| [reply] |
Re: pattern match -vs- *ix grep
by Taulmarill (Deacon) on Apr 07, 2005 at 13:31 UTC
|
use it directly from command line:
perl -pe'$_="" unless /pattern/' file
but i don't think, that it will be faster than grep | [reply] [d/l] |
Re: pattern match -vs- *ix grep
by tlm (Prior) on Apr 07, 2005 at 14:01 UTC
|
#!/usr/bin/env perl
use strict;
use warnings;
die "Usage: blah blah\n" unless @ARGV;
my $regex = qr/@{[shift]}/;
/$regex/ && print while <>;
It certainly won't beat /bin/grep in speed, but you can give it cooler regexps. Note that this version takes any number of files as input. And it differs from /bin/grep in one important point: it returns a 0 status even when it finds no matches.
Also watch out for regexp characters that have special meaning to the shell.
Update: fixed stray -w in first line.
| [reply] [d/l] [select] |
Re: pattern match -vs- *ix grep
by VSarkiss (Monsignor) on Apr 07, 2005 at 14:07 UTC
|
Can anyone tell me of a better way to do this?
Yes. Check your arguments. This little snippet could cause havoc if called with
$ snippet.pl '| rm -rf *' 'ouch'
Take a look at Two-arg open() considered dangerous for details.
My point is that specialized utilities that have been around for a long time, like grep, have more than speed going for them. In general, they'll handle edge cases better.
If you want to demonstrate "the power of Perl", remember that power can be used for good or evil. :-)
| [reply] [d/l] [select] |
|
Sillyness. Anyone typing in
$ snippet.pl '| rm -rf *' 'ouch'
could as well have typed
$ rm -rf *
No point in checking the arguments.
| [reply] [d/l] [select] |
|
$ grep '| rm -rf *' 'ouch'
would not have done anything unexpected at all.
If you want to demonstrate that a simple Perl program is better or faster than a long-standing utility, you need to put more thought into what that utility does. Otherwise the comparison is unfair or meaningless.
| [reply] [d/l] [select] |
Re: pattern match -vs- *ix grep
by Anonymous Monk on Apr 07, 2005 at 14:25 UTC
|
Actually, I would expect grep to be faster than Perl all the time. Grep is a special purpose tool, accepting simpler regexes than Perl is able to handle. You can do more with Perl than you can with grep, but, IMO, a match between grep and Perl isn't going the best way to "win people over". Best you can hope for is that Perl isn't much slower.
Having said that, I would just write it as:
perl -ne 'BEGIN {$p = shift} print if /$p/' PATTERN files ...
But that's still significant longer than:
grep PATTERN files ...
| [reply] [d/l] [select] |
|
| [reply] |
Re: pattern match -vs- *ix grep
by dave_the_m (Monsignor) on Apr 07, 2005 at 16:10 UTC
|
Adding an 'o' to the end of the regexp avoids the pattern being recompiled each time round the loop, ie
if (/$pattern/o) {
Dave. | [reply] [d/l] |
|
| [reply] |
|
Ah yes, silly me. It bypasses the calls to the gvsv and regcomp ops though, so there's still a marginal saving.
Dave.
| [reply] |
Re: pattern match -vs- *ix grep
by cazz (Pilgrim) on Apr 07, 2005 at 14:21 UTC
|
If you are dead set on using perl regular expressions, you also might want to take a look at pcregrep. Same syntax, supports most of the features you probably want out of a regex, but with a LOT less overhead. | [reply] |
Re: pattern match -vs- *ix grep
by QM (Parson) on Apr 07, 2005 at 16:01 UTC
|
| [reply] [d/l] [select] |
Re: pattern match -vs- *ix grep
by tlm (Prior) on Apr 07, 2005 at 14:24 UTC
|
perl -wsne '/$r/ && print' -- -r='your regexp here' *.txt
| [reply] [d/l] |