http://www.perlmonks.org?node_id=345344


in reply to grepp -- Perl version of grep

Interesting, but unfortunally, a bit slow. grep rocks when it comes to speed:
#!/bin/sh FILE=/home/abigail/Words/enable.lst PATTERNS=" abc sch[ae] .*a.*e.*i.*o.*u.* ^[[:xdigit:]]+$ (a|b)(c|d)(e|f) " PROGRAMS=("/bin/grep -E" ./grepp) OUTPUT=(/tmp/grep.out /tmp/grepp.out) i=0 while [ $i -lt ${#PROGRAMS[*]} ] do echo -n ${PROGRAMS[$i]} time for pat in $PATTERNS do ${PROGRAMS[$i]} $pat $FILE done > ${OUTPUT[$i]} i=$((i + 1)) done i=1 while [ $i -lt ${#OUTPUT[*]} ] do diff ${OUTPUT[0]} ${OUTPUT[$i]} i=$((i + 1)) done rm -f ${OUTPUT[*]} /bin/grep -E real 0m0.070s user 0m0.060s sys 0m0.010s ./grepp real 0m2.774s user 0m2.610s sys 0m0.030s

Abigail

Replies are listed 'Best First'.
Re: Re: grepp -- Perl version of grep
by jdporter (Paladin) on Apr 15, 2004 at 17:51 UTC
    Such a benchmark is only valid for tests which can be run by both implementations. Depending on what you're trying to do, slow is better than impossible.

    jdporter
    The 6th Rule of Perl Club is -- There is no Rule #6.

      Do you have common regular expressions you'd use 'grepp' for, and for which there's no 'grep' equivalent?

      Abigail

        Well, that bit in the pod about matching GB characters was based on personal experience. Access to Perl 5.8's character-set transliteration and unicode-based character semantics for matches (not to mention the very handy "\p" unicode-based character classes) was the main reason I had to write this tool in the first place. The other bells and whistles (handling compressed data, controlling the input record separator) were after-thoughts -- once I started using this thing on real (multi-language, multi-coded) text data, those other things were just handy and easy to add. But the Encode module was crucial.

        Maybe some people have tricks to search for a specific GB (or Big5, or Shif-JIS) character using plain-old "grep", but I couldn't figure a way to make it trustworhty for such things, and doing it with Perl just made sense.

Re: Re: grepp -- Perl version of grep
by Anonymous Monk on Apr 15, 2004 at 19:52 UTC
    agreed. grep does not use regex when matching against simple patterns. most grep implementations use a skip-search aglo ala Boyer-Moore for simple patterns.
      grep does not use regex when matching against simple patterns.
      And neither does Perl. Unless you want to call '(a|b)(c|d)(e|f)' "simple" - this is a pattern where grep wins big time over Perl.

      Abigail