Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

(ichimunki) Re: All-Perl search engine having speed issues

by ichimunki (Priest)
on Nov 19, 2001 at 23:42 UTC ( [id://126355]=note: print w/replies, xml ) Need Help??


in reply to All-Perl search engine having speed issues

First, some suggestions unrelated to the benchmark issue:

Use strict (in code over 1 line long, it will save you countless hours finding little bugs-- it will also enforce some good scope parameters more carefully, it looks like you're having some "fun" with that piece, too. That is, you shouldn't have to undef a list that is about to go out of scope.)

Use CGI.pm (it may be a little heavy, but it provides good form handling)

Use taint mode (you never know what's going to sneak in on a POST).

Your loops will be more readable if you put an obvious list into the definition-- rather than using the C-like syntax. I mean, we can all figure it out, but that takes brain time on the reader end, where it is least desirable.

Have you tested your split against strings containing more than one \W character in a row? Two spaces in between keywords is going to slow your search down for no apparent reason (and may even cause other problems).

Finally, with respect to your algorithm, you can't get away from the performance hit the way you have this built. sysseek may be very efficient, but doing it twice will take twice as long (depending on where in the file the words fall). It looks like you're already using your filesystem to the best advantage, by sorting/indexing the search files by initial letter groupings, etc. but again, if you do something twice it takes twice as long.
  • Comment on (ichimunki) Re: All-Perl search engine having speed issues

Replies are listed 'Best First'.
Re: (ichimunki) Re: All-Perl search engine having speed issues
by tstock (Curate) on Nov 20, 2001 at 05:57 UTC
    The old "use strict, CGI.pm and taint" reply... you forgot "warnings" ichimunki :)

    Tiago
      Someone had to say it! ;)

      At least I commented on something other than just those old saws. And I did remember -w, but only a while after I made the post... thanks for reminding us!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://126355]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (4)
As of 2024-04-25 23:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found