I'm trying to speed up a script which finds all lines in a large (20MB) file that contain a certain string. Because partial matches are allowed, I can't use an inverted word list
approach for this. I've sped things up by using index()
instead of a regex, but it still takes too long.
So far, the only idea I've had is to use read() to pull in 4K chunks, qualify them with index() and only parse them line-by-line if the chunk qualifies.
Does anyone have any other ideas? FYI, this script will be running under mod_perl.