in reply to
Database searches with persistent results via CGI
Good afternoon, MrCromeDome. It seems you are having the same sort of problem as do a lot of search engines... needing to save cycles on many repeat queries. There are a lot of approaches to it, and it really is going to depend on the result set size, what hardware you have it on, etc. I'm going to assume that you have mySql as a database engine, as these tips will hold true for many many DBMs out there.
- First off, consider this, what is the upper limit size on your result set. Are you looking at 100s, 1000s, more? Because at a certain point, these caching mechanisms are not really going to help you. You'll have to do some performance tweaking to see what size category the bulk your results end up.
- Secondly, consider how often your data changes? This is essential in determining when your cache should drop dead, and the next search should store new data. If you have rapidly changing data (a good amount entered new each day), consider a cached time of perhaps 4 hours, or do it on off cycles, such as overnight.
In mySQL, whenever you request a large result set that has an "ORDER BY" statement in it, mySQL has to at least analyze the entire result set, then apply the limiter to it, to get the correct results. This can be quite tedious on the server, as you often pass over tons of results that you never pass back to the app. One way of caching is to get and hold all of the results that mySQL will come back to you with, and store them someplace fast to look them up in, such as another table.
A proposed table would take the form:
- result_id = primary key int not null
- dropdead = datetime, a future time in which to kill the result
- searchstring = char(255) of search results.
- resultcache = blob / text (or largeblob, etc)
Basically, you'd store the "SMITH JOHN" in the search string, and then form your result set in the resultcache, in some way that
Storage caching mechanisms can take two forms.
- Binary column type, Storable, and freeze / thaw This method compresses the best, uncompresses fast, but is binary, and may be harder to debug.
- A text data type (see the blob link above), Data::Dumper, and Data::Dumper->Dump / eval to store and then revive the data. This approach will store fewer results, is not as fast other approaches, but is easier to debug.
If you take a result set, and store it as an array, you can thaw (or eval), the entire result set and jump to a certain spot in the set. To look at this in psuedoperl:
foreach my $result in ($search_result)
push @list_of_cached_results, result;
my $binary_cache = nfreeze(\@list_of_cached_results);
#store lists of binary results
@list_of_cached_results = thaw($indatabase);
Basically, if you use the perl persistance modules, and a database (such as the one you're already reading from), you can pick up a good amount of speed from picking up a pre-packaged set of results (a match to your search string), and then display it to the page, saving you the hell of finding all the results again. As long as you can manage results and not have them go "stale", then you'll be all set.
Hope this made some sense to you. If I don't seem clear, I'd be happy to clarify anything in here.