http://www.perlmonks.org?node_id=79108


in reply to Using large result sets from a database

Tough problem.

You did not say which database you are using. Postgres will let you specify a limit and an offset to the query so you can reissue the query but get only the next set of results (assuming nothing has changed). I believe mysql has similar features. And you can do it in Oracle but it gets very ugly. So if you want to restrict yourself to standard SQL you are out of luck on the DB side.

On the server side you have more control, but it is still pretty ugly. Are you using any particular framework (mason, mod_perl, ...) on your server side? If you have persistent server side state you can get the whole dataset the first time, bung it into your state and then paginate it for the user when they ask for it. This avoids problems where the data changes underneath you since you will just page through a static view but your data can become stale. It is cheaper if you expect your users to look at most of the data, but if they are only going to look at a page or two, it gets expensive. Plus, you then have to clean up the state sometime, and deciding when is tricky.

What I do in this case is to reissue the query each time and loop through the results only storing the ones I care about:

my $first_row = 500; my $window_size = 50; my $location = 1; while (my ( $col1, $col2) = $sth->fetchrow_array() ) { next if $location < $first_row; last if $location >= $first_row + $window_size; # Do whatever I need to do with the data } continue { $location++; }
Which gets pricey if they plan on looking at all of the results, but I have hundreds of thousands to millions of results to show (in some cases). So I can't slurp them to my side and in most cases they only want to see a page or two.

-ben

Replies are listed 'Best First'.
Re: Re: Using large result sets from a database
by MrCromeDome (Deacon) on May 09, 2001 at 19:05 UTC
    I'm sorry for leaving out some of the important info:

    (Unfortunately) my DBMS is MS SQL Server 7.0. I don't like that, but unfortunately it's what I'm stuck with. It's hard being the UNIX guy (err, the voice of reason) in a Win32 shop. . . but anyway, I'm using DBI-ODBC to connect, and I have Apache 1.3.19 with mod_perl 1.25.

    Thanks for the info :)

      Something you can also try, but which is more cumbersome, is to track the key values for your table if you've got sequential or orderable keys, or alternatively add a counter in a subquery and select for the range within that subquery that you pass the server within the query.

      As you're using MS SQL Server, look into the SYBASE documentation since they are 90% the same product. My guess is that they have this already taken care of using LIMIT or an equivalent.

      --hackmare