Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic

Re: Help on selecting and deleting many records in a DB.

by erix (Parson)
on Apr 05, 2014 at 16:23 UTC ( #1081261=note: print w/replies, xml ) Need Help??

in reply to Help on selecting and deleting many records in a DB.

Perl will simply run SQL, via DBI; it will not be faster than running the SQL directly (it will even be (a very little) slower). And you can run cursors in straight SQL too (well, in plpgsql); cursors have nothing especially to do with perl.

Perhaps perl gives you some more flexibility.

Which version of postgresql?

How large is the table (in GB and in rowcount)? And how large the part to be deleted?

One approach to data removal is to setup the data as a partitioned table (partitioned by date/timestamptz) which then lets you regularly drop the oldest table. See the fine manual (on that page, read also the partition caveats; mainly: don't use more than a few hundred partitions).

  • Comment on Re: Help on selecting and deleting many records in a DB.

Replies are listed 'Best First'.
Re^2: Help on selecting and deleting many records in a DB.
by neilwatson (Priest) on Apr 05, 2014 at 16:58 UTC
    PG 8.4. Millions of rows. Several GB. I wish to delete a significant fraction of the table. How much depends on the size of application deployment.

    Neil Watson

      PG 8.4.

      I hope you realise that 8.4 goes out of support in 3 months. It's decidedly long in the tooth (to use another equine saying).

      What I would do is use a recent postgres version (9.3.4 and even 9.4devel (which is almost in feature-freeze anyway)), fix any database problems on those, and then (if your app really needs it) work back to fix it on legacy versions (i.e.: 9.2.8, 9.1.13, 9.0.17). It's time to drop support for 8.4 (I see you even still mention 8.3 on your pages...).

      And RichardK is right: get yourself acqainted with EXPLAIN ANALYZE output (at least to the point where you can tell whether an index is used, and useful).

      if you can't get it sorted out with an index, I still think partitioning is the best way to solve your problem. I think all the talk of stored proceduring is a red herring.

      IRC freenode #postgresql is often full of people who relish this kind of problem, even if it is somewhat FAQ-ish (sometimes you have to be patient).

      Do let us know how you solve this. I'm interested. (If it wasn't 8.4 I'd probably have setup a little trial example...)

      update: changed order of paragraphs: first try appropriate indexing, if that fails, partition).

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1081261]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (5)
As of 2018-06-22 03:44 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (121 votes). Check out past polls.