Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re^2: Help on selecting and deleting many records in a DB.

by neilwatson (Curate)
on Apr 05, 2014 at 17:02 UTC ( #1081264=note: print w/ replies, xml ) Need Help??


in reply to Re: Help on selecting and deleting many records in a DB.
in thread Help on selecting and deleting many records in a DB.

Since a unique record is based on a combination of most fields the current indexes do not work well for this operation. Would adding another field, that is a concatenation of the other fields, and indexing that be a good approach?

Neil Watson
watson-wilson.ca


Comment on Re^2: Help on selecting and deleting many records in a DB.
Replies are listed 'Best First'.
Re^3: Help on selecting and deleting many records in a DB.
by vincent_veyron (Sexton) on Apr 06, 2014 at 16:38 UTC

    No, you need to create an index that matches your query terms. See: http://www.postgresql.org/docs/9.1/static/indexes.html

    You can try this :

    delete from agent_log t1 where WHERE timestamp < now() - interval '7 days') and timespan < (select max(timestamp) from agent_log t2 where t2.class=t1.class and t2.hostname=t1.hostname and t2.id_address=t1.ip_address)

    I suppose you have an index on timestamp and the other fields? A surrogate key might help, or even using timespan as primary key, if possible.

    If you can create a history table, then you could:

    SELECT class, max(timestamp) as timestamp, hostname, ip_address, promise_handle, promiser, promisee, policy_server, promise_outcome FROM agent_log WHERE timestamp < now() - interval '7 days' into history_table ; delete from agent_log WHERE timestamp < now() - interval '7 days'

    That would be a good start towards partitioning your data, which you might need to consider.



    libremen.com : legal case, contract and insurance claim management software
      SELECT * FROM agent_log t1 WHERE timestamp < now() - interval '7 days' AND timestamp < ( SELECT max(timestamp) FROM agent_log t2 WHERE t2.class=t1.class AND t2.promiser=t1.promiser AND t2.hostname=t1.hostname AND t2.ip_address=t1.ip_address ) ORDER by class,timestamp;
      You almost had me there, but I don't think this will keep the maximum timestamp per day.

      Neil Watson
      watson-wilson.ca

        Well, it does if you run it every day and clean up afterwards, with a cron job for instance.

        If possible, it seems you could solve a big part of your problem using partitioning of your table, probably by day; see:
        http://www.postgresql.org/docs/9.3/static/ddl-partitioning.html



        libremen.com : legal case, contract and insurance claim management software

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1081264]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (14)
As of 2015-07-28 13:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (254 votes), past polls