Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

Re: Help on selecting and deleting many records in a DB.

by sundialsvc4 (Abbot)
on Apr 06, 2014 at 11:47 UTC ( #1081302=note: print w/ replies, xml ) Need Help??

in reply to Help on selecting and deleting many records in a DB.

If you have “millions of rows,” the last thing you want to do is to send those millions of rows down a pipe to another computer, or even to another program in the same one.   You will want to do this in a stored procedure that is executed by the server, on the server.   Furthermore, you will probably want this stored procedure to execute a loop which runs through each record in a query such as SELECT class, ip_address, hostname, MAX(timestamp) .. GROUP BY .., and which then, for each record found (each one representing a single group and providing the timestamp that should be kept):   starts a transaction, deletes the qualifying records (using a single DELETE), and then commits.

However, to do that efficiently, you will need to create an index on (class ip_address, hostname), and I would do that first.   You may well find that, when you have done this, a pure-SQL solution using a NOT IN query will now yield acceptable performance.   (Although it may be too phat to run as a single transaction.)   The absence of such an index is probably why your existing approach is running too slow ... and, why every other one would run too slow, as well.

Comment on Re: Help on selecting and deleting many records in a DB.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1081302]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (6)
As of 2016-05-29 13:13 GMT
Find Nodes?
    Voting Booth?