1 minute read

Over the weekend I realized that deleting 80% of the records on a table with 87 million rows is not quite as easy as issuing a delete statement with a where criteria.

The main reason is that the transaction log quickly grows as it needs to keep track of all the uncommitted rows that are being deleted. In my case the database file was 60GB and the log file grew quickly from a mere 40MB to 32GB before I ran out of disk space and the database goes into recovery mode.

Browsing through blogs I narrowed my scenario to two options.

  1. Move the required records over to a temporary table and issue a truncate on the table you want to delete.
  2. Issue the deletes in batches so that the log file doesn’t fill up.

The first option seems to take the shorter time to execute but I didn’t want to go through the effort of removing and adding foreign keys  and renaming tables. So I went with the second option. Here’s the script I used for SQL Server which deletes 10,000 rows at a time and commits them.

DECLARE @continue INT
DECLARE @rowcount INT
 
SET @continue = 1
WHILE @continue = 1
BEGIN
    PRINT GETDATE()
    SET ROWCOUNT 10000
    BEGIN TRANSACTION
    DELETE FROM Transactions WHERE  TradeDate IS NULL
    SET @rowcount = @@rowcount 
    COMMIT
    PRINT GETDATE()
    IF @rowcount = 0
    BEGIN
        SET @continue = 0
    END
END

The print statements help to show you how things are moving along.

(10000 row(s) affected)
Jan  1 2009 11:54PM
Jan  1 2009 11:54PM

(10000 row(s) affected)
Jan  1 2009 11:55PM
Jan  1 2009 11:55PM

Seven hours later the script is still executing with 39 million rows deleted so far and the log file currently at 700MB.

Categories:

Updated: