Django’s delete is racy and how to fix it

Author: Swen Kooij

Date: 2023-04-28

Django’s QuerySet.delete is inherently racy. The race condition is due to the fact that Django implements cascading deletes itself rather than letting the database deal with it.

Understanding the race

Take the following example:

class OtherModel(models.Model):
    pass

class MyModel(models.Model):
    other = models.ForeignKey(OtherModel, on_delete=models.CASCADE)

other = OtherModel.objects.create()
MyModel.objects.create(other=other)

OtherModel.objects.all().delete() # MyModel instance should be deleted

In the example above, we specified on_delete=models.CASCADE for the foreign key. When the row referenced by MyModel is deleted, we expect the the MyModel row to be deleted as well.

When you call QuerySet.delete Django will first collect all related rows and delete those first before deleting the rows you intend to delete. In-between those two operations, new related rows might be deleted. Eventually this leads to an error when it tries to delete the rows you intended to delete since there are still rows referencing the rows you intend to delete.

The error looks something like this (django.db.IntegrityError):

update or delete on table "myapp_othermodel" violates foreign key constraint "myapp_mymodel_fk" on table "myapp_mymodel"
DETAIL:  Key (id)=(1) is still referenced from table "myapp_mymodel".

Django builts a tree of operations to execute prior to deleting the rows you intend to delete. You can see how this works in django.db.models.deletion.Collector.

Impact

In reality, this is rarely a big problem for applications:

Fixing it

Easy

Retry the delete. If your application doesn’t experience extraordinary levels of concurrency it is often viable to just retry the transaction upon encountering the error. There’s a good chance it’ll succeseed in a few retries.

Hard

Django implements cascading deletes because at the time it was implemented, not all databases supported it properly themselves. In 2023, all major databases supported by Django can do cascading deletes natively and correctly.

Letting the database handle the cascade has the advantage that we can rely on the database to acquire proper locks on the referenced tables to avoid a race condition. DELETE is an atomic operation in most databases and thus race-free.

If you are using PostgreSQL with Django, you could re-create your foreign keys with ON DELETE CASCADE (or ON DELETE SET NULL):

from django.db import migrations

class Migration(migrations.Migration):
    operations = [
        migrations.RunSQL(
            sql="ALTER TABLE myapp_othermodel DROP CONSTRAINT myapp_mymodel_fk;",
            reverse_sql="ALTER TABLE myapp_othermodel ADD CONSTRAINT myapp_mymodel_fk FOREIGN KEY (other_id) REFERENCES myapp_othermodel (id);",
        ),
        migrations.RunSQL(
            sql="ALTER TABLE myapp_othermodel ADD CONSTRAINT myapp_mymodel_fk FOREIGN KEY (other_id) REFERENCES myapp_othermodel (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED",
            reverse_sql="ALTER TABLE myapp_othermodel DROP CONSTRAINT myapp_mymodel_fk;",
        ),
    ]

Once implemented, even if the race occurs during QuerySet.delete, Postgres will natively delete the referenced row, thus avoiding the real race. A nice side effect of this solution is that you could gain a bit of speed by doing the delete in a single operation:

with connection.cursor() as cursor:
    cursor.execute("DELETE FROM myapp_othermodel WHERE id = %s", (1,))

This bypasses Django’s two-step cascading delete and lets the database handle deletes manually.