Date: 2023-04-28
Django’s QuerySet.delete
is inherently racy. The race condition is due to the fact that Django
implements cascading deletes itself rather than letting the database
deal with it.
Take the following example:
class OtherModel(models.Model):
pass
class MyModel(models.Model):
other = models.ForeignKey(OtherModel, on_delete=models.CASCADE)
other = OtherModel.objects.create()
MyModel.objects.create(other=other)
OtherModel.objects.all().delete() # MyModel instance should be deletedIn the example above, we specified
on_delete=models.CASCADE for the foreign key. When the row
referenced by MyModel is deleted, we expect the the
MyModel row to be deleted as well.
When you call QuerySet.delete
Django will first collect all related rows and delete those first before
deleting the rows you intend to delete. In-between those two operations,
new related rows might be deleted. Eventually this leads to an error
when it tries to delete the rows you intended to delete since there are
still rows referencing the rows you intend to delete.
The error looks something like this (django.db.IntegrityError):
update or delete on table "myapp_othermodel" violates foreign key constraint "myapp_mymodel_fk" on table "myapp_mymodel"
DETAIL: Key (id)=(1) is still referenced from table "myapp_mymodel".
Django builts a tree of operations to execute prior to deleting the
rows you intend to delete. You can see how this works in django.db.models.deletion.Collector.
In reality, this is rarely a big problem for applications:
Retry the delete. If your application doesn’t experience extraordinary levels of concurrency it is often viable to just retry the transaction upon encountering the error. There’s a good chance it’ll succeseed in a few retries.
Django implements cascading deletes because at the time it was implemented, not all databases supported it properly themselves. In 2023, all major databases supported by Django can do cascading deletes natively and correctly.
Letting the database handle the cascade has the advantage that we can
rely on the database to acquire proper locks on the referenced tables to
avoid a race condition. DELETE is an atomic operation in
most databases and thus race-free.
If you are using PostgreSQL with Django, you could re-create your
foreign keys with ON DELETE CASCADE (or
ON DELETE SET NULL):
from django.db import migrations
class Migration(migrations.Migration):
operations = [
migrations.RunSQL(
sql="ALTER TABLE myapp_othermodel DROP CONSTRAINT myapp_mymodel_fk;",
reverse_sql="ALTER TABLE myapp_othermodel ADD CONSTRAINT myapp_mymodel_fk FOREIGN KEY (other_id) REFERENCES myapp_othermodel (id);",
),
migrations.RunSQL(
sql="ALTER TABLE myapp_othermodel ADD CONSTRAINT myapp_mymodel_fk FOREIGN KEY (other_id) REFERENCES myapp_othermodel (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED",
reverse_sql="ALTER TABLE myapp_othermodel DROP CONSTRAINT myapp_mymodel_fk;",
),
]Once implemented, even if the race occurs during
QuerySet.delete, Postgres will natively delete the
referenced row, thus avoiding the real race. A nice side effect of this
solution is that you could gain a bit of speed by doing the delete in a
single operation:
with connection.cursor() as cursor:
cursor.execute("DELETE FROM myapp_othermodel WHERE id = %s", (1,))This bypasses Django’s two-step cascading delete and lets the database handle deletes manually.