Django 1.1.2, MySQL 5.1
Blob.objects.filter(foo = foo) \ .filter(status = Blob.PLEASE_DELETE) \ .delete()
This snippet results in the ORM first generating a
SELECT * from xxx_blob where ... query, then doing a
DELETE from xxx_blob where id in (BLAH); where BLAH is a ridiculously long list of id's. Since I'm deleting a large amount of blobs, this makes both me and the DB very unhappy.
Is there a reason for this? I don't see why the ORM can't convert the above snippet into a single DELETE query. Is there a way to optimize this without resorting to raw SQL?
Not without writing your own custom SQL or managers or something; they are apparently working on it though.
For those who are still looking for an efficient way to bulk delete in django, here's a possible solution:
The reason delete() may be so slow is twofold: 1) django has to ensure cascade deleting functions properly, thus looking for foreign key references to your models; 2) django has to handle pre and post-save signals for your models.
If you know your models don't have cascade deleting or signals to be handled, you can accelerate this process by resorting to the private API _raw_delete as follows:
More details in here. Please note that django already tries to make a good handling of these events, though using the raw delete is, in many situations, much more efficient.