How do I force Django to ignore any caches and reload data?


I'm using the Django database models from a process that's not called from an HTTP request. The process is supposed to poll for new data every few seconds and do some processing on it. I have a loop that sleeps for a few seconds and then gets all unhandled data from the database.

What I'm seeing is that after the first fetch, the process never sees any new data. I ran a few tests and it looks like Django is caching results, even though I'm building new QuerySets every time. To verify this, I did this from a Python shell:

>>> MyModel.objects.count()
# (Here I added some more data from another process.)
>>> MyModel.objects.count()
>>> MyModel.objects.update()
>>> MyModel.objects.count()

As you can see, adding new data doesn't change the result count. However, calling the manager's update() method seems to fix the problem.

I can't find any documentation on that update() method and have no idea what other bad things it might do.

My question is, why am I seeing this caching behavior, which contradicts what Django docs say? And how do I prevent it from happening?

7/27/2010 5:17:43 PM

Having had this problem and found two definitive solutions for it I thought it worth posting another answer.

This is a problem with MySQL's default transaction mode. Django opens a transaction at the start, which means that by default you won't see changes made in the database.

Demonstrate like this

Run a django shell in terminal 1

>>> MyModel.objects.get(id=1).my_field

And another in terminal 2

>>> MyModel.objects.get(id=1).my_field
>>> a = MyModel.objects.get(id=1)
>>> a.my_field = "NEW"
>>> MyModel.objects.get(id=1).my_field

Back to terminal 1 to demonstrate the problem - we still read the old value from the database.

>>> MyModel.objects.get(id=1).my_field

Now in terminal 1 demonstrate the solution

>>> from django.db import transaction
>>> @transaction.commit_manually
... def flush_transaction():
...     transaction.commit()
>>> MyModel.objects.get(id=1).my_field
>>> flush_transaction()
>>> MyModel.objects.get(id=1).my_field

The new data is now read

Here is that code in an easy to paste block with docstring

from django.db import transaction

def flush_transaction():
    Flush the current transaction so we don't read stale data

    Use in long running processes to make sure fresh data is read from
    the database.  This is a problem with MySQL and the default
    transaction mode.  You can fix it by setting
    "transaction-isolation = READ-COMMITTED" in my.cnf or by calling
    this function at the appropriate moment

The alternative solution is to change my.cnf for MySQL to change the default transaction mode

transaction-isolation = READ-COMMITTED

Note that that is a relatively new feature for Mysql and has some consequences for binary logging / slaving. You could also put this in the django connection preamble if you wanted.

Update 3 years later

Now that Django 1.6 has turned on autocommit in MySQL this is no longer a problem. The example above now works fine without the flush_transaction() code whether your MySQL is in REPEATABLE-READ (the default) or READ-COMMITTED transaction isolation mode.

What was happening in previous versions of Django which ran in non autocommit mode was that the first select statement opened a transaction. Since MySQL's default mode is REPEATABLE-READ this means that no updates to the database will be read by subsequent select statements - hence the need for the flush_transaction() code above which stops the transaction and starts a new one.

There are still reasons why you might want to use READ-COMMITTED transaction isolation though. If you were to put terminal 1 in a transaction and you wanted to see the writes from the terminal 2 you would need READ-COMMITTED.

The flush_transaction() code now produces a deprecation warning in Django 1.6 so I recommend you remove it.

11/5/2014 11:19:50 AM

Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow