I am trying to iterate through this loop:
for doc in coll.find()
I get the following error at the 100,000th plus record.
File "build\bdist.win32\egg\pymongo\cursor.py", line 703, in next File "build\bdist.win32\egg\pymongo\cursor.py", line 679, in _refresh File "build\bdist.win32\egg\pymongo\cursor.py", line 628, in __send_message File "build\bdist.win32\egg\pymongo\helpers.py", line 95, in _unpack_response pymongo.errors.OperationFailure: cursor id '1236484850793' not valid at server
what does this error mean?
Maybe your cursor timed out on the server. To see if this is the problem, try to set timeout=False`:
for doc in coll.find(timeout=False)
If it was a timeout problem one possible solution is to set the
batch_size (s. other answers).
timeout=Falseis dangerous and should never be used, because the connection to the cursor can remain open for unlimited time, which will affect system performance. The docs specifically reference the need to manually close the cursor.
batch_sizeto a small number will work, but creates a big latency issue, because we need to access the DB more often than needed.
In my solution it is mandatory to use sort on the cursor:
done = False skip = 0 while not done: cursor = coll.find() cursor.sort( indexed_parameter ) # recommended to use time or other sequential parameter. cursor.skip( skip ) try: for doc in cursor: skip += 1 do_something() done = True except pymongo.errors.OperationFailure, e: msg = e.message if not (msg.startswith("cursor id") and msg.endswith("not valid at server")): raise