I wrote a Python program that acts on a large input file to create a few million objects representing triangles. The algorithm is:
The requirement of OFF that I print out the complete list of vertices before I print out the triangles means that I have to hold the list of triangles in memory before I write the output to file. In the meanwhile I'm getting memory errors because of the sizes of the lists.
What is the best way to tell Python that I no longer need some of the data, and it can be freed?
According to Python Official Documentation, you can force the Garbage Collector to release unreferenced memory with
import gc gc.collect()
Unfortunately (depending on your version and release of Python) some types of objects use "free lists" which are a neat local optimization but may cause memory fragmentation, specifically by making more and more memory "earmarked" for only objects of a certain type and thereby unavailable to the "general fund".
The only really reliable way to ensure that a large but temporary use of memory DOES return all resources to the system when it's done, is to have that use happen in a subprocess, which does the memory-hungry work then terminates. Under such conditions, the operating system WILL do its job, and gladly recycle all the resources the subprocess may have gobbled up. Fortunately, the
multiprocessing module makes this kind of operation (which used to be rather a pain) not too bad in modern versions of Python.
In your use case, it seems that the best way for the subprocesses to accumulate some results and yet ensure those results are available to the main process is to use semi-temporary files (by semi-temporary I mean, NOT the kind of files that automatically go away when closed, just ordinary files that you explicitly delete when you're all done with them).