I have noticed
.pyc files spontaneously being generated when some
.py file of the same name gets run. What is the difference between
Also, I find that having
.pyc files lying around clutters up space. Should one delete
.pyc files? Or is there a benefit and/or necessity to having them around?
UPDATE: Here are 2 answered questions that are related to my question
This Question is not a Duplicate
Reason 1: Because I am asking what the difference between these two files are. The question S.Lott found named 'If Python is interpreted, what are .pyc files?' is not asking what the difference between .py and .pyc files are. It is asking what .pyc files are.
Reason 2: Because my secondary questions 'Should one delete
.pyc files? Or is there a benefit and/or necessity to having them around?' provide even more information on .pyc files and how one should handle them.
Reason 3: Because when a beginner Python programmer like myself wants to find out What is the difference between .py and .pyc files? , they will have no problem finding out the answer as they will be guided directly to my question. This helps reduce search time since the question is right to the point.
When a module is loaded, the py file is "byte compiled" to pyc files. The time stamp is recorded in pyc files. This is done not to make it run faster but to load faster. Hence, it makes sense to "byte compile" modules when you load them.
[Edit : To include notes, references]
From PEP 3147 on "Byte code compilation":
CPython compiles its source code into "byte code", and for performance reasons, it caches this byte code on the file system whenever the source file has changes. This makes loading of Python modules much faster because the compilation phase can be bypassed. When your source file is foo.py, CPython caches the byte code in a foo.pyc file right next to the source.
How byte code compiled files are tracked with respect to Python version and "py" file changes:
It also inserts a magic number in the compiled byte code ".pyc" files.
This changes whenever Python changes the byte code format, usually in major releases.
This ensures that pyc files built for previous versions of the VM won't cause problems. The timestamp is used to make sure that the pyc file match the py file that was used to create it. When either the magic number or timestamp do not match, the py file is recompiled and a new pyc file is written.
"pyc" files are not compatible across Python major releases. When Python finds a pyc file with a non-matching magic number, it falls back to the slower process of recompiling the source.
Thats the reason, if you simply distribute the ".pyc" files compiled for the same platform will not work any more, if the python version changes.
If there is a byte compiled file ".pyc" and it's timestamp indicates that it is recent then it will be loaded up other wise python will fallback on the slower approach of loading the ".py" files. The execution performance of the ".py" file is not affected but the loading of the ".pyc" files is faster than ".py" files.
Consider executing a.py which imports b.py
Typical total performance = loading time (A.py) + execution time (A.py) + loading time (B.py) + execution time (B.py) Since loading time (B.pyc) < loading time (B.py) You should see a better performance by using the byte compiled "pyc" files.
That said, if you have a large script file X.py, modularizing it and moving contents to other modules results in taking advantage of lower load time for byte code compiled file.
Another inference is that modules tend to be more stable than the script or the main file. Hence it is not byte compiled at all.
Compiling the main script would be annoying for scripts in e.g.
/usr/bin. The .pyc file is generated in the same directory, thus polluting the public location.