Incompatibilities moving from Python 2 to Python 3
Unlike most languages, Python supports two major versions. Since 2008 when Python 3 was released, many have made the transition, while many have not. In order to understand both, this section covers the important differences between Python 2 and Python 3.
There are currently two supported versions of Python: 2.7 (Python 2) and 3.6 (Python 3). Additionally versions 3.3 and 3.4 receive security updates in source format.
Python 2.7 is backwards-compatible with most earlier versions of Python, and can run Python code from most 1.x and 2.x versions of Python unchanged. It is broadly available, with an extensive collection of packages. It is also considered deprecated by the CPython developers, and receives only security and bug-fix development. The CPython developers intend to abandon this version of the language in 2020.
According to Python Enhancement Proposal 373 there are no planned future releases of Python 2 after 25 June 2016, but bug fixes and security updates will be supported until 2020. (It doesn't specify what exact date in 2020 will be the sunset date of Python 2.)
Python 3 intentionally broke backwards-compatibility, to address concerns the language developers had with the core of the language. Python 3 receives new development and new features. It is the version of the language that the language developers intend to move forward with.
Over the time between the initial release of Python 3.0 and the current version, some features of Python 3 were back-ported into Python 2.6, and other parts of Python 3 were extended to have syntax compatible with Python 2. Therefore it is possible to write Python that will work on both Python 2 and Python 3, by using future imports and special modules (like six).
Future imports have to be at the beginning of your module:
For further information on the
__future__ module, see the relevant page in the Python documentation.
The package six provides utilities for Python 2/3 compatibility:
- unified access to renamed libraries
- variables for string/unicode types
- functions for method that got removed or has been renamed
A reference for differences between Python 2 and Python 3 can be found here.
.next() method on iterators renamed
In Python 2, an iterator can be traversed by using a method called
next on the iterator itself:
In Python 3 the
.next method has been renamed to
.__next__, acknowledging its “magic” role, so calling
.next will raise an
AttributeError. The correct way to access this functionality in both Python 2 and Python 3 is to call the
next function with the iterator as an argument.
This code is portable across versions from 2.6 through to current releases.
In Python 3, PEP 404 changes the way imports work from Python 2. Implicit relative imports are no longer allowed in packages and
from ... import * imports are only allowed in module level code.
To achieve Python 3 behavior in Python 2:
- the absolute imports feature can be enabled with
from __future__ import absolute_import
- explicit relative imports are encouraged in place of implicit relative imports
For clarification, in Python 2, a module can import the contents of another module located in the same directory as follows:
Notice the location of
foo is ambiguous from the import statement alone. This type of implicit relative import is thus discouraged in favor of explicit relative imports, which look like the following:
. allows an explicit declaration of the module location within the directory tree.
More on Relative Imports
Consider some user defined package called
shapes. The directory structure is as follows:
triangle.py all import
util.py as a module. How will they refer to a module in the same level?
. is used for same-level relative imports.
Now, consider an alternate layout of the
Now, how will these 3 classes refer to util.py?
.. is used for parent-level relative imports. Add more
.s with number of levels between the parent and child.
All classes are "new-style classes" in Python 3.
3.x all classes are new-style classes; when defining a new class python implicitly makes it inherit from
object. As such, specifying
object in a
class definition is a completely optional:
Both of these classes now contain
object in their
mro (method resolution order):
2.x classes are, by default, old-style classes; they do not implicitly inherit from
object. This causes the semantics of classes to differ depending on if we explicitly add
object as a base
In this case, if we try to print the
Y, similar output as that in the Python
3.x case will appear:
This happens because we explicitly made
Y inherit from object when defining it:
class Y(object): pass. For class
X which does not inherit from object the
__mro__ attribute does not exist, trying to access it results in an
In order to ensure compatibility between both versions of Python, classes can be defined with
object as a base class:
Alternatively, if the
__metaclass__ variable is set to
type at global scope, all subsequently defined classes in a given module are implicitly new-style without needing to explicitly inherit from
Class Boolean Value
In Python 2, if you want to define a class boolean value by yourself, you need to implement the
__nonzero__ method on your class. The value is True by default.
In Python 3,
__bool__ is used instead of
cmp function removed in Python 3
From the documentation:
cmp()function should be treated as gone, and the
__cmp__()special method is no longer supported. Use
__hash__(), and other rich comparisons as needed. (If you really need the
cmp()functionality, you could use the expression
(a > b) - (a < b)as the equivalent for
Moreover all built-in functions that accepted the
cmp parameter now only accept the
key keyword only parameter.
Transform an old-style comparison function to a key function. Used with tools that accept key functions (such as
itertools.groupby()). This function is primarily used as a transition tool for programs being converted from Python 2 which supported the use of comparison functions.
Comparison of different types
Objects of different types can be compared. The results are arbitrary, but consistent. They are ordered such that
None is less than anything else, numeric types are smaller than non-numeric types, and everything else is ordered lexicographically by type. Thus, an
int is less than a
str and a
tuple is greater than a
This was originally done so a list of mixed types could be sorted and objects would be grouped together by type:
An exception is raised when comparing different (non-numeric) types:
To sort mixed lists in Python 3 by types and to achieve compatibility between versions, you have to provide a key to the sorted function:
str as the
key function temporarily converts each item to a string only for the purposes of comparison. It then sees the string representation starting with either
0-9 and it's able to sort those (and all the following characters).
Dictionary method changes
In Python 3, many of the dictionary methods are quite different in behaviour from Python 2, and many were removed as well:
view* are gone. Instead of
d.has_key(key), which had been long deprecated, one must now use
key in d.
In Python 2, dictionary methods
items return lists. In Python 3 they return view objects instead; the view objects are not iterators, and they differ from them in two ways, namely:
- they have size (one can use the
lenfunction on them)
- they can be iterated over many times
Additionally, like with iterators, the changes in the dictionary are reflected in the view objects.
Python 2.7 has backported these methods from Python 3; they're available as
viewitems. To transform Python 2 code to Python 3 code, the corresponding forms are:
d.items()of Python 2 should be changed to
d.iteritems()should be changed to
iter(d.keys()), or even better,
- and finally Python 2.7 method calls
d.viewitems()can be replaced with
Porting Python 2 code that iterates over dictionary keys, values or items while mutating it is sometimes tricky. Consider:
The code looks as if it would work similarly in Python 3, but there the
keys method returns a view object, not a list, and if the dictionary changes size while being iterated over, the Python 3 code will crash with
RuntimeError: dictionary changed size during iteration. The solution is of course to properly write
for key in list(d).
Similarly, view objects behave differently from iterators: one cannot use
next() on them, and one cannot resume iteration; it would instead restart; if Python 2 code passes the return value of
d.iteritems() to a method that expects an iterator instead of an iterable, then that should be
iter(d.items()) in Python 3.
Differences between range and xrange functions
In Python 2,
range function returns a list while
xrange creates a special
xrange object, which is an immutable sequence, which unlike other built-in sequence types, doesn't support slicing and has neither
In Python 3,
xrange was expanded to the
range sequence, which thus now creates a
range object. There is no
Additionally, since Python 3.2,
range also supports slicing,
The advantage of using a special sequence type instead of a list is that the interpreter does not have to allocate memory for a list and populate it:
Since the latter behaviour is generally desired, the former was removed in Python 3.
If you still want to have a list in Python 3, you can simply use the
list() constructor on a
In order to maintain compatibility between both Python 2.x and Python 3.x versions, you can use the
builtins module from the external package
future to achieve both forward-compatiblity and backward-compatiblity:
future library supports slicing,
count in all Python versions, just like the built-in method on Python 3.2+.
encode/decode to hex no longer available
"1deadbeef3".decode('hex') # Out: '\x1d\xea\xdb\xee\xf3' '\x1d\xea\xdb\xee\xf3'.encode('hex') # Out: 1deadbeef3
"1deadbeef3".decode('hex') # Traceback (most recent call last): # File "<stdin>", line 1, in <module> # AttributeError: 'str' object has no attribute 'decode' b"1deadbeef3".decode('hex') # Traceback (most recent call last): # File "<stdin>", line 1, in <module> # LookupError: 'hex' is not a text encoding; use codecs.decode() to handle arbitrary codecs '\x1d\xea\xdb\xee\xf3'.encode('hex') # Traceback (most recent call last): # File "<stdin>", line 1, in <module> # LookupError: 'hex' is not a text encoding; use codecs.encode() to handle arbitrary codecs b'\x1d\xea\xdb\xee\xf3'.encode('hex') # Traceback (most recent call last): # File "<stdin>", line 1, in <module> # AttributeError: 'bytes' object has no attribute 'encode'
However, as suggested by the error message, you can use the
codecs module to achieve the same result:
import codecs codecs.decode('1deadbeef4', 'hex') # Out: b'\x1d\xea\xdb\xee\xf4' codecs.encode(b'\x1d\xea\xdb\xee\xf4', 'hex') # Out: b'1deadbeef4'
codecs.encode returns a
bytes object. To obtain a
str object just
decode to ASCII:
codecs.encode(b'\x1d\xea\xdb\xee\xff', 'hex').decode('ascii') # Out: '1deadbeeff'
exec statement is a function in Python 3
In Python 2,
exec is a statement, with special syntax:
exec code [in globals[, locals]]. In Python 3
exec is now a function:
exec(code, [, globals[, locals]]), and the Python 2 syntax will raise a
__future__ import was also added. However, there is no
from __future__ import exec_function, as it is not needed: the exec statement in Python 2 can be also used with syntax that looks exactly like the
exec function invocation in Python 3. Thus you can change the statements
and the latter forms are guaranteed to work identically in both Python 2 and Python 3.
file is no longer a builtin name in 3.x (
open still works).
Internal details of file I/O have been moved to the standard library
io module, which is also the new home of
The file mode (text vs binary) now determines the type of data produced by reading a file (and type required for writing):
The encoding for text files defaults to whatever is returned by
locale.getpreferredencoding(False). To specify an encoding explicitly, use the
encoding keyword parameter:
filter(), map() and zip() return iterators instead of sequences
Since Python 2
itertools.izip is equivalent of Python 3
izip has been removed on Python 3.
hasattr function bug in Python 2
In Python 2, when a property raise a error,
hasattr will ignore this property, returning
This bug is fixed in Python3. So if you use Python 2, use
The standard division symbol (
/) operates differently in Python 3 and Python 2 when applied to integers.
When dividing an integer by another integer in Python 3, the division operation
x / y represents a true division (uses
__truediv__ method) and produces a floating point result. Meanwhile, the same operation in Python 2 represents a classic division that rounds the result down toward negative infinity (also known as taking the floor).
|Code||Python 2 output||Python 3 output|
The rounding-towards-zero behavior was deprecated in Python 2.2, but remains in Python 2.7 for the sake of backward compatibility and was removed in Python 3.
Note: To get a float result in Python 2 (without floor rounding) we can specify one of the operands with the decimal point. The above example of
2/3 which gives
0 in Python 2 shall be used as
2 / 3.0 or
2.0 / 3 or
2.0/3.0 to get
|Code||Python 2 output||Python 3 output|
There is also the floor division operator (
//), which works the same way in both versions: it rounds down to the nearest integer. (although a float is returned when used with floats) In both versions the
// operator maps to
|Code||Python 2 output||Python 3 output|
One can explicitly enforce true division or floor division using native functions in the
While clear and explicit, using operator functions for every division can be tedious. Changing the behavior of the
/ operator will often be preferred. A common practice is to eliminate typical division behavior by adding
from __future__ import division as the first statement in each module:
|Code||Python 2 output||Python 3 output|
from __future__ import division guarantees that the
/ operator represents true division and only within the modules that contain the
__future__ import, so there are no compelling reasons for not enabling it in all new modules.
Note: Some other programming languages use rounding toward zero (truncation) rather than rounding down toward negative infinity as Python does (i.e. in those languages
-3 / 2 == -1). This behavior may create confusion when porting or comparing code.
Note on float operands: As an alternative to
from __future__ import division, one could use the usual division symbol
/ and ensure that at least one of the operands is a float:
3 / 2.0 == 1.5. However, this can be considered bad practice. It is just too easy to write
average = sum(items) / len(items) and forget to cast one of the arguments to float. Moreover, such cases may frequently evade notice during testing, e.g., if you test on an array containing
floats but receive an array of
ints in production. Additionally, if the same code is used in Python 3, programs that expect
3 / 2 == 1 to be True will not work correctly.
See PEP 238 for more detailed rationale why the division operator was changed in Python 3 and why old-style division should be avoided.
See the Simple Math topic for more about division.
Leaked variables in list comprehension
As can be seen from the example, in Python 2 the value of
x was leaked: it masked
hello world! and printed out
U, since this was the last value of
x when the loop ended.
However, in Python 3
x prints the originally defined
hello world!, since the local variable from the list comprehension does not mask variables from the surrounding scope.
Additionally, neither generator expressions (available in Python since 2.5) nor dictionary or set comprehensions (which were backported to Python 2.7 from Python 3) leak variables in Python 2.
Note that in both Python 2 and Python 3, variables will leak into the surrounding scope when using a for loop:
long vs. int
In Python 2, any integer larger than a C
ssize_t would be converted into the
long data type, indicated by an
L suffix on the literal. For example, on a 32 bit build of Python:
However, in Python 3, the
long data type was removed; no matter how big the integer is, it will be an
map() is a builtin that is useful for applying a function to elements of an iterable. In Python 2,
map returns a list. In Python 3,
map returns a map object, which is a generator.
In Python 2, you can pass
None to serve as an identity function. This no longer works in Python 3.
Moreover, when passing more than one iterable as argument in Python 2,
map pads the shorter iterables with
None (similar to
itertools.izip_longest). In Python 3, iteration stops after the shortest iterable.
In Python 2:
In Python 3:
Note: instead of
map consider using list comprehensions, which are Python 2/3 compatible. Replacing
map(str, [1, 2, 3, 4, 5]):
In Python 2, an octal literal could be defined as
To ensure cross-compatibility, use
Print statement vs. Print function
In Python 2,
In Python 3,
print() is a function, with keyword arguments for common uses:
The print function has the following parameters:
sep is what separates the objects you pass to print. For example:
end is what the end of the print statement is followed by. For example:
Printing again following a non-newline ending print statement will print to the same line:
Note : For future compatibility,
This function has exactly same format as Python 3's, except that it lacks the
See PEP 3105 for rationale.
Raising and handling Exceptions
This is the Python 2 syntax, note the commas
, on the
In Python 3, the
, syntax is dropped and replaced by parenthesis and the
For backwards compatibility, the Python 3 syntax is also available in Python 2.6 onwards, so it should be used for all new code that does not need to be compatible with previous versions.
Python 3 also adds exception chaining, wherein you can signal that some other exception was the cause for this exception. For example
The exception raised in the
except statement is of type
DatabaseError, but the original exception is marked as the
__cause__ attribute of that exception. When the traceback is displayed, the original exception will also be displayed in the traceback:
Traceback (most recent call last): File "<stdin>", line 2, in <module> FileNotFoundError The above exception was the direct cause of the following exception: Traceback (most recent call last): File "<stdin>", line 4, in <module> DatabaseError('Cannot open database.db')
If you throw in an
except block without explicit chaining:
The traceback is
Traceback (most recent call last): File "<stdin>", line 2, in <module> FileNotFoundError During handling of the above exception, another exception occurred: Traceback (most recent call last): File "<stdin>", line 4, in <module> DatabaseError('Cannot open database.db')
Neither one is supported in Python 2.x; the original exception and its traceback will be lost if another exception is raised in the except block. The following code can be used for compatibility:
To "forget" the previously thrown exception, use
raise from None
Now the traceback would simply be
Traceback (most recent call last): File "<stdin>", line 4, in <module> DatabaseError('Cannot open database.db')
Or in order to make it compatible with both Python 2 and 3 you may use the six package like so:
Reduce is no longer a built-in
In Python 2,
reduce is available either as a built-in function or from the
functools package (version 2.6 onwards), whereas in Python 3
reduce is available only from
functools. However the syntax for
reduce in both Python2 and Python3 is the same and is
As an example, let us consider reducing a list to a single value by dividing each of the adjacent numbers. Here we use
truediv function from the
In Python 2.x it is as simple as:
In Python 3.x the example becomes a bit more complicated:
We can also use
from functools import reduce to avoid calling
reduce with the namespace name.
Removed operators <> and ``, synonymous with != and repr()
In Python 2,
<> is a synonym for
`foo` is a synonym for
A few modules in the standard library have been renamed:
|Old name||New name|
|urllib / urllib2||urllib, urllib.parse, urllib.error, urllib.response, urllib.request, urllib.robotparser|
Some modules have even been converted from files to libraries. Take tkinter and urllib from above as an example.
When maintaining compatibility between both Python 2.x and 3.x versions, you can use the
future external package to enable importing top-level standard library packages with Python 3.x names on Python 2.x versions.
Return value when writing to a file object
In Python 2, writing directly to a file handle returns
In Python 3, writing to a handle will return the number of characters written when writing text, and the number of bytes written when writing bytes:
Strings: Bytes versus Unicode
In Python 2, an object of type
str is always a byte sequence, but is commonly used for both text and binary data.
A string literal is interpreted as a byte string.
There are two exceptions: You can define a Unicode (text) literal explicitly by prefixing the literal with
Alternatively, you can specify that a whole module's string literals should create Unicode (text) literals:
In order to check whether your variable is a string (either Unicode or a byte string), you can use:
In Python 3, the
str type is a Unicode text type.
Additionally, Python 3 added a
bytes object, suitable for binary "blobs" or writing to encoding-independent files. To create a bytes object, you can prefix
b to a string literal or call the string's
To test whether a value is a string, use:
It is also possible to prefix string literals with a
u prefix to ease compatibility between Python 2 and Python 3 code bases. Since, in Python 3, all strings are Unicode by default, prepending a string literal with
u has no effect:
Python 2’s raw Unicode string prefix
ur is not supported, however:
You can use
decode to ask a
bytes object for what Unicode text it represents:
bytes type exists in both Python 2 and 3, the
unicode type only exists in Python 2. To use Python 3's implicit Unicode strings in Python 2, add the following to the top of your code file:
Another important difference is that indexing bytes in Python 3 results in an
int output like so:
Whilst slicing in a size of one results in a length 1 bytes object:
The round() function tie-breaking and return type
round() tie breaking
In Python 2, using
round() on a number equally close to two integers will return the one furthest from 0. For example:
In Python 3 however,
round() will return the even integer (aka bankers' rounding). For example:
The round() function follows the half to even rounding strategy that will round half-way numbers to the nearest even integer (for example,
round(2.5) now returns 2 rather than 3.0).
As per reference in Wikipedia, this is also known as unbiased rounding, convergent rounding, statistician's rounding, Dutch rounding, Gaussian rounding, or odd-even rounding.
Half to even rounding is part of the IEEE 754 standard and it's also the default rounding mode in Microsoft's .NET.
This rounding strategy tends to reduce the total rounding error. Since on average the amount of numbers that are rounded up is the same as the amount of numbers that are rounded down, rounding errors cancel out. Other rounding methods instead tend to have an upwards or downwards bias in the average error.
round() return type
round() function returns a
float type in Python 2.7
Starting from Python 3.0, if the second argument (number of digits) is omitted, it returns an
True, False and None
In Python 2,
None are built-in constants. Which means it's possible to reassign them.
You can't do this with
None since Python 2.4.
In Python 3,
None are now keywords.
In Python 3, you can unpack an iterable without knowing the exact number of items in it, and even have a variable hold the end of the iterable. For that, you provide a variable that may collect a list of values. This is done by placing an asterisk before the name. For example, unpacking a
Note: When using the
*variable syntax, the
variable will always be a list, even if the original type wasn't a list. It may contain zero or more elements depending on the number of elements in the original list.
Similarly, unpacking a
Example of unpacking a
_ is used in this example as a throwaway variable (we are interested only in
It is worth mentioning that, since
* eats up a variable number of items, you cannot have two
*s for the same iterable in an assignment - it wouldn't know how many elements go into the first unpacking, and how many in the second:
So far we have discussed unpacking in assignments.
** were extended in Python 3.5. It's now possible to have several unpacking operations in one expression:
It is also possible to unpack an iterable into function arguments:
Unpacking a dictionary uses two adjacent stars
** (PEP 448):
This allows for both overriding old values and merging dictionaries.
Python 3 removed tuple unpacking in functions. Hence the following doesn't work in Python 3
See PEP 3113 for detailed rationale.
In Python 2, user input is accepted using the
While in Python 3 user input is accepted using the
In Python 2, the
input function will accept input and interpret it. While this can be useful, it has several security considerations and was removed in Python 3. To access the same functionality,
eval(input()) can be used.
To keep a script portable across the two versions, you can put the code below near the top of your Python script: