3
*********************************
4
Porting Python 2 Code to Python 3
5
*********************************
11
With Python 3 being the future of Python while Python 2 is still in active
12
use, it is good to have your project available for both major releases of
13
Python. This guide is meant to help you choose which strategy works best
14
for your project to support both Python 2 & 3 along with how to execute
17
If you are looking to port an extension module instead of pure Python code,
18
please see :ref:`cporting-howto`.
24
When a project makes the decision that it's time to support both Python 2 & 3,
25
a decision needs to be made as to how to go about accomplishing that goal.
26
The chosen strategy will depend on how large the project's existing
27
codebase is and how much divergence you want from your Python 2 codebase from
28
your Python 3 one (e.g., starting a new version with Python 3).
30
If your project is brand-new or does not have a large codebase, then you may
31
want to consider writing/porting :ref:`all of your code for Python 3
32
and use 3to2 <use_3to2>` to port your code for Python 2.
34
If your project has a pre-existing Python 2 codebase and you would like Python
35
3 support to start off a new branch or version of your project, then you will
36
most likely want to :ref:`port using 2to3 <use_2to3>`. This will allow you to
37
port your Python 2 code to Python 3 in a semi-automated fashion and begin to
38
maintain it separately from your Python 2 code. This approach can also work if
39
your codebase is small and/or simple enough for the translation to occur
42
Finally, if you want to maintain Python 2 and Python 3 versions of your project
43
simultaneously and with no differences, then you can write :ref:`Python 2/3
44
source-compatible code <use_same_source>`. While the code is not quite as
45
idiomatic as it would be written just for Python 3 or automating the port from
46
Python 2, it does makes it easier to continue to do rapid development
47
regardless of what major version of Python you are developing against at the
50
Regardless of which approach you choose, porting is probably not as hard or
51
time-consuming as you might initially think. You can also tackle the problem
52
piece-meal as a good portion of porting is simply updating your code to follow
53
current best practices in a Python 2/3 compatible way.
56
Universal Bits of Advice
57
------------------------
59
Regardless of what strategy you pick, there are a few things you should
62
One is make sure you have a robust test suite. You need to make sure everything
63
continues to work, just like when you support a new minor version of Python.
64
This means making sure your test suite is thorough and is ported properly
65
between Python 2 & 3. You will also most likely want to use something like tox_
66
to automate testing between both a Python 2 and Python 3 VM.
68
Two, once your project has Python 3 support, make sure to add the proper
69
classifier on the Cheeseshop_ (PyPI_). To have your project listed as Python 3
70
compatible it must have the
71
`Python 3 classifier <http://pypi.python.org/pypi?:action=browse&c=533>`_
73
http://techspot.zzzeek.org/2011/01/24/zzzeek-s-guide-to-python-3-porting/)::
79
# make sure to use :: Python *and* :: Python :: 3 so
80
# that pypi can list the package on the python 3 page
81
'Programming Language :: Python',
82
'Programming Language :: Python :: 3'
84
packages=['yourlibrary'],
85
# make sure to add custom_fixers to the MANIFEST.in
86
include_package_data=True,
91
Doing so will cause your project to show up in the
92
`Python 3 packages list
93
<http://pypi.python.org/pypi?:action=browse&c=533&show=all>`_. You will know
94
you set the classifier properly as visiting your project page on the Cheeseshop
95
will show a Python 3 logo in the upper-left corner of the page.
97
Three, the six_ project provides a library which helps iron out differences
98
between Python 2 & 3. If you find there is a sticky point that is a continual
99
point of contention in your translation or maintenance of code, consider using
100
a source-compatible solution relying on six. If you have to create your own
101
Python 2/3 compatible solution, you can use ``sys.version_info[0] >= 3`` as a
104
Four, read all the approaches. Just because some bit of advice applies to one
105
approach more than another doesn't mean that some advice doesn't apply to other
108
Five, drop support for older Python versions if possible. `Python 2.5`_
109
introduced a lot of useful syntax and libraries which have become idiomatic
110
in Python 3. `Python 2.6`_ introduced future statements which makes
111
compatibility much easier if you are going from Python 2 to 3.
112
`Python 2.7`_ continues the trend in the stdlib. So choose the newest version
113
of Python which you believe can be your minimum support version
117
.. _tox: http://codespeak.net/tox/
119
.. _PyPI: http://pypi.python.org/
120
.. _six: http://packages.python.org/six
121
.. _Python 2.7: http://www.python.org/2.7.x
122
.. _Python 2.6: http://www.python.org/2.6.x
123
.. _Python 2.5: http://www.python.org/2.5.x
124
.. _Python 2.4: http://www.python.org/2.4.x
125
.. _Python 2.3: http://www.python.org/2.3.x
126
.. _Python 2.2: http://www.python.org/2.2.x
134
If you are starting a new project or your codebase is small enough, you may
135
want to consider writing your code for Python 3 and backporting to Python 2
136
using 3to2_. Thanks to Python 3 being more strict about things than Python 2
137
(e.g., bytes vs. strings), the source translation can be easier and more
138
straightforward than from Python 2 to 3. Plus it gives you more direct
139
experience developing in Python 3 which, since it is the future of Python, is a
140
good thing long-term.
142
A drawback of this approach is that 3to2 is a third-party project. This means
143
that the Python core developers (and thus this guide) can make no promises
144
about how well 3to2 works at any time. There is nothing to suggest, though,
145
that 3to2 is not a high-quality project.
148
.. _3to2: https://bitbucket.org/amentajo/lib3to2/overview
156
Included with Python since 2.6, the 2to3_ tool (and :mod:`lib2to3` module)
157
helps with porting Python 2 to Python 3 by performing various source
158
translations. This is a perfect solution for projects which wish to branch
159
their Python 3 code from their Python 2 codebase and maintain them as
160
independent codebases. You can even begin preparing to use this approach
161
today by writing future-compatible Python code which works cleanly in
162
Python 2 in conjunction with 2to3; all steps outlined below will work
163
with Python 2 code up to the point when the actual use of 2to3 occurs.
165
Use of 2to3 as an on-demand translation step at install time is also possible,
166
preventing the need to maintain a separate Python 3 codebase, but this approach
167
does come with some drawbacks. While users will only have to pay the
168
translation cost once at installation, you as a developer will need to pay the
169
cost regularly during development. If your codebase is sufficiently large
170
enough then the translation step ends up acting like a compilation step,
171
robbing you of the rapid development process you are used to with Python.
172
Obviously the time required to translate a project will vary, so do an
173
experimental translation just to see how long it takes to evaluate whether you
174
prefer this approach compared to using :ref:`use_same_source` or simply keeping
175
a separate Python 3 codebase.
177
Below are the typical steps taken by a project which uses a 2to3-based approach
178
to supporting Python 2 & 3.
184
As a first step, make sure that your project is compatible with `Python 2.7`_.
185
This is just good to do as Python 2.7 is the last release of Python 2 and thus
186
will be used for a rather long time. It also allows for use of the ``-3`` flag
187
to Python to help discover places in your code which 2to3 cannot handle but are
188
known to cause issues.
190
Try to Support `Python 2.6`_ and Newer Only
191
-------------------------------------------
193
While not possible for all projects, if you can support `Python 2.6`_ and newer
194
**only**, your life will be much easier. Various future statements, stdlib
195
additions, etc. exist only in Python 2.6 and later which greatly assist in
196
porting to Python 3. But if you project must keep support for `Python 2.5`_ (or
197
even `Python 2.4`_) then it is still possible to port to Python 3.
199
Below are the benefits you gain if you only have to support Python 2.6 and
200
newer. Some of these options are personal choice while others are
201
**strongly** recommended (the ones that are more for personal choice are
202
labeled as such). If you continue to support older versions of Python then you
203
at least need to watch out for situations that these solutions fix.
206
``from __future__ import print_function``
207
'''''''''''''''''''''''''''''''''''''''''
209
This is a personal choice. 2to3 handles the translation from the print
210
statement to the print function rather well so this is an optional step. This
211
future statement does help, though, with getting used to typing
212
``print('Hello, World')`` instead of ``print 'Hello, World'``.
215
``from __future__ import unicode_literals``
216
'''''''''''''''''''''''''''''''''''''''''''
218
Another personal choice. You can always mark what you want to be a (unicode)
219
string with a ``u`` prefix to get the same effect. But regardless of whether
220
you use this future statement or not, you **must** make sure you know exactly
221
which Python 2 strings you want to be bytes, and which are to be strings. This
222
means you should, **at minimum** mark all strings that are meant to be text
223
strings with a ``u`` prefix if you do not use this future statement.
229
This is a **very** important one. The ability to prefix Python 2 strings that
230
are meant to contain bytes with a ``b`` prefix help to very clearly delineate
231
what is and is not a Python 3 string. When you run 2to3 on code, all Python 2
232
strings become Python 3 strings **unless** they are prefixed with ``b``.
234
There are some differences between byte literals in Python 2 and those in
235
Python 3 thanks to the bytes type just being an alias to ``str`` in Python 2.
236
Probably the biggest "gotcha" is that indexing results in different values. In
237
Python 2, the value of ``b'py'[1]`` is ``'y'``, while in Python 3 it's ``121``.
238
You can avoid this disparity by always slicing at the size of a single element:
239
``b'py'[1:2]`` is ``'y'`` in Python 2 and ``b'y'`` in Python 3 (i.e., close
242
You cannot concatenate bytes and strings in Python 3. But since in Python
243
2 has bytes aliased to ``str``, it will succeed: ``b'a' + u'b'`` works in
244
Python 2, but ``b'a' + 'b'`` in Python 3 is a :exc:`TypeError`. A similar issue
245
also comes about when doing comparisons between bytes and strings.
248
Supporting `Python 2.5`_ and Newer Only
249
---------------------------------------
251
If you are supporting `Python 2.5`_ and newer there are still some features of
252
Python that you can utilize.
255
``from __future__ import absolute_imports``
256
'''''''''''''''''''''''''''''''''''''''''''
258
Implicit relative imports (e.g., importing ``spam.bacon`` from within
259
``spam.eggs`` with the statement ``import bacon``) does not work in Python 3.
260
This future statement moves away from that and allows the use of explicit
261
relative imports (e.g., ``from . import bacon``).
263
In `Python 2.5`_ you must use
264
the __future__ statement to get to use explicit relative imports and prevent
265
implicit ones. In `Python 2.6`_ explicit relative imports are available without
266
the statement, but you still want the __future__ statement to prevent implicit
267
relative imports. In `Python 2.7`_ the __future__ statement is not needed. In
268
other words, unless you are only supporting Python 2.7 or a version earlier
269
than Python 2.5, use the __future__ statement.
273
Handle Common "Gotchas"
274
-----------------------
276
There are a few things that just consistently come up as sticking points for
277
people which 2to3 cannot handle automatically or can easily be done in Python 2
278
to help modernize your code.
281
``from __future__ import division``
282
'''''''''''''''''''''''''''''''''''
284
While the exact same outcome can be had by using the ``-Qnew`` argument to
285
Python, using this future statement lifts the requirement that your users use
286
the flag to get the expected behavior of division in Python 3
287
(e.g., ``1/2 == 0.5; 1//2 == 0``).
291
Specify when opening a file as binary
292
'''''''''''''''''''''''''''''''''''''
294
Unless you have been working on Windows, there is a chance you have not always
295
bothered to add the ``b`` mode when opening a binary file (e.g., ``rb`` for
296
binary reading). Under Python 3, binary files and text files are clearly
297
distinct and mutually incompatible; see the :mod:`io` module for details.
298
Therefore, you **must** make a decision of whether a file will be used for
299
binary access (allowing to read and/or write bytes data) or text access
300
(allowing to read and/or write unicode data).
305
Text files created using ``open()`` under Python 2 return byte strings,
306
while under Python 3 they return unicode strings. Depending on your porting
307
strategy, this can be an issue.
309
If you want text files to return unicode strings in Python 2, you have two
312
* Under Python 2.6 and higher, use :func:`io.open`. Since :func:`io.open`
313
is essentially the same function in both Python 2 and Python 3, it will
314
help iron out any issues that might arise.
316
* If pre-2.6 compatibility is needed, then you should use :func:`codecs.open`
317
instead. This will make sure that you get back unicode strings in Python 2.
322
New-style classes have been around since `Python 2.2`_. You need to make sure
323
you are subclassing from ``object`` to avoid odd edge cases involving method
324
resolution order, etc. This continues to be totally valid in Python 3 (although
325
unneeded as all classes implicitly inherit from ``object``).
328
Deal With the Bytes/String Dichotomy
329
''''''''''''''''''''''''''''''''''''
331
One of the biggest issues people have when porting code to Python 3 is handling
332
the bytes/string dichotomy. Because Python 2 allowed the ``str`` type to hold
333
textual data, people have over the years been rather loose in their delineation
334
of what ``str`` instances held text compared to bytes. In Python 3 you cannot
335
be so care-free anymore and need to properly handle the difference. The key
336
handling this issue to to make sure that **every** string literal in your
337
Python 2 code is either syntactically of functionally marked as either bytes or
338
text data. After this is done you then need to make sure your APIs are designed
339
to either handle a specific type or made to be properly polymorphic.
342
Mark Up Python 2 String Literals
343
********************************
345
First thing you must do is designate every single string literal in Python 2
346
as either textual or bytes data. If you are only supporting Python 2.6 or
347
newer, this can be accomplished by marking bytes literals with a ``b`` prefix
348
and then designating textual data with a ``u`` prefix or using the
349
``unicode_literals`` future statement.
351
If your project supports versions of Python pre-dating 2.6, then you should use
352
the six_ project and its ``b()`` function to denote bytes literals. For text
353
literals you can either use six's ``u()`` function or use a ``u`` prefix.
356
Decide what APIs Will Accept
357
****************************
359
In Python 2 it was very easy to accidentally create an API that accepted both
360
bytes and textual data. But in Python 3, thanks to the more strict handling of
361
disparate types, this loose usage of bytes and text together tends to fail.
363
Take the dict ``{b'a': 'bytes', u'a': 'text'}`` in Python 2.6. It creates the
364
dict ``{u'a': 'text'}`` since ``b'a' == u'a'``. But in Python 3 the equivalent
365
dict creates ``{b'a': 'bytes', 'a': 'text'}``, i.e., no lost data. Similar
366
issues can crop up when transitioning Python 2 code to Python 3.
368
This means you need to choose what an API is going to accept and create and
369
consistently stick to that API in both Python 2 and 3.
372
Bytes / Unicode Comparison
373
**************************
375
In Python 3, mixing bytes and unicode is forbidden in most situations; it
376
will raise a :class:`TypeError` where Python 2 would have attempted an implicit
377
coercion between types. However, there is one case where it doesn't and
378
it can be very misleading::
383
This is because an equality comparison is required by the language to always
384
succeed (and return ``False`` for incompatible types). However, this also
385
means that code incorrectly ported to Python 3 can display buggy behaviour
386
if such comparisons are silently executed. To detect such situations,
387
Python 3 has a ``-b`` flag that will display a warning::
391
__main__:1: BytesWarning: Comparison between bytes and string
394
To turn the warning into an exception, use the ``-bb`` flag instead::
398
Traceback (most recent call last):
399
File "<stdin>", line 1, in <module>
400
BytesWarning: Comparison between bytes and string
403
Indexing bytes objects
404
''''''''''''''''''''''
406
Another potentially surprising change is the indexing behaviour of bytes
407
objects in Python 3::
412
Indeed, Python 3 bytes objects (as well as :class:`bytearray` objects)
413
are sequences of integers. But code converted from Python 2 will often
414
assume that indexing a bytestring produces another bytestring, not an
415
integer. To reconcile both behaviours, use slicing::
423
The only remaining gotcha is that an out-of-bounds slice returns an empty
424
bytes object instead of raising ``IndexError``:
427
Traceback (most recent call last):
428
File "<stdin>", line 1, in <module>
429
IndexError: index out of range
434
``__str__()``/``__unicode__()``
435
'''''''''''''''''''''''''''''''
437
In Python 2, objects can specify both a string and unicode representation of
438
themselves. In Python 3, though, there is only a string representation. This
439
becomes an issue as people can inadvertently do things in their ``__str__()``
440
methods which have unpredictable results (e.g., infinite recursion if you
441
happen to use the ``unicode(self).encode('utf8')`` idiom as the body of your
442
``__str__()`` method).
444
There are two ways to solve this issue. One is to use a custom 2to3 fixer. The
445
blog post at http://lucumr.pocoo.org/2011/1/22/forwards-compatible-python/
446
specifies how to do this. That will allow 2to3 to change all instances of ``def
447
__unicode(self): ...`` to ``def __str__(self): ...``. This does require you
448
define your ``__str__()`` method in Python 2 before your ``__unicode__()``
451
The other option is to use a mixin class. This allows you to only define a
452
``__unicode__()`` method for your class and let the mixin derive
453
``__str__()`` for you (code from
454
http://lucumr.pocoo.org/2011/1/22/forwards-compatible-python/)::
458
class UnicodeMixin(object):
460
"""Mixin class to handle defining the proper __str__/__unicode__
461
methods in Python 2 or 3."""
463
if sys.version_info[0] >= 3: # Python 3
465
return self.__unicode__()
468
return self.__unicode__().encode('utf8')
471
class Spam(UnicodeMixin):
473
def __unicode__(self):
474
return u'spam-spam-bacon-spam' # 2to3 will remove the 'u' prefix
477
Don't Index on Exceptions
478
'''''''''''''''''''''''''
480
In Python 2, the following worked::
482
>>> exc = Exception(1, 2, 3)
485
>>> exc[1] # Python 2 only!
488
But in Python 3, indexing directly on an exception is an error. You need to
489
make sure to only index on the :attr:`BaseException.args` attribute which is a
490
sequence containing all arguments passed to the :meth:`__init__` method.
492
Even better is to use the documented attributes the exception provides.
494
Don't use ``__getslice__`` & Friends
495
''''''''''''''''''''''''''''''''''''
497
Been deprecated for a while, but Python 3 finally drops support for
498
``__getslice__()``, etc. Move completely over to :meth:`__getitem__` and
505
2to3_ will attempt to generate fixes for doctests that it comes across. It's
506
not perfect, though. If you wrote a monolithic set of doctests (e.g., a single
507
docstring containing all of your doctests), you should at least consider
508
breaking the doctests up into smaller pieces to make it more manageable to fix.
509
Otherwise it might very well be worth your time and effort to port your tests
513
Eliminate ``-3`` Warnings
514
-------------------------
516
When you run your application's test suite, run it using the ``-3`` flag passed
517
to Python. This will cause various warnings to be raised during execution about
518
things that 2to3 cannot handle automatically (e.g., modules that have been
519
removed). Try to eliminate those warnings to make your code even more portable
526
Once you have made your Python 2 code future-compatible with Python 3, it's
527
time to use 2to3_ to actually port your code.
533
To manually convert source code using 2to3_, you use the ``2to3`` script that
534
is installed with Python 2.6 and later.::
536
2to3 <directory or file to convert>
538
This will cause 2to3 to write out a diff with all of the fixers applied for the
539
converted source code. If you would like 2to3 to go ahead and apply the changes
540
you can pass it the ``-w`` flag::
542
2to3 -w <stuff to convert>
544
There are other flags available to control exactly which fixers are applied,
551
When a user installs your project for Python 3, you can have either
552
:mod:`distutils` or Distribute_ run 2to3_ on your behalf.
553
For distutils, use the following idiom::
556
from distutils.command.build_py import build_py_2to3 as build_py
557
except ImportError: # Python 2
558
from distutils.command.build_py import build_py
560
setup(cmdclass = {'build_py': build_py},
570
This will allow you to not have to distribute a separate Python 3 version of
571
your project. It does require, though, that when you perform development that
572
you at least build your project and use the built Python 3 source for testing.
578
At this point you should (hopefully) have your project converted in such a way
579
that it works in Python 3. Verify it by running your unit tests and making sure
580
nothing has gone awry. If you miss something then figure out how to fix it in
581
Python 3, backport to your Python 2 code, and run your code through 2to3 again
582
to verify the fix transforms properly.
585
.. _2to3: http://docs.python.org/py3k/library/2to3.html
586
.. _Distribute: http://packages.python.org/distribute/
591
Python 2/3 Compatible Source
592
============================
594
While it may seem counter-intuitive, you can write Python code which is
595
source-compatible between Python 2 & 3. It does lead to code that is not
596
entirely idiomatic Python (e.g., having to extract the currently raised
597
exception from ``sys.exc_info()[1]``), but it can be run under Python 2
598
**and** Python 3 without using 2to3_ as a translation step. This allows you to
599
continue to have a rapid development process regardless of whether you are
600
developing under Python 2 or Python 3. Whether this approach or using
601
:ref:`use_2to3` works best for you will be a per-project decision.
603
To get a complete idea of what issues you will need to deal with, see the
604
`What's New in Python 3.0`_. Others have reorganized the data in other formats
605
such as http://docs.pythonsprints.com/python3_porting/py-porting.html .
607
The following are some steps to take to try to support both Python 2 & 3 from
608
the same source code.
611
.. _What's New in Python 3.0: http://docs.python.org/release/3.0/whatsnew/3.0.html
614
Follow The Steps for Using 2to3_ (sans 2to3)
615
--------------------------------------------
617
All of the steps outlined in how to
618
:ref:`port Python 2 code with 2to3 <use_2to3>` apply
619
to creating a Python 2/3 codebase. This includes trying only support Python 2.6
620
or newer (the :mod:`__future__` statements work in Python 3 without issue),
621
eliminating warnings that are triggered by ``-3``, etc.
623
You should even consider running 2to3_ over your code (without committing the
624
changes). This will let you know where potential pain points are within your
625
code so that you can fix them properly before they become an issue.
631
The six_ project contains many things to help you write portable Python code.
632
You should make sure to read its documentation from beginning to end and use
633
any and all features it provides. That way you will minimize any mistakes you
634
might make in writing cross-version code.
637
Capturing the Currently Raised Exception
638
----------------------------------------
640
One change between Python 2 and 3 that will require changing how you code (if
641
you support `Python 2.5`_ and earlier) is
642
accessing the currently raised exception. In Python 2.5 and earlier the syntax
643
to access the current exception is::
647
except Exception, exc:
648
# Current exception is 'exc'
651
This syntax changed in Python 3 (and backported to `Python 2.6`_ and later)
656
except Exception as exc:
657
# Current exception is 'exc'
658
# In Python 3, 'exc' is restricted to the block; Python 2.6 will "leak"
661
Because of this syntax change you must change to capturing the current
668
exc = sys.exc_info()[1]
669
# Current exception is 'exc'
672
You can get more information about the raised exception from
673
:func:`sys.exc_info` than simply the current exception instance, but you most
674
likely don't need it.
677
In Python 3, the traceback is attached to the exception instance
678
through the ``__traceback__`` attribute. If the instance is saved in
679
a local variable that persists outside of the ``except`` block, the
680
traceback will create a reference cycle with the current frame and its
681
dictionary of local variables. This will delay reclaiming dead
682
resources until the next cyclic :term:`garbage collection` pass.
684
In Python 2, this problem only occurs if you save the traceback itself
685
(e.g. the third element of the tuple returned by :func:`sys.exc_info`)
692
The authors of the following blog posts and wiki pages deserve special thanks
693
for making public their tips for porting Python 2 code to Python 3 (and thus
694
helping provide information for this document):
696
* http://docs.pythonsprints.com/python3_porting/py-porting.html
697
* http://techspot.zzzeek.org/2011/01/24/zzzeek-s-guide-to-python-3-porting/
698
* http://dabeaz.blogspot.com/2011/01/porting-py65-and-my-superboard-to.html
699
* http://lucumr.pocoo.org/2011/1/22/forwards-compatible-python/
700
* http://lucumr.pocoo.org/2010/2/11/porting-to-python-3-a-guide/
701
* http://wiki.python.org/moin/PortingPythonToPy3k
703
If you feel there is something missing from this document that should be added,
704
please email the python-porting_ mailing list.
706
.. _python-porting: http://mail.python.org/mailman/listinfo/python-porting