1
==================================================
2
PyPy - Goals and Architecture Overview
3
==================================================
8
This document gives an overview of the goals and architecture of PyPy.
9
See `getting started`_ for a practical introduction and starting points.
16
* a common translation and support framework for producing
17
implementations of dynamic languages, emphasising a clean
18
separation between language specification and implementation
21
* a compliant, flexible and fast implementation of the Python_ Language
22
using the above framework to enable new advanced features without having
23
to encode low level details into it.
25
By separating concerns in this way, we intend for our implementation
26
of Python - and other dynamic languages - to become robust against almost
27
all implementation decisions, including target platform, memory and
28
threading models, optimizations applied, up to to the point of being able to
29
automatically *generate* Just-in-Time compilers for dynamic languages.
31
Conversely, our implementation techniques, including the JIT compiler
32
generator, should become robust against changes in the languages
37
=============================
39
PyPy - the Translation Framework
40
-----------------------------------------------
42
Traditionally, language interpreters are written in a target platform language
43
like C/Posix, Java or C#. Each such implementation fundamentally provides
44
a mapping from application source code to the target environment. One of
45
the goals of the "all-encompassing" environments, like the .NET framework
46
and to some extent the Java virtual machine, is to provide standardized
47
and higher level functionalities in order to support language implementers
48
for writing language implementations.
50
PyPy is experimenting with a more ambitious approach. We are using a
51
subset of the high-level language Python, called RPython_, in which we
52
write languages as simple interpreters with few references to and
53
dependencies on lower level details. Our translation framework then
54
produces a concrete virtual machine for the platform of our choice by
55
inserting appropriate lower level aspects. The result can be customized
56
by selecting other feature and platform configurations.
58
Our goal is to provide a possible solution to the problem of language
59
implementers: having to write ``l * o * p`` interpreters for ``l``
60
dynamic languages and ``p`` platforms with ``o`` crucial design
61
decisions. PyPy aims at having any one of these parameters changeable
62
independently from each other:
64
* ``l``: the language that we analyze can be evolved or entirely replaced;
66
* ``o``: we can tweak and optimize the translation process to produce
67
platform specific code based on different models and trade-offs;
69
* ``p``: we can write new translator back-ends to target different
70
physical and virtual platforms.
72
By contrast, a standardized target environment - say .NET -
73
enforces ``p=1`` as far as it's concerned. This helps making ``o`` a
74
bit smaller by providing a higher-level base to build upon. Still,
75
we believe that enforcing the use of one common environment
76
is not necessary. PyPy's goal is to give weight to this claim - at least
77
as far as language implementation is concerned - showing an approach
78
to the ``l * o * p`` problem that does not rely on standardization.
80
The most ambitious part of this goal is to *generate Just-In-Time
81
Compilers* in a language-independent way, instead of only translating
82
the source interpreter into an interpreter for the target platform.
83
This is an area of language implementation that is commonly considered
84
very challenging because of the involved complexity.
87
PyPy - the Python Interpreter
88
--------------------------------------------
90
Our main motivation for developing the translation framework is to
91
provide a full featured, customizable, fast_ and `very compliant`_ Python
92
implementation, working on and interacting with a large variety of
93
platforms and allowing the quick introduction of new advanced language
96
This Python implementation is written in RPython as a relatively simple
97
interpreter, in some respects easier to understand than CPython, the C
98
reference implementation of Python. We are using its high level and
99
flexibility to quickly experiment with features or implementation
100
techniques in ways that would, in a traditional approach, require
101
pervasive changes to the source code. For example, PyPy's Python
102
interpreter can optionally provide lazily computed objects - a small
103
extension that would require global changes in CPython. Another example
104
is the garbage collection technique: changing CPython to use a garbage
105
collector not based on reference counting would be a major undertaking,
106
whereas in PyPy it is an issue localized in the translation framework,
107
and fully orthogonal to the interpreter source code.
111
===========================
113
As you would expect from a project implemented using ideas from the world
114
of `Extreme Programming`_, the architecture of PyPy has evolved over time
115
and continues to evolve. Nevertheless, the high level architecture is
116
stable. As described above, there are two rather independent basic
117
subsystems: the `Python Interpreter`_ and the `Translation Framework`_.
119
.. _`translation framework`:
121
The Translation Framework
122
-------------------------
124
The job of the translation tool chain is to translate RPython_ programs
125
into an efficient version of that program for one of various target
126
platforms, generally one that is considerably lower-level than Python.
128
The approach we have taken is to reduce the level of abstraction of the
129
source RPython program in several steps, from the high level down to the
130
level of the target platform, whatever that may be. Currently we
131
support two broad flavours of target platforms: the ones that assume a
132
C-like memory model with structures and pointers, and the ones that
133
assume an object-oriented model with classes, instances and methods (as,
134
for example, the Java and .NET virtual machines do).
136
The translation tool chain never sees the RPython source code or syntax
137
trees, but rather starts with the *code objects* that define the
138
behaviour of the function objects one gives it as input. It can be
139
considered as "freezing" a pre-imported RPython program into an
140
executable form suitable for the target platform.
142
The steps the translation process can be summarized as follows:
144
* The code object of each source functions is converted to a `control
145
flow graph` by the `Flow Object Space`_.
147
* The control flow graphs are processed by the Annotator_, which
148
performs whole-program type inference to annotate each variable of
149
the control flow graph with the types it may take at run-time.
151
* The information provided by the annotator is used by the RTyper_ to
152
convert the high level operations of the control flow graphs into
153
operations closer to abstraction level of the target platform.
155
* Optionally, `various transformations`_ can then be applied which, for
156
example, perform optimizations such as inlining or add capabilities
157
such as stackless_-style concurrency.
159
* Then, the graphs are converted to source code for the target platform
160
and compiled into an executable.
162
This process is described in much more detail in the `document about
163
the translation process`_.
165
.. _`control flow graph`: translation.html#the-flow-model
166
.. _`Flow Object Space`: objspace.html#the-flow-object-space
167
.. _Annotator: translation.html#the-annotation-pass
168
.. _RTyper: rtyper.html#overview
169
.. _`various transformations`: translation.html#the-optional-transformations
170
.. _`document about the translation process`: translation.html
173
.. _`standard interpreter`:
174
.. _`python interpreter`:
176
The Python Interpreter
177
-------------------------------------
179
PyPy's *Python Interpreter* is written in RPython and implements the
180
full Python language. This interpreter very closely emulates the
181
behavior of CPython. It contains the following key components:
183
- a bytecode compiler responsible for producing Python code objects
184
from the source code of a user application;
186
- a `bytecode evaluator`_ responsible for interpreting
189
- a `standard object space`_, responsible for creating and manipulating
190
the Python objects seen by the application.
192
The *bytecode compiler* is the preprocessing phase that produces a
193
compact bytecode format via a chain of flexible passes (tokenizer,
194
lexer, parser, abstract syntax tree builder, bytecode generator). The
195
*bytecode evaluator* interprets this bytecode. It does most of its work
196
by delegating all actual manipulations of user objects to the *object
197
space*. The latter can be thought of as the library of built-in types.
198
It defines the implementation of the user objects, like integers and
199
lists, as well as the operations between them, like addition or
202
This division between bytecode evaluator and object space is very
203
important, as it gives a lot of flexibility. One can plug in
204
different `object spaces`_ to get different or enriched behaviours
205
of the Python objects. Additionally, a special more abstract object
206
space, the `flow object space`_, allows us to reuse the bytecode
207
evaluator for our translation framework.
209
.. _`bytecode evaluator`: interpreter.html
210
.. _`standard object space`: objspace.html#the-standard-object-space
211
.. _`object spaces`: objspace.html
212
.. _`flow object space`: objspace.html#the-flow-object-space
214
.. _`the translation framework`:
220
All of PyPy's documentation can be reached from the `documentation
221
index`_. Of particular interest after reading this document might be:
223
* `getting-started`_: a hands-on guide to getting involved with the
226
* `PyPy's approach to virtual machine construction`_: a paper
227
presented to the Dynamic Languages Symposium attached to OOPSLA
230
* `The translation document`_: a detailed description of our
231
translation process. You might also be interested in reading the
232
more theoretically-oriented paper `Compiling dynamic language
235
* All our `Technical reports`_.
237
.. _`documentation index`: index.html
238
.. _`getting-started`: getting-started.html
239
.. _`PyPy's approach to virtual machine construction`: http://codespeak.net/svn/pypy/extradoc/talk/dls2006/pypy-vm-construction.pdf
240
.. _`the translation document`: translation.html
241
.. _`Compiling dynamic language implementations`: dynamic-language-translation.html
242
.. _`Technical reports`: index-report.html
244
.. _`getting started`: getting-started.html
245
.. _`Extreme Programming`: http://www.extremeprogramming.org/
247
.. _fast: faq.html#how-fast-is-pypy
248
.. _`very compliant`: http://www2.openend.se/~pedronis/pypy-c-test/allworkingmodules/summary.html
250
.. _`RPython`: coding-guide.html#rpython
252
.. _Python: http://docs.python.org/ref
253
.. _Psyco: http://psyco.sourceforge.net
254
.. _stackless: stackless.html
256
.. include:: _ref.txt