1
==============================================
2
Integration of PyPy with host Virtual Machines
3
==============================================
5
This document is based on the discussion I had with Samuele during the
6
Duesseldorf sprint. It's not much more than random thoughts -- to be
9
Terminology disclaimer: both PyPy and .NET have the concept of
10
"wrapped" or "boxed" objects. To avoid confusion I will use "wrapping"
11
on the PyPy side and "boxing" on the .NET side.
16
The goal is to find a way to efficiently integrate the PyPy
17
interpreter with the hosting environment such as .NET. What we would
18
like to do includes but it's not limited to:
20
- calling .NET methods and instantiate .NET classes from Python
22
- subclass a .NET class from Python
24
- handle native .NET objects as transparently as possible
26
- automatically apply obvious Python <--> .NET conversions when
27
crossing the borders (e.g. intgers, string, etc.)
29
One possible solution is the "proxy" approach, in which we manually
30
(un)wrap/(un)box all the objects when they cross the border.
37
public static int foo(int x) { return x}
39
>>>> from somewhere import foo
42
In this case we need to take the intval field of W_IntObject, box it
43
to .NET System.Int32, call foo using reflection, then unbox the return
44
value and reconstruct a new (or reuse an existing one) W_IntObject.
49
The general idea to solve handle this problem is to split the
50
"stateful" and "behavioral" parts of wrapped objects, and use already
51
boxed values for storing the state.
53
This way when we cross the Python --> .NET border we can just throw
54
away the behavioral part; when crossing .NET --> Python we have to
55
find the correct behavioral part for that kind of boxed object and
59
Split state and behaviour in the flowgraphs
60
===========================================
62
The idea is to write a graph transformation that takes an usual
63
ootyped flowgraph and split the classes and objects we want into a
64
stateful part and a behavioral part.
66
We need to introduce the new ootypesystem type ``Pair``: it acts like
67
a Record but it hasn't its own identiy: the id of the Pair is the id
70
XXX about ``Pair``: I'm not sure this is totally right. It means
71
that an object can change identity simply by changing the value of a
72
field??? Maybe we could add the constraint that the "id" field
73
can't be modifiend after initialization (but it's not easy to
76
XXX-2 about ``Pair``: how to implement it in the backends? One
77
possibility is to use "struct-like" types if available (as in
78
.NET). But in this case it's hard to implement methods/functions
79
that modify the state of the object (such as __init__, usually). The
80
other possibility is to use a reference type (i.e., a class), but in
81
this case there will be a gap between the RPython identity (in which
82
two Pairs with the same state are indistinguishable) and the .NET
83
identity (in which the two objects will have a different identity,
86
Step 1: RPython source code
87
---------------------------
92
def __init__(self, intval):
96
return self.intval + x
106
Sometimes the following examples are not 100% accurate for the sake of
107
simplicity (e.g: we directly list the type of methods instead of the
108
ootype._meth instances that contains it).
114
W_IntObject = Instance(
115
"W_IntObject", # name
116
ootype.OBJECT, # base class
117
{"intval": (Signed, 0)}, # attributes
118
{"foo": Meth([Signed], Signed)} # methods
122
Prebuilt constants (referred by name in the flowgraphs)
126
W_IntObject_meta_pbc = (...)
127
W_IntObject.__init__ = (static method pbc - see below for the graph)
135
1. x = new(W_IntObject)
136
2. oosetfield(x, "meta", W_IntObject_meta_pbc)
137
3. direct_call(W_IntObject.__init__, x, 41)
138
4. result = oosend("foo", x, 1)
142
W_IntObject.__init__(W_IntObject self, Signed intval) {
143
1. oosetfield(self, "intval", intval)
146
W_IntObject.foo(W_IntObject self, Signed x) {
147
1. value = oogetfield(self, "value")
148
2. result = int_add(value, x)
152
Step 3: Transformation
153
----------------------
155
This step is done before the backend plays any role, but it's still
156
driven by its need, because at this time we want a mapping that tell
157
us what classes to split and how (i.e., which boxed value we want to
160
Let's suppose we want to map W_IntObject.intvalue to the .NET boxed
161
``System.Int32``. This is possible just because W_IntObject contains
162
only one field. Note that the "meta" field inherited from
163
ootype.OBJECT is special-cased because we know that it will never
164
change, so we can store it in the behaviour.
171
W_IntObject_bhvr = Instance(
174
{}, # no more fields!
175
{"foo": Meth([W_IntObject_pair, Signed], Signed)} # the Pair is also explicitly passed
178
W_IntObject_pair = Pair(
179
("value", (System.Int32, 0)), # (name, (TYPE, default))
180
("behaviour", (W_IntObject_bhvr, W_IntObject_bhvr_pbc))
188
W_IntObject_meta_pbc = (...)
189
W_IntObject.__init__ = (static method pbc - see below for the graph)
190
W_IntObject_bhvr_pbc = new(W_IntObject_bhvr); W_IntObject_bhvr_pbc.meta = W_IntObject_meta_pbc
191
W_IntObject_value_default = new System.Int32(0)
199
1. x = new(W_IntObject_pair) # the behaviour has been already set because
200
# it's the default value of the field
202
2. # skipped (meta is already set in the W_IntObject_bhvr_pbc)
204
3. direct_call(W_IntObject.__init__, x, 41)
206
4. bhvr = oogetfield(x, "behaviour")
207
result = oosend("foo", bhvr, x, 1) # note that "x" is explicitly passed to foo
212
W_IntObject.__init__(W_IntObjectPair self, Signed value) {
213
1. boxed = clibox(value) # boxed is of type System.Int32
214
oosetfield(self, "value", boxed)
217
W_IntObject.foo(W_IntObject_bhvr bhvr, W_IntObject_pair self, Signed x) {
218
1. boxed = oogetfield(self, "value")
219
value = unbox(boxed, Signed)
221
2. result = int_add(value, x)
230
Apply the transformation to a whole class (sub)hierarchy is a bit more
231
complex. Basically we want to mimic the same hierarchy also on the
232
``Pair``\s, but we have to fight the VM limitations. In .NET for
233
example, we can't have "covariant fields"::
239
class Derived: Base {
240
public Derived field;
243
A solution is to use only kind of ``Pair``, whose ``value`` and
244
``behaviour`` type are of the most precise type that can hold all the
245
values needed by the subclasses::
248
class W_IntObject(W_Object): ...
249
class W_StringObject(W_Object): ...
253
W_Object_pair = Pair(System.Object, W_Object_bhvr)
255
Where ``System.Object`` is of course the most precise type that can
256
hold both ``System.Int32`` and ``System.String``.
258
This means that the low level type of all the ``W_Object`` subclasses
259
will be ``W_Object_pair``, but it also means that we will need to
260
insert the appropriate downcasts every time we want to access its
261
fields. I'm not sure how much this can impact performances.