1
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
4
<title>Generalization of Deferred Execution in Python</title>
8
<h1>Generalization of Deferred Execution in Python</h1>
10
<p>Glyph Lefkowitz</p>
13
<div>Twisted Matrix Labs</div>
14
<div><a href="mailto:glyph@twistedmatrix.com">glyph@twistedmatrix.com</a></div>
19
<p>A deceptively simple architectural challenge faced by many multi-tasking
20
applications is gracefully doing nothing. Systems that must wait for the
21
results of a long-running process, network message, or database query while
22
continuing to perform other tasks must establish conventions for the semantics
23
of waiting. The simplest of these is blocking in a thread, but it has
24
significant scalability problems. In asynchronous frameworks, the most common
25
approach is for long-running methods to accept a callback that will be executed
26
when the command completes. These callbacks will have different signatures
27
depending on the nature of the data being requested, and often, a great deal of
28
code is necessary to glue one portion of an asynchronous networking system to
29
another. Matters become even more complicated when a developer wants to wait
30
for two different events to complete, requiring the developer to "juggle"
31
the callbacks and create a third, mutually incompatible callback type to handle
32
the final result. </p>
34
<p>This paper describes the mechanism used by the Twisted framework for waiting
35
for the results of long-running operations. This mechanism, the <code>Deferred</code>,
36
handles the often-neglected problems of error handling, callback juggling,
37
inter-system communication and code readability. </p>
39
<p> In a framework like Twisted, the ability to glue two existing components
40
together with a minimum of mediating code is paramount. Considering that the
41
vast majority of the code in Twisted is asynchronous I/O handling, it is
42
imperative that the mechanism for relaying the data between the output from one
43
system into the input of another be competitive with the simplicity of passing
44
the return value of one method to the argument of another. It was also
45
important to use only no new syntax to avoid confusing programmers who already
46
have experience with Python, and establish no dependencies on anything which
47
would break compatibility with existing Python code or C / Java
50
<h2>Other Popular Approaches</h2>
52
<p>There are several traditional approaches to handling concurrency that have
53
been taken by application frameworks in the past. Each has its own
58
<p>The problems with using threads for concurrency in systems that need to
59
scale is fairly well-documented. However, many systems that use asynchronous
60
multiplexing for I/O and system-level tasks, but run application code in a
61
thread. Zope's threaded request handling is a good example of this model.</p>
63
<p>It is optimal, however, to avoid <em>requiring</em> threads for any part of
64
a framework. Threading has a significant cost, especially in Python. The
65
global interpreter lock destroys any performance benefit that threading may
66
yield on SMP systems, and introduces significant complexity into both framework
67
and application code that needs to be thread-safe.</p>
69
<p>A full discussion of the pros and cons of threads is beyond the scope of
70
this paper, however, using threads merely for blocking operations is clearly
71
overkill. Since each thread represents some allocation of resources, all of
72
those resources are literally sitting idle if they are doing nothing but
73
waiting for the results from a blocking call.</p>
75
<p>In a fairly traditional networking situation, where the server is
76
asynchronously multiplexed, this waste of resources may be acceptable for
77
special-purpose, simple client programs, since only a few will be run at a
78
time. To create a generic system, however, one must anticipate cases when the
79
system in question is not only a multi-user server or a single-user client, but
80
also a multi-user hybrid client/server.</p>
82
<p>A good example of this is a high-volume web spider. A spider may have a
83
server for administrative purposes, but must also be able to spawn many
84
requests at once and wait for them all to return without allocating undue
85
resources for each request. The non-trivial overhead of threads, in addition
86
to sockets, would be a very serious performance problem. </p>
88
<h3>Callback Conventions</h3>
90
<p>At some level, any system for handling asynchronous results in Python will
91
be based on callback functions. The typical way to present this to the
92
application programmer is to have all asynchronous methods accept a callback as
93
one of their arguments.</p>
95
<p>This approach is usually standardized by giving the callback having a
96
standard name ("callback") or a particular position (first argument, last
97
argument). Even systems which rigorously adhere to such standardization run
98
into problems, however.</p>
100
<p>This approach does work for a variety of events. It is unwieldy when one is
101
attempting to write asynchronous "conversations" that involve multiple
102
stages. The first problem that we notice is the lack of error-handling. If an
103
error occurs in normal Python code, Exception handling provides clean and
104
powerful semantics for handling it. </p>
106
<a href="deferex-listing0.py" class="py-listing">Document Processor Example</a>
108
<p>In an asynchronous method such as the one given above, traditional
109
exceptions fall short. What if an error occurs retrieving the documents from
110
storage? Do we call the callback with an error rather than a result?</p>
112
<h3>Language Modifications</h3>
114
<p>Other languages handle this by associating different semantics with
115
threading, or providing different constructs altogether for concurrency. This
116
has the disadvantage that these languages aren't Python. Even Stackless Python
117
is problematic because it lacks integration with the wide variety of libraries
118
that Python provides access to. </p>
120
<p>The design of <code>Deferred</code> draws upon some of these other languages, and this
121
section will cover several languages and their impact.</p>
123
<p>In particular, the following list of languages were influential:</p>
133
<p> E, Smalltalk, and Scheme proved particularly influential. In E's, there is
134
a sharp distinction between objects which are synchronously accessible and
135
those which are asynchronously accessible. The original use for
136
<code>Deferred</code>s was to represent results from Perspective Broker method
137
calls. E was interesting in that the entire execution environment had
138
assumptions about networking built in. E's "eventually" operator <a
139
href="#steigler">[stiegler]</a> is what originally inspired the distinction
140
between "a method which returns X" and "a method which returns a
141
<code>Deferred</code> that fires X". </p>
144
Smalltalk was influential in that its syntax for closures provided some
145
precedent for thinking about the continuation of a "conversation" of execution
146
as itself an object with methods. The original thought-experiment that lead to
147
<code>Deferred</code>s was an attempt to write some Squeak code that looked like this:
150
(object callRemote: "fooBar") andThen: [ result |
151
Transcript show: result.
152
] orElse: [ failure |
153
failure printTraceback.
157
The hypothetical <code>callRemote</code> method here would return an object
158
with the method <code>andThen:orElse:</code> that took 2 code blocks, one for
159
handling results and the other for handling errors.
162
<p>It was challenging to write enough Smalltalk code to make anything
163
interesting happen with this paradigm, but within the existing framework of
164
Twisted, it was easy to convert several request/response idioms to use this
165
sort of object. Now that Twisted has dropped python 1.5.2 compatibility, and
166
2.1 is the baseline version, we can use <code>nested_scopes</code> <a
167
href="#hylton">[hylton]</a> and anonymous functions to make the code look
168
similar to this original model. </p>
170
<p> Scheme, of course, provides <code>call-with-current-continuation</code> (or
171
<code>call/cc</code>), the mind-bending control structure which has been a
172
subject of much debate in language-design circles. <code>call/cc</code> may
173
have provided more a model of things to avoid than a real inspiration, though.
174
While it is incredibly powerful, it creates almost as many problems as it
175
solves. In particular, the interaction between continuations and
176
<code>try:finally:</code> is undefined <a href="#pitman">[pitman]</a>, since it
177
is impossible to determine the final time the protected code will be run. The
178
strongest lesson from <code>call/cc</code> was to only take as much state in
179
the <code>Deferred</code> as necessary, and to avoid playing tricks with implicit context.
182
<p>The mechanisms that these languages use, however, often rely upon deeper
183
abstractions that make their interpreters less amenable than Python's to
184
convenient, idiomatic integration with C and UNIX. Scheme's
185
<code>call/cc</code> requires a large amount of work and creativity to
186
integrate with "C" language libraries, as C. Tismer's work in
187
Stackless Python Python has shown. <a href="#tismer">[tismer]</a> </p>
189
<h2>Basics of Deferreds</h2>
191
<p>After several months of working with Twisted's callback-based
192
request/response mechanisms, it became apparent that something more was
193
necessary. Often, errors would silently cause a particular process to halt.
194
The syntax for a multi-stage asynchronous process looked confusing, because
195
multiple different interfaces were being invoked, each of which taking multiple
196
callbacks. The complexity of constructing these stages was constantly being
197
exposed to the application developer, when it shouldn't really concern them.
200
<p>In order to make gluing different request/response systems together easy, we
201
needed to create a more uniform way of having them communicate than a simple
202
convention. In keeping with that goal, we reduced several conventions into one
203
class, <code>Deferred</code>, so that the request system could return a
204
<code>Deferred</code> as output and the responder could accept a <code>Deferred</code> as input..
205
<code>Deferred</code>s are objects which represent the result of a request that is not yet
206
available. It is suggested that any methods which must perform long-running
207
calculations or communication with a remote host return a <code>Deferred</code>.</p>
209
<p>This is similar to the Promise pattern, or lazy evaluation, except that it
210
is a promise that will not be resolved synchronously. The terminology usually
211
used to describe a <code>Deferred</code> is "a <code>Deferred</code> that will fire" a particular
214
<p><code>Deferred</code>s have a small interface, which boils down to these five methods,
215
plus convenience methods that call them:
218
<li><code>addCallbacks(self, callback, errback=None, callbackArgs=None,
219
callbackKeywords=None, errbackArgs=None, errbackKeywords=None)</code></li>
220
<li><code>callback(result)</code></li>
221
<li><code>errback(result)</code></li>
222
<li><code>pause()</code></li>
223
<li><code>unpause()</code></li>
228
<p>In general, code that initially returns <code>Deferred</code>s will be framework code,
229
such as a web request or a remote method call. This means that code that uses
230
the framework will call <code>addCallbacks</code> on the <code>Deferred</code> that is
231
returned by the framework. When the result is ready, the callback will be
232
triggered and the client code can process the result. Usually the utility
233
methods <code>addCallback</code> and <code>addErrback</code> are used.
236
<p>Using <code>addCallbacks</code> has slightly different semantics than using
237
<code>addCallback</code> followed by <code>addErrback</code>;
238
<code>addCallbacks</code> places the callback and the errback "in
239
parallel", meaning if there is an error in your callback, your errback will
240
not be called. Thus using <code>addCallbacks</code> has either/or semantics;
241
either the callback or the errback will be called, but not both.</p>
243
<a href="deferex-listing1.py" class="py-listing">Fictitious Request Example</a>
245
<p>The example given shows a method which returns a <code>Deferred</code> that will fire a
246
formatted string of the result of a given request. The return value of each
247
callback is passed to the first argument of the next.</p>
249
<h2>Generalized Error Handling</h2>
251
<p>As described above in the section on using callbacks for asynchronous result
252
processing, one of the most common application-level problems in an
253
asynchronous framework is an error that causes a certain task to stop
254
executing. For example, if an exception is raised while hashing a user's
255
password, the entire log-in sequence might be halted, leaving the connection in
256
an inconsistent state.</p>
258
<p>One way that Twisted remedies this is to have reasonable default behavior in
259
circumstances such as this: if an uncaught exception is thrown while in the
260
<code>dataReceived</code> callback for a particular connection, the connection
261
is terminated. However, for multi-step asynchronous conversations, this is not
264
<p>Python's basic exception handling provides a good example for an
265
error-handling mechanisms. If the programmer fails to account for an error, an
266
uncaught exception stops the program and produces information to help track it
267
down. Well-written python code never has to manually detect whether an error
268
has occurred or not: code which depends on the previous steps being successful
269
will not be run if they are not. It is easy to provide information about an
270
error by using attributes of exception objects. It is also easy to relay
271
contextual information between successful execution and error handlers, because
272
they execute in the same scope.</p>
274
<p><code>Deferred</code> attempts to mimic these properties as much as possible in an
275
asynchronous context.</p>
277
<h3>Reasonable Defaults</h3>
279
<p>When something unexpected goes wrong, the program should emit some debugging
280
information and terminate the asynchronous chain of processing as gracefully as
283
<p>Python exceptions do this very gracefully, with no effort required on the
284
part of the developer at all.</p>
286
<a href="deferex-simple-raise.py" class="py-listing">Simple Catastrophic Exception</a>
288
<p><code>Deferred</code>s provide a symmetrical facility, where the developer may register a
289
callback but then forego any error processing.</p>
291
<a href="deferex-simple-failure.py" class="py-listing">Simple Catastrophic Deferred Failure</a>
293
<p>In this example, the onSuccess callback will never be run, because the
294
blowUp callback creates an error condition which is not handled. </p>
296
<h3>No Ambiguity about Failure</h3>
298
<p>It is impossible to provide a reasonable default behavior if failure is
299
ambiguous. Code should never have to manually distinguish between success and
300
failure. An error-processing callback has a distinct signature to a
301
result-processing callback.</p>
303
<p>Forcing client code to manually introspect on return values creates a common
304
kind of error; when the success of a given long-running operation is assumed,
305
it appears to work, and it is easier (and less code) to write a callback that
306
only functions properly in a successful case, and creates bizarre errors in a
307
failure case. A simple example:.</p>
309
<a class="py-listing" href="deferex-bad-adding.py">Common Error Pattern</a>
311
<p>In this example, when the remote call to add the two numbers succeeds,
312
everything looks fine. However, when it fails, <code>result</code> will be an
313
exception and not an integer: therefore the printed traceback will say
314
something unhelpful, like:</p>
316
<pre>TypeError: unsupported operand types for +: 'instance' and 'int'</pre>
318
<h3>Rich Information about Errors</h3>
320
<p>It should be easy for developers to distinguish between fatal and non-fatal
321
errors. With Python exceptions, you can do this by specifying exception
322
classes, which is a fairly powerful technique.</p>
324
<a href="deferex-complex-raise.py" class="py-listing">Complex Python Exception</a>
326
<p>With <code>Deferred</code>, we planned to have a syntactically simple technique for
327
accomplishing something similar. The resulting code structure is tends to be a
328
bit more expansive than the synchronous equivalent, due to the necessity of
329
giving explicit names to the functions involved, but it can be just as easy to
332
<a href="deferex-complex-failure.py" class="py-listing">Complex Deferred Failure</a>
334
<p>In this example, we have a callback chain that begins with the result of a
335
remote method call of 3. We then encounter a <code>MyExc</code> error raised
336
in <code>blowUp</code>, which is caught by the errback <code>trapIt</code>.
337
The 'trap' method will re-raise the current failure unless its class matches
338
the given argument, so the error will continue to propagate if it doesn't
339
match, much like an <code>except:</code> clause.</p>
341
<h3>Easy Propagation of Context</h3>
343
<p>While it is dangerous to implicitly propagate too much context (leading to
344
problems similar to those with threads), we wanted to make sure that it is easy
345
to move context from one callback to the next, and to convert information in
346
errors into successful results after the errors have been handled. </p>
348
<p>Both <code>addCallback</code> and <code>addErrback</code> have the signature
349
<code>callable, *args, **kw</code>. The additional arguments are passed
350
through to the registered callback when it is invoked. This allows us to
351
easily send information about the current call to the error-handler in the same
352
way as the success callback.</p>
354
<h2>Patterns of Usage</h2>
356
<p>Since <code>Deferred</code> is designed for a fairly specific class of problems, most
357
places it is used tend to employ certain idioms.</p>
359
<h3>Request-ID Dictionary</h3>
361
<p>If you are implementing a symmetric, message-oriented protocol, you will
362
typically need to juggle an arbitrary number of outstanding requests at once.
363
The normal pattern for doing this is to create a dictionary mapping a request
364
number to a <code>Deferred</code>, and firing a <code>Deferred</code> when a response with a given
365
request-ID associated with it arrives.</p>
367
<p>A good example of this pattern is the Perspective Broker protocol. Each
368
method call has a request, but it is acceptable for the peer to make calls
369
before answering requests. Few protocols are as extensively permissive about
370
execution order as PB, but any full-fledged RPC or RMI protocol will enable
371
similar interactions. The MSN protocol implementation in Twisted also uses
372
something similar.</p>
374
<h3> Sometimes Synchronous Interface </h3>
376
<p>When writing interfaces that application programmers will be implementing
377
frequently, it is often convenient to allow them to either return either a
378
<code>Deferred</code> or a synchronous result. A good example of this is Twisted's Woven, a
379
dynamic web content system.
382
<p> The processing of any XML node within a page may be deferred until some
383
results are ready, be they results of a database query, a remote method call,
384
an authentication request, or a file upload. Many methods that may return
385
Nodes may also return <code>Deferred</code>s, so that in either case the application
386
developer need return the appropriate value. No wrapping is required if it is
387
synchronous, and no manual management of the result is required if it is not.
390
<p>This is the best way to assure that an application developer will never need
391
to care whether a certain method's results are synchronous or not. The first
392
usage of this was in Perspective Broker, to allow easy transparent forwarding
393
of method calls. If a <code>Deferred</code> is returned from a remotely accessible method,
394
the result will not be sent to the caller until the <code>Deferred</code> fires. </p>
396
<a href="deferex-forwarding.py" class="py-listing">Forwarding Local and Remote
399
<h3><code>callRemote</code></h3>
401
<p>Ideally, all interactions between communicating systems would be modeled as
402
asynchronous method calls. Twisted Words, the Twisted chat server, treats any
403
asynchronous operation as a subset of the functionality of Perspective Broker,
404
using the same interface. Eventually, the hope is to make greater use of this
405
pattern, and abstract asynchronous conversations up another level, by having
406
the actual mechanism of message transport wrapped so that client code is only
407
aware of what asynchronous interface is being invoked.</p>
409
<h2>Advanced Features</h2>
411
<p>The first "advanced" feature of <code>Deferred</code>s is actually
412
used quite frequently. As discussed previously, each <code>Deferred</code> has
413
not one, but a chain of callbacks, each of which is passed the result from the
414
previous callback. However, the mechanism that invokes each callback is itself
415
an implementor of the previously-discussed "Sometimes Synchronous
416
Interface" pattern - a callback may return either a value or a
417
<code>Deferred</code>.</p>
419
<p>For example, if we have a <code>Deferred</code> A, which has 2 callbacks: X,
420
which returns a deferred B, that fires the result to X in 2 seconds, and Y,
421
which prints its result, we will see the string "hello" on the screen
422
in 2 seconds. While it may sound complex, this style of coding one
423
<code>Deferred</code> which depends on another looks very natural.</p>
425
<a class="py-listing" href="deferex-chaining.py">Chaining 2
426
<code>Deferred</code>s Together</a>
428
<p>In this way, any asynchronous conversation may pause to wait for an
429
additional request, without knowing in advance of running the first request
430
what all the requests will be.</p>
432
<p>The other advanced feature of <code>Deferred</code>s is not terribly common,
433
but is still useful on occasion. We have glossed over the issue of
434
"pre-executed"<code>Deferred</code>s so far, e.g. <code>Deferred</code>s
435
which have already been called with a callback value before client code adds
436
callbacks to them. The default behavior, which works in almost every
437
situation, is simply to call the callback immediately (synchronously) as it is
438
added. However, there are rare circumstances where race conditions can occur
439
when this naive approach is taken.</p>
441
<p>For this reason, <code>Deferred</code> provides <code>pause</code> and
442
<code>unpause</code> methods, allowing you to put a <code>Deferred</code> into
443
a state where it will stop calling its callbacks as they are added; this will
444
allow you to set up a series of communicating <code>Deferred</code>s without
445
having anything execute, complete your setup work, and then unpause the
448
<p>In this way, you can create centralized choke-points for caring about whether
449
a process is synchronous or not, and completely ignore this problem in your
450
application code. For example, in the now-obsolete Twisted Web Widgets system
451
(a dynamic web content framework that predates woven), it was necessary to make
452
sure that certain <code>Deferred</code>s were always called in order, so the page would
453
render from top to bottom. However, the user's code did not need to concern
454
itself with this, because any <code>Deferred</code>s for which synchronous callback
455
execution would have been an issue were passed to user code paused.</p>
459
<p><code>Deferred</code>s are a powerful abstraction for dealing with
460
asynchronous results. Having a uniform approach to asynchronous conversations
461
allows Twisted APIs to provide a level of familiarity and flexibility for
462
network programmers that approaches that of domain-specific languages, but
463
still provides access to all of Python's power.</p>
465
<h2>Acknowledgements</h2>
467
<p>I would like to thank the entire Twisted team, for making me realize what a
468
good idea I had hit upon with <code>Deferred</code>s.</p>
470
<p>Special thanks go to Andrew Bennetts and Moshe Zadka, for implementing the
471
portion of Twisted used to generate this, and other, papers, and to Ying Li and
472
Donovan Preston for last-minute editorial assistance..</p>
478
<li><a name="stiegler"></a>Marc Stiegler, <a
479
href="http://www.skyhunter.com/marcs/ewalnut.html#SEC19">The E Language in a
480
Walnut</a>, <i>erights.org</i></li>
482
<li><a name="hylton"></a>Jeremy Hylton, <a
483
href="http://www.python.org/peps/pep-0227.html" >PEP 227, "Statically
484
Nested Scopes"</a></li>
486
<li><a name="pitman"></a>Kent Pitman, <a
487
href="http://www.nhplace.com/kent/PFAQ/unwind-protect-vs-continuations.html"
488
>UNWIND-PROTECT vs. Continuations</a>, <i>Kent Pitman's Personal FAQ</i></li>
490
<li><a name="tismer">Christian Tismer, <a
491
href="http://www.stackless.com/spcpaper.htm">Continuations and Stackless
492
Python</a>, <i>Proceedings of the Sixth International Python Conference</i>