3
Copyright (c) 1985, 1986, 1987, 1988, 1989 Xerox Corporation.
6
Use and copying of this document is permitted. Any distribution of this
7
document must comply with all applicable United States export control
10
Last updated: 6/3/89 by Gregor
11
10/26/89 by Gregor -- added :RETURN, removed :ISHIFT
13
This file contains documentation of the PCL abstract LAP code. Any port
14
of PCL is required to implement the abstract LAP code interface. There
15
is a portable, relatively good performance implementation in the file
16
lap.lisp, port-specific implementations are in that file as well.
18
The PCL abstract LAP code mechanism exists to provide PCL with a way to
19
create high-performance method lookup functions. Using this mechanism,
20
PCL can produce "LAP closures" which do the method lookup. By allowing
21
PCL to specify these closures using abstract LAP code rather that Lisp
22
code we hope to achieve the following:
24
* Better runtime performance. By using abstract LAP code, we
25
will get better machine instruction sequences than we would
26
from compiling Lisp code.
28
* Better load and update time performance. Because it should
29
be possible to "assemble" the LAP code more quickly than
30
compiling Lisp code, PCL will spend less time building the
33
* Ability to use PCL without a compiler. The LAP assembler will
34
still be required but this should be much smaller than the full
37
Of course, not all implementations of the LAP code mechanism will
38
satisfy all of these goals. The first is the most important.
40
In particular, many PCL ports will use the portable LAP implementation.
41
KCL will use the portable implementation in all of its ports. Other
42
Lisps may have custom LAP implementations for some ports and use the
43
portable implementation for other ports.
45
Some Lisps will have a custom LAP implementation but will nonetheless
46
require the compiler to be loaded to generate LAP closure constructors.
48
An important point is why we have chosen to take this route rather than
49
have each implementation implement the method lookup codes itself. This
50
was done because we are, at PARC, just beginning to study cache behavior
51
for CLOS programs. As we learn more about this we will want to modify
52
the caching strategy PCL uses. This architecture, because it leaves
53
PCL to implement caching behavior makes it possible to do this. Once
54
this study is complete, implementations may want to do their own, ultra
55
high performance implementations of caching strategies.
59
Production of LAP closures is a two step process. In the first step, a
60
port-specific function is called to take abstract LAP code and produce a
61
a "lap closure generator". Lap closure generators are functions which
62
are called with a set of closure variable values and return a LAP
65
The intermediary of the lap closure generators provides an important
66
optimization. Because it is assumed that producing the LAP closure
67
generator can take much longer than producing a LAP closure from the
68
generator, PCL attempts to make only one closure generator for each
69
sequence of LAP code. Because of the way PCL generates the LAP code
70
sequences, this is quite easy for it to do.
72
The rest of this document is divided into six parts.
74
* the metatypes std-instance and fsc-instance
76
* an abstraction for simple vector indices
78
* important optimizations
80
* the port specific function for making lap closure generators
82
* the actual abstract LAP code
86
*** The metatypes STD-INSTANCE and FSC-INSTANCE ***
88
In PCL, instances with metaclass STANDARD-CLASS are represented using
89
the metatype STD-INSTANCE. (Note that in Cinco de Mayo PCL, this
90
metatype is called IWMC-CLASS.) Each port must implement this metatype.
91
The metatype could be implemented by the following DEFSTRUCT.
93
(defstruct (std-instance
94
(:predicate std-instance-p)
95
(:conc-name %std-instance-)
96
(:constructor %allocate-std-instance (wrapper slots))
97
(:constructor %allocate-std-instance-1 ())
98
(:print-function print-std-instance))
102
PCL itself will guarantee correct access to this structure and the
103
accessors and constructors. With this in mind, the following are
106
* Being able to type test this structure quickly is critical. See
107
the :STD-INSTANCE-P opcode.
109
* The allocation functions should compile inline, do no argument
110
checking and be as fast as possible.
112
* The accessor functions should compile inline, do no checking of their
113
arguments and be as fast as possible. SETF of the accessors should
116
The port is also required to implement the metatype FSC-INSTANCE (called
117
FUNCALLABLE-INSTANCE, or FIN for short, in Cinco de Mayo PCL). Objects
118
with this metatype are used, among other things, to implement generic
119
functions. These objects have field structure associated with them and
120
are also functions that can be applied to arguments. The fields are the
121
same as those for STD-INSTANCE, the FSC-INSTANCE metatype has
122
predicates, print-functions, constructors and accessors as follows:
126
%fsc-instance-wrapper
128
%allocate-fsc-instance (wrapper slots)
129
%allocate-fsc-instance-1 ()
131
In addition, objects of metatype FSC-INSTANCE have a property called the
132
funcallable instance function. When an FSC-INSTANCE is applied to
133
arguments, the funcallable instance function is what is actually called.
134
The funcallable instance function of an FSC-INSTANCE can be changed
135
using the function SET-FUNCALLABLE-INSTANCE-FUNCTION. There is no
136
mechanism for obtaining the funcallable instance function of an
139
It is possible to implement the FSC-INSTANCE metatype in pure Common
140
Lisp. A simple implementation which uses lexical closures as the
141
instances and a hash table to record that the lexical closures are of
142
metatype FSC-INSTANCE is easy to write. Unfortunately, this
143
implementation adds significant overhead:
145
to generic-function-invocation (1 function call)
146
to slot-access (1 function call or one hash table lookup)
147
to class-of a generic-function (1 hash-table lookup)
149
In addition, it would prevent the FSC-INSTANCEs from being garbage
150
collected. In short, the pure Common Lisp implementation really isn't
153
Note that previous implementations of FINS were always based on the
154
lexical closure metatype. In some ports, that provides poor
155
performance. Those ports may want to consider reimplementing to use the
156
compiled code metatype. In that implementation strategy, LAP closure
157
variables would become constants of the compiled code object.
159
The following note from JonL is of interest when working on a FIN
162
Date: Tue, 16 May 89 05:45:56 PDT
163
From: Jon L White <jonl@lucid.com>
165
This isn't a bug in Lucid's compiler -- it's a lurking bug in PCL
166
that will "bite" most implementations where different settings of
167
the compiler optimization switches will produce morphologically
168
different (but of course functionally equivalent) function objects.
170
The difficulty is in how discriminator codes service cache misses.
171
They "call out" to (potentially) random functions that will in some
172
cases "smash" the function object that was actually running as the
173
discriminator code. This is all right providing you don't return to
174
that function frame, but alas ...
176
I know this is a more extensive problem because the code in the
177
port-independent function 'notice-methods-change' goes out of
178
its way to do a tail-recursive call to the function that is going
179
to smash the possibly-executing discriminator code. Here is the
180
commentary from that code (sic):
182
;; In order to prevent this we take a simple measure: we just
183
;; make sure that it doesn't try to reference our its own closure
184
;; variables after it makes the dcode change. This is done by
185
;; having notice-methods-change-2 do the work of making the change
186
;; AND calling the actual generic function (a closure variable)
187
;; over. This means that at the time the dcode change is made,
188
;; there is a pointer to the generic function on the stack where
189
;; it won't be affected by the change to the closure variables.
192
A similar thing should be done in the construction of standard-accessor,
193
checking, and caching dcodes. In an experimental version here at Lucid,
194
I rewrote dcode.lisp to do that, and there is no problem with it.
195
Although that code is somewhat Lucid-specific, it could be of help to
196
someone who wanted to rewrite the generic dcode.lisp (no pun intended).
197
Contact me privately if you are interested.
199
Doing a tail-recursive call out of dcodes when there is a cache miss
200
is a good thing, regardless of other problems. I think one might as
201
well do it. However, I should point out that in the presence of
202
multiprocessing, there is another more serious problem that cannot be
203
solved so simply. Think about what happens when one process decides
204
to update a dcode while another process is still using it; no such
205
stack-maintenance discipline will fix this case. A tail-recursive
206
exit from the dcode will *immensely* reduce the likelihood that
207
another process can sneak in during the interval in which the dcode
208
requires consistency in its function; but it can't reduce that
211
The more desirable thing to do is to put the whole "dcode" down one
212
more level of indirection through the symbol-function cell of the
213
generic function. This is effectively what PCL's 'make-trampoline'
214
function does, but unfortunately that is not a very efficient approach
215
when you consider how most compilers will compile it. Something akin
216
to the "mattress-pads" in Steve Haflich's code (in the fin.lisp file)
217
could probably be done for many other implementations as well.
220
*** Index Operations ***
222
Indexes are an abstraction for indexes into a simple vector. This
223
abstraction is used to make it possible to generate more efficient
224
code to access simple vectors. The idea being that this may make it
225
possible to use alternate addressing modes to address these.
227
The "index value" of an index is defined to be the fixnum of which that
228
index is an alternate form. So, using the Lisp function SVREF with the
229
index value of an index accesses the same element as using the index
230
with the appropriate access function or operand.
232
The format of an index is unspecified, but is assumed to be something
233
like a fixnum with certain bits ignored. Accessing a vector using an
234
index must be done using the appropriate special accessor function or
237
Conversion from index values to indices and vice-versa can be done with
238
the following functions:
240
INDEX-VALUE->INDEX (index-value)
241
INDEX->INDEX-VALUE (index)
244
The following constant indicates the maximum index value an index can
245
have in a given port. This must be at least 2^16.
247
INDEX-VALUE-LIMIT - a fixnum, must be at least 2^16.
250
MAKE-INDEX-MASK (<cache-size> <line-size>)
252
This function is used to make index masks. Because I am lazy, I show an
253
implementation of it in the common case where indexes are just fixnums:
255
(defun make-index-mask (cache-size line-size)
256
(let ((cache-size-in-bits (floor (log cache-size 2)))
257
(line-size-in-bits (floor (log line-size 2)))
259
(dotimes (i cache-size-in-bits) (setq mask (dpb 1 (byte 1 i) mask)))
260
(dotimes (i line-size-in-bits) (setq mask (dpb 0 (byte 1 i) mask)))
263
*** Optimizations ***
265
This section discusses two important optimizations related to LAP
266
closures. The first relates to calling LAP closures themselves, the
267
second relates to calling other functions from LAP closures.
269
The important point about calling LAP closures is that almost all of the
270
time, LAP closures will be used as the funcallable-instance-function of
271
funcallable instances. It is required that LAP closures be funcallable
272
themselves, but usually they will be stored in a FIN and the fin will
273
then be funcalled. This brings up several optimizations, including ones
274
having to do with access to the closure variables of a LAP closure.
276
When a LAP closure is used to do method lookup, the function the LAP
277
closure ends up calling has the same number of required arguments as the
278
LAP closure itself. Since the LAP closure must check its required
279
arguments to do the lookup, it is redundant for the function called to
280
do so as well. Since LAP closures do all calls in a tail recursive way,
281
it should even be possible to optimize out certain parts of the normal
282
stack frame initialization.
284
A similar situation occurs between effective method functions and the
285
individual method functions; the difference is that in effective method
286
functions, the calls are not necessarily tail recursive.
288
Consequently, it would be nice to have a way to call certain functions
289
and inhibit the checking of required arguments. This is made possible
290
by use of the PCL-FAST-APPLY and PCL-FAST-FUNCALL macros together with
291
the PCL-FAST-CALL compiler declaration.
293
The PCL-FAST-CALL compiler declaration declares that a function may be
294
fast called. Not all callers of the function will necessarily fast call
295
it, but most probably will. The :JMP opcode can only be used to call a
296
function compiled with the PCL-FAST-CALL declaration.
298
The PCL-FAST-APPLY and PCL-FAST-FUNCALL macros are used to fast call a
299
function. The function argument must be a compiled function that has
300
the PCL-FAST-CALL compiler declaration in its lambda declarations.
302
The basic idea is that the PCL-FAST-CALL compiler declaration causes the
303
compiler to set up an additional entrypoint to the function. This
304
entrypoint comes after checking of required arguments but before
305
processing of other arguments.
307
Note: When FAST-APPLY is used, the required arguments will be given as
308
separate arguments and all other arguments will appear as a single
309
spread argument. For example:
311
(let ((fn (compile () '(lambda (a b &optional (c 'z))
312
(declare (pcl-fast-call))
315
(pcl-fast-apply fn 'x 'y ()) ;legal
316
(pcl-fast-apply fn 'x 'y '(foo)) ;legal
317
(pcl-fast-apply fn '(a b c)) ;illegal
320
*** Producing LAP Closure Generators ***
322
Each implementation of the LAP code mechanism must provide a port
323
specific function making lap closure generators. In the portable
324
implementation, this function is called PLAP-CLOSURE-GENERATOR. In
325
ExCL it should be called EXCL-LAP-CLOSURE-GENERATOR etc.
327
At any time, the value of the variable *make-lap-closure-generator* is a
328
symbol which names the function currently being used to make lap closure
331
The port specific function must accept arguments as follows:
333
PLAP-CLOSURE-GENERATOR (<closure-vars>
336
<simple-vector-registers>
339
This returns a lap-closure generator. A lap-closure generator is a
340
function which is called with a number of arguments equal to the length
341
of <closure-vars>. These arguments are the values of the closure
342
variables for the lap closure. These values cannot be changed once the
343
LAP closure is created. PCL takes care of keeping track of
344
lap-closure-generators it already has on hand and reusing them. The
345
function RESET-LAP-CLOSURE-GENERATORS can be called to force PCL to
346
forget all the lap closure generators it has remembered.
350
A list of symbols. This provides a way to name particular arguments to
351
the LAP closure. Arguments which will not be referenced by name are
352
given as NIL. All required arguments to the LAP closure are explicitly
353
included (perhaps as NIL). If &REST appears at the end of arguments it
354
means that non-required arguments are allowed, these will be processed
355
by the methods. If &REST does not appear at the end of arguments, the
356
lap closure should signal an error if more than the indicated number of
357
arguments are supplied.
363
Specifies a two argument lap closure. If more or less than
364
two arguments are supplied an error is signalled. Within
365
the actual lap code, both arguments can be referenced by
366
name (see the :ARG operand).
370
Specifies a two or more argument lap closure. If less than
371
two arguments are supplied an error is signalled. Within
372
the actual lap code, the first argument can be referenced by
373
name (see the :ARG operand).
378
A list of symbols. The closure will have these as closure variables.
379
Within the lap code these can be accessed using the :CVAR operand. The
380
lap code cannot change these values. SET-FUNCALLABLE-INSTANCE-FUNCTION
381
is permitted to have the special knowledge that there are at most ?? of
382
these and to be prepared to do something special when the funcallable
383
instance function of a funcallable instance is set to a lap closure.
387
A list of register numbers. These registers will be used only to hold
388
indexes. Other registers may be used to hold indexes as well, but the
389
only values put into these registers will be indexes.
391
<simple-vector-registers>
393
A list of register numbers. These registers will be used only to hold
394
simple-vectors. Other registers may be used to hold simple-vectors as
395
well, but the only values put into these registers will be
401
The actual lap code for this closure. This is a list of LAP code
402
opcodes. See the section "Abstract LAP Code" for more details.
404
Each implementation must also supply a function named PRE-MAKE-xxx where
405
xxx is the same as the name of its make-lap-closure-generator function.
406
The macro doesn't evaluate its arguments, and when it appears in a file
407
it should try to do some of the work at load time. It might appear in a
412
(pre-make-plap-closure-generator ...)))
414
*** Abstract LAP Code ***
416
Each lap code operand has the form: (opcode operand1 ... operandn).
418
In some cases, the distinction between an operand and an opcode is
419
somewhat arbitrary. In general, opcodes have a significant "action"
420
component to their behavior. Operands select a piece of data to operate
421
on. Some operands select their data in a more complex way, but they are
424
All data must be in a register before it can be operated on. This
425
requirement means that the only place a non-register operand can appear
426
is as the first argument to the :move opcode. (Actually, there is one
427
other exception, a :iref operand can be the target of a move as well.)
428
Moreover, only register operands can appear as the second argument to
429
the :move opcode and this register must not appear in the <from>
436
A pseudo register. <n> is an integer in the range [0 , 31].
438
A particular implementation can map this to a real register, a memory
439
location or the stack. The abstract LAP code itself does not include
440
the notion of a stack.
442
PCL will attempt to optimize register use in two ways. PCL itself will
443
attempt to re-use registers whenever possible. That is, the port should
444
not have to worry with doing live register analysis for the registers.
445
In addition, PCL will consider lower numbered registers to be "faster"
446
than higher numbered ones.
451
A closure variable of the lap-closure. <name> is a symbol.
456
An argument to the LAP closure. <name> is a symbol.
458
(:std-wrapper <from>)
459
(:fsc-wrapper <from>)
460
(:built-in-wrapper <from>)
461
(:structure-wrapper <from>)
462
(:other-wrapper <from>)
464
Get the class wrapper of <from>. For std-instances and fsc-instances
465
this just fetches the wrapper field. The specific port is required to
466
implement fast access to the wrappers of built-in, structure and other
467
metatypes. A callback mechanism allows the port to ask PCL to generate
468
a class and wrapper for objects for which no class and wrapper exists
469
yet. This mechanism is <<to be spec'd out>>.
472
(:std-slots <operand>)
473
(:fsc-slots <operand>)
475
Fetch the slots field of a std-instance or a fsc-instance.
477
(:constant <constant>)
479
This just allows inline constants. <constant> can be any Lisp object.
481
The following operands operate on indexes. Each is patterned after a
482
Lisp function which would have a corresponding effect on the index value
486
(:i+ <index-1> <index-2>)
487
(:i- <index-1> <index-2>)
488
(:ilogand <index-1> <index-2>)
489
(:ilogxor <index-1> <index-2>)
491
Like the corresponding Lisp functions.
494
(:iref <vector> <index>)
496
Like the SVREF function. <vector> must be a simple vector.
498
(:cref <vector> <constant>)
500
The :cref operand is for constant vector references. <constant> must be
507
A full word move operation.
510
(:eq <from1> <from2> <label>)
511
(:neq <from1> <from2> <label>)
513
EQ and NOT EQ conditional branches. If the value contained in <from1>
514
is EQ (or not) to the value contained in <from2>, jump to <label>.
515
Otherwise continue execution at the next opcode. <label> is a symbol.
517
(:fix= <from1> <from2> <label>)
519
A fixnum = conditional branch.
521
(:izerop <from> <label>)
523
Jump to label if and only if the index <from> is 0.
525
(:std-instance-p <from> <destination-label>)
526
(:fsc-instance-p <from> <destination-label>)
527
(:built-in-instance-p <from> <destination-label>)
528
(:structure-instance-p <from> <destination-label>)
530
Test the metatype of <from> and dispatch. If the metatype of from is of
531
the specified type execution jumps to <destination-label>. Otherwise,
532
execution proceeds normally at the next opcode. See the :xxx-wrapper
537
fcn contains a function to call. This must be a compiled function,
538
which had the PCL-FAST-CALL declaration in it. The call should be "tail
539
recursive" in that whatever values it returns should be returned
540
immediately from the LAP closure itself.
542
Method lookup is implemented by finding a function in the cache and then
543
using :JMP to call it. The various kinds of traps are implemented by
544
using :JMP to call a trap function that was stored in one of the closure
549
Return immediately from the LAP closure. <value> is the single value to
552
In certain cases of method lookup when all the methods are accessor methods,
553
there is an important optimization which implements the slot access in the
554
discriminating function itself. This opcode is used in that case.
558
Identifies a point in the lap code. <label> is a symbol. This can be
559
the target of any of the control transfer opcodes (:GO, :EQ, :NEQ,
560
:FIX=, :STD-INSTANCE-P, :FSC-INSTANCE-P, :STRUCTURE-INSTANCE-P,
561
:BUILT-IN-INSTANCE-P)
565
Jump to the label <label>. <label> is a symbol.
569
Here is an example of the use of the abstract LAP mechanism. It doesn't
570
use all operands or opcodes, but it is representative of the LAP
571
sequences PCL will use.
573
Several things are worth noting:
574
* This is, I believe, just about the simplest such sequence. There
575
are some others of comparable simplicity, but none much simpler.
577
* A total of only 5 registers are used. I haven't checked, but I
578
am pretty sure most all such code sequences will get by with 16
579
or less and many will stay under 8.
581
(defun make-1-arg-std-dfun ()
583
(making-lap-closure-generator
584
(initialize-lap-cvars '(cache mask reflect trap))
585
(initialize-lap-args '(a0))
586
(initialize-lap-registers 4 4 3)
587
(let ((cache (allocate-lap-register 'simple-vector))
588
(count (allocate-lap-register))
589
(wrapper (allocate-lap-register))
590
(index (allocate-lap-register 'index))
591
(t1 (allocate-lap-register))
592
(t2 (allocate-lap-register))
593
(i1 (allocate-lap-register 'index))
594
(i2 (allocate-lap-register 'index)))
596
(opcode :move (operand :cvar 'cache) cache)
597
(opcode :move (operand :arg 'a0) t1)
598
(opcode :std-instance-p t1 'std-instance)
600
(opcode :label 'std-instance)
602
;; The stable register pattern for the rest of the code is:
606
;; index index under consideration
608
;; Whenever we jump to HIT, this pattern must be in force.
610
(opcode :move (operand :std-wrapper t1) wrapper);
611
(opcode :move (operand :cref cache 0) count) ;
612
(opcode :move (operand :cref wrapper 0) i2) ;wrapper-no -> i2
613
(opcode :move (operand :cvar 'mask) i1) ;mask -> i1
614
(opcode :move (operand :ilogand i1 i2) index) ;
616
(opcode :move (operand :iref cache index) t1) ;key -> t1
617
(opcode :eq t1 wrapper 'hit) ;
618
(opcode :move (operand :cvar 'reflect) i1) ;reflect -> i1
619
(opcode :move index i2) ;index -> i2
620
(opcode :move (operand :i- i1 i2) index) ;
621
(opcode :move (operand :iref cache index) t1) ;key -> t1
622
(opcode :eq t1 wrapper 'hit) ;
626
;; To do the hit, we use registers as follows:
627
;; 0 cache comes in here
628
;; 1 cache count comes in here
629
;; 2 use this for the function
630
;; 3 index comes in here
631
;; 4 1+ index, then new count
634
(opcode :move (operand :i1+ index) i1)
635
(opcode :move (operand :iref cache i1) t1)
637
(opcode :move (operand :cref cache 0) t2)
638
(opcode :fix= count t2 'call)
641
(opcode :label 'call)
644
(opcode :label 'trap)
645
(opcode :move (operand :cvar 'trap) t1)
650
(make-index-mask 16 2)
653
(declare (pcl-fast-call) (ignore a))