1
This file merges information from the various README.r[5-8] notes
2
shipped with Ben Jackson and Jay Carlson's "rogue" server patches.
3
The information below didn't fit well into the changelog format.
7
1.8.0r5 is a collection of unofficial patches to Erik Ostrom's
8
LambdaMOO 1.8.0p6 server release. They're primarily bug fixes and
9
speedups. For logistical reasons they're packaged as a tar file
10
rather than as a collection of diffs.
12
It's difficult to measure MOO server performance. All we can say is
13
that some plausible synthetic benchmarks are now two to four times
14
faster. Users have noted that production systems running this code
15
feel much more responsive at computationally expensive tasks.
19
All files were run through GNU indent with settings given in
20
.indent.pro in an attempt to normalize coding style.
24
Fixed bizarre bug where uninitialized memory was accessed; usually
25
multiplied by zero immediately, so nobody ever noticed.
27
eval_env.c, db_io.c, objects.c, utils.c:
29
Type identifiers (TYPE_STR et al) now contain a bit flag indicating
30
whether additional work needs to be done when a Var of their type is
31
freed. This allows free_var to run inline without a case statement
32
when "simple" Vars are freed. Code to translate between the internal
33
TYPE_STR and the previous external representation added.
35
db_verbs.c, db_objects.c:
37
(This part is primarily Jay's fault, so we'll let him talk about it
38
using the first person.)
40
The verb lookup cache. Traditionally, the server has spent large
41
amounts of time searching for what verbcode to run. MOO verbs can
42
have aliases ($object_utils:descendents/descendants), incomplete
43
specification ($room:l*ook), and command-line verbs distinguished by
44
args...and verb definition order matters during lookup! These
45
features ruled out the naive speedup of just dumping all verbdefs in a
46
hash table per object.
48
I decided not to work too hard on improving the performance of command
49
line verb lookups. Any solution that addressed them looked to be many
50
times more complex than just fixing verbs calling verbs
51
(db_find_callable_verb), and the later appeared more significant to
54
Originally I built a 7 element per object table to cache lookups but
55
this significantly inflated the server size relative to the
56
performance increase. If you're interested in this, it's in the
57
moo-cows archive as one of the steak patches.
59
My current solution to lookup performance is to build a global hash
62
(hash(object_key x target_verbname), object_key, target_verbname)
65
used only for callable verb lookups.
67
Any action on the db that could affect the validity of this table
68
clears the whole table by calling
69
db_priv_affected_callable_verb_lookup(). Here's a list:
73
chparent(): in some circumstances
76
set_verb_info(): name changes, flag changes
79
Since a good number of objects don't have verbs on them (inheriting
80
all behavior from parents) I decided to use "first parent with verbs"
81
as the object_key. This means that all those kids of $exit don't need
82
to have separate table entries for :invoke or whatever. All kids of a
83
player class get a single entry for :tell unless the player has verbs
84
on emself. (Sadly, on LambdaMOO, the lag reduction feature object
85
places a trivial :tell on anyone using it. Since the verb is
86
immediately at hand the lookup is short but unavoidable for every
89
Since I use "first parent with verbs" as object_key, chparent() does
90
not need to clear the table that often. If the object has no verbs,
91
it can't be mentioned in the table directly; however, if it has
92
children it could indirectly affect lookup of its kids that do have
93
verbs. Transient objects going through the usual
94
$recycler:_create()/$recycler:_recycle() life cycle avoid both of
95
these problems and in this release no longer trigger a flush.
97
For this release, Ben added negative caching---failed verb lookups are
98
stored in the table as well.
100
The table itself is implemented as a fixed number of hash chains. The
101
compiled-in default is 7507 (DEFAULT_VC_SIZE in db_verbs.c).
102
Statistics on occupancy are available through two new wiz-only
103
primitives. log_cache_stats() dumps formatted info into the server
104
log; verb_cache_stats() returns a list of the form:
106
{hits, negative_hits, misses, table_clears, histogram}
108
where histogram is a 17 element list. histogram[1] is the number of
109
chains with length 0; histogram[2] is the number of chains with length
110
1 and so on up to histogram[17] which counts the number of chains with
111
length of 16 or greater.
113
hits, negative_hits, misses, and table_clears are counters only zeroed
114
at server start. The histogram is a snapshot of current cache
115
condition. If you're running a really busy server you can overflow
116
the hits counter in a few weeks; your server won't crash but values
117
reported by these functions will be wrong. Yes, LambdaMOO executes
118
*billions* of verbs in a typical run.
120
If you start fretting about how much memory the lookup table is using,
121
write a continuously running verb that forces one of the table clear
124
extensions.c, db_tune.h:
126
The functions in extensions.c that provide verb cache stats need to
127
talk to the db layer's internals in order to gather information, but
128
they aren't part of the db layer proper. db_tune.h was invented as a
129
middle ground between db.h and db_private.h for source files that
130
needed access to implementation-specific interfaces provided by the db
133
Comments (and suggestions on a better name!) on this are solicited.
135
decompile.c, program.c:
137
When errors are thrown, the line number of the error is included in
138
the traceback information. Mapping between bytecode program counter
139
and line number is expensive, so each Program now maintains a single
140
pc->lineno cache entry---hopefully most programs that fail multiple
141
times usually fail on the same line.
143
eval_env.c, execute.c:
145
To avoid calling malloc()/free() as often, the server now keeps a
146
central pool of rt_stacks and rt_envs of given sizes. They revert to
147
malloc()/free() for large requests.
151
General optimization; Ben can write more extensively about this. One of
152
the more significant is that OP_IMM followed by OP_POP is "peephole
153
optimized"; this makes verb comments like
155
"$string_utils:from_list(l, [, separator])";
156
"Return a string etc";
158
"and do some more work";
163
An important memory leak involving failed property lookups was closed.
167
Because very few sites actually use protected builtin properties and
168
using them is a very substantial performance hit, a new options.h
169
define, IGNORE_PROP_PROTECTED, allows them to be disabled at
170
compile-time. This is the default.
172
functions.c, server.c:
174
Doing property lookups per builtin function call to determine whether
175
the function needs the $server_options.protect_foo treatment is
176
extremely expensive. A protectedness flag was directly added to the
177
builtin function struct; the value of these flags are loaded from the
178
db at startup time, or whenever the new builtin function
179
load_server_options() is called.
183
There's now a canonical empty list.
185
The regexp pattern cache wasn't storing the case_matters flag, causing
186
many patterns to be impossible to find in the cache.
188
decode_binary() was broken on systems where char is signed by default.
190
doinsert reallocs lists with refcount 1 when appending rather than
191
calling var_ref/free_var on all the elements. (The general case could
192
be sped up with memcpy as well.)
196
sys/time.h may be necessary for FD_ZERO et al definitions.
198
parse_cmd.c, storage.h:
200
parse_into_words was incorrectly allocating an array of (char *) as
201
M_STRING. This caused a million unaligned memory access warnings on
202
the Alpha. Created a new M_STRING_PTRS allocation class for this.
206
fastmap was allocated with mymalloc() but freed with the normal
211
Refcounts are now allocated as part of objects that can be
212
addref()'d. This allows macros to manipulate those counts and makes a
213
request for the current refcount of an object much cheaper. This
214
completely replaces the old hash table implementation.
218
There's now a canonical empty string.
220
myrealloc(), the mymalloc/myfree analog of realloc() is now available.
222
As a result of the changes, the memory debugging code is no longer
223
available. Also, since we now hold pointers to only the interior of
224
some allocated objects, tools such as Purify will claim a million
225
possible memory leaks.
229
If a forked task was killed before it ever started, it leaked some
234
var_refcount(Var v) added. Returns the refcount of any Var.
238
The two big changes in r6 over r5 are:
240
o Bytecode optimizations to try to modify lists in-place whenever
241
possible. List manipulation and mutation should be orders of
242
magnitude faster in some cases.
244
o String "interning" during load; initially, there will be one and
245
only one in-memory copy of each identical string. (In JHCore that
246
means we only allocate memory for "do" once...)
250
r7 fixes BYTECODE_REDUCE_REF. It's now safe to turn on.
251
[This turned out to be false.]
253
The default input and output buffer sizes in options.h are now 64k.
257
r8 adds more fixes to BYTECODE_REDUCE_REF. It's now safe to turn on.
258
[This appears to be true.]