1
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
4
<title>DM4 §31: Tokens of grammar</title>
5
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
6
<link rel="stylesheet" type="text/css" href="dm4.css">
10
<a href="index.html">home</a> /
11
<a href="contents.html">contents</a> /
12
<a href="ch4.html" title="Chapter IV: Describing and Parsing">chapter IV</a> /
13
<a href="s30.html" title="§30: How verbs are parsed">prev</a> /
14
<a href="s32.html" title="§32: Scope and what you can see">next</a> /
15
<a href="dm4index.html">index</a>
18
<a id="p232" name="p232"></a>
19
<h2>§31 Tokens of grammar</h2>
21
<p class="normal"><span class="atleft"><img src="dm4-232_1.jpg" alt=""></span>
22
The complete list of grammar tokens is given in the table below.
23
These tokens are all described in this section except for
24
<span class="grammartoken"><code>scope = </code>‹<span class="token">Routine</span>›</span>,
25
which is postponed to the next.</p>
27
<!-- editing note: cellpadding is necessary, or the grammartoken boxes get sheared in IE -->
28
<p class="lynxonly"></p>
29
<div class="inset"><table cellspacing="4" cellpadding="4">
30
<tr><td><span class="grammartoken"><code>'</code>‹<span class="token">word</span>›<code>'</code></span></td>
31
<td>that literal word only</td></tr>
32
<tr><td><span class="grammartoken"><code>noun</code></span></td>
33
<td>any object in scope</td></tr>
34
<tr><td><span class="grammartoken"><code>held</code></span></td>
35
<td>object held by the actor</td></tr>
36
<tr><td><span class="grammartoken"><code>multi</code></span></td>
37
<td>one or more objects in scope</td></tr>
38
<tr><td><span class="grammartoken"><code>multiheld</code></span></td>
39
<td>one or more held objects</td></tr>
40
<tr><td><span class="grammartoken"><code>multiexcept</code></span></td>
41
<td>one or more in scope, except the other object</td></tr>
42
<tr><td><span class="grammartoken"><code>multiinside</code></span></td>
43
<td>one or more in scope, inside the other object</td></tr>
44
<tr><td><span class="grammartoken">‹<span class="token">attribute</span>›</span></td>
45
<td>any object in scope which has the attribute</td></tr>
46
<tr><td><span class="grammartoken"><code>creature</code></span></td>
47
<td>an object in scope which is <code>animate</code></td></tr>
48
<tr><td><span class="grammartoken"><code>noun = </code>‹<span class="token">Routine</span>›</span></td>
49
<td>any object in scope passing the given test</td></tr>
50
<tr><td><span class="grammartoken"><code>scope = </code>‹<span class="token">Routine</span>›</span></td>
51
<td>an object in this definition of scope</td></tr>
52
<tr><td><span class="grammartoken"><code>number</code></span></td>
53
<td>a number only</td></tr>
54
<tr><td><span class="grammartoken">‹<span class="token">Routine</span>›</span></td>
55
<td>any text accepted by the given routine</td></tr>
56
<tr><td><span class="grammartoken"><code>topic</code></span></td>
57
<td>any text at all</td></tr>
60
<p class="normal">To recap, the parser goes through a line of grammar
61
tokens trying to match each against some text from the player's input.
62
Each token that matches must produce one of the following five results:</p>
65
<li>a single object;</li>
66
<li>a “multiple object”, that is, a set of objects;</li>
68
<li>a “consultation topic”, that is, a collection of words
69
left unparsed to be looked through later;</li>
70
<li>no information at all.
73
<p class="normal">Ordinarily, a single line, though it may contain many
74
tokens, can produce at most two substantial results ((a) to (d)),
75
at most one of which can be multiple
76
<a id="p233" name="p233"></a>
77
(b). (See the exercises below
78
if this is a problem.) For instance, suppose the text “green
79
apple on the table” is parsed against the grammar line:</p>
81
<p class="lynxonly"></p>
82
<pre class="code">* multi 'on' noun -> Insert</pre>
84
<p class="normal">The <span class="grammartoken"><code>multi</code></span>
85
token matches “green apple” (result: a single object, since
86
although <span class="grammartoken"><code>multi</code></span> can match
87
a multiple object, it doesn't have to), <span class="grammartoken"><code>'on'</code></span>
88
matches “on” (result: nothing) and the second
89
<span class="grammartoken"><code>noun</code></span> token matches
90
“the table” (result: a single object again). There are two
91
substantial results, both objects, so the action that comes out is
92
<code><Insert apple table></code>. If the text had been “all
93
the fruit on the table”, the <span class="grammartoken"><code>multi</code></span>
94
token might have resulted in a list: perhaps of an apple, an orange
95
and a pear. The parser would then have generated and run through three
96
actions in turn: <code><Insert apple table></code>, then
97
<code><Insert orange table></code> and finally
98
<code><Insert pear table></code>, printing out the name of each
99
item and a colon before running the action:</p>
101
<p class="output">><tt>put all the fruit on the table</tt><br>
102
Cox's pippin: Done.<br>
104
Conference pear: Done.</p>
106
<p class="normal">The library's routine <code>InsertSub</code>, which
107
actually handles the action, only deals with single objects at a time,
108
and in each case it printed “Done.”</p>
110
<p class="dotbreak">� � � � �</p>
112
<p class="tokendef"><span class="grammartoken"><code>'</code>‹<span class="token">word</span>›<code>'</code></span>
113
This matches only the literal word given, sometimes called
114
a preposition because it usually is one, and produces no resulting
115
information. (There can therefore be as many or as few of them on a
116
grammar line as desired.) It often happens that several prepositions really
117
mean the same thing for a given verb: for instance “in”,
118
“into” and “inside” are often synonymous.
119
As a convenient shorthand, then, you can write a series of prepositions
120
(only) with slashes <code>/</code> in between, to mean “one of
121
these words”. For example:</p>
123
<p class="lynxonly"></p>
124
<pre class="code">* noun 'in'/'into'/'inside' noun -> Insert</pre>
126
<p class="tokendef"><span class="grammartoken"><code>noun</code></span>
127
Matches any single object “in scope”, a
128
term defined in the next section and which roughly means “visible
129
to the player at the moment”.</p>
131
<p class="tokendef"><span class="grammartoken"><code>held</code></span>
132
Matches any single object which is an immediate possession
133
of the actor. (Thus, if a key is inside a box being carried by the actor,
134
the box might match but the key cannot.) This is convenient for two reasons. Firstly,
135
<a id="p234" name="p234"></a>
136
many actions, such as <code>Eat</code> or <code>Wear</code>,
137
only sensibly apply to things being held. Secondly, suppose we have grammar</p>
139
<p class="lynxonly"></p>
140
<pre class="code">Verb 'eat' * held -> Eat;</pre>
142
<p class="normal">and the player types “eat the banana” while
143
the banana is, say, in plain view on a shelf. It would be petty of the
144
game to refuse on the grounds that the banana is not being held.
145
So the parser will generate a <code>Take</code> action for the banana
146
and then, if the <code>Take</code> action succeeds, an <code>Eat</code>
147
action. Notice that the parser does not just pick up the object, but
148
issues an action in the proper way – so if the banana had rules
149
making it too slippery to pick up, it won't be picked up. This is
150
called “implicit taking”, and happens only for the player,
151
not for other actors.</p>
153
<p class="tokendef"><span class="grammartoken"><code>multi</code></span>
154
Matches one or more objects in scope. The
155
<span class="grammartoken"><code>multi-</code></span> tokens indicate
156
that a list of one or more objects can go here. The parser works out
157
all the things the player has asked for, sorting out plural nouns and
158
words like “except” in the process. For instance, “all
159
the apples” and “the duck and the drake” could match
160
a <span class="grammartoken"><code>multi</code></span> token but not
161
a <span class="grammartoken"><code>noun</code></span> token.</p>
163
<p class="tokendef"><span class="grammartoken"><code>multiexcept</code></span>
164
Matches one or more objects in scope, except that it does
165
not match the other single object parsed in the same grammar line. This
166
is provided to make commands like “put everything in the rucksack”
167
come out right: the “everything” is matched by all of the
168
player's possessions except the rucksack, which stops the parser from
169
generating an action to put the rucksack inside itself.</p>
171
<p class="tokendef"><span class="grammartoken"><code>multiinside</code></span>
172
Similarly, this matches anything inside the other single
173
object parsed on the same grammar line, which is good for parsing
174
commands like “remove everything from the cupboard”.</p>
176
<p class="tokendef"><span class="grammartoken">‹<span class="token">attribute</span>›</span>
177
Matches any object in scope which has the given attribute.
178
This is useful for sorting out actions according to context, and perhaps
179
the ultimate example might be an old-fashioned “use” verb:</p>
181
<p class="lynxonly"></p>
183
Verb 'use' 'employ' 'utilise'
185
* clothing -> Wear
187
* enterable -> Enter;
190
<a id="p235" name="p235"></a>
191
<p class="tokendef"><span class="grammartoken"><code>creature</code></span>
192
Matches any object in scope which behaves as if living.
193
This normally means having <code>animate</code>: but, as an exceptional
194
rule, if the action on the grammar line is <code>Ask</code>,
195
<code>Answer</code>, <code>Tell</code> or <code>AskFor</code> then
196
having <code>talkable</code> is also acceptable.</p>
198
<p class="tokendef"><span class="grammartoken"><code>noun = </code>‹<span class="token">Routine</span>›</span>
199
“Any single object in scope satisfying some condition”.
200
When determining whether an object passes this test, the parser sets the
201
variable <code>noun</code> to the object in question and calls the routine.
202
If it returns <code>true</code>, the parser accepts the object, and
203
otherwise it rejects it. For example, the following should only apply
204
to animals kept in a cage:</p>
206
<p class="lynxonly"></p>
209
if (noun in wicker_cage) rtrue; rfalse;
211
Verb 'free' 'release'
212
* noun=CagedCreature -> FreeAnimal;
215
<p class="normal">So that only nouns which pass the <code>CagedCreature</code>
216
test are allowed. The <code>CagedCreature</code> routine can appear anywhere
217
in the source code, though it's tidier to keep it nearby.</p>
219
<p class="tokendef"><span class="grammartoken"><code>scope = </code>‹<span class="token">Routine</span>›</span>
220
An even more powerful token, which means “an object
221
in scope” where scope is redefined specially. You can also choose
222
whether or not it can accept a multiple object. See <a href="s32.html">§32</a>.</p>
224
<p class="tokendef"><span class="grammartoken"><code>number</code></span>
225
Matches any decimal number from 0 upwards (though it rounds
226
off large numbers to 10,000), and also matches the numbers “one”
227
to “twenty” written in English. For example:</p>
229
<p class="lynxonly"></p>
230
<pre class="code">Verb 'type' * number -> TypeNum;</pre>
232
<p class="normal">causes actions like <code><Typenum 504></code> when
233
the player types “type 504”. Note that <code>noun</code> is
234
set to 504, not to an object. (While <code>inp1</code> is set to 1,
235
indicating that this “first input” is intended as a number:
236
if the noun had been the object which happened to have number 504,
237
then <code>inp1</code> would have been set to this object, the same
238
as <code>noun</code>.) If you need more exact number parsing, without
239
rounding off, and including negative numbers, see the exercise below.</p>
241
<a id="p236" name="p236"></a>
242
<a id="ex83" name="ex83"></a>
243
<p class="aside"><span class="warning"><b>•</b>
244
<b><a href="sa6.html#ans83">EXERCISE 83</a></b></span><br>
245
Some games, such as David M. Baggett's game ‘The Legend Lives!’
246
produce footnotes every now and then. Arrange matters so that these
247
are numbered <code>[1]</code>, <code>[2]</code> and so on in order
248
of appearance, to be read by the player when “footnote 1”
251
<p class="aside"><span class="warning">▲</span>
252
The entry point <code>ParseNumber</code> allows you to provide your
253
own number-parsing routine, which opens up many sneaky possibilities –
254
Roman numerals, coordinates like “J4”, very long telephone
255
numbers and so on. This takes the form</p>
257
<p class="lynxonly"></p>
259
[ ParseNumber buffer length;
260
...returning false if no match is made, or the number otherwise...
264
<p class="aside">and examines the supposed ‘number’ held
265
at the byte address <code>buffer</code>, a row of characters of the
266
given length. If you provide a <code>ParseNumber</code> routine but
267
return <code>false</code> from it, then the parser falls back on
268
its usual number-parsing mechanism to see if that does any better.</p>
270
<p class="aside"><span class="warning">▲▲</span>
271
Note that <code>ParseNumber</code> can't return 0 to mean the number
272
zero, because 0 is the same as <code>false</code>. Probably
273
“zero” won't be needed too often, but if it is you can always
274
return some value like 1000 and code the verb in question to understand
275
this as 0. (Sorry: this was a poor design decision made too long ago
278
<p class="tokendef"><span class="grammartoken"><code>topic</code></span>
279
This token matches as much text as possible, regardless
280
of what it says, producing no result. As much text as possible means
281
“until the end of the typing, or, if the next token is a
282
preposition, until that preposition is reached”. The only way
283
this can fail is if it finds no text at all. Otherwise, the variable
284
<code>consult_from</code> is set to the number of the first word of the
285
matched text and <code>consult_words</code> to the number of words.
286
See <a href="s16.html">§16</a> and <a href="s18.html">§18</a>
287
for examples of topics being used.</p>
289
<p class="tokendef"><span class="grammartoken">‹<span class="token">Routine</span>›</span>
290
The most flexible token is simply the name of a “general
291
parsing routine”. As the name suggests, it is a routine to do some
292
parsing which can have any outcome you choose, and many of the interesting
293
things you can do with the parser involve writing one. A general parsing
294
routine looks at the word stream using <code>NextWord</code> and
295
<code>wn</code> (see <a href="s28.html">§28</a>) to make its decisions,
296
and should return one of the following. Note that the values beginning
297
<code>GPR_</code> are constants defined by the library.</p>
299
<p class="lynxonly"></p>
300
<div class="inset"><table>
301
<tr><td><code>GPR_FAIL</code></td><td>if there is no match;</td></tr>
302
<tr><td><code>GPR_MULTIPLE</code></td><td>if the result is a multiple object;</td></tr>
303
<tr><td><code>GPR_NUMBER</code></td><td>if the result is a number;</td></tr>
304
<tr><td><code>GPR_PREPOSITION</code></td><td>if there is a match but no result;</td></tr>
305
<tr><td><a id="p237" name="p237"></a><code>GPR_REPARSE</code></td><td>to reparse the whole command from scratch; or</td></tr>
306
<tr><td><i>O</i></td><td>if the result is a single object <i>O</i>.</td></tr>
309
<p class="normal">On an unsuccessful match, returning <code>GPR_FAIL</code>,
310
it doesn't matter what the final value of <code>wn</code> is. On a
311
successful match it should be left pointing to the next thing <em>after</em>
312
what the routine understood. Since <code>NextWord</code> moves <code>wn</code>
313
on by one each time it is called, this happens automatically unless
314
the routine has read too far. For example:</p>
316
<p class="lynxonly"></p>
319
if (NextWord() == 'on' or 'at' or 'in') return GPR_PREPOSITION;
324
<p class="normal">duplicates the effect of
325
<span class="grammartoken"><code>'on'/'at'/'in'</code></span>,
326
that is, it makes a token which accepts any of the words “on",
327
“at" or “in" as prepositions. Similarly,</p>
329
<p class="lynxonly"></p>
332
while (NextWordStopped() ~= -1) ; return GPR_PREPOSITION;
336
<p class="normal">accepts the entire rest of the line (even an empty
337
text, if there are no more words on the line), ignoring it.
338
<code>NextWordStopped</code> is a form of <code>NextWord</code> which
339
returns the special value −1 once the original word stream has
342
<p class="indent">If you return <code>GPR_NUMBER</code>, the number
343
which you want to be the result should be put into the library's
344
variable <code>parsed_number</code>.</p>
346
<p class="indent">If you return <code>GPR_MULTIPLE</code>, place your
347
chosen objects in the table <code>multiple_object</code>: that is,
348
place the number of objects in <code>multiple_object-->0</code>
349
and the objects themselves in <code>-->1</code>, …</p>
351
<p class="indent">The value <code>GPR_REPARSE</code> should only be
352
returned if you have actually altered the text you were supposed to be
353
parsing. This is a feature used internally by the parser when it asks
354
“Which do you mean…?” questions, and you can use
355
it too, but be wary of loops in which the parser eternally changes
356
and reparses the same text.</p>
358
<p class="dotbreak">� � � � �</p>
360
<p class="aside"><span class="warning">▲</span>
361
To parse a token, the parser uses a routine called <code>ParseToken</code>.
362
This behaves almost exactly like a general parsing routine, and returns
363
the same range of values. For instance,</p>
365
<p class="lynxonly"></p>
366
<pre class="code">ParseToken(ELEMENTARY_TT, NUMBER_TOKEN)</pre>
368
<a id="p238" name="p238"></a>
369
<p class="aside">parses exactly as <span class="grammartoken"><code>number</code></span>
370
does: similarly for <code>NOUN_TOKEN</code>, <code>HELD_TOKEN</code>,
371
<code>MULTI_TOKEN</code>, <code>MULTIHELD_TOKEN</code>,
372
<code>MULTIEXCEPT_TOKEN</code>, <code>MULTIINSIDE_TOKEN</code> and
373
<code>CREATURE_TOKEN</code>. The call</p>
375
<p class="lynxonly"></p>
376
<pre class="code">ParseToken(SCOPE_TT, MyRoutine)</pre>
378
<p class="aside">does what <span class="grammartoken"><code>scope=MyRoutine</code></span>
379
does. In fact <code>ParseToken</code> can parse any kind of token,
380
but these are the only cases which are both useful enough to mention
381
and safe enough to use. It means you can conveniently write a token
382
which matches, say, <em>either</em> the word “kit” <em>or</em>
383
any named set of items in scope:</p>
385
<p class="lynxonly"></p>
387
[ KitOrStuff; if (NextWord() == 'kit') return GPR_PREPOSITION;
388
wn--; return ParseToken(ELEMENTARY_TT, MULTI_TOKEN);
392
<p class="dotbreak">� � � � �</p>
394
<a id="ex84" name="ex84"></a>
395
<p class="aside"><span class="warning"><b>•</b>
396
<b><a href="sa6.html#ans84">EXERCISE 84</a></b></span><br>
397
Write a token to detect small numbers in French, “un”
398
to “cinq”.</p>
400
<a id="ex85" name="ex85"></a>
401
<p class="aside"><span class="warning"><b>•</b>
402
<b><a href="sa6.html#ans85">EXERCISE 85</a></b></span><br>
403
Write a token called <code>Team</code>, which matches only against
404
the word “team” and results in a multiple object containing
405
each member of a team of adventurers in a game.</p>
407
<a id="ex86" name="ex86"></a>
408
<p class="aside"><span class="warning"><b>•</b>▲
409
<b><a href="sa6.html#ans86">EXERCISE 86</a></b></span><br>
410
Write a token to detect non-negative floating-point numbers like
411
“21”, “5.4623”, “two point oh eight”
412
or “0.01”, rounding off to two decimal places.</p>
414
<a id="ex87" name="ex87"></a>
415
<p class="aside"><span class="warning"><b>•</b>▲
416
<b><a href="sa6.html#ans87">EXERCISE 87</a></b></span><br>
417
Write a token to match a phone number, of any length from 1 to 30 digits,
418
possibly broken up with spaces or hyphens (such as “01245 666
419
737” or “123-4567”).</p>
421
<a id="ex88" name="ex88"></a>
422
<p class="aside"><span class="warning"><b>•</b>▲▲
423
<b><a href="sa6.html#ans88">EXERCISE 88</a></b></span><br>
424
(Adapted from code in <tt>"timewait.h"</tt>: see the references
425
below.) Write a token to match any description of a time of day, such
426
as “quarter past five”, “12:13 pm”,
427
“14:03”, “six fifteen” or “seven
430
<a id="ex89" name="ex89"></a>
431
<p class="aside"><span class="warning"><b>•</b>▲
432
<b><a href="sa6.html#ans89">EXERCISE 89</a></b></span><br>
433
Code a spaceship control panel with five sliding controls, each
434
set to a numerical value, so that the game looks like:</p>
436
<a id="p239" name="p239"></a>
437
<p class="output">><tt>look</tt><br>
438
<em>Machine Room</em><br>
439
There is a control panel here, with five slides, each of which can
440
be set to a numerical value.<br>
441
><tt>push slide one to 5</tt><br>
442
You set slide one to the value 5.<br>
443
><tt>examine the first slide</tt><br>
444
Slide one currently stands at 5.<br>
445
><tt>set four to six</tt><br>
446
You set slide four to the value 6.</p>
448
<a id="ex90" name="ex90"></a>
449
<p class="aside"><span class="warning"><b>•</b>▲
450
<b><a href="sa6.html#ans90">EXERCISE 90</a></b></span><br>
451
Write a general parsing routine accepting any amount of text, including
452
spaces, full stops and commas, between double-quotes as a single token.</p>
454
<a id="ex91" name="ex91"></a>
455
<p class="aside"><span class="warning"><b>•</b>
456
<b><a href="sa6.html#ans91">EXERCISE 91</a></b></span><br>
457
On the face of it, the parser only allows two parameters to an action,
458
<code>noun</code> and <code>second</code>. Write a general parsing routine
459
to accept a <code>third</code>. (This is easier than it looks: see
460
the specification of the <code>NounDomain</code> library routine
461
in <a href="sa3.html">§A3</a>.)</p>
463
<a id="ex92" name="ex92"></a>
464
<p class="aside"><span class="warning"><b>•</b>
465
<b><a href="sa6.html#ans92">EXERCISE 92</a></b></span><br>
466
Write a token to match any legal Inform decimal, binary or hexadecimal
467
constant (such as <code>-321</code>, <code>$4a7</code> or
468
<code>$$1011001</code>), producing the correct numerical value in
469
all cases, while not matching any number which overflows or underflows
470
the legal Inform range of −32,768 to 32,767.</p>
472
<a id="ex93" name="ex93"></a>
473
<p class="aside"><span class="warning"><b>•</b>
474
<b><a href="sa6.html#ans93">EXERCISE 93</a></b></span><br>
475
Add the ability to match the names of the built-in Inform constants
476
<code>true</code>, <code>false</code>, <code>nothing</code> and
477
<code>NULL</code>.</p>
479
<a id="ex94" name="ex94"></a>
480
<p class="aside"><span class="warning"><b>•</b>
481
<b><a href="sa6.html#ans94">EXERCISE 94</a></b></span><br>
482
Now add the ability to match character constants like <code>'7'</code>,
483
producing the correct character value (in this case 55, the ZSCII
484
value for the character ‘7’).</p>
486
<a id="ex95" name="ex95"></a>
487
<p class="aside"><span class="warning"><b>•</b>▲▲
488
<b><a href="sa6.html#ans95">EXERCISE 95</a></b></span><br>
489
Next add the ability to match the names of attributes, such as
490
<code>edible</code>, or negated attributes with a tilde in front,
491
such as <code>~edible</code>. An ordinary attribute should parse
492
to its number, a negated one should parse to its number plus 100.
493
(Hint: the library has a printing rule called <code>DebugAttribute</code>
494
which prints the name of an attribute.)</p>
496
<a id="ex96" name="ex96"></a>
497
<p class="aside"><span class="warning"><b>•</b>▲▲
498
<b><a href="sa6.html#ans96">EXERCISE 96</a></b></span><br>
499
And now add the names of properties.</p>
501
<a id="p240" name="p240"></a>
502
<p class="aside"><span class="warning"><b>•</b>
503
<b>REFERENCES</b></span><br>
504
Once upon a time, Andrew Clover wrote a neat library extension called
505
<tt>"timewait.h"</tt> for parsing times of day, and allowing
506
commands such as “wait until quarter to three”. L. Ross
507
Raszewski, Nicholas Daley and Kevin Forchione each tinkered with and
508
modernised this, so that there are now also <tt>"waittime.h"</tt>
509
and <tt>"timesys.h"</tt>. Each has its merits.</p>
513
<a href="index.html">home</a> /
514
<a href="contents.html">contents</a> /
515
<a href="ch4.html" title="Chapter IV: Describing and Parsing">chapter IV</a> /
516
<a href="s30.html" title="§30: How verbs are parsed">prev</a> /
517
<a href="s32.html" title="§32: Scope and what you can see">next</a> /
518
<a href="dm4index.html">index</a>