3
<title>Can a Bayesian spam filter play chess?</title>
6
<h1>Can a Bayesian spam filter play chess?</h1>
7
<center>by Laird A. Breyer</center>
11
Many people these days depend on Bayesian filters to
12
protect them from the ever present email scourge that is spam.
13
Unlike older technologies, these programs' claim to fame is that
14
they <i>learn</i> the spam patterns automatically, and more importantly,
15
learn <i>personalized</i> spam (bad) and ham (good) email patterns.
17
Like many others, I wrote a Bayesian filter to protect me from unwanted
18
email, which I called <a href="http://dbacl.sourceforge.net"><i>dbacl</i></a>.
19
My implementation functions as a Unix command line text classifier,
20
with special email support, and can be used with procmail.
22
People are often astonished at how well statistical mail filtering works
23
after they first try it, and it's tempting to imagine that such programs
24
actually understand the emails being delivered, rather than merely matching
27
Now <a href="http://www.fide.com/official/handbook.asp?level=EE101">chess</a>
28
has always been a popular gauge of intelligence that everyone
29
can understand, so if we put all these ideas together,
30
then the question "Can a Bayesian spam filter play chess?"
31
seems like a fun experiment with a lot of appeal.
33
Let's put down some ground rules: This experiment will test a real spam filter, not a specially designed chess program. It won't aim to beat
34
<a href="http://www.research.ibm.com/deepblue/"><i>Deep Thought</i></a>
35
(I wouldn't know where to start, and I have a feeling this could be difficult anyway ;-), but it will aim to show signs of "intelligence", or we won't claim
37
Finally, since dry tables and graphs are no fun, a theoretical proof of concept is not enough:
38
the spam filter must <i>really</i> play chess
39
in a way that everyone can see, and try out at home.
41
The account below is designed so that you can follow and duplicate
43
you need is a Unix compatible computer. You'll have to open a terminal
44
and be ready to type shell commands. All shell commands below are preceded
45
by % to indicate the prompt, but you don't type the '%'.
46
Instructions are fairly detailed, and
47
various scripts can be downloaded when needed, but it helps if you're
48
familiar with the shell. Ask a friend if you need help.
50
<b>Important: You must follow these instructions if you want to actually play chess
51
against the spam filter. You must also download some training games and
52
teach the filter beforehand. Running the scripts alone is not enough.
53
The instructions below have been tested to work with the GNU utilities and
57
Start by making a directory to keep all the
58
workings in one place.
64
<h2>Setting up the game(s)</h2>
66
The first thing we have to do is obtain a (preferably large) collection
67
of chess games that we can learn.
69
Not being an expert, I started off by
70
browsing the web for likely keywords. It became soon apparent that a large
71
collection of free games is available electronically in something called the
72
PGN format. So I ended up downloading all the files available from
73
<a href="http://www.chessopolis.com/chessfiles/pgn_collections.htm">Chessopolis</a> and placing them into a subdirectory.
79
100-pg.zip can92-pg.zip gmcom3pg.zip krp-pg.zip swede2pg.zip
80
4queenpg.zip cp8687pg.zip irish-pg.zip lonepine.zip ten-pg.zip
81
acbul-pg.zip cp8891pg.zip italy-pg.zip maxg-pgn.zip trap-pg.zip
82
bril-pg.zip croat-pg.zip kbp-pg.zip minis-pg.zip wchamppg.zip
83
brit60pg.zip denm-pg.zip knp-pg.zip pca9395.zip
84
brit70pg.zip gmcom1pg.zip kp-kp-pg.zip sams-pg.zip
85
cal-pg.zip gmcom2pg.zip kqp-pg.zip storm-pg.zip
88
Now that we have a collection, let's look at a typical game, say from
91
% zcat sams-pg.zip | head -15
93
[Site "Active Chess Championship, Kuala Lumpur (Malays"]
96
[White "Anand Viswanathan (IND)"]
101
1. e4 e5 2. Nf3 Nc6 3. Bc4 Nf6 4. Ng5 Nxe4 5. Bxf7+ Ke7 6. d3 Nf6 7.
102
Bb3 d5 8. Nc3 Bg4 9. f3 Bf5 10. f4 Bg4 11. Qd2 h6 12. fxe5 Nxe5 13. Qe3
103
Kd6 14. d4 Nd3+ 15. Qxd3 Qe7+ 16. Be3 Re8 17. Nf7+ Qxf7 18. O-O c6 19.
104
Bf4+ Kd7 20. Be5 Be7 21. Rae1 Rhf8 22. Nxd5 cxd5 23. Ba4+ Kd8 24. Qc3
105
Bb4 25. Qxb4 Re6 26. c4 Rb6 27. Qa5 Bc8 28. c5 1-0
109
The trouble with data collections is that they are never exactly in the format
110
we want. The chess game is obviously the bit at the bottom, while the
111
text in square brackets looks quite useless to teach our filter.
113
Looking at the game itself, the numbers obviously count the moves,
114
while the actual symbols that follow just seem like noise. But look
115
more closely, and each move is actually followed by two expressions,
118
In chess, the White player always starts first,
119
and if you know that a chess board's columns are marked by letters,
120
and the rows are marked by numbers, then e4 is a square on the board.
121
The capital letters such as B, N, Q, K probably stand for Bishop,
122
kNight, Queen and King. Of course, if you get stuck, you might just
123
want to read the <a href="http://www.tim-mann.org/Standard">PGN format specification</a> instead of guessing.
125
So now we know that each player's moves are separated by spaces, and
126
that the numbers ending in a dot are just there to help people read the
127
moves, and can be ignored just like the text in brackets with the names
128
of the players etc. The real game information could be simply written like this:
130
e4 e5 Nf3 Nc6 Bc4 Nf6 Ng5 Nxe4 Bxf7+ Ke7 d3 Nf6
131
Bb3 d5 Nc3 Bg4 f3 Bf5 f4 Bg4 Qd2 h6 fxe5 Nxe5 Qe3
132
Kd6 d4 Nd3+ Qxd3 Qe7+ Be3 Re8 Nf7+ Qxf7 O-O c6
133
Bf4+ Kd7 Be5 Be7 Rae1 Rhf8 Nxd5 cxd5 Ba4+ Kd8 Qc3
134
Bb4 Qxb4 Re6 c4 Rb6 Qa5 Bc8 c5 1-0
137
<h2>The engine's brain</h2>
139
<a href="http://dbacl.sourceforge.net"><i>dbacl</i></a> is a text classifier. It reads words and builds a probability model
140
from their frequencies. It can also look at pairs of words, triples, etc. up
141
to 7. The tokens it looks at can be up to 30 characters long. These limitations
142
are technical and would take too long to explain, but are useful to know up front.
144
What happens when <i>dbacl</i> classifies a text is that it computes the
145
probability of seeing the text naturally occurring under the
146
model. This is how spam filtering works: we build two models, one for
147
spam and one for good emails, and then <i>dbacl</i> checks which model
148
probability is higher for every incoming email.
151
make sure that <i>dbacl</i> is available. You'll need version 1.11 at least, so
152
the easiest way is to download the program from its homepage. Place the
153
dbacl-1.11.tar.gz file in your <i>chess</i> directory.
156
% tar xfz dbacl-1.11.tar.gz
157
% mv dbacl-1.11 dbacl
159
% ./configure && make && make check
163
For chess, we first want to learn a model from many, many games. But
164
then what? If we play an actual game, we'll end up with another piece
165
of text that looks like the game above, but it will be incomplete. And
166
we want <i>dbacl</i> to choose each move for the side it plays.
168
Now <i>dbacl</i> doesn't know the rules of chess, in fact anybody would be
169
hard pressed to recognize that the text above is actually a game
170
played on a board with wooden or plastic pieces. Anybody who has never
171
heard of PGN of course.
173
What <i>dbacl</i> can do is <i>choose</i>. So here's what we will do: we
174
take an incomplete game, and we add one extra (half) move at the end ourselves.
175
We repeat this for all possible legal moves at that point. We'll get
176
nearly identical partial games whose only difference is the last expression.
177
We'll ask <i>dbacl</i> to work out the probabilities of each partial game under
178
its model. And then we'll pick the most likely partial game.
181
<h2>Figuring out what to learn</h2>
183
We know that <i>dbacl</i> can tell us if a piece of text is <i>typical</i> for
184
a model, and in turn a model is learned by reading many examples together.
185
So if a pattern occurs very often in the games being learned, then such
186
a typical pattern will be recommended by <i>dbacl</i>. And if the pattern is rare
187
then <i>dbacl</i> will recommend it rarely.
189
But we want <i>dbacl</i> to win. So we want it to recommend the kind of things
190
winners often do. So when <i>dbacl</i> plays White, it must learn a model
191
from games where White wins, and if <i>dbacl</i> plays Black, then its model must
192
be from games where Black wins.
194
At least that's a good first assumption. Sometimes, strong players lose a game
195
against weaker players, and if <i>dbacl</i> learns this type of game, then it
196
will pick up bad habits from the weaker player. But we'll assume that most
197
games the better player wins. Also, if we learn to play by studying games
198
from terrible players, then we'll pick up bad habits no matter what.
199
But this is for later, or we'll never get anywhere.
201
Unfortunately, we now have work to do. We must split our thousands of
202
sample games into White-Win (1-0) and Black-Win (0-1). And what to do
203
about draws (1/2-1/2)? We can put them in both categories or just ignore them.
205
The files I downloaded are zipped MS-DOS files called *.PGN whose lines end in "\r\n"
206
instead of ending in "\n", which is the Unix convention.
209
% for f in *.zip; do unzip $f; done
211
% mv *.PGN ../gamefiles
215
After inspecting a few *.PGN files, it's clear that
216
a typical game takes several lines to write out fully,
217
but the lines in between are either empty or contain all sorts of comments
218
and useless information which must be scrubbed. We can do this by recombining
219
the lines of a game into a single long line, and since all games start
220
with a "1.", any lines left that don't start this way can be thrown away.
223
% cat *.PGN | sed -e 's/\r//g' \
224
| sed -e :a -e '$!N;s/\n\([a-hKQNBRO0-9]\)/ \1/;ta' -e 'P;D' \
230
What we now have is a big file <i>allgames.txt</i> which contains very long lines
231
where each line is a single game. In the PGN format, the end result is
232
marked at the end of the game, so it is easy for us to sort the games
233
by throwing away the lines which either contain 1-0 (White wins) or 0-1 (Black wins). We also remove the move numbers which we don't need anymore.
235
% cat allgames.txt | grep -v '0-1' | sed 's/[0-9]*\.[ ]*//g' > WhiteWinDraw.txt
236
% cat allgames.txt | grep -v '1-0' | sed 's/[0-9]*\.[ ]*//g' > BlackWinDraw.txt
237
% cat allgames.txt | grep '1-0' | sed 's/[0-9]*\.[ ]*//g' > WhiteWin.txt
238
% cat allgames.txt | grep '0-1' | sed 's/[0-9]*\.[ ]*//g' > BlackWin.txt
240
Let's see how many games we've got:
244
26809 BlackWinDraw.txt
246
31332 WhiteWinDraw.txt
250
All right, around 15-20 thousand winning games of each type.
251
That should give <i>dbacl</i> something to read!
253
Remember, each game is on its own line. I'm going to leave the final
254
scores at the end of each line where they are, as they are harmless
255
(they can't occur in the middle of a game in progress). Let's learn
259
% ./dbacl/src/dbacl -T text -l ./WhiteWinDraw -e alnum -L uniform -j -w 2 -H 20 ./gamefiles/WhiteWinDraw.txt
260
% ./dbacl/src/dbacl -T text -l ./BlackWinDraw -e alnum -L uniform -j -w 2 -H 20 ./gamefiles/BlackWinDraw.txt
263
The most important option here is "-w 2", which tells <i>dbacl</i> that it
264
must pick up single words as well as word pairs. We'll see later if that's a
265
good idea. If all went well, then you should have two files in your <i>chess</i> directory.
268
-rw-r----- 1 laird laird 3.2M 2005-06-24 17:16 BlackWinDraw
269
-rw-r----- 1 laird laird 3.2M 2005-06-24 17:15 WhiteWinDraw
273
<h2>Building the engine</h2>
275
Now let's build the engine. This is where we'll <strike>cheat</strike>
276
take advantage of the wonderful world of open source.
278
Remember that what we want to do is take a PGN game that's in progress,
279
and let <i>dbacl</i> complete the next move by picking all possible legal next
280
moves and adding them in turn to the current game, then we let <i>dbacl</i>
281
compute the most probable game under its model. We don't really know
282
if <i>dbacl</i>'s model is a good one for chess, and let's not even <i>think</i>
283
about competing with <i>Deep Thought</i> here,
284
but this procedure will at least give us a
285
way to pick the move.
287
Unfortunately, computing legal moves is straightforward but tedious,
288
so I looked around the internet for something useful, and found the
289
<a href="http://www-cgi.cs.cmu.edu/afs/cs/project/ai-repository/ai/areas/games/chess/san/0.html">SAN Toolkit</a>. This is an old
290
freeware toolkit for chess which does everything we want, even though
291
it was last updated in 1993 and probably no longer state of the art.
292
It's written in C and we have to compile it, but most importantly we
293
can use it directly in a Unix shell environment. Download it and place
294
it in the <i>chess</i> directory.
298
% cp SAN_DOC/Makefile SAN_SRC
304
The documentation for this program is a little wanting, but it's not
305
too hard to figure out, and the source code helps.
306
Everything is calculated by the program <i>san</i>,
307
which accepts commands. We'll write a script which accepts a PGN
308
partial game and one of the <i>dbacl</i> categories we learned, and outputs
309
the "best" move, meaning what <i>dbacl</i> thinks is closest to the category
312
Let's try those ideas out before we proceed. We'll need a test PGN file.
315
1. e4 c5 2. Nf3 e6 3. d3 Nc6
317
The cat commands waits until you press CTRL-D to finish. Let's first
318
ask the <i>san</i> program for a list of legal moves.
320
% echo -ne "svop verb namd nabd napr\nlfer test.pgn\nenum 1\n" \
321
| ./SAN/SAN_SRC/san > test.legal 2>&1
322
% head -10 test.legal
323
: : : Welcome to the SAN Kit.
335
Okay, this looks promising! There are some pieces of text we'll have to
336
remove, but the possible first moves are all there (the listing is cut after
337
10 lines). Let's build the game
340
% cat test.pgn | sed 's/[0-9]*\.[ ]*//g' > test.gameline
345
Next, we complete this gameline with each possible move:
347
% cat test.legal | grep '^.* : 1$' | cut -f 1 -d ' ' | \
349
echo `cat test.gameline` $move
351
% head -10 test.complete
352
e4 c5 Nf3 e6 d3 Nc6 Bd2
353
e4 c5 Nf3 e6 d3 Nc6 Be2
354
e4 c5 Nf3 e6 d3 Nc6 Be3
355
e4 c5 Nf3 e6 d3 Nc6 Bf4
356
e4 c5 Nf3 e6 d3 Nc6 Bg5
357
e4 c5 Nf3 e6 d3 Nc6 Bh6
358
e4 c5 Nf3 e6 d3 Nc6 Kd2
359
e4 c5 Nf3 e6 d3 Nc6 Ke2
360
e4 c5 Nf3 e6 d3 Nc6 Na3
361
e4 c5 Nf3 e6 d3 Nc6 Nbd2
364
And of course, let's check what <i>dbacl</i> thinks of this:
366
% cat test.complete | ./dbacl/src/dbacl -n -c ./WhiteWinDraw -f 1 > test.scores
368
WhiteWinDraw 53.88 e4 c5 Nf3 e6 d3 Nc6 Bd2
369
WhiteWinDraw 52.93 e4 c5 Nf3 e6 d3 Nc6 Be2
370
WhiteWinDraw 52.16 e4 c5 Nf3 e6 d3 Nc6 Be3
371
WhiteWinDraw 53.11 e4 c5 Nf3 e6 d3 Nc6 Bf4
372
WhiteWinDraw 52.88 e4 c5 Nf3 e6 d3 Nc6 Bg5
373
WhiteWinDraw 54.64 e4 c5 Nf3 e6 d3 Nc6 Bh6
374
WhiteWinDraw 55.32 e4 c5 Nf3 e6 d3 Nc6 Kd2
375
WhiteWinDraw 54.82 e4 c5 Nf3 e6 d3 Nc6 Ke2
376
WhiteWinDraw 54.55 e4 c5 Nf3 e6 d3 Nc6 Na3
377
WhiteWinDraw 57.48 e4 c5 Nf3 e6 d3 Nc6 Nbd2
378
WhiteWinDraw 52.07 e4 c5 Nf3 e6 d3 Nc6 Nc3
379
WhiteWinDraw 54.69 e4 c5 Nf3 e6 d3 Nc6 Nd4
380
WhiteWinDraw 53.83 e4 c5 Nf3 e6 d3 Nc6 Ne5
381
WhiteWinDraw 60.95 e4 c5 Nf3 e6 d3 Nc6 Nfd2
382
WhiteWinDraw 56.18 e4 c5 Nf3 e6 d3 Nc6 Ng1
383
WhiteWinDraw 54.51 e4 c5 Nf3 e6 d3 Nc6 Ng5
384
WhiteWinDraw 55.05 e4 c5 Nf3 e6 d3 Nc6 Nh4
385
WhiteWinDraw 52.88 e4 c5 Nf3 e6 d3 Nc6 Qd2
386
WhiteWinDraw 53.56 e4 c5 Nf3 e6 d3 Nc6 Qe2
387
WhiteWinDraw 53.83 e4 c5 Nf3 e6 d3 Nc6 Rg1
388
WhiteWinDraw 48.60 e4 c5 Nf3 e6 d3 Nc6 a3
389
WhiteWinDraw 49.86 e4 c5 Nf3 e6 d3 Nc6 a4
390
WhiteWinDraw 49.73 e4 c5 Nf3 e6 d3 Nc6 b3
391
WhiteWinDraw 49.77 e4 c5 Nf3 e6 d3 Nc6 b4
392
WhiteWinDraw 48.51 e4 c5 Nf3 e6 d3 Nc6 c3
393
WhiteWinDraw 49.23 e4 c5 Nf3 e6 d3 Nc6 c4
394
WhiteWinDraw 47.56 e4 c5 Nf3 e6 d3 Nc6 d4
395
WhiteWinDraw 49.77 e4 c5 Nf3 e6 d3 Nc6 e5
396
WhiteWinDraw 48.33 e4 c5 Nf3 e6 d3 Nc6 g3
397
WhiteWinDraw 50.00 e4 c5 Nf3 e6 d3 Nc6 g4
398
WhiteWinDraw 49.23 e4 c5 Nf3 e6 d3 Nc6 h3
399
WhiteWinDraw 49.86 e4 c5 Nf3 e6 d3 Nc6 h4
402
First, you'll note that each line contains the current game, but ends
403
with one of the legal moves. Just before each game sequence, there
404
is a score for that sequence, and since the sequences are nearly identical,
405
the scores are nearly identical too.
407
In the world of <i>dbacl</i>, these scores are the negative logarithm (base 2) of the
408
probability of the sequence, based on the model <i>WhiteWinDraw</i>. It's best to think
409
of these scores as a distance away from the category model, so the line with
410
the smallest score is the most likely. If you want to know what probability
411
each sequence has, it's 1/(2^54.73) etc., which is pretty close to zero! But that's
412
normal with these kinds of models.
414
So what's the best move? Simply sort the lines in increasing order by score
415
and print out the first line:
417
% cat test.scores | sort -k 2 -n | head -1
418
WhiteWinDraw 47.56 e4 c5 Nf3 e6 d3 Nc6 d4
420
Remember, <i>dbacl</i> recommends what it thinks most of the
421
games it has learned would do. We can take a peek at the effect of each half move on the
422
score for the first line of <i>test.complete</i> by using <i>dbacl</i>'s debugging switch:
424
% head -1 test.complete
425
e4 c5 Nf3 e6 d3 Nc6 Bd2
426
% head -1 test.complete | ./dbacl/src/dbacl -nv -c ./WhiteWinDraw -f 1 -d
427
# categories: WhiteWinDraw
428
# format: avg_score * complexity
429
20.25 * 0.5 []e4[](1)
430
2.91 * 1.0 []e4[]c5[](1)
432
4.55 * 2.0 []c5[]Nf3[](1)
433
7.65 * 2.5 []Nf3[](1)
434
3.15 * 3.0 []Nf3[]e6[](1)
436
3.21 * 4.0 []e6[]d3[](1)
438
4.34 * 5.0 []d3[]Nc6[](1)
439
5.83 * 5.5 []Nc6[](1)
440
4.31 * 6.0 []Nc6[]Bd2[](1)
441
5.75 * 6.5 []Bd2[](1)
443
The scores we saw earlier are obtained by multiplying the average by the
444
complexity, but <i>dbacl</i> internally works with <i>nats</i>, not <i>bits</i>,
445
so 5.75 * 6.5 / ln(2) = 53.9. Since here we just want to see the tokens that are being
446
used, these values aren't important. The most important thing to see is that
447
there are contributions from single half moves as well as pairs of half moves,
448
and the score balances them all.
450
<h2>Playing against people</h2>
452
Chess without a board is not as much fun as chess with an actual board.
453
If we really want to claim that <i>dbacl</i> can play chess, then we need to
454
make it work with something like
455
GNU <a href="http://savannah.gnu.org/projects/xboard/">XBoard</a> so it
456
can play against people. Before we go on, you'll have to make sure
457
this program is installed. It comes standard with most GNU/Linux distributions,
458
for example on <a href="http://www.debian.org">Debian</a> you can type
461
% apt-get install xboard
465
A real chess engine that works with XBoard must implement the
466
<a href="http://www.tim-mann.org/xboard/engine-intf.html">Chess Engine
467
Communication Protocol</a>. This is a bit tedious, so I prepared one earlier.
468
Save it as a file named <a href="dce-basic.sh">dce-basic.sh</a> in your <i>chess</i>
469
directory, or type it yourself from the listing below.
470
Note that I've just created enough code to play a simple game
471
without undo, force, switch sides or board setup. In other words, the only
472
thing you can do is start a new game, and play the moves, and <i>dbacl</i> must play Black.
476
# This script functions as an incomplete chess engine for XBoard.
478
DBACL=./dbacl/src/dbacl
479
SAN=./SAN/SAN_SRC/san
484
PGN="$TMP/$DCE.current.pgn"
485
GAMELINE="$TMP/$DCE.current.gameline"
486
SCORES="$TMP/$DCE.current.scores"
487
ENGINEMOVE="$TMP/$DCE.current.emove"
488
SANOUT="$TMP/$DCE.current.stdout"
489
SANERR="$TMP/$DCE.current.stderr"
493
CATFILE="./BlackWinDraw"
498
function exec_san() {
499
rm -rf $PGN.new $SANOUT $SANERR
500
echo -ne "svop namd nabd napr\nlfer $PGN\n$1\nsfer $PGN.new" \
501
| "$SAN" > "$SANOUT" 2> "$SANERR"
502
if grep Error "$SANOUT" > /dev/null; then
503
echo "Error (illegal move): $cmd"
511
function do_engine_move() {
512
if exec_san "enum 1"; then
513
# legal moves are in $SANOUT
516
| sed -e :a -e '$!N;s/\n\([a-hKQNBRO0-9]\)/ \1/;ta' -e 'P;D' \
519
| sed 's/[0-9]*\.[ ]*//g' \
523
# make all completions and let dbacl decide
527
| while read move; do
528
echo "`cat $GAMELINE` $move"
530
| "$DBACL" -n -c "$CATFILE" -f 1 \
533
if [ `cat "$SCORES"| wc -l` = "0" ]; then
534
# no moves left, game over!
535
# the gameline contains the result
536
echo "`cat $GAMELINE | sed 's/^.* //'` {Play again?}"
539
# pick best scoring move
543
| sed -e 's/^.* //' \
546
if exec_san "`cat $ENGINEMOVE`"; then
547
echo "move `cat $ENGINEMOVE`"
556
function do_reset() {
563
if [ "$SANOK" != "yes" ]; then
564
echo "Illegal move (you must use SAN): $1"
565
echo "Error (cannot play): $1"
567
if exec_san "$1"; then
575
echo "feature san=1 done=1"
587
echo "Error (only standard chess please): $cmd"
599
echo "Error (unknown command): $cmd"
602
# ignore other commands
603
echo "Error (command not implemented): $cmd"
606
if [ "$MOVENOW" = "yes" ]; then
607
if do_engine_move; then
610
echo "Error (engine blew up): kaboom"
617
The source code above is not very complex. The last part reads commands one
618
line at a time and calls various functions depending on the command.
619
Then, if it is Black's turn, it calls the <i>do_engine_move</i> function.
620
The current state of the game is constantly updated in the <i>dce-basic.current.pgn</i> file which is created in your <i>chess</i> directory, the list
621
of scores computed by <i>dbacl</i> is in the file <i>dce-basic.current.scores</i>, and the final best engine move is saved in the file <i>dce-basic.current.emove</i>.
623
Once you have this engine script in your <i>chess</i> directory, make sure it has
624
execute permissions, and then you can try it out with XBoard.
626
% chmod +x ./dce-basic.sh
627
% xboard -fcp ./dce-basic.sh
631
It's much more fun to read this account if you actually try this out
632
with XBoard yourself, so I won't say what happens next or how well
633
<i>dbacl</i> does as a chess player. The discussion with answers continues
634
in the next section so consider this a spoiler warning. Don't peek!
636
When playing chess on XBoard, don't forget that <a href="dce-basic.sh">dce-basic.sh</a> is
637
incomplete. All you can do is play White, and restart the game whenever
638
you want. No fancy menu choices. However, you can follow what's going
639
on at any time by looking at the temporary files named
640
<i>dce-basic.current.*</i> which are placed in your <i>chess</i> directory.
642
<img src="down.png"><p>
643
<img src="spoiler.png" alt="spoiler space" height="100%">
647
Okay, you've tried your first game against <i>dbacl</i>, and it wasn't very
648
impressive. In fact, it was practically random play! Note that it
649
isn't entirely random - <i>dbacl</i> doesn't pick for example a random pawn
650
on first move but slightly prefers positions towards the middle of the board. What's
653
There can be only one culprit, and it's the model we used for learning,
654
namely the "-w 2" switch. To understand this, let's look at the game line
661
Now <i>dbacl</i> learns by reading single words (ie half moves) and consecutive pairs
662
of words, then building a model out of these frequencies. This means that there is nothing in the model which makes for example the third word Nf3 depend on
663
the first one e4 except through the second one c5, which could well be anything.
664
In other words, consecutive moves by one colour are largely independent,
665
which is why it feels that <i>dbacl</i> has no strategy.
667
We can force <i>dbacl</i> instead to learn 3, or better yet 4 consecutive words
668
as a pattern. That should link e4 with Nf3 and even e6. Let's try it (this could take
669
a few minutes to run):
671
% ./dbacl/src/dbacl -T text -l ./BlackWinDraw -e alnum -L uniform -j -w 4 -H 20 ./gamefiles/BlackWinDraw.txt
672
dbacl:warning: table full, some tokens ignored - try with option -h 21
674
Oh, oh. Here's a problem that we might as well get out of the way now.
675
Looking at pairs, triples, etc. (these are called n-grams) uses a lot of memory
676
because of all the possible combinations. That's one reason why long n-grams aren't that
677
popular in machine classification.
678
Here <i>dbacl</i> ran out of space and told us so, but
679
still tried to learn something. But we don't want a skewed model, so we'll
680
increase the allowable space and relearn. You will have to be
681
mindful of this when you experiment.
683
% ./dbacl/src/dbacl -T text -l ./BlackWinDraw -e alnum -L uniform -j -w 4 -H 22 ./gamefiles/BlackWinDraw.txt
684
% xboard -fcp ./dce-basic.sh
688
<img src="down.png"><p>
689
<img src="spoiler.png" alt="spoiler space" height="100%">
691
<h2>Several steps</h2>
693
It's hard to see a big improvement with "-w 4", which is not very surprising if you think about it.
694
Let's try with "-w 7", which for technical reasons is the maximum <i>dbacl</i> can
695
handle. But before we do this, I want to briefly mention the work of
696
<a href="http://www-groups.dcs.st-and.ac.uk/~history/Mathematicians/Shannon.html">C.E. Shannon</a>.
698
One of Shannon's famous experiments concerns the approximation of English. He
699
asked what would sentences look like if letters or words were picked randomly from
700
a book. Here is what he found:
702
(Single letters) OCRO HLI RGWR NMIELWIS EU LL NBNESEBYA TH EEI ALHENHTTPA OOBTTVA NAH BRL.
704
(Pairs of letters) ON IE ANTSOUTINYS ARE T INCTORE ST BE S DEAMY ACHIN D ILONASIVE TUCOOWE AT TEASONARE FUSO TIZIN ANDY TOBE SEACE CTISBE.
706
(Triples of letters) IN NO IST LAT WHEY CRATICT FROURE BIRS GROCID PONDENOME OF DEMONSTURES OF THE REPTAGIN IS REGOACTIONA OF CRE.
708
(Single words) REPRESENTING AND SPEEDILY IS AN GOOD APT OR COME CAN DIFFERENT NATURAL HERE HE THE A IN CAME THE TO OF TO EXPERT GRAY COME TO FURNISHES
709
THE LINE MESSAGE HAD BE THESE.
711
(Pairs of words) THE HEAD AND IN FRONTAL ATTACK ON AN ENGLISH WRITER THAT THE CHARACTER OF THIS POINT IS THEREFORE ANOTHER METHOD FOR THE LETTERS THAT
712
THE TIME OF WHO EVER TOLD THE PROBLEM FOR AN UNEXPECTED.
714
So can we expect this also with chess moves? Let's try it. Because of
715
the heavy memory requirements when "-w 7" is used (and the long time it takes),
716
we'll learn a smaller
717
game collection <i>BlackWin.txt</i>
718
(but we keep the <i>BlackWinDraw</i> category name so we don't have to modify <i>dce-basic.sh</i>).
720
% ./dbacl/src/dbacl -T text -l ./BlackWinDraw -e alnum -L uniform -j -w 7 -H 23 ./gamefiles/BlackWin.txt
723
Before we start playing, how can we be sure that <i>dbacl</i> will match patterns in the way we expect?
724
Let's try the debug switch like we did before on our test gameline:
726
% head -1 test.complete | ./dbacl/src/dbacl -nv -c ./BlackWinDraw -f 1 -d
727
# categories: BlackWinDraw
728
# format: avg_score * complexity
729
63.73 * 0.1 []e4[](1)
730
-9.44 * 0.3 []e4[]c5[](1)
731
15.46 * 0.4 []c5[](1)
732
6.91 * 0.6 []e4[]c5[]Nf3[](1)
733
-6.53 * 0.7 []c5[]Nf3[](1)
734
5.07 * 0.9 []Nf3[](1)
735
4.17 * 1.0 []e4[]c5[]Nf3[]e6[](1)
736
1.31 * 1.1 []c5[]Nf3[]e6[](1)
737
-9.96 * 1.3 []Nf3[]e6[](1)
738
-2.22 * 1.4 []e6[](1)
739
-0.86 * 1.6 []e4[]c5[]Nf3[]e6[]d3[](1)
740
-1.00 * 1.7 []c5[]Nf3[]e6[]d3[](1)
741
-3.88 * 1.9 []Nf3[]e6[]d3[](1)
742
-9.31 * 2.0 []e6[]d3[](1)
743
-3.80 * 2.1 []d3[](1)
744
-2.59 * 2.3 []e4[]c5[]Nf3[]e6[]d3[]Nc6[](1)
745
-1.58 * 2.4 []c5[]Nf3[]e6[]d3[]Nc6[](1)
746
-2.16 * 2.6 []Nf3[]e6[]d3[]Nc6[](1)
747
-2.99 * 2.7 []e6[]d3[]Nc6[](1)
748
-5.49 * 2.9 []d3[]Nc6[](1)
749
-2.12 * 3.0 []Nc6[](1)
750
0.51 * 3.1 []e4[]c5[]Nf3[]e6[]d3[]Nc6[]Bd2[](1)
751
2.92 * 3.3 []c5[]Nf3[]e6[]d3[]Nc6[]Bd2[](1)
752
5.13 * 3.4 []Nf3[]e6[]d3[]Nc6[]Bd2[](1)
753
7.16 * 3.6 []e6[]d3[]Nc6[]Bd2[](1)
754
6.37 * 3.7 []d3[]Nc6[]Bd2[](1)
755
3.35 * 3.9 []Nc6[]Bd2[](1)
756
5.84 * 4.0 []Bd2[](1)
759
Perfect! Clearly <i>dbacl</i> is picking up sequences up to seven long. Now
760
let's make a proper check:
762
% xboard -fcp ./dce-basic.sh
765
<img src="down.png"><p>
766
<img src="spoiler.png" alt="spoiler space" height="100%">
768
<h2>Another improvement</h2>
770
If you've tried the case "-w 7", then you may have noticed a true improvement
771
over the previous attempts. But while <i>dbacl</i> no longer plays <i>quite</i> so randomly, the overall
772
game seems touch and go. Openings are sometimes recognized, but there's no
773
strategy and frequently <i>dbacl</i> seems to forget what it was doing. Also,
774
there aren't many attempts to protect the 8th row from even direct attacks.
776
We can explain this type of behaviour with what we already know about the
777
<i>dbacl</i> model. Since one chess move needs two words in PGN notation,
778
then even with "-w 7", the longest connected word sequences
779
are just over 3 chess moves long. These sequences aren't under <i>dbacl</i>'s control
780
since you play White, so when they break, this causes confusion and another
781
potential sequence is followed. That's why <i>dbacl</i> seems to lose interest.
783
Another problem is that <i>dbacl</i>'s model has no concept of opening, middle and
784
endgame. If row 8 is attacked, it has no way of knowing if there are pawns
785
which protect the piece, and whether there is room to move away, because
786
the training patterns are averaged over many games. We'll come
787
back to this observation later.
789
So far we've treated half-moves as fundamental, but perhaps it makes more sense
790
to base our models on full moves? Since our engine always plays
791
Black, then the completed gamelines will always have an even number of
792
words. Moreover, the '-w 7" model shall truly be <i>seven chess moves</i> long, not three
793
and a half. Let's try it. We can combine pairs of moves by replacing the
794
space between them with underscores as follows (<a href="combine_half_moves.sh">combine_half_moves.sh</a>).
798
% cat > combine_half_moves.sh
800
sed -e 's/$/ Z/' -e 's/ \([^ ]* *[^ ]\)/_\1/g' -e 's/[ ]*Z$//'
801
% chmod +x combine_half_moves.sh
802
% cat test.gameline | ./combine_half_moves.sh
806
Now we have to adapt the training sets, and change the code of <i>dce-basic.sh</i>
807
slightly. I've called this modification <a href="dce-1.sh">dce-1.sh</a>
810
% cat BlackWinDraw.txt | ../combine_half_moves.sh > BlackWinDraw-1.txt
811
% cat BlackWin.txt | ../combine_half_moves.sh > BlackWin-1.txt
815
Naturally, we must learn the new dataset. You might want to make a cup of coffee while you wait.
817
% ./dbacl/src/dbacl -T text -l ./BlackWin-1 -e graph -L uniform -j -w 7 -H 23 ./gamefiles/BlackWin-1.txt
820
One last safety check and we'll be ready:
822
% cat test.gameline | ./combine_half_moves.sh | ./dbacl/src/dbacl -nv -c ./BlackWin-1 -f 1 -d
823
# categories: BlackWin-1
824
# format: avg_score * complexity
825
128.93 * 0.1 []e4_c5[](1)
826
-11.83 * 0.3 []e4_c5[]Nf3_e6[](1)
827
37.71 * 0.4 []Nf3_e6[](1)
828
17.15 * 0.6 []e4_c5[]Nf3_e6[]d3_Nc6[](1)
829
-20.69 * 0.7 []Nf3_e6[]d3_Nc6[](1)
830
7.60 * 0.9 []d3_Nc6[](1)
832
Okay, it seems to be working.
835
% xboard -fcp ./dce-1.sh
838
<img src="down.png"><p>
839
<img src="spoiler.png" alt="spoiler space" height="100%">
841
<h2>Aimless behaviour</h2>
843
Compared to the earlier attempts, this change seems to have improved
844
<i>dbacl</i>'s tactics. However, we
845
still have much aimless behaviour in the middle and end games. How do
848
Our biggest problem is possibly that <i>dbacl</i> is blind. It simply doesn't know or care about
849
the true configuration of the board pieces, like a real chess engine would.
850
Instead, everything it knows are the likely sequences of moves that it found
851
in the training games.
853
Computer chess is an area which has been studied extensively for a long time,
854
and while we could try to apply these methods to our little engine,
855
we wouldn't learn anything new about chess or about <i>dbacl</i>. So instead,
856
I am only going to try the kind of things that would not be out of place in spam filtering.
858
Deep in the dawn of spam filtering, people devised keyword rules to
859
send unwanted email to the trash. Even today, this is a popular method
860
to detect, for example, those messages which contain "VIAGRA" in the
861
subject line, and automatically pick an action to take.
862
Maybe we can detect some fixed text pattern in the gameline and use this to override
863
the normal <i>dbacl</i> scores? Let's look at a typical game.
865
% head -1 ./gamefiles/BlackWin.txt | fmt
866
d4 Nf6 c4 e6 Nf3 b6 g3 Ba6 Qc2 c5 Bg2 Bb7 O-O Be7 Nc3 cxd4 Nxd4 Bxg2
867
Kxg2 Qc7 Qd3 O-O e4 d6 f3 Nbd7 b3 a6 Be3 Qb7 Rfd1 Rfe8 Bf2 Bf8 Nc2 Rec8
868
Ne3 Rab8 a4 Ne5 Qd2 Rc7 Rac1 Rbc8 Qe2 Nc6 Be1 Nd7 g4 Nc5 Qc2 Nb4 Qb1
869
Be7 Ne2 Nc6 Bc3 Ne5 Ng3 Bg5 Ngf1 Bf4 Bxe5 Bxe5 Rd2 Bf4 Rcd1 b5 axb5
870
axb5 Ra2 Nd7 Qd3 Nc5 Qc2 h6 Ra5 bxc4 bxc4 Nd7 Qe2 Ne5 Rb5 Qa6 Rb2
871
Nxc4 Nxc4 Qxc4 Rd3 Qc5 Ne3 Qg5 Qf2 Rc3 Rxc3 Rxc3 Nf1 Kh7 Rc2 Qe5 Ng3
872
Be3 Qe2 g6 Nf1 Bf4 Ng3 Kg7 Qf2 Be3 Qe2 Qd4 Nf1 Bf4 Ng3 Rxc2 Qxc2 Qd2+
873
Qxd2 Bxd2 Ne2 g5 Nd4 Kf6 Nb3 Bc3 Kf2 Ke5 Ke3 Bb4 Nc1 Bc5+ Ke2 Bg1 Nd3+
874
Kd4 h3 Kc3 Nc1 Bh2 Nd3 Bf4 Nf2 d5 exd5 exd5 Nd3 Be5 Nc1 Kd4 Nd3 Bf4 Nb4
875
Ke5 Nd3+ Kd6 Nb4 Ke6 Nd3 Bd6 Nb2 Ke5 Nd3+ Kf6 Nb2 Ke6 Nd3 h5 Nf2 f5 Nd1
876
fxg4 fxg4 hxg4 hxg4 Kf6 Nb2 Ke5 Kf3 Kd4 Nd1 Kd3 Nf2+ Kd2 Nh3 Be7 Nf2
877
Bc5 Nh1 Bd6 Nf2 Bf4 Nh1 Be3 Ng3 Kd3 Nh1 Bg1 Ng3 Kd2 Nf5 d4 Ng3 d3 0-1
880
Besides the normal moves that represent a change in position, the most
881
obvious feature is that some of
882
the moves above also contain an 'x', which means that this is a capturing move.
883
Interesting! One of the problems with the <i>dbacl</i> engine so far is that often,
884
an opportunity for capturing White's pieces is simply ignored. Can we devise
885
a keyword rule which triggers on 'x' to force <i>dbacl</i> to capture an available
886
piece instead of ignoring it?
888
Let's try it: first we'll need a gameline which includes an 'x' type
889
move, since our previous <i>test.gameline</i> file doesn't. Note that
890
we'll forget temporarily the underscore trick we used in the previous
891
section, just to keep things simple at first.
893
If you look at the full game listed two paragraphs ago, you'll see that
894
the last full move on the first line is "Nxd4 Bxg2", so if we use this line
895
and delete the last full move, then the possible completions will contain at least "Nxd4" and we have a suitable test case.
898
1. d4 Nf6 2. c4 e6 3. Nf3 b6 4. g3 Ba6 5. Qc2 c5 6. Bg2 Bb7 7. O-O Be7 8. Nc3 cxd4
899
% echo -ne "svop verb namd nabd napr\nlfer test2.pgn\nenum 1\n" \
900
| ./SAN/SAN_SRC/san > test2.legal 2>&1
901
% cat test2.legal | grep '^.* : 1$' | cut -f 1 -d ' ' | \
903
echo `cat test2.gameline` $move
904
done > test2.complete
905
% cat test2.complete | ./dbacl/src/dbacl -n -c ./WhiteWinDraw -f 1 > test2.scores
908
There are 48 different potential moves in the <i>test2.scores</i> file, and
909
what we are interested in is the score (column 2) and the potential half move
912
% cat test2.scores | sort -k 2 -n | cut -f 2,21 -d ' ' | head -25
940
I've only listed the first 25 moves by score but it's clear that <i>dbacl</i>'s
941
model puts "h3" as the most likely move, and "Nxd4" way down in 20th position!
942
Let's extract the capturing moves.
944
% cat test2.scores | grep 'x[^ _]*$' | sort -k 2 -n
945
WhiteWinDraw 138.73 d4 Nf6 c4 e6 Nf3 b6 g3 Ba6 Qc2 c5 Bg2 Bb7 O-O Be7 Nc3 cxd4 Nxd4 Bxg2 Nxd4
946
WhiteWinDraw 162.90 d4 Nf6 c4 e6 Nf3 b6 g3 Ba6 Qc2 c5 Bg2 Bb7 O-O Be7 Nc3 cxd4 Nxd4 Bxg2 Qxh7
949
Since there is more than one possible capturing move, the engine has to decide what it wants
951
By sorting the capturing moves by their scores, we let <i>dbacl</i> tell us which move it prefers.
952
Obviously, "Nxd4" is this preferred candidate.
953
It's also possible that for some gamelines, there is no legal capturing move; we'll have
954
to use the full list of scores as before in that case.
956
Finally, we must decide how we are going to integrate this special handling of capturing moves
957
into our chess engine.
958
In spam filters, a keyword rule typically stops all other tests from happening afterwards, because the rule is trusted to override other decisions. Here, this means that we first score
959
all the capturing moves if there are any, and use the best one regardless of other options.
960
It's only if there are no capturing moves that we look at the remaining possibilities.
962
By using the explanations above, you can modify <a href="dce-1.sh">dce-basic.sh</a> yourself, or try
963
out <a href="dce-2.sh">dce-2.sh</a>, which implements both the 'x' and "underscore" tricks together.
966
% xboard -fcp ./dce-2.sh
970
<img src="down.png"><p>
971
<img src="spoiler.png" alt="spoiler space" height="100%">
973
<h2>Tactical adjustments</h2>
975
Quite a change in behaviour! The engine no longer ignores capture opportunities,
977
made it greedy. <i>dbacl</i> is now so greedy that there is no tension left in the game, and
978
since it doesn't know the value of each piece, it often makes bad bargains.
980
Unfortunately, there isn't enough data in the PGN format to let <i>dbacl</i>
981
read off easily the value of an exchange. If you look at the full training
982
game listed earlier, you'll see many moves marked with an 'x', such as
983
"cxd4" "Nxd4" "Bxg2" "Kxg2", but none of these moves identifies the
984
type of the captured piece, only the piece doing the capturing. It's certainly possible to <i>deduce</i>
985
the relevant piece by replaying all the moves on an imaginary board,
986
but then our chess engine would no longer act like a spam
987
filter. Keeping an imaginary board is roughly the equivalent of
988
<i>understanding the actual meaning</i> of an email.
990
So <i>dbacl</i>, as a chess-playing-spam-filter, is limited to two things:
991
it can limit the risk of an exchange, and limit the frequency of exchanges.
993
To limit risk, if there are several capture scenarios available, it can
994
always prefer to capture with the least valued piece (since the type of the piece
995
doing the capturing is known by looking at the move), but note that this doesn't help
996
when there is exactly one capture move possible.
998
Limiting the frequency of exchanges would make <i>dbacl</i> less greedy and
999
refuse to capture pieces all the time. As we noted earlier, <i>dbacl</i> doesn't naturally
1000
tend to capture pieces, so we must find a balance.
1002
We can force a capture when there are two or more capture opportunities. Together
1003
with the risk limitation idea, these types of captures are then performed by
1004
lower value pieces in relative terms.
1006
Clearly, there are endless other things we can try, but there is a price to pay.
1007
As capture rules become heavier and more complex, the original patterns learned by
1008
<i>dbacl</i> from the training games lose their importance. The <i>dbacl</i> chess engine becomes
1009
a hybrid, part ordinary chess engine and part Bayesian text classifier.
1011
I've implemented the two rules above in <a href="dce-3.sh">dce-3.sh</a>. Try it out now.
1014
% xboard -fcp ./dce-3.sh
1017
<!-- not finished, but doesn't look promising
1020
<img src="down.png"><p>
1021
<img src="spoiler.png" alt="spoiler space" height="100%">
1023
<h2>The carrot and the stick</h2>
1025
If you've followed the story so far, you may have noticed a missing ingredient.
1026
Bayesian spam filters require both examples of spam, but also examples of good email to
1027
perform well. The good email teaches the filter what features we want to accept (the carrot)
1028
and the spam messages teach it what features we want to reject (the stick).
1030
So far, the <i>dbacl</i> chess engine only uses a carrot, namely a collection of games we want
1031
to mimic. The best move is chosen to minimize the distance towards the model. If we want to use
1032
a second model (the stick), then we must find a way of combining the scores. The easiest
1033
way is to subtract them. Note that it's important to use comparable models when doing this,
1034
so let's learn brand new category files with identical constraints.
1037
% cat BlackWin.txt | ../combine_half_moves.sh > BlackWin-4.txt
1038
% cat WhiteWin.txt | ../combine_half_moves.sh > WhiteWin-4.txt
1040
% ./dbacl/src/dbacl -T text -l ./BlackWin-4 -e graph -L uniform -j -w 7 -H 23 ./gamefiles/BlackWin-4.txt
1041
% ./dbacl/src/dbacl -T text -l ./WhiteWin-4 -e graph -L uniform -j -w 7 -H 23 ./gamefiles/WhiteWin-4.txt
1044
So now we have a carrot model <i>Blackwin-4</i> and a stick model <i>WhiteWin-4</i>, let's score a set of completed gamelines:
1046
% cat test.complete | | sed -e 's/\([^ ]*\) \([^ ]*\) \([^ ]*\)/\1_\2 \3/g' \
1047
-e 's/\([^ ]*\) \([^ ]*\)$/\1_\2/' > test-4.complete
1048
% cat test-4.complete | ./dbacl/src/dbacl -n -c ./WhiteWin-4 -c ./BlackWin-4 -f 1,2 > test-4.scores
1049
% head -10 test-4.scores
1050
WhiteWin-4 169.69 BlackWin-4 192.53 e4_c5 Nf3_e6 d3_Nc6_Bd2
1051
WhiteWin-4 169.69 BlackWin-4 192.53 e4_c5 Nf3_e6 d3_Nc6_Be2
1052
WhiteWin-4 169.69 BlackWin-4 192.53 e4_c5 Nf3_e6 d3_Nc6_Be3
1053
WhiteWin-4 169.69 BlackWin-4 192.53 e4_c5 Nf3_e6 d3_Nc6_Bf4
1054
WhiteWin-4 169.69 BlackWin-4 192.53 e4_c5 Nf3_e6 d3_Nc6_Bg5
1055
WhiteWin-4 169.69 BlackWin-4 192.53 e4_c5 Nf3_e6 d3_Nc6_Bh6
1056
WhiteWin-4 169.69 BlackWin-4 192.53 e4_c5 Nf3_e6 d3_Nc6_Kd2
1057
WhiteWin-4 169.69 BlackWin-4 192.53 e4_c5 Nf3_e6 d3_Nc6_Ke2
1058
WhiteWin-4 169.69 BlackWin-4 192.53 e4_c5 Nf3_e6 d3_Nc6_Na3
1059
WhiteWin-4 177.67 BlackWin-4 200.51 e4_c5 Nf3_e6 d3_Nc6_Nbd2
1062
These are the first few possible moves. Clearly, <i>dbacl</i> thinks they are
1063
closer to <i>WhiteWin-4</i> than <i>BlackWin-4</i>. We can subtract the scores and sort the
1066
% cat test-4.scores | while read line; do
1067
S=`echo $line | cut -d ' ' -f2,4 | sed 's/ / - /' | bc -l`
1069
done | sort -k 1 -n > test-4.sorted
1070
% head -10 test-4.sorted
1071
-22.84 WhiteWin-4 161.71 BlackWin-4 184.55 e4_c5 Nf3_e6 d3_Nc6_a3
1072
-22.84 WhiteWin-4 161.71 BlackWin-4 184.55 e4_c5 Nf3_e6 d3_Nc6_a4
1073
-22.84 WhiteWin-4 161.71 BlackWin-4 184.55 e4_c5 Nf3_e6 d3_Nc6_b3
1074
-22.84 WhiteWin-4 161.71 BlackWin-4 184.55 e4_c5 Nf3_e6 d3_Nc6_b4
1075
-22.84 WhiteWin-4 161.71 BlackWin-4 184.55 e4_c5 Nf3_e6 d3_Nc6_c3
1076
-22.84 WhiteWin-4 161.71 BlackWin-4 184.55 e4_c5 Nf3_e6 d3_Nc6_c4
1077
-22.84 WhiteWin-4 161.71 BlackWin-4 184.55 e4_c5 Nf3_e6 d3_Nc6_d4
1078
-22.84 WhiteWin-4 161.71 BlackWin-4 184.55 e4_c5 Nf3_e6 d3_Nc6_e5
1079
-22.84 WhiteWin-4 161.71 BlackWin-4 184.55 e4_c5 Nf3_e6 d3_Nc6_g3
1080
-22.84 WhiteWin-4 161.71 BlackWin-4 184.55 e4_c5 Nf3_e6 d3_Nc6_g4
1083
Why do we have a negative difference? Since <i>dbacl</i> plays Black,
1084
we want the smallest scores to represent a case when
1085
<i>BlackWin-4</i> is more relevant than <i>WhiteWin-4</i>.
1087
What seems a little surprising is that all scores are the same,
1088
but this might just be coincidence. Let's try out <i>dce-4.sh</i>
1089
which implements all these changes.
1091
% chmod +x ./dce-4.sh
1092
% xboard -fcp ./dce-4.sh
1098
<img src="down.png"><p>
1099
<img src="spoiler.png" alt="spoiler space" height="100%">
1101
<h2>Randomized play</h2>
1103
We've come a long way, so let's go back to the initial question.
1104
"Can a spam filter play chess?". To answer this question affirmatively,
1105
we need at least one example which can't be explained by random play.
1106
While <i>dbacl</i> undoubtedly makes some strange chess decisions sometimes,
1107
let's look at the following game fragment played against
1108
<a href="dce-3.sh">dce-3.sh</a>:
1110
<img src="csfpc1.png">
1111
<img src="csfpc2.png">
1112
<img src="csfpc3.png">
1114
Clearly, <i>dbacl</i> picked up the pawn defensive moves from the games archive,
1115
as this succession of moves is very unlikely to be random (all moves are on
1116
the same side of the board, and none of the moves are capturing moves, so
1117
cannot be explained by capture heuristics which partially override <i>dbacl</i>'s
1120
So <i>dbacl</i> has definitely learned something about chess, at
1121
least in some tactical situations,
1122
and can probably hold
1123
its own against an average three year old. Mission accomplished!
1125
Well, the title of this section is "randomized play", and there is
1126
one more thing to do. So far, the <i>dbacl</i> chess engine is deterministic:
1127
when faced with identical White moves, it will always play the same way.
1128
This gets boring very fast. What we would like is more randomized
1129
play, but which still uses the <i>dbacl</i> scoring system.
1131
For randomized play, we can think of the scores as equivalent
1132
probabilities. Then we
1133
can pick not just the highest probability (lowest score) move as we've done so far, but instead any move according
1134
to its probability. Let's look again at the file <i>test.scores</i>
1135
that we created earlier:
1137
% head -5 test.scores
1138
WhiteWinDraw 53.88 e4 c5 Nf3 e6 d3 Nc6 Bd2
1139
WhiteWinDraw 52.93 e4 c5 Nf3 e6 d3 Nc6 Be2
1140
WhiteWinDraw 52.16 e4 c5 Nf3 e6 d3 Nc6 Be3
1141
WhiteWinDraw 53.11 e4 c5 Nf3 e6 d3 Nc6 Bf4
1142
WhiteWinDraw 52.88 e4 c5 Nf3 e6 d3 Nc6 Bg5
1145
As I already mentioned, the <i>dbacl</i> model probabilities are related to
1146
these scores as follows: Prob[Bd2] = 2^(-53.88), Prob[Be2] = 2^(-52.93), etc.
1147
Unfortunately, these are all very small probabilities and we can't
1148
easily represent them as floating point numbers once the gameline becomes
1149
much longer, let alone pick them at random. So first, let's subtract the biggest common factor: If we sort the scores, then the top score is the factor we want:
1151
% cat test.scores | sort -k 2 -n > test.scores.prob
1152
% head -1 test.scores.prob
1153
WhiteWinDraw 47.56 e4 c5 Nf3 e6 d3 Nc6 d4
1156
We'll write an awk program to do the randomizing, so here's the first part:
1166
% cat test.scores.prob | awk -f ./renorm.awk | head -5
1167
0 WhiteWinDraw 47.56 e4 c5 Nf3 e6 d3 Nc6 d4
1168
0.77 WhiteWinDraw 48.33 e4 c5 Nf3 e6 d3 Nc6 g3
1169
0.95 WhiteWinDraw 48.51 e4 c5 Nf3 e6 d3 Nc6 c3
1170
1.04 WhiteWinDraw 48.60 e4 c5 Nf3 e6 d3 Nc6 a3
1171
1.67 WhiteWinDraw 49.23 e4 c5 Nf3 e6 d3 Nc6 c4
1174
Okay, the first column gives the modified score and the rest is the original
1175
line. Since these are binary exponents rather than actual probabilities,
1176
we'll use a rejection sampler to pick the correct line and print it out.
1177
Here's the full randomizer:
1179
% cat > randomizer.awk
1185
score[NR] = $2 - cf;
1190
# randomizer seeded by time of day
1191
# don't use more often than once per second.
1194
x = int(rand() * NR) + 1;
1196
if( log(2) * score[x] < t ) {
1202
% cat test.scores.prob | ./randomizer.awk
1203
WhiteWinDraw 48.33 e4 c5 Nf3 e6 d3 Nc6 g3
1204
% cat test.scores.prob | ./randomizer.awk
1205
WhiteWinDraw 47.56 e4 c5 Nf3 e6 d3 Nc6 d4
1208
Awk's randomizer is seeded by the time of day, so if you repeatedly run the <i>randomizer.awk</i> script in a fast loop, it will always output the same result. But for chess against people, we don't really care about the quality of the randomness involved.
1210
I've added the randomizer to the final version of the chess engine, <a href="dce.sh">dce.sh</a>, which also contains a few other improvements. If you feel
1211
like experimenting further with the <i>dbacl</i> chess engine, that script is probably
1212
the best starting point. Enjoy!
1215
% xboard -fcp ./dce.sh
1218
<img src="down.png"><p>
1219
<img src="spoiler.png" alt="spoiler space" height="100%">
1221
<h2>Wrapping up</h2>
1223
It's time to conclude this investigation and see what we've learned. The
1224
original question was "Can a spam filter play chess?". Clearly, the answer to
1225
this is yes, but making it play well is not so easy.
1227
A crucial aspect I haven't touched on here are chess tournaments. The only way to
1228
reliably judge potential improvements in an engine is to make it play other engines
1229
with known strengths. Not all types of tournaments are appropriate for the <i>dbacl</i>
1230
chess engine - for example randomized initial positions and endgame puzzles are meaningless,
1231
because <i>dbacl</i> doesn't think ahead more than one half move, and it needs the full game
1232
history for matching patterns. Moreover, the fundamental behaviour is learned entirely
1233
from training sets. Thus <i>dbacl</i>'s strength as a chess player is meaningless without reference
1234
to its training archive. However, if this archive is fixed, then incremental improvements to the
1235
algorithms can be evaluated in that context.
1237
<i>dbacl</i> is able to learn some tactics simply by reading large collections of
1238
games. However, it seems that strategy is beyond its capabilities. Moreover,
1239
there are some fundamental limitations in treating a PGN format game like a text
1240
document: some information such as the values of exchanged pieces aren't easy to read off
1241
without keeping an imaginary board for replaying moves.
1243
There are many ways to change the characteristics of the basic chess engine we've
1244
built here. Besides more complex capture heuristics, one can try to account for
1245
the length of the game (opening/middle/endgame), the difference in the number of pieces
1246
captured by each side, etc. Beyond that, the PGN gameline representation which we used
1247
here can be replaced with more informative symbol sequences. In principle, one could replace
1248
each move with a pictorial representation of the board such as
1249
<a href="http://www.tim-mann.org/Standard">FEN</a>, but besides the added implementational complications, this causes difficulties because of <i>dbacl</i>'s limit of 30 character tokens, and possibly statistical issues in recognizing similar but not identical board configurations.
1251
Perhaps most interestingly overall, it should be remembered that <i>dbacl</i>
1252
<i>doesn't think ahead</i> like most chess engines do. Its successes and
1253
failures are almost entirely based on the historical record of the
1254
game as it develops and mimicry of training games, not at all on calculating moves and
1255
countermoves in the future.
1257
Download the latest dbacl chess engine: <a href="dce.sh">dce.sh</a>
1259
<h2>Post Scriptum</h2>
1261
After this essay was written, it <a href="http://games.slashdot.org/article.pl?sid=05/07/20/0317215&tid=156&tid=111&tid=10">appeared on slashdot</a>. This resulted in some interesting comments and emails. I've selected some of them
1262
below and added some responses.
1266
<i>aug24 writes on slashdot
1267
"For example, it currently uses entire games to
1268
compare. So if it comes across an unusual opening,
1269
even one close to a standard one, it's not able to
1270
decide effectively. Perhaps something using game
1271
fragments would be possible, then it might
1272
reproduce structured plays even when the previous
1273
game play has been unusual."</i>
1276
It's true that in the essay, the full game is scored from the beginning,
1277
but this is only for convenience. For the actual decisions, solely the
1278
last few moves matter (how many depends on the -w switch), and the extra
1279
score contributions from the beginning are identical for all possible
1280
choices, so cancel out in the final decision. The beginnings of the games
1281
could be cut away before scoring, but this would be more work and would
1282
make no difference in the way dbacl plays chess.
1286
<i>N3Roaster writes on slashdot
1287
"I've got one. I read somewhere that high level
1288
chess players don't view the current game
1289
state so much in terms of exactly where each
1290
piece is, but in terms of piece groupings. I'm
1291
more of a go player myself, but it seems like
1292
that would probably be right. Perhaps a richer
1293
yet more abstract structure representing game
1294
state would be better. In this way, situations
1295
where the structure of play dynamics are the
1296
same even while the key pieces are in
1297
different positions could be picked up on.
1298
This combined with your game fragment notion
1299
would probably allow the chess filter to learn
1303
Other representations besides PGN are certainly possible and worth
1304
trying. Since dbacl learns patterns directly from input text, it
1305
is first necessary to convert the training games into such another
1306
representation, then the current game must also be available in this
1307
representation during play.
1312
<i>pclminion writes on slashdot
1313
"I'm sure people have thought of it before, but
1314
they haven't done it because it's clear that
1315
it can't work unless it's given an infinite
1316
context of moves. Otherwise, you can set the
1317
computer up to fail by creating a situation
1318
near the beginning of the midgame and then
1319
playing out a line that traps the computer as
1320
soon as it runs out of context.
1321
It is also limited to what it learns by
1322
"observing" other games. So it can never move
1323
creatively or unexpectedly."</i>
1326
dbacl does not operate this way. It is not limited to
1327
matching known sequences which were seen in the training
1328
games, but instead predicts likelihoods for never before seen sequences
1329
from its model by generalization. There is no need for an infinite
1330
context of moves, the model is in fact learned from a limited number of
1331
training games, and once learned, it can handle any game sequence
1332
"creatively", since such a sequence is decomposed into known and unknown
1333
fragments and recombined through the rules of probability.
1334
The dbacl chess engine can also move "unexpectedly", either because
1335
the model differs from a player's expectation, or becasue of the
1336
randomization step explained at the end of the essay.
1341
<i>mrthoughtful writes on slashdot
1342
"It would never get to beat modern computer chess
1343
programs, as it depends upon a database of previous
1344
games that are similar to the current game played, and
1345
has no scope for examining possible futures."</i>
1348
While it is true that in the essay, dbacl predicts the next move
1349
by examining the immediate past, there is nothing inherently
1350
impossible about predicting several moves ahead. With some work,
1351
the engine can be modified to generate all the game sequences which
1352
are two or more moves into the future, and dbacl will happily score
1353
each such sequence as well.
1354
However, the fundamental difficulty with
1355
chess is that the number of game sequences grows exponentially with
1356
the number of predicted moves, so the computational cost of predicting
1357
several moves into the future grows very quickly.
1362
<i>Steinfiend writes on slashdot
1364
Chess playing Bayesian filter isn't necessarily 100%
1365
useful now, but what they learned from doing it might
1366
be able to be applied in some other situation. Maybe
1367
they can reapply whatever they learned back into spam
1368
filtering and improve all of our in-boxes."</i>
1371
Perhaps a different way of saying this is that both spam
1372
filtering and this chess playing experiment exhibit common
1373
weaknesses. For example, currently the dbacl chess engine
1374
has difficulty adapting to the middle and end games. This is
1375
difficult because there is no simple boundary between the phases
1376
of a chess game. Similarly, general emails have a complicated structure
1377
but it can be hard to distinguish its semantic parts. A message
1378
can switch from being professional to personal etc. This weakness is
1379
hard to observe in spam filtering, but much easier to see in the
1385
<i>Chris_Mir writes on slashdot
1386
"I can imagine, depending on how many games are played and
1387
the available memory space, that it will develop to have a
1388
decent opening. But nothing more then that." and
1389
aug24 responds "Absolutely, because it is effectively playing with
1390
complete games in order to do well, which is
1391
equivalent to making a tree of every possible chess
1392
move from the start - a rather big data set.</i>
1395
dbacl doesn't remember full trees, it picks up game fragments of
1396
a certain number of moves and uses these to analyze unknown and longer
1397
sequences using Bayesian theory. The openings are learned more quickly
1398
because everyone starts with the same initial board setup, and there
1399
are only a handful of very popular openings in training games.
1404
<i>Flyboy Connor writes on slashdot
1405
"The premise of a Bayesian filter is that is learns
1406
sequences of words, or characters, or whatever. Spam-chess
1407
learns sequences of moves. This premise is wrong, since
1408
good moves are related to complete board positions, not to
1409
what was done in the previous few moves.
1410
Of course, the longer your string of moves is, the better
1411
it will represent the board position, especially during
1412
the opening phase of the game. And the example the article
1413
provides of reasonable play of spam-chess, is actually
1414
from the game's opening, where the learned sequences
1415
indeed represent the complete game.
1416
For the middle game, however, spam chess will perform
1420
This is a good comment. It's true that chess requires an understanding
1421
of the board positions, and the PGN format is very limited in
1422
this respect (however, the PGN sequence is indirectly equivalent
1423
to a full board representation, since it is possible to reconstruct the
1424
board from such a sequence). However, it does not follow that
1425
this approach must always perform badly in the middle game. Chess is
1426
a combination of both strategy and tactics, and tactical thinking is
1427
confined to a small number of consecutive moves, precisely what dbacl
1428
tends to pick up. A good tactician can play well in the middle game,
1429
even if he is a bad strategist.
1434
<i>m50d writes on slashdot
1435
"How about applying it differently, getting it to learn
1436
to pick a move for each position? Obviously that's a
1437
little harder than a simple spam/not spam judgement,
1438
but I'd have thought you could get it to recognise
1439
that if there are 3 pawns in front of the king and you
1440
have a rook available you can do a back rank mate,
1444
This can be done by using an appropriate choice of representation.
1445
The PGN format is not the only possible format for representing chess.
1446
The FEN format represents the full board, and a hybrid representation
1447
could include any features of interest. One could represent each
1448
game as a sequence of FEN positions. dbacl has a limit on the length
1449
of words which makes using straight FEN impractical, but the positions
1450
could just as well be compressed by using a hash function.
1455
<i>barawn writes on slashdot
1456
"I think the main limitation - not being able to
1457
understand values of exchanges - is probably the
1458
biggest problem. It's also the easiest to fix, as you
1459
can just write a program to add the piece that's being
1460
taken (i.e. instead of Bxc4, you'd have BxBc4). The
1461
relative value of each piece can be determined just
1462
from the number of times an exchange happens for
1466
Nice idea. Does anyone want to try?
1471
<i>aridg writes on slashdot
1472
"Reading this article, I was reminded of the old children's
1473
story about "stone soup". You remember that one -- someone
1474
advertises that he can make soup from a stone, and various
1475
others gather around to watch this amazing feat. Well, the
1476
soup needs a little extra seasoning, so he gets someone to
1477
put in some carrots while the stone cooks, then he adds
1478
some onions, etc, etc... I think you can see where this is
1480
Sure you can make a chess playing program from a spam
1482
You just need to throw in a legal move generator, and a
1483
game database, and some capture heuristics, and position
1484
displayer, etc, etc..."</i>
1487
This is an interesting criticism, but is problematic on inspection.
1488
What does it take to play chess? Since a game is defined as a sequence
1489
of legal moves, there is nothing optional about a legal move generator.
1490
Imagine if two people are playing chess, and one tries an illegal
1491
move. The opponent will point out the mistake and refuse to allow it.
1492
The first person will try again until a legal move is found.
1493
Statistically, dbacl could learn to distinguish legal from illegal moves
1494
in any one board position by observing enough training games,
1495
ie the illegal moves can be learned
1496
to have arbitrarily small probabilities. However, in a real game the
1497
opponent won't allow illegal moves, so dbacl would continue to generate
1498
moves until it obtains a legal move. Mathematically, this is equivalent
1499
to using a move generator like SAN, but using SAN is faster. Similar
1500
comments apply to the other criticisms.
1505
<i>lawrenqj writes on slashdot
1506
"Well actually Bayesian filters aught to be able to
1507
discard inserted "noise". I'm not sure how the package
1508
the article pointed out works, but usually the attempt
1509
at pattern recognition accounts for either noise or
1510
randomization and finds appropriate matches despite
1511
small changes. So a bad move on the human's part
1512
should be ignored by the filter. I'm not sure it would
1513
be able to take advantage of the situation though."</i>
1516
This is correct. dbacl builds a statistical model from the training
1517
games. The model handles noise as appropriate via the rules
1518
of probability and the model assumptions. In other words, when
1519
a sequence of moves is analysed, it is decomposed into known
1520
subsequences and unknown subsequences, which are combined through
1521
a Bayesian weighting mechanism which depends on the model. In this
1522
way, all relevant training sequences, even if they only overlap
1523
partially, are used to predict the scores of a game.
1528
<i>blacksky writes on slashdot
1529
"Someone alluded to the Wargames movie/book. Which begs the
1530
question: would you see an improvement if you set the
1531
bayesian filter against itself, and fed the resulting
1532
games back into its knowledge base? Would favouring the
1533
shorter games in this feedback loop improve it further?"</i>
1536
This could be an interesting experiment. The chess engine presented in
1537
the essay is incomplete, it only plays Black, but with some work it
1538
can be completed. I don't know if playing against itself is a good
1539
idea. It might help to extract the likely moves, but it might also
1540
cause degeneration similar to incest.
1545
<i>Chuckstar writes on slashdot
1546
"What if you ignore the 'x' altogether.
1547
Specifically, delete all the 'x's before running anything
1548
through dbacl. Put back the 'x' where necessary in
1549
returning the move to the chess program.
1551
The idea being that whether or not you make a specific
1552
move is not always dependent on whether you make a
1553
capture. Sometimes you just want the piece on that spot on
1554
the board. The program currently treats those two
1555
instances (moving a piece for a to b, and moving a piece
1556
from a while capturing the piece at b) as different moves,
1557
so that the presence of a piece at b alters the outcome of
1558
the decision process."</i>
1561
This is an idea worth trying. However, if dbacl never sees
1562
an 'x' in the training games, then it will assign any move containing
1563
an 'x' during play a low probability, identical for all capturing moves.
1564
This would make it even more reluctant to capture pieces naturally,
1565
and give no way of choosing which pieces should be captured. A simple
1566
way to make things work correctly would be to remove all the 'x'
1568
games and train on both the old games containing 'x', and the same games
1569
where the 'x' is missing.
1574
<i>Eric Towers writes in an email
1575
"[I] wondered if dbacl would play better if instead of learning
1576
multi-move sequences, it learned {board state, next move} pairs."
1580
It's possible. I only tried the simplest things in the essay, but that
1581
approach is very promising. You don't actually have to represent the
1582
full board state, for example you might also represent a slightly
1583
higher level state which tells things like the number and values of
1584
important pieces etc.
1589
<i>John Kipling Lewis writes in an email
1590
"Also, as end games are probably going to be a bit of a bother, it
1591
would be nice to select a sub-section of possible moves when in the
1592
middle game and then compare their LAST 7 moves as an additional
1596
While I didn't do it in the writeup, it is certainly possible to
1597
generate more than a single move ahead and apply the same scoring
1598
method. In principle, up to 7 moves could be generated, although the
1599
exponential branching will always pose a practical problem just as in
1602
It's an interesting question however because it isn't obvious I
1603
think if this might help or not. It might help because it's a
1604
form of educated guessing where the opponent's moves are filled
1605
in from experience with a large collection of games, but it
1606
might not help because it would amount to wishful thinking, that
1607
the opponent behaves like in the training games.
1609
In the best case, it would amount to an effective doubling of the
1611
length from 7 to about 14, since the calculations would involve all the
1612
overlapping 7-move (or less) sequences straddling the current move.
b'\\ No newline at end of file'