1
<?xml version="1.0"?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
2
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml" lang="en"><head><title>BuildBot: build/test automation</title><link href="stylesheet.css" type="text/css" rel="stylesheet" /></head><body bgcolor="white"><h1 class="title">BuildBot: build/test automation</h1><div class="toc"><ol><li><a href="#auto0">Abstract</a></li><li><a href="#auto1">Features</a></li><li><a href="#auto2">Overview</a></li><li><a href="#auto3">Design</a></li><ul><li><a href="#auto4">Build Master</a></li><li><a href="#auto5">Builders and BuildProcesses</a></li><li><a href="#auto6">Build Slaves</a></li><li><a href="#auto7">Build Status</a></li></ul><li><a href="#auto8">Installation</a></li><li><a href="#auto9">Security</a></li><li><a href="#auto10">Inspirations and Competition</a></li><li><a href="#auto11">Current Status</a></li><li><a href="#auto12">Future Directions</a></li><li><a href="#auto13">More Information</a></li></ol></div><div class="content"><span></span><ul><li>Author: Brian Warner <<code>warner@lothar.com</code>></li><li>BuildBot Home Page:
3
<a href="http://buildbot.sourceforge.net">http://buildbot.sourceforge.net</a></li></ul><h2>Abstract<a name="auto0"></a></h2><p>The BuildBot is a system to automate the compile/test cycle required by
4
most software projects to validate code changes. By automatically rebuilding
5
and testing the tree each time something has changed, build problems are
6
pinpointed quickly, before other developers are inconvenienced by the
7
failure. The guilty developer can be identified and harassed without human
8
intervention. By running the builds on a variety of platforms, developers
9
who do not have the facilities to test their changes everywhere before
10
checkin will at least know shortly afterwards whether they have broken the
11
build or not. Warning counts, lint checks, image size, compile time, and
12
other build parameters can be tracked over time, are more visible, and are
13
therefore easier to improve.</p><p>The overall goal is to reduce tree breakage and provide a platform to run
14
tests or code-quality checks that are too annoying or pedantic for any human
15
to waste their time with. Developers get immediate (and potentially public)
16
feedback about their changes, encouraging them to be more careful about
17
testing before checkin.</p><h2>Features<a name="auto1"></a></h2><ul><li> run builds on a variety of slave platforms</li><li> arbitrary build process: handles projects using C, Python, whatever</li><li> minimal host requirements: python and Twisted</li><li> slaves can be behind a firewall if they can still do checkout</li><li> status delivery through web page, email, IRC, other protocols</li><li> track builds in progress, provide estimated completion time</li><li> flexible configuration by subclassing generic build process classes</li><li> debug tools to force a new build, submit fake Changes, query slave
18
status</li><li> released under the GPL</li></ul><h2>Overview<a name="auto2"></a></h2><img src="waterfall.png" alt="waterfall display" height="457" align="right" width="323" /><p>In general, the buildbot watches a source code repository (CVS or other
19
version control system) for <q>interesting</q> changes to occur, then
20
triggers builds with various steps (checkout, compile, test, etc). The
21
Builds are run on a variety of slave machines, to allow testing on different
22
architectures, compilation against different libraries, kernel versions,
23
etc. The results of the builds are collected and analyzed: compile succeeded
24
/ failed / had warnings, which tests passed or failed, memory footprint of
25
generated executables, total tree size, etc. The results are displayed on a
26
central web page in a <q>waterfall</q> display: time along the vertical
27
axis, build platform along the horizontal, <q>now</q> at the top. The
28
overall build status (red for failing, green for successful) is at the very
29
top of the page. After developers commit a change, they can check the web
30
page to watch the various builds complete. They are on the hook until they
31
see green for all builds: after that point they can reasonably assume that
32
they did not break anything. If they see red, they can examine the build
33
logs to find out what they broke.</p><p>The status information can be retrieved by a variety of means. The main
34
web page is one path, but the underlying Twisted framework allows other
35
protocols to be used: IRC or email, for example. A live status client (using
36
Gtk+ or Tkinter) can run on the developers desktop, with a box per builder
37
that turns green or red as the builds succeed or fail. Once the build has
38
run a few times, the build process knows about how long it ought to take (by
39
measuring elapsed time, quantity of text output by the compile process,
40
searching for text indicating how many unit tests have been run, etc), so it
41
can provide a progress bar and ETA display.</p><p>Each build involves a list of <q>Changes</q>: files that were changed
42
since the last build. If a build fails where it used to succeed, there is a
43
good chance that one of the Changes is to blame, so the developers who
44
submitted those Changes are put on the <q>blamelist</q>. The unfortunates on
45
this list are responsible for fixing their problems, and can be reminded of
46
this responsibility in increasingly hostile ways. They can receive private
47
mail, the main web page can put their name up in lights, etc. If the
48
developers use IRC to communicate, the buildbot can sit in on the channel
49
and tell developers directly about build status or failures.</p><p>The build master also provides a place where long-term statistics about
50
the build can be tracked. It is occasionally useful to create a graph
51
showing how the size of the compiled image or source tree has changed over
52
months or years: by collecting such metrics on each build and archiving
53
them, the historical data is available for later processing.</p><h2>Design<a name="auto3"></a></h2><p>The BuildBot consists of a master and a set of build slaves. The master
54
runs on any conveniently-accessible host: it provides the status web server
55
and must be reachable by the build slaves, so for public projects it should
56
be reachable from the general internet. The slaves connect to the master and
57
actually perform the builds: they can be behind a firewall as long as they
58
can reach the master and check out source files.</p><h3>Build Master<a name="auto4"></a></h3><img src="overview.png" alt="overview diagram" height="383" width="595" /><p>The master receives information about changed source files from various
59
sources: it can connect to a CVSToys server, or watch a mailbox that is
60
subscribed to a CVS commit list of the type commonly provided for widely
61
distributed development projects. New forms of change notification (e.g. for
62
other version control systems) can be handled by writing an appropriate
63
class: all are responsible for creating Change objects and delivering them
64
to the ChangeMaster service inside the master.</p><p>The build master is given a working directory where it is allowed to save
65
persistent information. It is told which TCP ports to use for slave
66
connections, status client connections, the built-in HTTP server, etc. The
67
master is also given a list of <q>build slaves</q> that are allowed to
68
connect, described below. Each slave gets a name and a password to use. The
69
buildbot administrator must give a password to each person who runs a build
70
slave.</p><p>The build master is the central point of control. All the decisions about
71
what gets built are made there, all the file change notices are sent there,
72
all the status information is distributed from there. Build slave
73
configuration is minimal: everything is controlled on the master side by the
74
buildbot administrator. On the other hand, the build master does no actual
75
compilation or testing. It does not have to be able to checkout or build the
76
tree. The build slaves are responsible for doing any work that actually
77
touches the project's source code.</p><h3>Builders and BuildProcesses<a name="auto5"></a></h3><p>Each <q>build process</q> is defined by an instance of a Builder class
78
which receives a copy of every Change that goes into the repository. It gets
79
to decide which changes are interesting (e.g. a Builder which only compiles
80
C code could ignore changes to documentation files). It can decide how long
81
to wait until starting the build: a quick build that just updates the files
82
that were changed (and will probably finish quickly) could start after 30
83
seconds, whereas a full build (remove the old tree, checkout a new tree,
84
compile everything, test everything) would want to wait longer. The default
85
10 minute delay gives developers a chance to finish checking in a set of
86
related files while still providing timely feedback about the consequences
87
of those changes.</p><p>Once the build is started, the build process controls how it proceeds
88
with a series of BuildSteps, which are things like shell commands, CVS
89
update or checkout commands, etc. Each BuildStep can invoke SlaveCommands on
90
a connected slave. One generic command is ShellCommand, which takes a
91
string, hands it to <code>/bin/sh</code>, and returns exit status and
92
stdout/stderr text. Other commands are layered on top of ShellCommand:
93
CVSCheckout, MakeTarget, CountKLOC, and so on. Some operations are faster or
94
easier to do with python code on the slave side, some are easier to do on
95
the master side.</p><p>The Builder walks through a state machine, starting BuildSteps and
96
receiving a callback when they complete. Steps which fail may stop the
97
overall build (if the CVS checkout fails, there is little point in
98
attempting a compile), or may allow it to continue (unit tests could fail
99
but documentation may still be buildable). When the last step finishes, the
100
entire build is complete, and a function combines the completion status of
101
all the steps to decide how the overall build should be described:
102
successful, failing, or somewhere in between.</p><p>At each point in the build cycle (waiting to build, starting build,
103
starting a BuildStep, finishing the build), status information is delivered
104
to a special Status object. This information is used to update the main
105
status web page, and can be delivered to real-time status clients that are
106
attached at that moment. Intermediate status (stdout from a ShellCommand,
107
for example) is also delivered while the Step runs. This status can be used
108
to estimate how long the individual Step (or the overall build) has left
109
before it is finished, so an ETA can be listed on the web page or in the
110
status client.</p><p>The build master is persisted to disk when it is stopped with SIGINT,
111
preserving the status and historical build statistics.</p><p>Builders are set up by the buildbot administrator. Each one gets a name
112
and a BuildProcess object (which may be parameterized with things like which
113
CVS repository to use, which targets to build, which version or python or
114
gcc it should use, etc). Builders are also assigned to a BuildSlave,
115
described below. In the current implementation, Builders are defined by
116
adding lines to the setup script, but an HTML-based <q>create a builder</q>
117
scheme is planned for the future.</p><h3>Build Slaves<a name="auto6"></a></h3><p>BuildSlaves are where the actual compilation and testing gets done. They
118
are programs which run on a variety of platforms, and communicate with the
119
BuildMaster over TCP connections.</p><p>Each build slave is given a name and a working directory. They are also
120
given the buildmaster's contact information: hostname, port number, and a
121
password. This information must come from the buildbot administrator, who
122
has created a corresponding entry in the buildmaster. The password exists to
123
make it clear that build slave operators need to coordinate with the
124
buildbot administrator.</p><p>When the Builders are created, they are given a name (like
125
<q>quick-bsd</q> or <q>full-linux</q>), and are tied to a particular slave.
126
When that slave comes online, a RemoteBuilder object is created inside it,
127
where all the SlaveCommands are run. Each RemoteBuilder gets a separate
128
subdirectory inside the slave's working directory. Multiple Builders can
129
share the same slave: typically all Builders for a given architecture would
130
run inside the same slave.</p><img src="slave.png" alt="overview diagram" height="354" width="595" /><h3>Build Status<a name="auto7"></a></h3><p>The waterfall display holds short-term historical build status.
131
Developers can easily see what the buildbot is doing right now, how long it
132
will be until the current build is finished, and what are the results of
133
each step of the build process. Change comments and compile/test logs are
134
one click away. The top row shows overall status: green is good, red is bad,
135
yellow is a build still in progress.</p><p>Also available through the web page is information on the individual
136
builders: static information like what purpose the builder serves (specified
137
by the admin when configuring the buildmaster), and non-build status
138
information like which build slave it wants to use, whether the slave is
139
online or not, and how frequently the build has succeeded in the last 10
140
attempts. Build slave information is available here too, both data provided
141
by the build slave operator (which machine the slave is running on, who to
142
thank for the compute cycles being donated) and data extracted from the
143
system automatically (CPU type, OS name, versions of various build
144
tools).</p><p>The live status client shows the results of the last build, but does not
145
otherwise show historical information. It provides extensive information
146
about the current build: overall ETA, individual step ETA, data about what
147
changes are being processed. It will be possible to get at the error logs
148
from the last build through this interface.</p><p>Eventually, e-mail and IRC notices can be sent when builds have succeeded
149
or failed. Mail messages can include the compile/test logs or summaries
150
thereof. The buildmaster can sit on the IRC channel and accept queries about
151
current build status, such as <q>how long until the current build
152
finishes</q>, or <q>what tests are currently failing</q>.</p><p>Other status displays are possible. Test and compile errors can be
153
tracked by filename or test case name, providing view on how that one file
154
has fared over time. Errors can be tracked by username, giving a history of
155
how one developer has affected the build over time. </p><h2>Installation<a name="auto8"></a></h2><p>The buildbot administrator will find a publically-reachable machine to
156
host the buildmaster. They decide upon the BuildProcesses to be run, and
157
create the Builders that use them. Creating complex build processes will
158
involve writing a new python class to implement the necessary
159
decision-making, but it will be possible to create simple ones like
160
<q>checkout, make, make test</q> from the command line or through a
161
web-based configuration interface. They also decide upon what forms of
162
status notification should be used: what TCP port should be used for the web
163
server, where mail should be sent, what IRC channels should receive
164
success/failure messages.</p><p>Next, they need to arrange for change notification. If the repository is
165
using <a href="http://purl.net/net/CVSToys">CVSToys</a>, then they simply
166
tell the buildmaster the host, port, and login information for the CVSToys
167
server. When the buildmaster starts up, it will contact the server and
168
subscribe to hear about all CVS changes. If not, a <q>cvs-commits</q>
169
mailing list is needed. Most large projects have such a list: every time a
170
change is committed, an email is sent to everyone on the list which contains
171
details about what was changed and why (the checkin comments). The admin
172
should subscribe to this list, and dump the resulting mail into a
173
qmail-style <q>maildir</q>. (It doesn't matter who is subscribed, it could
174
be the admin themselves or a buildbot-specific account, just as long as the
175
mail winds up in the right place). Then they tell the buildmaster to monitor
176
that maildir. Each time a message arrives, it will be parsed, and the
177
contents used to trigger the buildprocess. All forms of CVS notification
178
include a filtering prefix, to tell the buildmaster it should ignore commits
179
outside a certain directory. This is useful if the repository is used for
180
multiple projects.</p><p>Finally, they need to arrange for build slaves. Some projects use
181
dedicated machines for this purpose, but many do not have that luxury and
182
simply use developer's personal workstations. Projects that would benefit
183
from testing on multiple platforms will want to find build slaves on a
184
variety of operating systems. Frequently these build slaves are run by
185
volunteers or developers involved in the project who have access to the
186
right equipment. The admin will give each of these people a name/password
187
for their build slave, as well as the location (host/port) of the
188
buildmaster. The build slave owners simply start a process on their systems
189
with the appropriate parameters and it will connect to the build master.</p><p>Both the build master and the build slaves are Twisted
190
<code>Application</code> instances. A <code>.tap</code> file holds the
191
pickled application state, and a daemon-launching program called
192
<code>twistd</code> is used to start the process, detach from the current
193
tty, log output to a file, etc. When the program is terminated, it saves its
194
state to another <code>.tap</code> file. Next time, <code>twistd</code> is
195
told to start from that file and the application will be restarted exactly
196
where it left off.</p><h2>Security<a name="auto9"></a></h2><p>The master is intended to be publically available, but of course
197
limitations can be put on it for private projects. User accounts and
198
passwords can be required for live status clients that want to connect, or
199
the master can allow arbitrary anonymous access to status information.
200
Twisted's <q>Perspective Broker</q> RPC system and careful design provides
201
security for the real-time status client port: those clients are read-only,
202
and cannot do anything to disrupt the build master or the build processes
203
running on the slaves.</p><p>Build slaves each have a name and password, and typically the project
204
coordinator would provide these to developers or volunteers who wished to
205
offer a host machine for builds. The build slaves connect to the master, so
206
they can be behind a firewall or NAT box, as long as they can still do a
207
checkout and compile. Registering build slaves helps prevent DoS attacks
208
where idiots attach fake build slaves that are not actually capable of
209
performing the build, displacing the actual slave.</p><p>Running a build slave on your machine is equivalent to giving a local
210
account to everyone who can commit code to the repository. Any such
211
developer could add an <q><code>rm -rf /</code></q> or code to start a
212
remotely-accessible shell to a Makefile and then do naughty things with the
213
account under which the build slave was launched. If this is a concern, the
214
build slave can be run inside a chroot jail or other means (like a
215
user-mode-linux sub-kernel), as long as it is still capable of checking out
216
a tree and running all commands necessary for the build.</p><h2>Inspirations and Competition<a name="auto10"></a></h2><p>Buildbot was originally inspired by Mozilla's Tinderbox project, but is
217
intended to conserve resources better (tinderbox uses dedicated build
218
machines to continually rebuild the tree, buildbot only rebuilds when
219
something has changed, and not even then for some builds) and deliver more
220
useful status information. I've seen other projects with similar goals
221
[CruiseControl on sourceforge is a java-based one], but I believe this one
222
is more flexible.</p><h2>Current Status<a name="auto11"></a></h2><p>Buildbot is currently under development. Basic builds, web-based status
223
reporting, and a basic Gtk+-based real-time status client are all
224
functional. More work is being done to make the build process more flexible
225
and easier to configure, add better status reporting, and add new kinds of
226
build steps. An instance has been running against the Twisted source tree
227
(which includes extensive unit tests) since February 2003.</p><h2>Future Directions<a name="auto12"></a></h2><p>Once the configuration process is streamlined and a release is made, the
228
next major feature is the <q>try</q> command. This will be a tool to which
229
they developer can submit a series of <em>potential</em> changes, before
230
they are actually checked in. <q>try</q> will assemble the changed and/or
231
new files and deliver them to the build master, which will then initiate a
232
build cycle with the current tree plus the potential changes. This build is
233
private, just for the developer who requested it, so failures will not be
234
announced publically. It will run all the usual tests from a full build and
235
report the results back to the developer. This way, a developer can verify
236
their changes, on more platforms then they directly have access to, with a
237
single command. By making it easy to thoroughly test their changes before
238
checkin, developers will have no excuse for breaking the build.</p><p>For projects that have unit tests which can be broken up into individual
239
test cases, the BuildProcess will have some steps to track each test case
240
separately. Developers will be able to look at the history of individual
241
tests, to find out things like <q>test_import passed until foo.c was changed
242
on monday, then failed until bar.c was changed last night</q>. This can also
243
be used to make breaking a previously-passing test a higher crime than
244
failing to fix an already-broken one. It can also help to detect
245
intermittent failures, ones that need to be fixed but which can't be blamed
246
on the last developer to commit changes. For test cases that represent new
247
functionality which has not yet been implemented, the list of failing test
248
cases can serve as a convenient TODO list.</p><p>If a large number of changes occur at the same time and the build fails
249
afterwards, a clever process could try modifying one file (or one
250
developer's files) at a time, to find one which is the actual cause of the
251
failure. Intermittent test failures could be identified by re-running the
252
failing test a number of times, looking for changes in the results.</p><p>Project-specific methods can be developed to identify the guilty
253
developer more precisely, for example grepping through source files for a
254
<q>Maintainer</q> tag, or a static table of module owners. Build failures
255
could be reported to the owner of the module as well as the developer who
256
made the offending change.</p><p>The Builder could update entries in a bug database automatically: a
257
change could have comments which claim it <q>fixes #12345</q>, so the bug DB is
258
queried to find out that test case ABC should be used to verify the bug. If
259
test ABC was failing before and now passes, the bug DB can be told to mark
260
#12345 as machine-verified. Such entries could also be used to identify
261
which tests to run, for a quick build that wasn't running the entire test
262
suite.</p><p>The Buildbot could be integrated into the release cycle: once per week,
263
any build which passes a full test suite is automatically tagged and release
264
tarballs are created.</p><p>It should be possible to create and configure the Builders from the main
265
status web page, at least for processes that use a generic <q>checkout /
266
make / make test</q> sequence. Twisted's <q>Woven</q> framework provides a
267
powerful HTML tool that could be used create the necessary controls.</p><p>If the master or a slave is interrupted during a build, it is frequently
268
possible to re-start the interrupted build. Some steps can simply be
269
re-invoked (<q>make</q> or <q>cvs update</q>). Interrupting others may
270
require the entire build to be re-started from scratch (<q>cvs export</q>).
271
The Buildbot will be extended so that both master and slaves can report to
272
the other what happened while they were disconnected, and as much work can
273
be salvaged as possible.</p><h2>More Information<a name="auto13"></a></h2><p>The BuildBot home page is at <a href="http://buildbot.sourceforge.net">http://buildbot.sourceforge.net</a>,
274
and has pointers to publically-visible BuildBot installations. Mailing
275
lists, bug reporting, and of course source downloads are reachable from that
276
page. </p><!-- $Id: buildbot.lore,v 1.1 2003/03/19 01:27:51 warner Exp $ --></div></body></html>
b'\\ No newline at end of file'