~vila/byoci/trunk

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
===============
 babune reborn
===============

This describes re-building babune (the now defunct BAzaar BUilbot NEt) on
top of a byoci setup.

Architecture
============

Babune is composed of a jenkins master, a monitor and slaves providing lxd
containers.

All test jobs occur inside an lxd container, only the push to launchpad
occur on the slaves themselves so that the needed credentials are never
exposed to test jobs.

The monitor is used to create and update the jobs and the slaves from a
version controlled branch.


Conventions
===========

The ``{byoci}.name`` notation used in this file and the project
documentation refers to the configuration options in ``byov.conf``,
``byov.conf.d`` files and scripts.

There are two namespaces, ``testing`` and ``production`` to define the
corresponding byoci setups, so, for example, ``{byoci}.slaves`` refers
to the slaves and will resolve to ``production.slaves`` or
``testing.slaves`` depending on the context.

Actually there is a third namespace ``selftest`` which is only used
internally for tests.

Production
==========

Babune is deployed in production on a private host (FIXME ;).

The people administering babune listed in ``{byoci}.admin.users``
(referenced by their launchpad login).

The bot used to create/update the jobs and the slaves is
``{byoci}.monitor.user``.

The bot owning the trunk and the release branches is
``{byoci}.landing.user``.

Depending on your participation in the various teams, you may or not have
the needed credentials to administer the byoci instances, create or update
jobs, run them, approve branches and land them.

Admins setting up the breezy byoci need to add a stanza in their
``~/.ssh/config`` to provide ssh access to ``byov`` vms on
``ci.breezy-vcs.org``, something like::

  Host 10.79.10.*
     ProxyCommand ssh ci.breezy-vcs.org nc -q0 %h %p

Double-check the IP range with ``lxc network show lxdbr0`` on
``ci.breezy-vcs.org``.

But everybody can setup a local byoci for testing purposes (see `Test
setup`_).

More administrative details available in `doc/admin.rst`.


Jobs
====

Sooo, jenkins is about running jobs.

The jobs for byoci are defined here (this branch) and created/updated in
jenkins with jenkins-job-builder from the openstack project
(http://docs.openstack.org/infra/jenkins-job-builder/) from the `monitor`
with an API token created during `master` setup.

``{{byoci}.monitor}`` (aka a container with a name starting with
``{byoci.prefix}-monitor`` in ``byov.conf`` is the container responsible for
creating, updating and deleting the jobs.

/!\ The production deployment is not yet clearly defined, the instructions
below will need to be updated once it is.

Create your monitor with::

  $ byo-ci-host setup brz-monitor-production

You're good to go, edit job descriptions under 'jobs/' then do:

ubuntu@brz-monitor:~$ cd /byoci && jenkins-jobs update --delete-old jobs/

Note: This a WIP and only applies to people who have the needed credentials,
see `Test setup`_ for how to setup a test environment.


Test setup
==========

Pre-requisites::

# FIXME: This should be ppa:brz/ci -- vila 2015-11-16
  $ sudo add-apt-repository ppa:vila/ci
  $ sudo apt-get update
  $ sudo apt-get install wget ssh lxd/xenial-backports lxd-client/xenial-backports python3-byov python3-byot bzr-byov

A ``{byoci}.secrets`` branch needs to be created to hold the various
credentials and its location made available in ``~/.config/byov`` as
``testing.host.secrets``. The value has to be an absolute path so that tests
can use it to derive ``selftest.host.secrets`` while being isolated.
  
If you've never used lxd before, see 'setup/lxd-on-tools' and make sure your
setup is compatible.

Since the slaves interact with launchpad, they are configured via
``{{byoci}.landing.user}`` for the launchpad login (and for commits from
``{byoci}.landing.fullname`` and ``{byoci}.landing.email}``).

The slaves also need an OAuth token and an ssh key pair (public and
private), the public key being uploaded to launchpad (manually).

For tests, setting ``{{byoci}.landing.user}``,
``{{byoci}.landing.fullname}`` and ``{{byoci}.landing.email}`` to yourself
and being a jenkins master admin is the most convenient. The monitor uses
your API token exported during setup. This options have default values setup
from your current launchpad login in ``bzr``.

To allow the slaves to access launchpad branches and your current byoci
branch, you need to generate a password-less key in the
``testing.secrets`` branch::

  $ ssh-keygen -f ../testing-secrets/slaves/ssh/keys/jenkins@brz-slave-testing -N '' -t rsa -b 4096 -C jenkins@brz-slave-testing

FIXME: byov needs an ssh key on the deploying host (which can be id_rsa
that's the default). If a dedicated key is used, it should be declared in
~/.config/byov/byov.conf. This needs to be a pre-requisite and tested as such
-- vila 2018-02-07
  
Upload the public part to your launchpad account
https://launchpad.net/~/+editsshkeys.

Add the public part to your ~/.ssh/authorized_keys file.

Check that the pre-requisites are ok (report bugs if needed)::

  $ ./testing/pre-requisites

Run the script::
  
  $ ./setup/byoci

This will end displaying the jenkins url, something like:

  Jenkins master is at http://192.168.0.xxx:8080

See `doc/secrets` if you need to deal with landings and jobs requiring
secrets.

Don't forget to label slaves 'production' as needed.

Pending issues
==============

jenkins
=======

IRC bot reporting
-----------------

There is currently no failure reporting to appropriate irc channels.

jenkins UI
----------

- Developers should be able to see the workspace
- API access should be granted for managing views


views
-----

All views are managed manually through the jenkins UI. We need API access to
be able to create/update views specific to each project (including one for
byoci itself).

webhooks
--------

Receiving webhooks on jenkins requires writing some java code. There is an
existing plugin for github that could be used as a starting point (the
comments on the issues are not encouraging though :-/).

It would probably be simpler, cleaner and more reliable to just have a
python app to revceive the webhooks and trigger the jobs. See `brain`_.

job triggering
==============

Most jobs are triggered if some condition is verified. The 'trigger-X' jobs
runs every 5 mins.

There are two main event families we want to react to:

branch
------

If a branch is created or updated, it means a dev may want to run some
tests.

Only changes to known (mostly trunk) branches are handled for now.

review
------

If a review is created or updated, it means one or several devs agreed that
the associated branch should pass some tests (it may happen during the
review discussion or after a specific stage (top approved)).


In both cases, launchpad (or github) webhooks provide such events. We used
to poll for those events for ubuntone, partly from tarmac, partly from
specific code. This caused races and created noise *by design*, time to move
on ;)

<cough> until we get webhooks, approved proposals are checked every 5 minutes.


brain
=====

Jenkins needs to stay as dumb as possible. The least it does, the best
chance there is it'll do it well and reliably.

That's the #1 reason to not use more plugins than strictly necessary.

jenkins runs jobs and keeps test results (rotated as needed).

The scheduling is: one job on one slave at a time, scaling slaves
horizontally enhance the ci bandwith.

This makes the jobs simpler to write: "I have the whole ressources" is
simpler than "I should share with many foreigners doing unknown things".

This also makes the scheduling simpler: one executor per slave. Done.

So the "brain" should be elsewhere.

All jobs can still be run manually so the brain can be down without blocking
the CI service.

It can receive webhooks from launchpad and github and there are plenty of
wsgi and flask repositories on github, for github.

$ xdg-open https://github.com/carlos-jenkins/python-github-webhooks https://github.com/razius/github-webhook-handler https://github.com/bloomberg/python-github-webhook/blob/master/github_webhook/webhook.py

there may be others...

http://eli.thegreenplace.net/2014/07/09/payload-server-in-python-3-for-github-webhooks
seems to capture the smallest implementation.

remote jobs
===========

Some tests happen on different CI sites, we may need to import them, trigger
them or react on their success or failures.