~rogpeppe/juju-core/axwalk-lp1300889-disable-mongo-keyfile : revision 773.7.1

1

Charms in action

2

================

3

4

This document describes the behaviour of the go implementation of the unit

5

agent, whose behaviour differs in some respects from that of the python

6

implementation. This information is largely relevant to charm authors, and

7

potentially to developers interested in the unit agent.

8

9

Hooks

10

-----

11

12

A charm's direct action is entirely defined by its hooks. Hooks are executable

13

files in a charm's hooks directory; hooks with particular names will be invoked

14

by juju at particular times, and thereby cause the charm to run a unit of its

15

service and respond to changes in its environment.

16

17

Whenever a hook-worthy action takes place, juju tries to run the hook with the

18

appropriate name. If the hook doesn't exist, juju continues without error; if

19

it exists, it is invoked without arguments in a specific environment, and its

20

output is written to the unit's log. If it returns a non-zero exit code, juju

21

puts the unit into an error state and awaits resolution; otherwise it continues

22

to process environment changes as before.

23

24

In general, a unit will run hooks in a clear sequence, about which a number of

25

useful guarantees can and will be made. All such guarantees come with the caveat

26

that there is [TODO: will be: `remove-unit --force`] a mechanism for forcible

27

termination of a unit, and that a unit so terminated will just stop, dead, and

28

completely fail to run anything else ever again. This shouldn't actually be a

29

big deal in practice.

30

31

Errors in hooks

32

---------------

33

34

Hooks should be idempotent, because they can fail, and may need to be re-executed

35

from scratch. As a hook author, you don't have complete control over the times

36

your hook might be stopped: if the unit agent process is killed for any reason

37

while running a hook, then when it recovers it will treat that hook as having

38

failed, just as if it had returned a non-zero exit code, and request user

39

intervention.

40

41

It is unrealistic to expect great sophistication on the part of the average user,

42

and as a charm author you should expect that users will attempt to re-execute

43

failed hooks before attempting to investigate or understand the situation. You

44

should therefore make every effort to ensure your hooks are idempotent when

45

aborted and restarted.

46

47

[TODO: I have a vague feeling that `juju resolved` actually defaults to "just

48

pretend the hook ran successfully" mode. I'm not sure that's really the best

49

default, but I'm also not sure we're in a position to change the UI that much.]

50

51

The most sophisticated charms will consider the nature of its operations with

52

care, and will be prepared to internally retry any operations it suspects of

53

having failed transiently, to ensure that it only requests user intervention in

54

the most trying circumstances; and will also be careful to log any relevant

55

information or advice before signalling the error.

56

57

[TODO: I just thought; it would be really nice to have a juju-fail hook tool,

58

which would allow charm authors to explicity set the unit's error status to

59

something a bit more sophisticated than "X hook failed". Wishlist, really.]

60

61

Charm deployment

62

----------------

63

64

* A charm is deployed into a directory, managed by git, that is entirely

65

owned and controlled by juju.

66

* At certain times, control of the directory is ceded to the charm (by

67

running a hook) or to the user (by entering an error state).

68

* At these times, and only at these times, should the charm directory be

69

used by anything other than juju itself.

70

71

The most important consequence of this is that it is a mistake to conflate the

72

state of the charm with the state of the software deployed by the charm: it's

73

fine to store *charm* state in the charm directory, but the charm must deploy

74

its actual software elsewhere on the system.

75

76

To put it another way: once a charm has put the system into a state, the system

77

should remain in that state completely independently of the charm's existence,

78

and there is no mechanism by which deployed software can safely feed information

79

back into the charm of its own accord.

80

81

[TODO: this sucks a bit. We have plans for a tool called `juju-run`, which

82

would allow an arbitrary script to be invoked as though it were a hook at any

83

time (well, it'd block until no other hook were running, but still). Probably

84

isn't even that hard but it's still rolling around my brain, might either click

85

soon or be overridden by higher priorities and be left for ages. I'm less sure,

86

but have a suspicion, that `juju ssh <unit>` should also default to a juju-run

87

environment: primarily because, without this, in the context of forced upgrades,

88

the system cannot offer *any* guarantees about what it might suddenly do to the

89

charm directory while the user's doing things with it. The alternative is to

90

allow unguarded ssh, but tell people that they have to use something like

91

`juju-run --interactive` before they modify the charm dir; this feels somewhat

92

user-hostile, though.]

93

94

Execution environment

95

---------------------

96

97

Every hook is run in the deployed charm directory, in an environment with the

98

following characteristics:

99

100

* $PATH is prefixed by a directory containing command line tools through

101

which the hooks can interact with juju.

102

* $CHARM_DIR holds the path to the charm directory.

103

* $JUJU_UNIT_NAME holds the name of the local unit.

104

* $JUJU_CONTEXT_ID and $JUJU_AGENT_SOCKET are set (but should not be messed

105

with: the command line tools won't work without them).

106

107

Hook tools

108

----------

109

110

All hooks can directly use the following tools:

111

112

* juju-log (write arguments direct to juju's log (potentially redundant, hook

113

output is all logged anyway, but --debug may remain useful))

114

* unit-get (returns the local unit's private-address or public-address)

115

* open-port (marks the supplied port/protocol as ready to open when the

116

service is exposed)

117

* close-port (reverses the effect of open-port)

118

* config-get (get current service configuration values)

119

* relation-get (get the settings of some related unit)

120

* relation-set (write the local unit's relation settings)

121

* relation-ids (list all relations using a given charm relation)

122

* relation-list (list all units of a related service)

123

124

Within the context of a single hook execution, the above tools present a

125

sandboxed view of the system with the following properties:

126

127

* Any data retrieved corresponds to the real value of the underlying state at

128

some point in time.

129

* Data is only written back to state when the hook completes without error;

130

changes made by a failing hook will be discarded.

131

* Not actually sandboxed: open-port and close-port operate directly on state.

132

[TODO: this is a stupid bug that I must capture properly.]

133

134

Hook kinds

135

----------

136

137

There are 5 `unit hooks` with predefined names that can be implemented by any

138

charm:

139

140

* install

141

* start

142

* config-changed

143

* upgrade-charm

144

* stop

145

146

For every relation defined by a charm, an additional 4 `relation hooks` can be

147

implemented, named after the charm relation:

148

149

* <name>-relation-joined

150

* <name>-relation-changed

151

* <name>-relation-departed

152

* <name>-relation-broken

153

154

Unit hooks

155

----------

156

157

The `install` hook always runs once, and only once, before any other hook.

158

159

The `start` hook always runs once immediately after the install hook; there are

160

currently no other circumstances in which it will be called, but this may change

161

in the future.

162

163

The `config-changed` hook always runs once immediately after the start hook,

164

and likewise after the upgrade-charm hook. It also runs whenever the service

165

configuration changes, and when recovering from transient errors.

166

167

The `upgrade-charm` hook always runs once immediately after the charm directory

168

contents have been changed by an unforced charm upgrade operation, and *may* do

169

so after a forced upgrade; but will *not* be run after a forced upgrade from an

170

existing error state.

171

172

The `stop` hook is the last hook to be run before the unit is destroyed. In the

173

future, it may be called in other situations.

174

175

In normal operation, a unit will run at least the install, start, config-changed

176

and stop hooks over the course of its lifetime.

177

178

It should be noted that, while all hook tools are available to all hooks, the

179

relation-* tools are not useful to the install, start, and stop hooks; this is

180

because the first two are run before the unit has any opportunity to participate

181

in any relations, and the stop hooks will not be run while the unit is still

182

participating in one.

183

184

Relation hooks

185

--------------

186

187

For each charm relation, any or all of the 4 relation hooks can be implemented.

188

Relation hooks operate in an environment slightly different to that of unit

189

hooks, in the following ways:

190

191

* JUJU_RELATION is set to the name of the charm relation. This is of limited

192

value, because every relation hook already "knows" what charm relation it

193

was written for; that is, in the "foo-relation-joined" hook, JUJU_RELATION

194

is "foo".

195

* JUJU_RELATION_ID is more useful, because it serves as unique identifier for

196

a particular relation, and thereby allows the charm to handle distinct

197

relations over a single endpoint. In hooks for the "foo" charm relation,

198

JUJU_RELATION_ID always has the form "foo:<id>", where id uniquely but

199

opaquely identifies the runtime relation currently in play.

200

* The relation-* hook tools, which ordinarily require that a relation be

201

specified, assume they're being called with respect to the current

202

relation. The default can of course be overridden as usual.

203

204

Furthermore, all relation hooks except relation-broken are notifications about

205

some specific unit of a related service, and operate in an environment with the

206

following additional properties:

207

208

* JUJU_REMOTE_UNIT is set to the name of the current related unit.

209

* The relation-get hook tool, which ordinarily requires that a related unit

210

be specified, assumes that it is being called with respect to the current

211

related unit. The default can of course be overridden as usual.

212

213

For every relation in which a unit partcipates, hooks for the appropriate charm

214

relation are run according to the following rules.

215

216

The "relation-joined" hook always runs once when a related unit is first seen.

217

218

The "relation-changed" hook for a given unit always runs once immediately

219

following the relation-joined hook for that unit, and subsequently whenever

220

the related unit changes its settings (by calling relation-set and exiting

221

without error). Note that "immediately" only applies within the context of

222

this particular runtime relation -- that is, when "foo-relation-joined" is

223

run for unit "bar/99" in relation id "foo:123", the only guarantee is that

224

the next hook to be run *in relation id "foo:123"* will be "foo-relation-changed"

225

for "bar/99". Unit hooks may intervene, as may hooks for other relations,

226

and even for other "foo" relations.

227

228

The "relation-departed" hook for a given unit always runs once when a related

229

unit is no longer related. After the "relation-departed" hook has run, no

230

further notifications will be received from that unit; however, its settings

231

will remain accessible via relation-get for the complete lifetime of the

232

relation.

233

234

The "relation-broken" hook is not specific to any unit, and always runs once

235

when the local unit is ready to depart the relation itself. Before this hook

236

is run, a relation-departed hook will be executed for every unit known to be

237

related; it will never run while the relation appears to have members, but it

238

may be the first and only hook to run for a given relation. The stop hook will

239

not run while relations remain to be broken.

240

241

Relations in depth

242

------------------

243

244

A unit's `scope` consists of the group of units that are transitively connected

245

to that unit within a particular relation. So, for a globally-scoped relation,

246

that means every unit of each service in the relation; for a locally-scoped

247

relation, it means only those sets of units which are deployed alongside one

248

another. That is to say: a globally-scoped relation has a single unit scope,

249

whilst a locally-scoped relation has one for each principal unit.

250

251

When a unit becomes aware that it is a member of a relation, its only self-

252

directed action is to `join` its scope within that relation. This involves two

253

steps:

254

255

* Write initial relation settings (just one value, "private-address"), to

256

ensure that they will be available to observers before they're triggered

257

by the next step;

258

* Signal its existence, and role in the relation, to the rest of the system.

259

260

The unit then starts observing and reacting to any other units in its scope

261

which are playing a role in which it is interested. (Providers and requirers

262

only observe each other; peers observe all peers which are not themselves.)

263

264

Now, suppose that unit was the very first unit to join the relation; and let's

265

say it's a requirer. No provider units are present, so no hooks will fire. But,

266

when a provider unit joins the relation, the requirer and provider become aware

267

of each other almost simultaneously. (Similarly, the first two units in a peer

268

relation become aware of each other almost simultaneously.)

269

270

So, concurrently, the units on each side of the relation run their relation-joined

271

and relation-changed hooks with respect to their counterpart. The intent is that

272

they communicate appropriate information to each other to set up some sort of

273

connection, by using the relation-set and relation-get hook tools; but neither

274

unit is safe to assume that any particular setting has yet been set by its

275

counterpart.

276

277

This sounds kinda tricky to deal with, but merely requires disciplined use of

278

the relation-get tool. Advanced users may on occasion have cause to break the

279

following rules, but such situations are outside the scope of this document.

280

281

* In relation-changed hooks, use relation-get freely on units of the current

282

relation (those returned by calling relation-list without arguments), but

283

be prepared for missing data. This is best handled by exiting silently,

284

without error: there is no cause for alarm, because the hook will be run

285

again as soon as the related unit writes its settings.

286

* In relation-joined hooks, use only `relation-get private-address`.

287

* Otherwise, don't use relation-get.

288

289

In general, where it is practical to do so, it is also a good idea to restrict

290

use of relation-set to the relation-joined hook. This is by no means required,

291

or even guaranteed to be possible, but it is the usage pattern that integrates

292

most smoothly with the system, and is most likely to avoid causing fruitlessly

293

repeated executions of relation-changed hooks by counterpart units.

294

295

Departing relations

296

-------------------

297

298

A unit will depart a relation when either the relation or the unit itself is

299

marked for termination. In either case, it follows the same sequence:

300

301

* For every known related unit -- those which have joined and not yet

302

departed -- run the relation-departed hook.

303

* Run the relation-broken hook.

304

* `depart` from its scope in the relation.

305

306

The unit's departure from its scope will in turn be detected by units of the

307

related service, and cause them to run relation-departed hooks. A unit's

308

relation settings persist beyond its own departure from the relation; the

309

final unit to depart a relation marked for termination is responsible for

310

destroying the relation and all associated data.

311

312

Debugging charms

313

----------------

314

315

Facilities are currently not good.

316

317

* juju ssh

318

* juju debug-hooks [TODO: not implemented]

319

* juju debug-log [TODO: not implemented]

320