~mhr3/unity-scopes-api/merge-trunk

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
#
# Copyright (C) 2013 Canonical Ltd
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Lesser General Public License version 3 as
# published by the Free Software Foundation.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.
#
# Authored by: Michi Henning <michi.henning@canonical.com>
#

This is an explanation of the layout of the source tree, and the
conventions used by this library. Please read this before making changes
to the source!

For instructions on how to build the code, see the INSTALL file.


Build targets
-------------

The build produces the Unity scopes library (libunity-scopes).

TODO: Flesh this out


Source tree layout
------------------

At the top level, we have src, include, and test, which contain what
they suggest.

Underneath src and include, we have subdirectories that correspond to
namespaces, that is, if we have src/internal, that automatically means
that there is a namespace unity::scopes::internal. This one-to-one correspondence
makes it easier to work out what is defined where.


Namespaces
----------

The library maintains a compilation firewall, strictly separating
external API and internal implementation.  This reduces the chance of
breaking the API, and it makes it clear what parts of the API are for
public consumption.

Anything that is not public is implemented in a namespace called "internal".

TODO: More explanation of the namespaces used and for what.


Header file conventions
-----------------------

All header files are underneath include/scopes. Source code always includes
headers using the full pathname, for example:

  #include <scopes/ScopeBase.h>

All source code uses angle brackets (<...>) for #include
directives. Double quotes ("...") are never used because the lookup
semantics can be surprising if there are any headers with the same name
in the tree. (Not that this should happen, but it's better to be safe. If
there are no duplicate names, inclusion with angle brackets behaves the
same way as inclusion with double quotes.)

All headers that are for public consumption appear in include/scopes/*
(provided the path does not include "internal").

Public header directories contain a private header directory called
"internal". This directory contains all the headers that are private
and specific to the implementation.

No public header file is allowed to include any header from one of the
internal directories. (Doing so would break the compilation firewall
and also prevent API clients from compiling because the internal headers
are not installed when unity is deployed. (This is enforced by the tests.)

All header files, whether public or internal, compile stand-alone, that
is, no header relies on another header being included first in order to
compile. (This is enforced by the tests.)


Compilation firewall
--------------------

Public classes use the pimpl (pointer-to-implemtation) idiom, also
known as "envelope-letter idiom", "Cheshire Cat", or "Compiler firewall
idiom". This makes it less likely that an update to the code will break
the ABI.

Many classes that are part of the public API contain only one private data
member, namely the pimpl. No other data members (public, protected,
or private) are permitted.

For public classes with shallow-copy semantics (classes that are
immutable after instantiation), the code uses std::shared_ptr as the
pimpl type because std::shared_ptr provides the correct semantics. (See
unity::Exception and its derived classes for an example.)

For classes that are non-copyable, std::unique_ptr is used as the
pimpl type.

If classes form derivation hierarchies, by convention, the pimpl is a
private data member of the root base class. Derived classes can access the
pimpl by calling the protected pimpl() method of the root base class. This
avoids redundantly storing a separated pimpl to the derived type in each
derived class. Instead, polymorphic dispatch to virtual methods in the
derived classes is achieved by using a dynamic_cast to the derived
type to forward an invocation to the corresponding virtual method in
the derived implementation class.


Error handling
--------------

No error codes, period. All errors are reported via exceptions. All
exceptions thrown by the library are derived from unity::Exception (which
is abstract). unity::Exception provides a to_string() method. This method
returns the exception history and prints all exceptions that were raised
along a particular throw path, even if a low-level exception was caught
and translated into a higher level exception. This works for exceptions
derived from std::nested_exception. (The exception chaining ends when it
encounters an exception that does not derive from std::nested_exception.)

If API clients intercept unity exceptions and rethrow their own exceptions,
it is recommended that API clients derive their exceptions from
unity::Exception or, alternatively, derive them from std::nested_exception
and implement a similar to_string() operation that, if it encounters a
unity::Exception while following a chain, calls the to_string() method on
unity::Exception. That way, the entire exception history will be returned
from to_string().

Functions that throw exceptions should, if at all possible, provide
the strong exception guarantee. Otherwise, they must provide the basic
(weak) exception guarantee. If it is impossible to maintain even the basic
guarantee, the code must call abort() instead of throwing an exception.


Resource management
-------------------

The code uses the RAII (resource acquisition is initialization)
idiom extensively. If you find yourself writing free, delete, Xfree,
XCloseResource, or any other kind of clean-up function, your code has a
problem. Instead of explictly cleaning up in destructors, *immediately*
assign the resource to a unique_ptr or shared_ptr with a custom deleter.

This guarantees that the resource will be released without having to
remember anything.  In particular, it guarantees that the resource
will be released even if allocated in a constructor near the beginning
and something called by the constructor throws before the constructor
completes.

For resources that cannot be managed by unique_ptr or shared_ptr
(because the allocator does not return a pointer or returns a
pointer to an opaque type), use the RAII helper class in ResourcePtr.h.
It does much the same thing as unique_ptr, but is more permissive
in the types it can manage.

Note the naming convention established by util/DefinesPtrs.h. All
classes that return or accept smart pointers derive from DefinesPtrs,
which is a code injection template that creates typedefs for Ptr and
UPtr (shared_ptr and unique_ptr, respectively) as well as CPtr and UCPtr
(shared_ptr and unique_ptr to constant instances).

Ideally, classes are fully initialized by their constructor so it is
impossible for a class to exist but not being in a usable state.

For some classes, it is unavoidable to provide a default constructor (for
example, if we want to put instances into an array). It is also sometimes
impossible to fully construct an instance immediately, for example if
the instance is member variable and the necessary initialization data
is not available until some time afterwards.

In this case, the default constructor must initialize the class to
a fail-safe state, that is, it must behave sanely in the face of
methods being invoked on it. This means that calling a method on a
default-constructed instance should throw a logic exception to
indicate to the caller that the instance is not fully initialized yet.

Note that turning method calls on not-yet-initialized instances into
no-ops is usually a bad idea: the caller thinks that everything worked
fine when, in fact, it did nothing. If such no-op methods do something
sensible (that is, they can do their job even on an incompletely
initialized instance), this begs the question of why the instance wasn't
default-constructible in the first place...

To sum it up: try to enforce complete initialization from the
constructor wherever possible. If it is impossible to do that, follow
the principle of least surprise for the caller if a method is called on
a not-yet-initialized instance.


Code structure
--------------------------------

The code is arranged in three layers:

- scopes (top level)

This level contains everything that is part of the public API. No source
file in this level includes any header from the internal namespace (or below
the internal namespace).

- internal

This level contains the implementation logic for the scopes run time,
independent of the particular middleware.

- zmq_middleware

This level contains the middleware-specific parts of the code. No internal
header file (except for MiddlewareFactory) includes any header in
zmq_middleware.


Client invocation classes
-------------------------

The call chain for an outgoing invocation is like this (using Scope as an
example):

ScopeProxy -> Scope -> ScopeImpl -> MWScopeProxy -> ZmqScope

Clients invoke remote operations via a proxy. The *Proxy types (such as
ScopeProxy) are shared_ptrs to the corresponding proxy class (such as
Proxy). Each proxy class provides a method that corresponds to a remote
operation in the server.

The proxy class stores a pimpl, which bridges the gap into the internal
namespace. In the internal namespace, the actual proxy implementation is
called *Impl, for example, ScopeImpl.

The *Impl classes contain any additional methods and data members that might
be needed by the run time during remote method invocation. In effect, they
provide an interception point where we can do things behind the scenes
during an invocation (such as sending metrics and transparently forwarding
query cancellation.)

Each *Impl class provides access to the actual middleware-specific proxy as
a shared_ptr to a base class. The accessor for this proxy is called fwd()
and returns an instance of MW*Proxy, for example, MWScopeProxy. This
actually is a shared_ptr<MWScope>. In other words, the MWScopeProxy pointer
does for our implementation what ScopeProxy does for our API clients: it
shields our implementation from the specifics of the middleware that is in
use.

MWScope is an abstract base interface. It contains a pure virtual method for
each remote operation. The signature of each method is the same as the
signature of the corresponding method on the public proxy.

The concrete implementation of the MWScope interface is provided by
ZmqScope.h. ZmqScope.h provides the client-side implementation of each
remote operation. The job of the implementation is to convert the
middleware-independent input parameters into their middleware-specific
counterparts, invoke the operation, wait for any return values, and convert
the middleware-specific return values into their middleware-independent
counterparts. For Zmq, the proxy implementation also drives the capnproto
marshaling and unmarshaling.


Server dispatch classes
-----------------------

The call chain for an incoming invocation is like this (using Scope as an
example):

ObjectAdapter -> ScopeI -> ScopeObject

On the server side, when a method invocation arrives over the wire, it ends
up in an instance of ObjectAdapter. Each network endpoint has one adapter.
The job of the adapter is to listen for incoming requests, map each request
to the correct C++ target object, and invoke a callback that corresponds to
the operation on that target object.

The endpoint to which a request is sent by a client implicitly identifies
which adapter handles the incoming request.  Each request that arrives over
the wire carries an object identity and an operation name. The adapter
unmarshals this information and then looks for a C++ object (called a
"servant") that was registered for that identity with the middleware to
receive requests of a particular type. Each adapter can have multiple C++
servants, each of which handles requests for a particular identity. The
different servants can implement objects of different types.

Each adapter contains a table that maps the incoming identity to the
corresponding servant. The key to the table is the identity, and the lookup
value is a shared_ptr to the servant. The table is populated by the
add_*_object() methods on MiddlewareBase.

If the adapter cannot find a servant with an identity that matches the
request, it marshals an exception back to the client. Otherwise, it
dispatches the request.

The concrete servants are of type *I, for example, ScopeI. (The "I" is short
for Implementation.) The ScopeI class is the middleware-specific server-side
equivalent of the ZmqScope client-side proxy. All servants derive from a
base ServantBase. ServantBase provides a generic dispatch_() method. The
job of dispatch_() is to check whether the servant can actually handle
an incoming operation. For example, if the incoming operation name is
"foo", but the servant does not provide a method called "foo", it sends
an OperationNotExistException back to the client.  Otherwise,
dispatch_() calls the corresponding method on the servant and waits for
the method to complete. The invocation to the servant is made in a
try-catch block; if the servant throws an exception, dispatch_() takes
care of returning an exception to the client.

The concrete servant method called by dispatch_ (such as
ScopeI::create_query) now unmarshals any in-parameters, translates them to
the middleware-independent equivalents, forwards the invocation to the
middleware-independent servant, and translates and marshals any return
values or exceptions.

Each servant class stores a pointer to a delegate. The delegate
implementation lives in the internal namespace and does not know by which
middleware it is called. The delegates are middleware-independent servants,
called *Object, for example, ScopeObject. Each *Object class implements a
method for the middleware-specific servant to call, with parameters that
correspond to the operation that was invoked by the client. In other words,
the *Object classes implement the server-side behavior of the operations
that clients call, and the servant classes forward an incoming request to
their corresponding *Object instance. This allows the middleware-independent
part of run time to implement operations without having to include and
middleware-specific header files.

If you think of ScopeProxy as a pointer that can point at an object in a
remote address space, then ScopeObject is the corresponding server-side
target that implements the methods that the client can invoke on the proxy.


Capnproto definitions
---------------------

The capnproto definitions live in zmq_middleware/capnproto and define what
is marshaled over the wire. (As far as zmq is concerned, things that go over
the wire are just opaque messages, that is, blobs of bytes. Zmq knows about
message boundaries, but knows nothing about what's inside them.)

Message.capnp defines the message headers.

A request contains the request mode (oneway or twoway), the identity of the
servant, the operation name, followed by the in-parameters as a blob (a
capnproto Object).

A response (sent only for twoway invocations) contains a status that
indicates success, run-time exception, or user exception, plus any return
values as a blob.

Success means the invocation was successful, and the return values are
inside the payload blob.  If the invocation failed, it can be for a variety
of reasons, some unexpected, some not. Unexpected reasons are things such as
general failures during dispatch, failure to locate the identity for a
servant or the operation within the servant, and any unexpected exceptions
that the operation implementation might throw. Unexpected errors are
indicated as a runtimeException. At the moment, there are two special
runtime exceptions (ObjectNotExist and OperationNotExist), which indicate
failure to find a servant or the correct operation. All other error
conditions (such as an operation implementation throwing 42), the client
gets an UnknownUserException with an error string.

Expected exceptions are exceptions that an operation is expected to produce.
(Think of Java exception specifications.) For example, NotFoundException can
be thrown by the Registry::get_metadata() method. To deal with these, the
servant implementation (ZmqRegistry) calls get_metadata() on the delegate in
a try-catch block that handles NotFoundException separately. If that
exception is thrown, ZmqRegistry::get_metadata() populates the payload that
is marshaled with the response with the exception details and sets the error
status in the response accordingly.

The renaming capnp definitions, such as Scope.capnp, each define the
operations and parameters for the corresponding interface. There is a
definition for each operation, <operation_name>Request, that lists all the
in-parameters and, for twoway methods, and <operation_name>Response (for
twoway operations), that contains a union of the return values and
exceptions that the operation can raise. (Oneway operations do not marshal
anything in the return direction and so do not have any Response
definitions.)


Loose ends
----------

Things that need fixing or checking are marked with a TODO comment in
the code.

Things that are work-arounds for bugs in the compiler or libraries are
marked with a BUGFIX comment.

Things that are dubious in some way with no better solution available
are marked with a HACK comment.


Style
-----

Consider running astyle over the code before checking it in.
See astyle-config for more details.


Test suite
----------

The test suite lives underneath the test directory.

test/headers contains tests relating to header file integrity.

test/gtest contains the C++ tests (which use Google test).

The Google gtest authors are adamant that it is a bad idea to just
link with a pre-installed version of libgtest.a. Therefore, libgtest.a
(and libgtest_main.a) are created as part of the build and can be found
in test/gtest/libgtest.

See the INSTALL file for how to run the tests, particularly the caveat
about "make test" not rebuilding the tests!


Building and installing the code
--------------------------------

See the INSTALL file.


Development hints and tricks
----------------------------

See the HACKING file.