47
32
world readable but only writable by the security (ticket granting)
48
33
agent, which is responsible for creating principals.
38
Ensemble relies on the security facilities provided by the zookeeper's
39
coordination storage, whereby zookeeper automatically restricts access
40
to each node, based on the ACL permission map on each node. This ACL
41
facility maps permissions to principal identity tokens. Zookeeper
42
provides permissions for read, write, delete, create, and admin access
43
to each node. Every zookeeper connection can associate principal
44
credentials to its connection, and all access by that connection is
45
validated against the per node ACL mapping.
54
An additional zk connected actor responsible for creating principals
51
An additional zookeeper connected actor responsible for creating principals
55
52
and providing an up to date token database.
57
54
The security agent manages a token database (definition to follow),
58
55
and provides for the creation of new principals and handing out their
59
56
hash tokens to inquiring parties.
62
Passing credentials to clients
63
--------------------------------------
65
How the system passes credentials is a critical aspect to managing
66
principal access. Instead of passing principals credentials directly
67
via insecure channels, an actor creating another actor also
68
establishes a principal creation token via the security agent. The
69
principal creation token is a one time use string to create a
70
principal and its password, and update. If a malicious user intercepts
71
the token and uses it, compared with passing credentials directly it
72
minimizes the time that a third party has to perform such an
73
interception. Moreover invalid use of a token can be logged to as a
74
foresenic information.
76
Creating the initial principals. During bootstrap there are
77
Clients interact with the tgs to obtain principal, a principal.
79
Global read access to token by name.
81
OTP for principal creation.
83
The clients are handed an initial token (separate than the auth token)
84
which will be consumed by the TGS when creating a principal. This is
87
Access for provisioning agents
89
OTP for principal access.
92
The token database will need to resolve services to service ids as service names are reusable.
97
Privilege Escalation Scenarios
98
---------------------------------------
100
We have 5 different levels of escalation atm,
102
container escalation (service unit environment)
103
machine escalation (virtual machine),
104
agent escalation (a malicious zk connected actor),
106
Beyond that we have escalations which are effectively fatal, as they have access to sensitive data.
107
- ensemble enviroment zookeeper on disk data (ie bootstrap machine)
108
- trusted agent escalation (provisioning agent, bootstrap machine agent).
110
The system is comprised of a number of actors connecting to and
111
communicating via a shared storage. The shared storage provides ACLs
112
and we provide to outline the communication channels between actors as
113
possible attack vectors.
61
Each actor employs a security policy, to determine the ACL map for a given
62
node path that may create. The policy simply takes the path to the node
63
to be created, and returns back an ACL map that can be set on the node.
65
Creating principals for actors
66
------------------------------
68
How the system passes credentials to an actor is a critical aspect to
69
managing principals securely. Every actor in the system needs its own
70
unique principal, to provide an auth identity, the credentials for a
71
principal are known only to the actor utilizing them and transiently
72
the security agent when they are created.
74
Instead of passing principals credentials directly via insecure
75
channels, an actor creating another actor also establishes a principal
76
creation token via the security agent. The principal creation token is
77
a one time use string which can be used to create a principal and its
78
password, and update the token database.
80
The security agent has a simple policy in place regarding principal
81
names and which actors can create them, ie. a provisioning agent can
82
create machine principals, but not service unit principals.
84
If a malicious user intercepts the token and uses it, compared with
85
passing credentials directly it minimizes the time that a third party
86
has to perform such an interception. Moreover invalid use of a token
87
can be logged as foresenic information.
89
One question that emerges with the use of a separate agent for creating
90
identities, is how agents needed for bootstrap recieve their credentials.
92
- The bootstrap can utilize a specialized OTP interface with a precreated
93
known value, which it can use to initialize the tree.
95
Encrypted zookeeper communications
96
----------------------------------
98
As zookeeper does not currently support SSL/TLS transport level
99
security, Ensemble utilizes SSH port forwarding to ensure encrypted
100
communications to zookeeper. One significant lacking to this approach,
101
is that any process on the set of ensemble machines can attempt to
102
connect zookeeper to brute force principal passwords.
107
Certain data stored within zookeeper, is by its nature privileged and
108
should only be shared with agents requiring it for their function. For
109
example the Ensemble provider credentials should only be exposed to
110
the provisioning agent, as its required for it to function, any
111
additional access to the data, would be regarded as a data escalation
114
Additionally services utilize relations to communicate with each
115
other, every service unit of the services participating within a
116
relation gets write access only to its own node within the relation,
117
and has read access to all service unit relation settings. An
118
unrelated service unit from a different service, is not allowed to
119
read any settings from the relation.
125
Ensemble is comprised of a number of actors connecting to and
126
communicating via a shared storage. When two services enter into a
127
relation, a private bidirectional channel is created for them to
130
Ensemble ensures that the zookeeper nodes used for this communication
131
are subject to the proper ACL constraints such that unrelated services
132
are unable to access them.
134
But these relations represent adhoc inter machine communication, which
135
are formula defined. A malicious agent could possibly abuse one of
136
these protocols to further compromise additional agents. Unlike other
137
attack vectors in ensemble, this is one that ensemble can only make
138
minimal safety guarantees regarding, outside of perhaps a simple
139
validation of relation data (currently treated as a binary blob) with
140
relation type associated schemas.
115
142
The formulas executed by the unit agent provide for user executed code
116
143
done within an lxc container (with root privileges). LXC provides
117
144
limited support for security against root in a container, so a
119
146
those of the other units on a machine.
123
Port Access to services
124
-----------------------
126
If we don’t have static information, how can we prevent port conflicts
127
when doing unit placement, short answer, we can’t. Now we need a way
128
for services to interrogate information on open ports on their machine
129
so they can select a non-conflicting port (container network is
130
separate than the machine so no way of identifying within the
131
container). So let’s say thats fine for app servers, now we connect a
132
proxy service to them, and we have a defined traffic port, ideally
133
we’d just assign a dns entry to the proxy service, but now we have a
134
problem in that we have a port offset on the url.
138
-------------------------
149
Privilege Escalation Scenarios
150
------------------------------
152
We have serveral different levels of escalation within ensemble for
153
malicious code that need to be considered.
158
All formula hooks are executed within an lxc container to give a
159
minimally isolated environment. This lxc container is rather trivially
160
exploitable to gain root access on the machine, as formulas execute
161
as root within the container and lxc provides minimal security guarantees
162
atm, which leads to the next escalation level.
164
Future work is needed to provide better security around lxc
165
integration, perhaps via integration of apparmor and ongoing lxc
171
A machine is considered compromised if malicious code has root access
172
on the machine, all service units colocated on the machine are also
173
considered compromised if this occurs.
178
An agent is considered compromised if malicious code has an open zookeeper
179
connection with a valid actor principal identity. The malicious code
180
has access to all data exposed via ACL to the compromised identity.
182
Beyond these generic scenarios we have particular escalations which
183
are effectively fatal, as they entail access to sensitive data that
184
spans the ensemble environment or machine provider.
186
A bootstrap machine compromise which allow for disk access could be
187
considered fatal as the Ensemble shared state (zookeeper) data is
190
Certain agents like the provisioning agent, compromise of whose identity
191
would allow malicious code to utilize the machine provider credentials.
194
Access to Deployed services
195
----------------------------
197
A plan for controlled public access to deployed services is provided
198
separately by the expose-services specification.
200
Currently all internal access within a machine provider environment
201
like ec2 is unfiltered.
203
In future we should have machine level firewalling to allow access
204
between services based on their relations.
140
209
SSH Host Identity Checks
141
- we should pull the ssh key of the machine into zk, so connections to a given machine can verify against valid keys of environment machines
143
Formula Storage must be referenced by
211
we should pull the ssh key of the machine into zk, so connections to a
212
given machine can verify against valid keys of environment machines
216
Currently the formula storage access is referenced by a storage key
217
which is retrieved via the machine provider storage interface. This
218
requires access to the machine provider credentials by Formula Storage
219
by machine agents, which they shouldn't need.
221
- Security Agent & Token Database
222
- Security Policy (Path Based ACL generator)
223
- Connections w/ Principal