13
13
compliance with the License. You should have received a copy of the
14
14
Erlang Public License along with this software. If not, it can be
15
15
retrieved online at http://www.erlang.org/.
17
17
Software distributed under the License is distributed on an "AS IS"
18
18
basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See
19
19
the License for the specific language governing rights and limitations
24
<title>The SSL Protocol</title>
25
<prepared>Peter Högfeldt</prepared>
27
<date>2003-04-28</date>
24
<title>Transport Layer Security (TLS) and its predecessor, Secure Socket Layer (SSL)</title>
29
25
<file>ssl_protocol.xml</file>
31
<p>Here we provide a short introduction to the SSL protocol. We only
32
consider those part of the protocol that are important from a
33
programming point of view.
35
<p>For a very good general introduction to SSL and TLS see the book
36
<cite id="rescorla"></cite>.
38
<p><em>Outline:</em></p>
39
<list type="bulleted">
40
<item>Two types of connections - connection: handshake, data transfer, and
42
SSL/TLS protocol - server must have certificate - what the the
43
server sends to the client - client may verify the server -
44
server may ask client for certificate - what the client sends to
45
the server - server may then verify the client - verification -
46
certificate chains - root certificates - public keys - key
47
agreement - purpose of certificate - references</item>
51
<title>SSL Connections</title>
52
<p>The SSL protocol is implemented on top of the TCP/IP protocol.
53
From an endpoint view it also has the same type of connections
54
as that protocol, almost always created by calls to socket
55
interface functions <em>listen</em>, <em>accept</em> and
56
<em>connect</em>. The endpoints are <em>servers</em> and
59
<p>A <em>server</em><em>listen</em>s for connections on a
60
specific address and port. This is done once. The server then
61
<em>accept</em>s each connections on that same address and
62
port. This is typically done indefinitely many times.
64
<p>A <em>client</em> connects to a server on a specific address
65
and port. For each purpose this is done once.
67
<p>For a plain TCP/IP connection the establishment of a connection
68
(through an accept or a connect) is followed by data transfer between
69
the client and server, finally ended by a connection close.
71
<p>An SSL connection also consists of data transfer and connection
72
close, However, the data transfer contains encrypted data, and
73
in order to establish the encryption parameters, the data
74
transfer is preceded by an SSL <em>handshake</em>. In this
75
handshake the server plays a dominant role, and the main
76
instrument used in achieving a valid SSL connection is the
77
server's <em>certificate</em>. We consider certificates in the
78
next section, and the SSL handshake in a subsequent section.</p>
82
<title>Certificates</title>
83
<p>A certificate is similar to a driver's license, or a
84
passport. The holder of the certificate is called the
85
<em>subject</em>. First of all the certificate identifies the
86
subject in terms of the name of the subject, its postal address,
87
country name, company name (if applicable), etc.
89
<p>Although a driver's license is always issued by a well-known and
90
distinct authority, a certificate may have an <em>issuer</em>
91
that is not so well-known. Therefore a certificate also always
92
contains information on the issuer of the certificate. That
93
information is of the same type as the information on the
94
subject. The issuer of a certificate also signs the certificate
95
with a <em>digital signature</em> (the signature is an inherent
96
part of the certificate), which allow others to verify that the
97
issuer really is the issuer of the certificate.
99
<p>Now that a certificate can be checked by verifying the
100
signature of the issuer, the question is how to trust the
101
issuer. The answer to this question is to require that there is
102
a certificate for the issuer as well. That issuer has in turn an
103
issuer, which must also have a certificate, and so on. This
104
<em>certificate chain</em> has to have en end, which then must
105
be a certificate that is trusted by other means. We shall cover
106
this problem of <em>authentication</em> in a subsequent
112
<title>Encryption Algorithms</title>
113
<p>An encryption algorithm is a mathematical algorithm for
114
encryption and decryption of messages (arrays of bytes,
115
say). The algorithm as such is always required to be publicly
116
known, otherwise its strength cannot be evaluated, and hence it
117
cannot be used reliably. The secrecy of an encrypted message is
118
not achieved by the secrecy of the algorithm used, but by the
119
secrecy of the <em>keys</em> used as input to the encryption and
120
decryption algorithms. For an account of cryptography in general
121
see <cite id="schneier"></cite>.
123
<p>There are two classes of encryption algorithms: <em>symmetric key</em> algorithms and <em>public key</em> algorithms. Both
124
types of algorithms are used in the SSL protocol.
126
<p>In the sequel we assume holders of keys keep them secret (except
127
public keys) and that they in that sense are trusted. How a
128
holder of a secret key is proved to be the one it claims to be
129
is a question of <em>authentication</em>, which, in the context
130
of the SSL protocol, is described in a section further below.
134
<title>Symmetric Key Algorithms</title>
135
<p>A <em>symmetric key</em> algorithm has one key only. The key
136
is used for both encryption and decryption. Obviously the key
137
of a symmetric key algorithm must always be kept secret by the
138
users of the key. DES is an example of a symmetric key
141
<p>Symmetric key algorithms are fast compared to public key
142
algorithms. They are therefore typically used for encrypting
148
<title>Public Key Algorithms</title>
149
<p>A <em>public key</em> algorithm has two keys. Any of the two
150
keys can be used for encryption. A message encrypted with one
151
of the keys, can only be decrypted with the other key. One of
152
the keys is public (known to the world), while the other key
153
is private (i.e. kept secret) by the owner of the two keys.
155
<p>RSA is an example of a public key algorithm.
157
<p>Public key algorithms are slow compared to symmetric key
158
algorithms, and they are therefore seldom used for bulk data
159
encryption. They are therefore only used in cases where the
160
fact that one key is public and the other is private, provides
161
features that cannot be provided by symmetric algorithms.
166
<title>Digital Signature Algorithms</title>
167
<p>An interesting feature of a public key algorithm is that its
168
public and private keys can both be used for encryption.
169
Anyone can use the public key to encrypt a message, and send
170
that message to the owner of the private key, and be sure of
171
that only the holder of the private key can decrypt the
174
<p>On the other hand, the owner of the private key can encrypt a
175
message with the private key, thus obtaining an encrypted
176
message that can decrypted by anyone having the public key.
178
<p>The last approach can be used as a digital signature
179
algorithm. The holder of the private key signs an array of
180
bytes by performing a specified well-known <em>message digest algorithm</em> to compute a hash of the array, encrypts the
181
hash value with its private key, an then presents the original
182
array, the name of the digest algorithm, and the encryption of
183
the hash value as a <em>signed array of bytes</em>.
185
<p>Now anyone having the public key, can decrypt the encrypted
186
hash value with that key, compute the hash with the specified
187
digest algorithm, and check that the hash values compare equal
188
in order to verify that the original array was indeed signed
189
by the holder of the private key.
191
<p>What we have accounted for so far is by no means all that can
192
be said about digital signatures (see <cite id="schneier"></cite>for
198
<title>Message Digests Algorithms</title>
199
<p>A message digest algorithm is a hash function that accepts
200
an array bytes of arbitrary but finite length of input, and
201
outputs an array of bytes of fixed length. Such an algorithm
202
is also required to be very hard to invert.
204
<p>MD5 (16 bytes output) and SHA1 (20 bytes output) are examples
205
of message digest algorithms.
211
<title>SSL Handshake</title>
212
<p>The main purpose of the handshake performed before an an SSL
213
connection is established is to negotiate the encryption
214
algorithm and key to be used for the bulk data transfer between
215
the client and the server. We are writing <em>the</em> key,
216
since the algorithm to choose for bulk encryption one of the
217
symmetric algorithms.
219
<p>There is thus only one key to agree upon, and obviously that
220
key has to be kept secret between the client and the server. To
221
obtain that the handshake has to be encrypted as well.
223
<p>The SSL protocol requires that the server always sends its
224
certificate to the client in the beginning of the handshake. The
225
client then retrieves the server's public key from the
226
certificate, which means that the client can use the server's
227
public key to encrypt messages to the server, and the server can
228
decrypt those messages with its private key. Similarly, the
229
server can encrypt messages to the client with its private key,
230
and the client can decrypt messages with the server's public
231
key. It is thus is with the server's public and private keys
232
that messages in the handshake are encrypted and decrypted, and
233
hence the key agreed upon for symmetric encryption of bulk data
234
can be kept secret (there are more things to consider to really
235
keep it secret, see <cite id="rescorla"></cite>).
237
<p>The above indicates that the server does not care who is
238
connecting, and that only the client has the possibility to
239
properly identify the server based on the server's certificate.
240
That is indeed true in the minimal use of the protocol, but it
241
is possible to instruct the server to request the certificate of
242
the client, in order to have a means to identify the client, but
243
it is by no means required to establish an SSL connection.
245
<p>If a server request the client certificate, it verifies, as a
246
part of the protocol, that the client really holds the private
247
key of the certificate by sending the client a string of bytes
248
to encrypt with its private key, which the server then decrypts
249
with the client's public key, the result of which is compared
250
with the original string of bytes (a similar procedure is always
251
performed by the client when it has received the server's
254
<p>The way clients and servers <em>authenticate</em> each other,
255
i.e. proves that their respective peers are what they claim to
256
be, is the topic of the next section.
261
<title>Authentication</title>
262
<p>As we have already seen the reception of a certificate from a
263
peer is not enough to prove that the peer is authentic. More
264
certificates are needed, and we have to consider how certificates
265
are issued and on what grounds.
267
<p>Certificates are issued by <em>certification authorities</em>
268
(<em>CA</em>s) only. They issue certificates both for other CAs
269
and ordinary users (which are not CAs).
271
<p>Certain CAs are <em>top CAs</em>, i.e. they do not have a
272
certificate issued by another CA. Instead they issue their own
273
certificate, where the subject and issuer part of the
274
certificate are identical (such a certificate is called a
275
self-signed certificate). A top CA has to be well-known, and has
276
to have a publicly available policy telling on what grounds it
279
<p>There are a handful of top CAs in the world. You can examine the
280
certificates of several of them by clicking through the menus of
283
<p>A top CA typically issues certificates for other CAs, called
284
<em>intermediate CAs</em>, but possibly also to ordinary users. Thus
285
the certificates derivable from a top CA constitute a tree, where
286
the leaves of the tree are ordinary user certificates.
288
<p>A <em>certificate chain</em> is an ordered sequence of
289
certificates, <c>C1, C2, ..., Cn</c>, say, where <c>C1</c> is a
290
top CA certificate, and where <c>Cn</c> is an ordinary user
291
certificate, and where the holder of <c>C1</c> is the issuer of
292
<c>C2</c>, the holder of <c>C2</c> is the issuer of <c>C3</c>,
293
..., and the holder of <c>Cn-1</c> is the issuer of <c>Cn</c>,
294
the ordinary user certificate. The holders of <c>C2, C3, ..., Cn-1</c> are then intermediate CAs.
296
<p>Now to verify that a certificate chain is unbroken we have to
297
take the public key from each certificate <c>Ck</c>, and apply
298
that key to decrypt the signature of certificate <c>Ck-1</c>,
299
thus obtaining the message digest computed by the holder of the
300
<c>Ck</c> certificate, compute the real message digest of the
301
<c>Ck-1</c> certificate and compare the results. If they compare
302
equal the link of the chain between <c>Ck</c> and <c>Ck-1</c> is
303
considered to unbroken. This is done for each link k = 1, 2,
304
..., n-1. If all links are found to be unbroken, the user
305
certificate <c>Cn</c> is considered authenticated.
309
<title>Trusted Certificates</title>
310
<p>Now that there is a way to authenticate a certificate by
311
checking that all links of a certificate chain are unbroken,
312
the question is how you can be sure to trust the certificates
313
in the chain, and in particular the top CA certificate of the
316
<p>To provide an answer to that question consider the
317
perspective of a client, which have just received the
318
certificate of the server. In order to authenticate the server
319
the client has to construct a certificate chain and to prove
320
that the chain is unbroken. The client has to have a set of CA
321
certificates (top CA or intermediate CA certificates) not
322
obtained from the server, but obtained by other means. Those
323
certificates are kept <c>locally</c> by the client, and are
324
trusted by the client.
326
<p>More specifically, the client does not really have to have
327
top CA certificates in its local storage. In order to
328
authenticate a server it is sufficient for the client to
329
posses the trusted certificate of the issuer of the server
332
<p>Now that is not the whole story. A server can send an
333
(incomplete) certificate chain to its client, and then the
334
task of the client is to construct a certificate chain that
335
begins with a trusted certificate and ends with the server's
336
certificate. (A client can also send a chain to its server,
337
provided the server requested the client's certificate.)
339
<p>All this means that an unbroken certificate chain begins with
340
a trusted certificate (top CA or not), and ends with the peer
341
certificate. That is the end of the chain is obtained from the
342
peer, but the beginning of the chain is obtained from local
343
storage, which is considered trusted.
28
<p>The erlang ssl application currently supports SSL 3.0 and TLS 1.0
29
RFC 2246, and will in the future also support later versions of TLS.
30
SSL 2.0 is not supported.
33
<p>By default erlang ssl is run over the TCP/IP protocol even
34
though you could plug in an other reliable transport protocol
35
with the same API as gen_tcp.</p>
37
<p>If a client and server wants to use an upgrade mechanism, such as
38
defined by RFC2817, to upgrade a regular TCP/IP connection to a ssl
39
connection the erlang ssl API supports this. This can be useful for
40
things such as supporting HTTP and HTTPS on the same port and
41
implementing virtual hosting.
45
<title>Security overview</title>
47
<p>To achieve authentication and privacy the client and server will
48
perform a TLS Handshake procedure before transmitting or receiving
49
any data. During the handshake they agree on a protocol version and
50
cryptographic algorithms, they generate shared secrets using public
51
key cryptographics and optionally authenticate each other with
52
digital certificates.</p>
56
<title>Data Privacy and Integrity</title>
58
<p>A <em>symmetric key</em> algorithm has one key only. The key is
59
used for both encryption and decryption. These algorithms are fast
60
compared to public key algorithms (using two keys, a public and a
61
private one) and are therefore typically used for encrypting bulk
65
<p>The keys for the symmetric encryption are generated uniquely
66
for each connection and are based on a secret negotiated
67
in the TLS handshake. </p>
69
<p>The TLS handshake protocol and data transfer is run on top of
70
the TLS Record Protocol that uses a keyed-hash MAC (Message
71
Authenticity Code), or HMAC, to protect the message's data
72
integrity. From the TLS RFC "A Message Authentication Code is a
73
one-way hash computed from a message and some secret data. It is
74
difficult to forge without knowing the secret data. Its purpose is
75
to detect if the message has been altered."
81
<title>Digital Certificates</title>
82
<p>A certificate is similar to a driver's license, or a
83
passport. The holder of the certificate is called the
84
<em>subject</em>. The certificate is signed
85
with the private key of the issuer of the certificate. A chain
86
of trust is build by having the issuer in its turn being
87
certified by an other certificate and so on until you reach the
88
so called root certificate that is self signed i.e. issued
91
<p>Certificates are issued by <em>certification
92
authorities</em> (<em>CA</em>s) only. There are a handful of
93
top CAs in the world that issue root certificates. You can
94
examine the certificates of several of them by clicking
95
through the menus of your web browser.
100
<title>Authentication of Sender</title>
102
<p>Authentication of the sender is done by public key path
103
validation as defined in RFC 3280. Simplified that means that
104
each certificate in the certificate chain is issued by the one
105
before, the certificates attributes are valid ones, and the
106
root cert is a trusted cert that is present in the trusted
107
certs database kept by the peer.</p>
109
<p>The server will always send a certificate chain as part of
110
the TLS handshake, but the client will only send one if
111
the server requests it. If the client does not have
112
an appropriate certificate it may send an "empty" certificate
115
<p>The client may choose to accept some path evaluation errors
116
for instance a web browser may ask the user if they want to
117
accept an unknown CA root certificate. The server, if it request
118
a certificate, will on the other hand not accept any path validation
119
errors. It is configurable if the server should accept
120
or reject an "empty" certificate as response to
121
a certificate request.</p>
125
<title>TLS Sessions</title>
127
<p>From the TLS RFC "A TLS session is an association between a
128
client and a server. Sessions are created by the handshake
129
protocol. Sessions define a set of cryptographic security
130
parameters, which can be shared among multiple
131
connections. Sessions are used to avoid the expensive negotiation
132
of new security parameters for each connection."</p>
134
<p>Session data is by default kept by the ssl application in a
135
memory storage hence session data will be lost at application
136
restart or takeover. Users may define their own callback module
137
to handle session data storage if persistent data storage is
138
required. Session data will also be invalidated after 24 hours
139
from it was saved, for security reasons. It is of course
140
possible to configure the amount of time the session data should be
143
<p>Ssl clients will by default try to reuse an available session,
144
ssl servers will by default agree to reuse sessions when clients