1
by Christoph Martin
Import upstream version 0.9.7d |
1 |
OpenSSL ASN1 Revision |
2 |
=====================
|
|
3 |
||
4 |
This document describes some of the issues relating to the new ASN1 code. |
|
5 |
||
6 |
Previous OpenSSL ASN1 problems |
|
7 |
=============================
|
|
8 |
||
9 |
OK why did the OpenSSL ASN1 code need revising in the first place? Well |
|
10 |
there are lots of reasons some of which are included below... |
|
11 |
||
12 |
1. The code is difficult to read and write. For every single ASN1 structure |
|
13 |
(e.g. SEQUENCE) four functions need to be written for new, free, encode and |
|
14 |
decode operations. This is a very painful and error prone operation. Very few |
|
15 |
people have ever written any OpenSSL ASN1 and those that have usually wish |
|
16 |
they hadn't. |
|
17 |
||
18 |
2. Partly because of 1. the code is bloated and takes up a disproportionate
|
|
19 |
amount of space. The SEQUENCE encoder is particularly bad: it essentially
|
|
20 |
contains two copies of the same operation, one to compute the SEQUENCE length
|
|
21 |
and the other to encode it.
|
|
22 |
||
23 |
3. The code is memory based: that is it expects to be able to read the whole
|
|
24 |
structure from memory. This is fine for small structures but if you have a
|
|
25 |
(say) 1Gb PKCS#7 signedData structure it isn't such a good idea... |
|
26 |
||
27 |
4. The code for the ASN1 IMPLICIT tag is evil. It is handled by temporarily |
|
28 |
changing the tag to the expected one, attempting to read it, then changing it |
|
29 |
back again. This means that decode buffers have to be writable even though they |
|
30 |
are ultimately unchanged. This gets in the way of constification. |
|
31 |
||
32 |
5. The handling of EXPLICIT isn't much better. It adds a chunk of code into |
|
33 |
the decoder and encoder for every EXPLICIT tag.
|
|
34 |
||
35 |
6. APPLICATION and PRIVATE tags aren't even supported at all. |
|
36 |
||
37 |
7. Even IMPLICIT isn't complete: there is no support for implicitly tagged |
|
38 |
types that are not OPTIONAL.
|
|
39 |
||
40 |
8. Much of the code assumes that a tag will fit in a single octet. This is
|
|
41 |
only true if the tag is 30 or less (mercifully tags over 30 are rare).
|
|
42 |
||
43 |
9. The ASN1 CHOICE type has to be largely handled manually, there aren't any |
|
44 |
macros that properly support it. |
|
45 |
||
46 |
10. Encoders have no concept of OPTIONAL and have no error checking. If the |
|
47 |
passed structure contains a NULL in a mandatory field it will not be encoded, |
|
48 |
resulting in an invalid structure. |
|
49 |
||
50 |
11. It is tricky to add ASN1 encoders and decoders to external applications. |
|
51 |
||
52 |
Template model |
|
53 |
==============
|
|
54 |
||
55 |
One of the major problems with revision is the sheer volume of the ASN1 code. |
|
56 |
Attempts to change (for example) the IMPLICIT behaviour would result in a |
|
57 |
modification of *every* single decode function. |
|
58 |
||
59 |
I decided to adopt a template based approach. I'm using the term 'template' |
|
60 |
in a manner similar to SNACC templates: it has nothing to do with C++
|
|
61 |
templates.
|
|
62 |
||
63 |
A template is a description of an ASN1 module as several constant C structures.
|
|
64 |
It describes in a machine readable way exactly how the ASN1 structure should
|
|
65 |
behave. If this template contains enough detail then it is possible to write
|
|
66 |
versions of new, free, encode, decode (and possibly others operations) that
|
|
67 |
operate on templates.
|
|
68 |
||
69 |
Instead of having to write code to handle each operation only a single
|
|
70 |
template needs to be written. If new operations are needed (such as a 'print' |
|
71 |
operation) only a single new template based function needs to be written
|
|
72 |
which will then automatically handle all existing templates.
|
|
73 |
||
74 |
Plans for revision
|
|
75 |
==================
|
|
76 |
||
77 |
The revision will consist of the following steps. Other than the first two
|
|
78 |
these can be handled in any order.
|
|
79 |
|
|
80 |
o Design and write template new, free, encode and decode operations, initially
|
|
81 |
memory based. *DONE*
|
|
82 |
||
83 |
o Convert existing ASN1 code to template form. *IN PROGRESS*
|
|
84 |
||
85 |
o Convert an existing ASN1 compiler (probably SNACC) to output templates
|
|
86 |
in OpenSSL form.
|
|
87 |
||
88 |
o Add support for BIO based ASN1 encoders and decoders to handle large
|
|
89 |
structures, initially blocking I/O.
|
|
90 |
||
91 |
o Add support for non blocking I/O: this is quite a bit harder than blocking
|
|
92 |
I/O.
|
|
93 |
||
94 |
o Add new ASN1 structures, such as OCSP, CRMF, S/MIME v3 (CMS), attribute
|
|
95 |
certificates etc etc.
|
|
96 |
||
97 |
Description of major changes
|
|
98 |
============================
|
|
99 |
||
100 |
The BOOLEAN type now takes three values. 0xff is TRUE, 0 is FALSE and -1 is
|
|
101 |
absent. The meaning of absent depends on the context. If for example the
|
|
102 |
boolean type is DEFAULT FALSE (as in the case of the critical flag for
|
|
103 |
certificate extensions) then -1 is FALSE, if DEFAULT TRUE then -1 is TRUE.
|
|
104 |
Usually the value will only ever be read via an API which will hide this from
|
|
105 |
an application.
|
|
106 |
||
107 |
There is an evil bug in the old ASN1 code that mishandles OPTIONAL with
|
|
108 |
SEQUENCE OF or SET OF. These are both implemented as a STACK structure. The
|
|
109 |
old code would omit the structure if the STACK was NULL (which is fine) or if
|
|
110 |
it had zero elements (which is NOT OK). This causes problems because an empty
|
|
111 |
SEQUENCE OF or SET OF will result in an empty STACK when it is decoded but when
|
|
112 |
it is encoded it will be omitted resulting in different encodings. The new code
|
|
113 |
only omits the encoding if the STACK is NULL, if it contains zero elements it
|
|
114 |
is encoded and empty. There is an additional problem though: because an empty
|
|
115 |
STACK was omitted, sometimes the corresponding *_new() function would
|
|
116 |
initialize the STACK to empty so an application could immediately use it, if
|
|
117 |
this is done with the new code (i.e. a NULL) it wont work. Therefore a new
|
|
118 |
STACK should be allocated first. One instance of this is the X509_CRL list of
|
|
119 |
revoked certificates: a helper function X509_CRL_add0_revoked() has been added
|
|
120 |
for this purpose.
|
|
121 |
||
122 |
The X509_ATTRIBUTE structure used to have an element called 'set' which took |
|
123 |
the value 1 if the attribute value was a SET OF or 0 if it was a single. Due
|
|
124 |
to the behaviour of CHOICE in the new code this has been changed to a field
|
|
125 |
called 'single' which is 0 for a SET OF and 1 for single. The old field has |
|
126 |
been deleted to deliberately break source compatibility. Since this structure
|
|
127 |
is normally accessed via higher level functions this shouldn't break too much. |
|
128 |
||
129 |
The X509_REQ_INFO certificate request info structure no longer has a field |
|
130 |
called 'req_kludge'. This used to be set to 1 if the attributes field was |
|
131 |
(incorrectly) omitted. You can check to see if the field is omitted now by |
|
132 |
checking if the attributes field is NULL. Similarly if you need to omit |
|
133 |
the field then free attributes and set it to NULL. |
|
134 |
||
135 |
The top level 'detached' field in the PKCS7 structure is no longer set when |
|
136 |
a PKCS#7 structure is read in. PKCS7_is_detached() should be called instead. |
|
137 |
The behaviour of PKCS7_get_detached() is unaffected. |
|
138 |
||
139 |
The values of 'type' in the GENERAL_NAME structure have changed. This is |
|
140 |
because the old code use the ASN1 initial octet as the selector. The new |
|
141 |
code uses the index in the ASN1_CHOICE template. |
|
142 |
||
143 |
The DIST_POINT_NAME structure has changed to be a true CHOICE type. |
|
144 |
||
145 |
typedef struct DIST_POINT_NAME_st { |
|
146 |
int type; |
|
147 |
union { |
|
148 |
STACK_OF(GENERAL_NAME) *fullname; |
|
149 |
STACK_OF(X509_NAME_ENTRY) *relativename; |
|
150 |
} name; |
|
151 |
} DIST_POINT_NAME; |
|
152 |
||
153 |
This means that name.fullname or name.relativename should be set |
|
154 |
and type reflects the option. That is if name.fullname is set then |
|
155 |
type is 0 and if name.relativename is set type is 1. |
|
156 |
||
157 |
With the old code using the i2d functions would typically involve: |
|
158 |
||
159 |
unsigned char *buf, *p; |
|
160 |
int len; |
|
161 |
/* Find length of encoding */
|
|
162 |
len = i2d_SOMETHING(x, NULL); |
|
163 |
/* Allocate buffer */
|
|
164 |
buf = OPENSSL_malloc(len); |
|
165 |
if(buf == NULL) { |
|
166 |
/* Malloc error */
|
|
167 |
}
|
|
168 |
/* Use temp variable because &p gets updated to point to end of
|
|
169 |
* encoding.
|
|
170 |
*/
|
|
171 |
p = buf; |
|
172 |
i2d_SOMETHING(x, &p); |
|
173 |
||
174 |
||
175 |
Using the new i2d you can also do: |
|
176 |
||
177 |
unsigned char *buf = NULL; |
|
178 |
int len; |
|
179 |
len = i2d_SOMETHING(x, &buf); |
|
180 |
if(len < 0) { |
|
181 |
/* Malloc error */
|
|
182 |
}
|
|
183 |
||
184 |
and it will automatically allocate and populate a buffer with the |
|
185 |
encoding. After this call 'buf' will point to the start of the |
|
186 |
encoding which is len bytes long. |
|
187 |