3
This is the combined README for pam_otp_auth, a PAM module, and
4
rlm_otp, a FreeRADIUS module. See the COPYRIGHT file, included with
5
this distribution, for copyright and redistribution information.
6
If you have questions not answered in this doc, please contact
7
Frank Cusack, <fcusack@fcusack.com>. Please send bug reports to
10
FreeRADIUS is available at <http://www.freeradius.org/>. The PAM
11
module is available at <http://www.fcusack.com/>.
13
In addition to this module, you need the state manager software.
14
The state manager primarily handles global (across all of your
15
authentication servers) state associated with synchronous mode
16
tokens (see section 4). It also handles other bookkeeping data
17
used to prevent passcode guessing attacks. The state manager
18
is available from <http://www.fcusack.com/>.
23
Tokens that use ANSI X9.9 or HOTP (these two cover all tokens made
24
today, except for RSA Securid) can theoretically be authenticated
25
via this module. In practice, however, only the TRI-D Systems and
26
PassGo/Axent "Defender Handheld" tokens are functional, and due to
27
the weakness of X9.9 (see next section) use of the Defender token
28
should, in a just world, cause you to lose your job.
30
Various CRYPTOCard tokens are fully supported, but with the problem
31
that you need to either reverse engineer the token programming
32
protocol, or reverse engineer the keystore encryption. Both of
33
these are quite possible; I've done it myself and have done a very
34
large CRYPTOCard deployment at my former employer. However, don't
35
ask me for help with this, your message will simply be trashed.
37
ActivCard can theoretically be supported, however you'll need to
38
purchase their dev kit ($$$) due to patents they hold on their
39
specific X9.9 implementation. For exorbitant fees, I can write the
40
code for you. Note that you may not redistribute any such code,
41
again due to patent issues.
43
Other vendors' tokens are also theoretically supported, with the
44
additional problem that you'll need to reverse engineer their
45
synchronous challenge generation algorithm. Again, I can help you
46
with this for an exorbitant fee.
48
I *strongly* discourage the use of "soft tokens" or PDA tokens.
49
These are easily compromisable, since the key is insufficiently
52
Throughout the remainder of this document, wherever applicable I
53
point out differences in the two main tokens supported, TRI-D and
57
2. STRONG WARNING SECTION
59
ANSI X9.9 has been withdrawn as a standard, due to the weakness
60
of DES. An attacker can learn the token's secret by observing
61
two challenge/response pairs. See ANSI document X9 TG-24-1999,
62
<http://www.x9.org/docs/TG24_1999.pdf>.
64
For X9.9 tokens, the obvious fix is to not issue a challenge; the
65
attacker will not have access to the plaintext. This is possible
66
since most X9.9 tokens support a synchronous mode; the only exception
67
I know of is the PassGo/Axent Defender Handheld.
69
The default configuration of this module effectively disables pure
70
challenge/response (hereafter: async) mode, for this reason.
72
In practice, async mode authentication is a poor user experience and
73
is exceedingly rare. No new token deployments should use async mode.
75
Does your token use X9.9? Ask your vendor. (Don't ask if they use
76
X9.9, ask what response generation method they use. If they won't
77
give you an answer, email me and I'll tell you what they use. Then
78
make sure you don't do business with them.)
80
CRYPTOCard uses X9.9; TRI-D uses HOTP.
85
You'll need to have DES and SHA-1 libraries in order to build and
86
use this module. Currently, only OpenSSL is supported.
88
You will also need /dev/urandom available. This is available on all
89
Linux, *BSD and Solaris 9+. For Solaris 8, you'll need to install
90
patch 112438-01 (sparc) or 112439-01 (x86). Information for other
93
You'll also need to write a site-specific challenge transform in
94
order to use async mode. For CRYPTOCard, you might need async mode to
95
sync the user's token with the server initially. More on this below.
96
For TRI-D, async mode is not supported.
101
In the very old days, the server would present a challenge to the
102
user, which the user would then enter into their token, and give the
103
server the response. We call this async mode. This is "klunky"
104
by modern standards of usability, and for X9.9 tokens is actually
105
unsafe given that DES is so weak. As noted above, CRYPTOCard supports
106
async mode; TRI-D does not.
108
Luckily, most tokens support a synchronous mode which lets the user
109
skip the part where they enter the challenge. In this mode, the
110
token and the server generate a "next challenge" which is derived
111
from an event and/or time counter and is implicit. Besides offering
112
better security, this mode also has the advantage of giving a much
113
better user experience. Both the TRI-D and CRYPTOCard tokens have a
116
For some tokens, the token can display the synchronous challenge.
117
The idea here is that the server would still present a challenge
118
to the user, but the user wouldn't have to enter it--they'd just
119
have to verify it matches. Then they can safely just press some
120
function key to obtain the response. From a security perspective,
121
this is no better than pure async mode, since an attacker can still
122
observe the plaintext/ciphertext pair.
124
So when operating in this mixed async-sync mode, instead of presenting
125
the synchronous challenge, the server ALWAYS displays a random
126
challenge. Instead of verifying that the challenge matches the token
127
display, the user should just skip past the token challenge display
128
to obtain the response. This might be confusing; you will need to
129
train users. Even with training, they will forget. Be warned!
130
This mixed mode is useless and stupid. If you can disable token
131
support for this, do so.
133
For other tokens, the token does not display the synchronous
134
challenge--only the response is displayed. This is a bit easier on
135
the user; they won't be confused as to which number to enter for the
136
response. I can't recommend this mode highly enough. With tokens
137
like this, you should configure the server to likewise not present
138
a challenge (this is the default). This appears to the user to be
139
close to a normal password authentication.
141
Older CRYPTOCard tokens only supported the mixed async-sync mode.
142
Newer ones support both sync modes. TRI-D supports only the "pure"
145
It's worth repeating that async mode is vastly inferior to either
146
sync mode, and the mixed async-sync mode is vastly inferior to the
147
pure sync mode. In addition to the shielding of the plaintext,
148
and ease of use, another advantage of sync mode is that it supports
149
authentication methods where a challenge cannot be presented to the
150
user, e.g. PPTP without EAP.
152
In sync mode, there are two ways to generate the implied challenge;
153
either event or time based. "Events" are token operations--each
154
time the token is activated an event counter advances.
156
CRYPTOCard is event synchronous; TRI-D is both time and event
159
Event synchronous tokens have the problem that if users play with
160
the token as a toy (say, to generate winning lottery numbers),
161
the server has no way to know this and so it has a different idea
162
of the counter value. Since there are typically only 1-10 million
163
passcodes (6-7 digit decimal display), the server cannot simply test
164
"many" passcodes in an attempt to discover the event counter value,
165
because a guessing attack is trivial with such a small response space.
166
Our solution for this is noted in section 6, below.
168
Time synchronous tokens solve this problem quite nicely by eliminating
169
the user from the equation. As PEBKAC is generally the worst kind
170
of problem, and most difficult to solve, this is clearly better than
171
event synchronous. However, it is not without its own problems.
172
First, a real time clock must be on the token, which today is not
173
a technical hurdle, but it is an added expense. To keep costs low,
174
the clock on the token keeps poor time, so the server has to track
175
drift. Also, the token is typically exposed to adverse environmental
176
conditions, which (especially in such a small and necessarily cheap
177
package) affects the clock and so the drift is not constant.
179
But even varying clock drift is not especially difficult to handle on
180
the server. A worse problem is that the timer interval (normally one
181
minute) also limits login rate. Even "normal" users commonly want
182
to login more frequently than this. Making users wait one minute to
183
login again is practically forever. TRI-D addresses this with the
184
activation button on the token. Each time it is pressed an event
185
counter is combined with the time counter to generate a new passcode.
186
The event counter is reset whenever the time counter advances.
189
5. SITE-SPECIFIC CHALLENGE TRANSFORM
191
Since the normal mode of operation will be sync mode, we really only
192
have async mode support for "last resort" user resync of the event
193
counter. (For "normal" resync see the rwindow description
196
Note that only some tokens support "user" sync/resync. For others,
197
admin intervention is required for resync. CRYPTOCard supports
198
this; TRI-D does not (since it is time-based, there is no resync).
200
Since pure challenge/response with X9.9 is unsafe, I came up with the
201
concept of the "site-specific challenge transform". For the user,
202
this means that instead of entering the challenge as presented to
203
them, they enter something based on the challenge. For example,
204
a simple transform would be to enter the challenge backwards; if
205
the server presents "123456" the user enters "654321". This has
206
the effect that an observer does not have access to the plaintext.
208
This is security through obscurity, and is not really "safe", but
209
for an outsider it may present at least some barrier. Even though
210
it presents no advantage in the face of a determined attacker,
211
I recommend using it. It may stop a more opportunistic attacker
212
and isn't difficult to use.
214
The server logs each time a user authenticates via async mode,
215
so I recommend a log scanner which alerts you to this. You should
216
reprogram tokens when the user authenticates via async mode.
218
otp_site.c implements the site-specific challenge transform.
219
The default transform is to replace the challenge with the text
220
"DISABLED". This effectively disables async mode (the user will
221
not be able to enter this into their token).
223
DO NOT use the transform suggested above, reversing the challenge.
224
That is now exceptionally weak. An example of a possibly strong
225
transform is to have the user enter the square of the challenge.
226
The VASCO DigiPass 500 is also a [regular] calculator, so this could
227
be a good one if you use that token. Well, there's no support
228
for that token, and now that I've mentioned it, it is another
229
exceptionally weak transform, but you get the idea.
231
Note that older CRYPTOCard RB-1 tokens support arbitrarily
232
long challenge strings. You should take advantage of this when
233
implementing your transform. You will still have to stay under
234
MAX_CHALLENGE_LEN digits. (This is why MAX_CHALLENGE_LEN is set to 32
235
even though the displayed challenge would generally be much smaller.)
237
If you do not believe applying a transform gives any advantage, you
238
can just comment out the single line of code there. This actually
239
may have some benefit, since your users don't need to be trained.
240
I can guarantee your most annoying user will complain when they
241
can't remember what they really are supposed to enter into the token.
242
Also, this can be safe if you diligently reprogram tokens when async
243
mode has been used. You might automatically disable a token after
244
two async authentications.
249
Most of the configuration is documented fairly well in the sample
250
otp.conf file (FreeRADIUS) or man page (PAM). I will only discuss
254
After hardfail consecutive failed login attempts, the user's
255
token is disabled. Because this allows a trivial DoS attack,
256
the default value is 0, and instead we recommend using softfail.
258
After softfail consecutive failed login attempts, the user is put
259
into "delay mode", where they are unable to login for a delay which
260
increases for each failed attempt.
262
It is critically important to have these options since the
263
passcode (response) space is so small. Without a delay/lockout,
264
it would be trivially easy for an attacker to just try every
265
possible passcode. With the default softfail setting of 5, an
266
attacker could try, at most, ~50 passcodes/day. No indication
267
is given to the user that they are in delay mode (except that
268
a valid passcode doesn't work), further thwarting an attacker,
269
albeit at some small cost to the legitimate user.
272
Some tokens have what we call a "hard PIN"; users enter a PIN into
273
the token to activate it. This has the advantage that only the
274
user knows the PIN, and that it is only entered into a secure
275
device, however, it has [token] UI challenges.
277
For usability reasons, other tokens have a constantly active
278
display and the user enters a "soft PIN" as part of the passcode.
279
This has the advantage of a better UI, but has the disadvantages
280
that the PIN is susceptible to capture, which can reduce the
281
token to a single factor device; and that the server admins know
282
the PIN. (Note that it doesn't matter for hard PIN devices that
283
admins don't know the PIN, since they know the token secret;
284
the loss incurred by admin exposure is not for security of the
285
device, but compromise of personal information.)
287
The prepend_pin setting toggles whether the user must prepend or
288
append the soft PIN; the default is to prepend. Note that hard
289
PIN devices can utilize a soft PIN as well.
291
CRYPTOCard supports a hard PIN; the biometric input on the TRI-D
292
3-factor card can is roughly equivalent to a hard PIN.
294
ewindow_size: (event window)
295
For event-synchronous-only tokens (CRYPTOCard), this is how far
296
out of [event] sync the server can get with the token. The value
297
is how far the user can be ahead of the server--essentially
298
how many times the user can play with the token. You'll want
299
to set this to at least 1 or 2, in case the user mistypes the
300
response and the token turns off before he is able to try again.
301
A more reasonable value is 5.
303
For event+time synchronous tokens (TRI-D), this value has no
304
meaning; the server determines how many events to test based on
307
This value is ignored for time-synchronous-only tokens.
309
Note that there is no analogous twindow_size setting; for
310
time synchronous (event+time or time only) tokens, the server
311
determines how far forward or backward to look based on card
314
rwindow_size/rwindow_delay: (resync window)
315
This is similar to ewindow_size. For event-synchronous-only
316
tokens (CRYPTOCard), when the user goes into delay mode (>softfail
317
consecutive incorrect passcodes), this extends the allowable
318
event window, but requires the user to enter TWO consecutive sync
319
responses corrrectly, within rwindow_delay seconds. The upside
320
of having to enter 2 passcodes is that the delay is overridden.
322
In practice, users that do have problems with the allowable
323
event window (and those users tend to have them consistently)
324
get into long lockout delays and since no indication is given
325
to the user about this state, they need a way to get past it
326
without calling the helpdesk.
328
For example, say softfail=1, ewindow_size=2 and rwindow_size=8
329
(ignore rwindow_delay). The server's state is such that the
330
next 8 responses are 1, 2, ..., 8. The user, however, has played
331
with the token and the response showing is '3', which he enters
334
This is ahead of ewindow_size, so the server refuses him,
335
and places the user into delay mode, since softfail is only 1.
336
Note that even though this response is within rwindow_size events,
337
it is not recorded as such because when checking the passcode,
338
the user was /not yet/ in delay mode and so only ewindow_size
339
events were considered. /AFTER/ testing the passcode, the user
340
is /THEN/ placed into delay mode.
342
The user tries again immediately, using '3' again. Since the
343
user /is now/ in delay mode, the server would normally refuse
344
him (remember, we said he tried again "immediately"). Even if
345
the user weren't in delay mode (say, softfail is larger), the
346
server would still refuse him because he is too far ahead of
347
the normal ewindow_size window.
349
But since he is in delay mode, and rwindow_size is non-zero,
350
instead of simply rejecting responses beyond ewindow_size
351
events, the server looks ahead up to rwindow_size (8 in this
352
case) events. It sees that '3' is within rwindow_size events,
353
records that the user gave a correct sync response at position 3,
356
Now the user tries again immediately, this time using the next
357
response of '4'. Again, normally this would be refused since
358
the user is in delay mode. But because rwindow_size is set,
359
the server sees that '4' is within the rwindow_size window,
360
and that the user's previous response ('3') matches the previous
361
response in the window, so the user is authenticated and returned
364
Note that the user actually entered 3,3,4 and although the user
365
entered 3 correct passcodes, only the last 2 were consecutive so
366
this seems to match the description of this feature. However,
367
if the user had entered 3,4,5 he still would have had to enter
368
3 passcodes! Review the example to understand why.
370
In practice, users generally enter a lot of bad passcodes to get
371
into softfail and then finally see what they're doing wrong and
372
so they do only enter 2 correct passcodes, ie if they are even
373
aware of this feature they don't get confused about why they
374
had to enter the '5' part of 3,4,5.
376
It is recommended that you tell users to /always/ advance to
377
the next passcode on error, and that they should always try at
378
least 3 (or 4) consecutive entries before calling the helpdesk.
380
The Windows VPN password error dialog is confusing and is a
381
major source of duplicate entries, which add an extra passcode
382
entry to rwindow mode. Another significant source of passcode
383
errors is PC laptop users that have a docking station with
384
keyboard. Windows keeps the numlock setting when undocking,
385
and my experience is that one of the first things that folks do
386
after undocking is to VPN in. The '0' key on the number row
387
is a '.' instead of a '0' when numlock is on. And since the
388
Windows VPN dialog can't know that it's safe to display the
389
passcode, the user can't tell that he's misentering zeroes.
390
This encourages getting out of sync. Ouch.
392
For time synchronous tokens (event+time or time only), the
393
rwindow_size value has no meaning as there is no event counter
394
to lose track of. (Clock drift affecting the time counter is
395
tracked by the server.)
397
However, the rwindow_delay value does have meaning. If a user
398
goes into softfail (maybe by repeatedly trying their longterm
399
password or by a password guessing attack), they can still get
400
out of delay mode by entering two consecutive passcodes within
401
rwindow_delay seconds.
403
Also, for TRI-D tokens, rwindow_delay has an additional meaning.
404
You'll need to read the state manager documentation to understand
405
this, but the TRI-D token supports "null state" meaning that
406
the admin does not have to (and in fact must not) manually
407
initialize state when issuing a token. State is automatically
408
initialized when a user first authenticates, however, the user
409
must authenticate twice, which uses the softfail mechanism and
410
thus depends on rwindow_delay. It's not quite softfail because
411
the user cannot simply wait for the delay period to expire and
412
then authenticate only once.
417
/etc/otppasswd, a file similar to /etc/passwd, contains usernames
418
and keys. See the sample otppasswd file.
423
All errors begin with "rlm_otp" (FreeRADIUS) or "pam_otp_auth"
424
(PAM). Only errors are logged, there are no "success" log messages
425
(besides FreeRADIUS/PAM standard messages). You will want to scan
426
for errors automatically or periodically.
428
"bad state" messages (FreeRADIUS) indicate a problem with the State
429
attribute, which the server uses to track async challenges. They are
430
all of the form "bad state for [%s]: <problem>", where <problem>
433
length: The length is not as expected. Could be an attempted attack,
434
but more likely a network blip.
435
hmac: The state is protected by a cryptographic hash which was not
436
able to be verified. This could be because you just HUP'd
438
expired: The state is older than maxdelay seconds. If you get a lot
439
of these you may wish to increase the value.
441
Another set of messages you'll want to lookout for is "valid but in
442
hardfail" and "valid but in softfail", which indicate a user that is
443
locked out due to exceeding hardfail or softfail failures.
445
Also, look for "[%s] authenticated in async mode" which indicates
446
a user with a sync mode card that used async authentication. You
447
may wish to reprogram these users' cards.
452
Send bug reports or any other questions to Frank Cusack,
453
<fcusack@fcusack.com>.