1
PPoossttffiixx DDeebbuuggggiinngg HHoowwttoo
3
-------------------------------------------------------------------------------
5
PPuurrppoossee ooff tthhiiss ddooccuummeenntt
7
This document describes how to debug parts of the Postfix mail system when
8
things do not work according to expectation. The methods vary from making
9
Postfix log a lot of detail, to running some daemon processes under control of
10
a call tracer or debugger.
12
The text assumes that the Postfix main.cf and master.cf configuration files are
13
stored in directory /etc/postfix. You can use the command "ppoossttccoonnff
14
ccoonnffiigg__ddiirreeccttoorryy" to find out the actual location of this directory on your
17
Listed in order of increasing invasiveness, the debugging techniques are as
20
* Look for obvious signs of trouble
21
* Debugging Postfix from inside
22
* Try turning off chroot operation in master.cf
23
* Verbose logging for specific SMTP connections
24
* Record the SMTP session with a network sniffer
25
* Making Postfix daemon programs more verbose
26
* Manually tracing a Postfix daemon process
27
* Automatically tracing a Postfix daemon process
28
* Running daemon programs with the interactive xxgdb debugger
29
* Running daemon programs under a non-interactive debugger
30
* Unreasonable behavior
31
* Reporting problems to postfix-users@postfix.org
33
LLooookk ffoorr oobbvviioouuss ssiiggnnss ooff ttrroouubbllee
35
Postfix logs all failed and successful deliveries to a logfile. The file is
36
usually called /var/log/maillog or /var/log/mail; the exact pathname is defined
37
in the /etc/syslog.conf file.
39
When Postfix does not receive or deliver mail, the first order of business is
40
to look for errors that prevent Postfix from working properly:
42
% egrep '(warning|error|fatal|panic):' /some/log/file | more
44
Note: the most important message is near the BEGINNING of the output. Error
45
messages that come later are less useful.
47
The nature of each problem is indicated as follows:
49
* "ppaanniicc" indicates a problem in the software itself that only a programmer
50
can fix. Postfix cannot proceed until this is fixed.
52
* "ffaattaall" is the result of missing files, incorrect permissions, incorrect
53
configuration file settings that you can fix. Postfix cannot proceed until
56
* "eerrrroorr" reports a fatal or non-fatal error condition. Postfix cannot
57
proceed until this is fixed.
59
* "wwaarrnniinngg" indicates a non-fatal error. These are problems that you may not
60
be able to fix (such as a broken DNS server elsewhere on the network) but
61
may also indicate local configuration errors that could become a problem
64
DDeebbuuggggiinngg PPoossttffiixx ffrroomm iinnssiiddee
66
With Postfix version 2.1 and later you can ask Postfix to produce mail delivery
67
reports for debugging purposes. These reports not only show sender/recipient
68
addresses after address rewriting and alias expansion or forwarding, they also
69
show information about delivery to mailbox, delivery to non-Postfix command,
70
responses from remote SMTP servers, and so on.
72
Postfix can produce two types of mail delivery reports for debugging:
74
* What-if: report what would happen, but do not actually deliver mail. This
75
mode of operation is requested with:
77
$ //uussrr//ssbbiinn//sseennddmmaaiill --bbvv aaddddrreessss......
78
Mail Delivery Status Report will be mailed to <your login name>.
80
* What happened: deliver mail and report successes and/or failures, including
81
replies from remote SMTP servers. This mode of operation is requested with:
83
$ //uussrr//ssbbiinn//sseennddmmaaiill --vv aaddddrreessss......
84
Mail Delivery Status Report will be mailed to <your login name>.
86
These reports contain information that is generated by Postfix delivery agents.
87
Since these run as daemon processes and do not interact with users directly,
88
the result is sent as mail to the sender of the test message. The format of
89
these reports is practically identical to that of ordinary non-delivery
92
For a detailed example of a mail delivery status report, see the debugging
93
section at the end of the ADDRESS_REWRITING_README document.
95
TTrryy ttuurrnniinngg ooffff cchhrroooott ooppeerraattiioonn iinn mmaasstteerr..ccff
97
A common mistake is to turn on chroot operation in the master.cf file without
98
going through all the necessary steps to set up a chroot environment. This
99
causes Postfix daemon processes to fail due to all kinds of missing files.
101
The example below shows an SMTP server that is configured with chroot turned
104
/etc/postfix/master.cf:
105
# =============================================================
106
# service type private unpriv cchhrroooott wakeup maxproc command
107
# (yes) (yes) ((yyeess)) (never) (100)
108
# =============================================================
109
smtp inet n - nn - - smtpd
111
Inspect master.cf for any processes that have chroot operation not turned off.
112
If you find any, save a copy of the master.cf file, and edit the entries in
113
question. After executing the command "ppoossttffiixx rreellooaadd", see if the problem has
116
If turning off chrooted operation made the problem go away, then
117
congratulations. Leaving Postfix running in this way is adequate for most
118
sites. If you prefer chrooted operation, see the Postfix
119
BASIC_CONFIGURATION_README file for information about how to prepare Postfix
120
for chrooted operation.
122
VVeerrbboossee llooggggiinngg ffoorr ssppeecciiffiicc SSMMTTPP ccoonnnneeccttiioonnss
124
In /etc/postfix/main.cf, list the remote site name or address in the
125
debug_peer_list parameter. For example, in order to make the software log a lot
126
of information to the syslog daemon for connections from or to the loopback
129
/etc/postfix/main.cf:
130
debug_peer_list = 127.0.0.1
132
You can specify one or more hosts, domains, addresses or net/masks. To make the
133
change effective immediately, execute the command "ppoossttffiixx rreellooaadd".
135
RReeccoorrdd tthhee SSMMTTPP sseessssiioonn wwiitthh aa nneettwwoorrkk ssnniiffffeerr
137
This example uses ttccppdduummpp. In order to record a conversation you need to
138
specify a large enough buffer with the "-s" option or else you will miss some
139
or all of the packet payload.
141
# tcpdump -w /file/name -s 2000 host example.com and port 25
143
Run this for a while, stop with Ctrl-C when done. To view the data use a binary
144
viewer, or eetthheerreeaall, or use my ttccppdduummppxx utility that is available from ftp://
145
ftp.porcupine.org/pub/debugging/.
147
MMaakkiinngg PPoossttffiixx ddaaeemmoonn pprrooggrraammss mmoorree vveerrbboossee
149
Append one or more "--vv" options to selected daemon definitions in /etc/postfix/
150
master.cf and type "ppoossttffiixx rreellooaadd". This will cause a lot of activity to be
151
logged to the syslog daemon. Example:
153
/etc/postfix/master.cf:
154
smtp inet n - n - - smtpd -v
156
This makes the Postfix SMTP server more verbose. To diagnose problems with
157
address rewriting one would specify a "--vv" option for the cleanup(8) and/or
158
trivial-rewrite(8) daemon, and to diagnose problems with mail delivery one
159
would specify a "--vv" option for the qmgr(8) or oqmgr(8) queue manager, or for
160
the lmtp(8), local(8), pipe(8), smtp(8), or virtual(8) delivery agent.
162
MMaannuuaallllyy ttrraacciinngg aa PPoossttffiixx ddaaeemmoonn pprroocceessss
164
Many systems allow you to inspect a running process with a system call tracer.
167
# trace -p process-id (SunOS 4)
168
# strace -p process-id (Linux and many others)
169
# truss -p process-id (Solaris, FreeBSD)
170
# ktrace -p process-id (generic 4.4BSD)
172
Even more informative are traces of system library calls. Examples:
174
# ltrace -p process-id (Linux, also ported to FreeBSD and BSD/OS)
175
# sotruss -p process-id (Solaris)
177
See your system documentation for details.
179
Tracing a running process can give valuable information about what a process is
180
attempting to do. This is as much information as you can get without running an
181
interactive debugger program, as described in a later section.
183
AAuuttoommaattiiccaallllyy ttrraacciinngg aa PPoossttffiixx ddaaeemmoonn pprroocceessss
185
Postfix can attach a call tracer whenever a daemon process starts. Call tracers
186
come in several kinds.
188
1. System call tracers such as ttrraaccee, ttrruussss, ssttrraaccee, or kkttrraaccee. These show the
189
communication between the process and the kernel.
191
2. Library call tracers such as ssoottrruussss and llttrraaccee. These show calls of
192
library routines, and give a better idea of what is going on within the
195
Append a --DD option to the suspect command in /etc/postfix/master.cf, for
198
/etc/postfix/master.cf:
199
smtp inet n - n - - smtpd -D
201
Edit the debugger_command definition in /etc/postfix/main.cf so that it invokes
202
the call tracer of your choice, for example:
204
/etc/postfix/main.cf:
206
PATH=/bin:/usr/bin:/usr/local/bin;
207
(truss -p $process_id 2>&1 | logger -p mail.info) & sleep 5
209
Type "ppoossttffiixx rreellooaadd" and watch the logfile.
211
RRuunnnniinngg ddaaeemmoonn pprrooggrraammss wwiitthh tthhee iinntteerraaccttiivvee xxxxggddbb ddeebbuuggggeerr
213
If you have X Windows installed on the Postfix machine, then an interactive
214
debugger such as xxxxggddbb can be convenient.
216
Edit the debugger_command definition in /etc/postfix/main.cf so that it invokes
219
/etc/postfix/main.cf:
221
PATH=/bin:/usr/bin:/usr/local/bin:/usr/X11R6/bin
222
xxgdb $daemon_directory/$process_name $process_id & sleep 5
224
Be sure that ggddbb is in the command search path, and export XXAAUUTTHHOORRIITTYY so that X
225
access control works, for example:
227
% setenv XAUTHORITY ~/.Xauthority (csh syntax)
228
$ export XAUTHORITY=$HOME/.Xauthority (sh syntax)
230
Append a --DD option to the suspect daemon definition in /etc/postfix/master.cf,
233
/etc/postfix/master.cf:
234
smtp inet n - n - - smtpd -D
236
Stop and start the Postfix system. This is necessary so that Postfix runs with
237
the proper XXAAUUTTHHOORRIITTYY and DDIISSPPLLAAYY settings.
239
Whenever the suspect daemon process is started, a debugger window pops up and
240
you can watch in detail what happens.
242
RRuunnnniinngg ddaaeemmoonn pprrooggrraammss uunnddeerr aa nnoonn--iinntteerraaccttiivvee ddeebbuuggggeerr
244
If you do not have X Windows installed on the Postfix machine, or if you are
245
not familiar with interactive debuggers, then you can try to run ggddbb in non-
246
interactive mode, and have it print a stack trace when the process crashes.
248
Edit the debugger_command definition in /etc/postfix/main.cf so that it invokes
251
/etc/postfix/main.cf:
253
PATH=/bin:/usr/bin:/usr/local/bin; export PATH; (echo cont;
254
echo where) | gdb $daemon_directory/$process_name $process_id 2>&1
255
>$config_directory/$process_name.$process_id.log & sleep 5
257
Append a --DD option to the suspect daemon in /etc/postfix/master.cf, for
260
/etc/postfix/master.cf:
261
smtp inet n - n - - smtpd -D
263
Type "ppoossttffiixx rreellooaadd" to make the configuration changes effective.
265
Whenever a suspect daemon process is started, an output file is created, named
266
after the daemon and process ID (for example, smtpd.12345.log). When the
267
process crashes, a stack trace (with output from the "wwhheerree" command) is
268
written to its logfile.
270
UUnnrreeaassoonnaabbllee bbeehhaavviioorr
272
Sometimes the behavior exhibited by Postfix just does not match the source
273
code. Why can a program deviate from the instructions given by its author?
274
There are two possibilities.
276
* The compiler has erred. This rarely happens.
278
* The hardware has erred. Does the machine have ECC memory?
280
In both cases, the program being executed is not the program that was supposed
281
to be executed, so anything could happen.
283
There is a third possibility:
285
* Bugs in system software (kernel or libraries).
287
Hardware-related failures usually do not reproduce in exactly the same way
288
after power cycling and rebooting the system. There's little Postfix can do
289
about bad hardware. Be sure to use hardware that at the very least can detect
290
memory errors. Otherwise, Postfix will just be waiting to be hit by a bit
291
error. Critical systems deserve real hardware.
293
When a compiler makes an error, the problem can be reproduced whenever the
294
resulting program is run. Compiler errors are most likely to happen in the code
295
optimizer. If a problem is reproducible across power cycles and system reboots,
296
it can be worthwhile to rebuild Postfix with optimization disabled, and to see
297
if optimization makes a difference.
299
In order to compile Postfix with optimizations turned off:
302
% make makefiles OPT=
304
This produces a set of Makefiles that do not request compiler optimization.
306
Once the makefiles are set up, build the software:
312
If the problem goes away, then it is time to ask your vendor for help.
314
RReeppoorrttiinngg pprroobblleemmss ttoo ppoossttffiixx--uusseerrss@@ppoossttffiixx..oorrgg
316
The people who participate on the postfix-users@postfix.org are very helpful,
317
especially if YOU provide them with sufficient information. Remember, these
318
volunteers are willing to help, but their time is limited.
320
When reporting a problem, be sure to include the following information.
322
* A summary of the problem. Please do not just send some logging without
323
explanation of what YOU believe is wrong.
325
* Consider using a test email address so that you don't have to reveal email
326
addresses of innocent people.
328
* If you can't use a test email address, please anonymize information
329
consistently. Replace each letter by "A", each digit by "D" so that the
330
helpers can still recognize syntactical errors.
332
* Complete error messages. Please use cut-and-paste, or use attachments,
333
instead of reciting information from memory.
335
* Postfix logging. See the text at the top of the DEBUG_README document to
336
find out where logging is stored. Please do not frustrate the helpers by
337
word wrapping the logging.
339
* Output from "postconf -n". Please do not send your main.cf file. Or better,
340
provide output from the "postfinger" tool.
342
* If the problem is about too much mail in the queue, consider including
343
output from the qshape tool, as described in the QSHAPE_README file.
345
* If the problem is protocol related (connections time out or an SMTP server
346
complains about syntax errors etc.) consider recording a session with
347
tcpdump, as described in the DEBUG_README document.