2
:mod:`rfc822` --- Parse RFC 2822 mail headers
3
=============================================
6
:synopsis: Parse 2822 style mail messages.
11
The :mod:`email` package should be used in preference to the :mod:`rfc822`
12
module. This module is present only to maintain backward compatibility, and
13
has been removed in 3.0.
15
This module defines a class, :class:`Message`, which represents an "email
16
message" as defined by the Internet standard :rfc:`2822`. [#]_ Such messages
17
consist of a collection of message headers, and a message body. This module
18
also defines a helper class :class:`AddressList` for parsing :rfc:`2822`
19
addresses. Please refer to the RFC for information on the specific syntax of
22
.. index:: module: mailbox
24
The :mod:`mailbox` module provides classes to read mailboxes produced by
25
various end-user mail programs.
28
.. class:: Message(file[, seekable])
30
A :class:`Message` instance is instantiated with an input object as parameter.
31
Message relies only on the input object having a :meth:`readline` method; in
32
particular, ordinary file objects qualify. Instantiation reads headers from the
33
input object up to a delimiter line (normally a blank line) and stores them in
34
the instance. The message body, following the headers, is not consumed.
36
This class can work with any input object that supports a :meth:`readline`
37
method. If the input object has seek and tell capability, the
38
:meth:`rewindbody` method will work; also, illegal lines will be pushed back
39
onto the input stream. If the input object lacks seek but has an :meth:`unread`
40
method that can push back a line of input, :class:`Message` will use that to
41
push back illegal lines. Thus this class can be used to parse messages coming
42
from a buffered stream.
44
The optional *seekable* argument is provided as a workaround for certain stdio
45
libraries in which :cfunc:`tell` discards buffered data before discovering that
46
the :cfunc:`lseek` system call doesn't work. For maximum portability, you
47
should set the seekable argument to zero to prevent that initial :meth:`tell`
48
when passing in an unseekable object such as a file object created from a socket
51
Input lines as read from the file may either be terminated by CR-LF or by a
52
single linefeed; a terminating CR-LF is replaced by a single linefeed before the
55
All header matching is done independent of upper or lower case; e.g.
56
``m['From']``, ``m['from']`` and ``m['FROM']`` all yield the same result.
59
.. class:: AddressList(field)
61
You may instantiate the :class:`AddressList` helper class using a single string
62
parameter, a comma-separated list of :rfc:`2822` addresses to be parsed. (The
63
parameter ``None`` yields an empty list.)
66
.. function:: quote(str)
68
Return a new string with backslashes in *str* replaced by two backslashes and
69
double quotes replaced by backslash-double quote.
72
.. function:: unquote(str)
74
Return a new string which is an *unquoted* version of *str*. If *str* ends and
75
begins with double quotes, they are stripped off. Likewise if *str* ends and
76
begins with angle brackets, they are stripped off.
79
.. function:: parseaddr(address)
81
Parse *address*, which should be the value of some address-containing field such
82
as :mailheader:`To` or :mailheader:`Cc`, into its constituent "realname" and
83
"email address" parts. Returns a tuple of that information, unless the parse
84
fails, in which case a 2-tuple ``(None, None)`` is returned.
87
.. function:: dump_address_pair(pair)
89
The inverse of :meth:`parseaddr`, this takes a 2-tuple of the form ``(realname,
90
email_address)`` and returns the string value suitable for a :mailheader:`To` or
91
:mailheader:`Cc` header. If the first element of *pair* is false, then the
92
second element is returned unmodified.
95
.. function:: parsedate(date)
97
Attempts to parse a date according to the rules in :rfc:`2822`. however, some
98
mailers don't follow that format as specified, so :func:`parsedate` tries to
99
guess correctly in such cases. *date* is a string containing an :rfc:`2822`
100
date, such as ``'Mon, 20 Nov 1995 19:12:08 -0500'``. If it succeeds in parsing
101
the date, :func:`parsedate` returns a 9-tuple that can be passed directly to
102
:func:`time.mktime`; otherwise ``None`` will be returned. Note that indexes 6,
103
7, and 8 of the result tuple are not usable.
106
.. function:: parsedate_tz(date)
108
Performs the same function as :func:`parsedate`, but returns either ``None`` or
109
a 10-tuple; the first 9 elements make up a tuple that can be passed directly to
110
:func:`time.mktime`, and the tenth is the offset of the date's timezone from UTC
111
(which is the official term for Greenwich Mean Time). (Note that the sign of
112
the timezone offset is the opposite of the sign of the ``time.timezone``
113
variable for the same timezone; the latter variable follows the POSIX standard
114
while this module follows :rfc:`2822`.) If the input string has no timezone,
115
the last element of the tuple returned is ``None``. Note that indexes 6, 7, and
116
8 of the result tuple are not usable.
119
.. function:: mktime_tz(tuple)
121
Turn a 10-tuple as returned by :func:`parsedate_tz` into a UTC timestamp. If
122
the timezone item in the tuple is ``None``, assume local time. Minor
123
deficiency: this first interprets the first 8 elements as a local time and then
124
compensates for the timezone difference; this may yield a slight error around
125
daylight savings time switch dates. Not enough to worry about for common use.
131
Comprehensive email handling package; supersedes the :mod:`rfc822` module.
133
Module :mod:`mailbox`
134
Classes to read various mailbox formats produced by end-user mail programs.
136
Module :mod:`mimetools`
137
Subclass of :class:`rfc822.Message` that handles MIME encoded messages.
145
A :class:`Message` instance has the following methods:
148
.. method:: Message.rewindbody()
150
Seek to the start of the message body. This only works if the file object is
154
.. method:: Message.isheader(line)
156
Returns a line's canonicalized fieldname (the dictionary key that will be used
157
to index it) if the line is a legal :rfc:`2822` header; otherwise returns
158
``None`` (implying that parsing should stop here and the line be pushed back on
159
the input stream). It is sometimes useful to override this method in a
163
.. method:: Message.islast(line)
165
Return true if the given line is a delimiter on which Message should stop. The
166
delimiter line is consumed, and the file object's read location positioned
167
immediately after it. By default this method just checks that the line is
168
blank, but you can override it in a subclass.
171
.. method:: Message.iscomment(line)
173
Return ``True`` if the given line should be ignored entirely, just skipped. By
174
default this is a stub that always returns ``False``, but you can override it in
178
.. method:: Message.getallmatchingheaders(name)
180
Return a list of lines consisting of all headers matching *name*, if any. Each
181
physical line, whether it is a continuation line or not, is a separate list
182
item. Return the empty list if no header matches *name*.
185
.. method:: Message.getfirstmatchingheader(name)
187
Return a list of lines comprising the first header matching *name*, and its
188
continuation line(s), if any. Return ``None`` if there is no header matching
192
.. method:: Message.getrawheader(name)
194
Return a single string consisting of the text after the colon in the first
195
header matching *name*. This includes leading whitespace, the trailing
196
linefeed, and internal linefeeds and whitespace if there any continuation
197
line(s) were present. Return ``None`` if there is no header matching *name*.
200
.. method:: Message.getheader(name[, default])
202
Return a single string consisting of the last header matching *name*,
203
but strip leading and trailing whitespace.
204
Internal whitespace is not stripped. The optional *default* argument can be
205
used to specify a different default to be returned when there is no header
206
matching *name*; it defaults to ``None``.
207
This is the preferred way to get parsed headers.
210
.. method:: Message.get(name[, default])
212
An alias for :meth:`getheader`, to make the interface more compatible with
213
regular dictionaries.
216
.. method:: Message.getaddr(name)
218
Return a pair ``(full name, email address)`` parsed from the string returned by
219
``getheader(name)``. If no header matching *name* exists, return ``(None,
220
None)``; otherwise both the full name and the address are (possibly empty)
223
Example: If *m*'s first :mailheader:`From` header contains the string
224
``'jack@cwi.nl (Jack Jansen)'``, then ``m.getaddr('From')`` will yield the pair
225
``('Jack Jansen', 'jack@cwi.nl')``. If the header contained ``'Jack Jansen
226
<jack@cwi.nl>'`` instead, it would yield the exact same result.
229
.. method:: Message.getaddrlist(name)
231
This is similar to ``getaddr(list)``, but parses a header containing a list of
232
email addresses (e.g. a :mailheader:`To` header) and returns a list of ``(full
233
name, email address)`` pairs (even if there was only one address in the header).
234
If there is no header matching *name*, return an empty list.
236
If multiple headers exist that match the named header (e.g. if there are several
237
:mailheader:`Cc` headers), all are parsed for addresses. Any continuation lines
238
the named headers contain are also parsed.
241
.. method:: Message.getdate(name)
243
Retrieve a header using :meth:`getheader` and parse it into a 9-tuple compatible
244
with :func:`time.mktime`; note that fields 6, 7, and 8 are not usable. If
245
there is no header matching *name*, or it is unparsable, return ``None``.
247
Date parsing appears to be a black art, and not all mailers adhere to the
248
standard. While it has been tested and found correct on a large collection of
249
email from many sources, it is still possible that this function may
250
occasionally yield an incorrect result.
253
.. method:: Message.getdate_tz(name)
255
Retrieve a header using :meth:`getheader` and parse it into a 10-tuple; the
256
first 9 elements will make a tuple compatible with :func:`time.mktime`, and the
257
10th is a number giving the offset of the date's timezone from UTC. Note that
258
fields 6, 7, and 8 are not usable. Similarly to :meth:`getdate`, if there is
259
no header matching *name*, or it is unparsable, return ``None``.
261
:class:`Message` instances also support a limited mapping interface. In
262
particular: ``m[name]`` is like ``m.getheader(name)`` but raises :exc:`KeyError`
263
if there is no matching header; and ``len(m)``, ``m.get(name[, default])``,
264
``name in m``, ``m.keys()``, ``m.values()`` ``m.items()``, and
265
``m.setdefault(name[, default])`` act as expected, with the one difference
266
that :meth:`setdefault` uses an empty string as the default value.
267
:class:`Message` instances also support the mapping writable interface ``m[name]
268
= value`` and ``del m[name]``. :class:`Message` objects do not support the
269
:meth:`clear`, :meth:`copy`, :meth:`popitem`, or :meth:`update` methods of the
270
mapping interface. (Support for :meth:`get` and :meth:`setdefault` was only
271
added in Python 2.2.)
273
Finally, :class:`Message` instances have some public instance variables:
276
.. attribute:: Message.headers
278
A list containing the entire set of header lines, in the order in which they
279
were read (except that setitem calls may disturb this order). Each line contains
280
a trailing newline. The blank line terminating the headers is not contained in
284
.. attribute:: Message.fp
286
The file or file-like object passed at instantiation time. This can be used to
287
read the message content.
290
.. attribute:: Message.unixfrom
292
The Unix ``From`` line, if the message had one, or an empty string. This is
293
needed to regenerate the message in some contexts, such as an ``mbox``\ -style
297
.. _addresslist-objects:
302
An :class:`AddressList` instance has the following methods:
305
.. method:: AddressList.__len__()
307
Return the number of addresses in the address list.
310
.. method:: AddressList.__str__()
312
Return a canonicalized string representation of the address list. Addresses are
313
rendered in "name" <host@domain> form, comma-separated.
316
.. method:: AddressList.__add__(alist)
318
Return a new :class:`AddressList` instance that contains all addresses in both
319
:class:`AddressList` operands, with duplicates removed (set union).
322
.. method:: AddressList.__iadd__(alist)
324
In-place version of :meth:`__add__`; turns this :class:`AddressList` instance
325
into the union of itself and the right-hand instance, *alist*.
328
.. method:: AddressList.__sub__(alist)
330
Return a new :class:`AddressList` instance that contains every address in the
331
left-hand :class:`AddressList` operand that is not present in the right-hand
332
address operand (set difference).
335
.. method:: AddressList.__isub__(alist)
337
In-place version of :meth:`__sub__`, removing addresses in this list which are
340
Finally, :class:`AddressList` instances have one public instance variable:
343
.. attribute:: AddressList.addresslist
345
A list of tuple string pairs, one per address. In each member, the first is the
346
canonicalized name part, the second is the actual route-address (``'@'``\
347
-separated username-host.domain pair).
349
.. rubric:: Footnotes
351
.. [#] This module originally conformed to :rfc:`822`, hence the name. Since then,
352
:rfc:`2822` has been released as an update to :rfc:`822`. This module should be
353
considered :rfc:`2822`\ -conformant, especially in cases where the syntax or
354
semantics have changed since :rfc:`822`.