1
:mod:`email.policy`: Policy Objects
2
-----------------------------------
4
.. module:: email.policy
5
:synopsis: Controlling the parsing and generating of messages
7
.. moduleauthor:: R. David Murray <rdmurray@bitdance.com>
8
.. sectionauthor:: R. David Murray <rdmurray@bitdance.com>
13
The :mod:`email` package's prime focus is the handling of email messages as
14
described by the various email and MIME RFCs. However, the general format of
15
email messages (a block of header fields each consisting of a name followed by
16
a colon followed by a value, the whole block followed by a blank line and an
17
arbitrary 'body'), is a format that has found utility outside of the realm of
18
email. Some of these uses conform fairly closely to the main RFCs, some do
19
not. And even when working with email, there are times when it is desirable to
20
break strict compliance with the RFCs.
22
Policy objects give the email package the flexibility to handle all these
25
A :class:`Policy` object encapsulates a set of attributes and methods that
26
control the behavior of various components of the email package during use.
27
:class:`Policy` instances can be passed to various classes and methods in the
28
email package to alter the default behavior. The settable values and their
29
defaults are described below.
31
There is a default policy used by all classes in the email package. This
32
policy is named :class:`Compat32`, with a corresponding pre-defined instance
33
named :const:`compat32`. It provides for complete backward compatibility (in
34
some cases, including bug compatibility) with the pre-Python3.3 version of the
37
The first part of this documentation covers the features of :class:`Policy`, an
38
:term:`abstract base class` that defines the features that are common to all
39
policy objects, including :const:`compat32`. This includes certain hook
40
methods that are called internally by the email package, which a custom policy
41
could override to obtain different behavior.
43
When a :class:`~email.message.Message` object is created, it acquires a policy.
44
By default this will be :const:`compat32`, but a different policy can be
45
specified. If the ``Message`` is created by a :mod:`~email.parser`, a policy
46
passed to the parser will be the policy used by the ``Message`` it creates. If
47
the ``Message`` is created by the program, then the policy can be specified
48
when it is created. When a ``Message`` is passed to a :mod:`~email.generator`,
49
the generator uses the policy from the ``Message`` by default, but you can also
50
pass a specific policy to the generator that will override the one stored on
51
the ``Message`` object.
53
:class:`Policy` instances are immutable, but they can be cloned, accepting the
54
same keyword arguments as the class constructor and returning a new
55
:class:`Policy` instance that is a copy of the original but with the specified
56
attributes values changed.
58
As an example, the following code could be used to read an email message from a
59
file on disk and pass it to the system ``sendmail`` program on a Unix system:
63
>>> from unittest import mock
64
>>> mocker = mock.patch('subprocess.Popen')
65
>>> m = mocker.start()
66
>>> proc = mock.MagicMock()
67
>>> m.return_value = proc
68
>>> proc.stdin.close.return_value = None
69
>>> mymsg = open('mymsg.txt', 'w')
70
>>> mymsg.write('To: abc@xyz.com\n\n')
76
>>> from email import message_from_binary_file
77
>>> from email.generator import BytesGenerator
78
>>> from email import policy
79
>>> from subprocess import Popen, PIPE
80
>>> with open('mymsg.txt', 'rb') as f:
81
... msg = message_from_binary_file(f, policy=policy.default)
82
>>> p = Popen(['sendmail', msg['To'].addresses[0]], stdin=PIPE)
83
>>> g = BytesGenerator(p.stdin, policy=msg.policy.clone(linesep='\r\n'))
93
>>> os.remove('mymsg.txt')
95
Here we are telling :class:`~email.generator.BytesGenerator` to use the RFC
96
correct line separator characters when creating the binary string to feed into
97
``sendmail's`` ``stdin``, where the default policy would use ``\n`` line
100
Some email package methods accept a *policy* keyword argument, allowing the
101
policy to be overridden for that method. For example, the following code uses
102
the :meth:`~email.message.Message.as_bytes` method of the *msg* object from
103
the previous example and writes the message to a file using the native line
104
separators for the platform on which it is running::
107
>>> with open('converted.txt', 'wb') as f:
108
... f.write(msg.as_bytes(policy=msg.policy.clone(linesep=os.linesep)))
111
Policy objects can also be combined using the addition operator, producing a
112
policy object whose settings are a combination of the non-default values of the
115
>>> compat_SMTP = policy.compat32.clone(linesep='\r\n')
116
>>> compat_strict = policy.compat32.clone(raise_on_defect=True)
117
>>> compat_strict_SMTP = compat_SMTP + compat_strict
119
This operation is not commutative; that is, the order in which the objects are
120
added matters. To illustrate::
122
>>> policy100 = policy.compat32.clone(max_line_length=100)
123
>>> policy80 = policy.compat32.clone(max_line_length=80)
124
>>> apolicy = policy100 + policy80
125
>>> apolicy.max_line_length
127
>>> apolicy = policy80 + policy100
128
>>> apolicy.max_line_length
132
.. class:: Policy(**kw)
134
This is the :term:`abstract base class` for all policy classes. It provides
135
default implementations for a couple of trivial methods, as well as the
136
implementation of the immutability property, the :meth:`clone` method, and
137
the constructor semantics.
139
The constructor of a policy class can be passed various keyword arguments.
140
The arguments that may be specified are any non-method properties on this
141
class, plus any additional non-method properties on the concrete class. A
142
value specified in the constructor will override the default value for the
143
corresponding attribute.
145
This class defines the following properties, and thus values for the
146
following may be passed in the constructor of any policy class:
148
.. attribute:: max_line_length
150
The maximum length of any line in the serialized output, not counting the
151
end of line character(s). Default is 78, per :rfc:`5322`. A value of
152
``0`` or :const:`None` indicates that no line wrapping should be
155
.. attribute:: linesep
157
The string to be used to terminate lines in serialized output. The
158
default is ``\n`` because that's the internal end-of-line discipline used
159
by Python, though ``\r\n`` is required by the RFCs.
161
.. attribute:: cte_type
163
Controls the type of Content Transfer Encodings that may be or are
164
required to be used. The possible values are:
166
.. tabularcolumns:: |l|L|
168
======== ===============================================================
169
``7bit`` all data must be "7 bit clean" (ASCII-only). This means that
170
where necessary data will be encoded using either
171
quoted-printable or base64 encoding.
173
``8bit`` data is not constrained to be 7 bit clean. Data in headers is
174
still required to be ASCII-only and so will be encoded (see
175
'binary_fold' below for an exception), but body parts may use
177
======== ===============================================================
179
A ``cte_type`` value of ``8bit`` only works with ``BytesGenerator``, not
180
``Generator``, because strings cannot contain binary data. If a
181
``Generator`` is operating under a policy that specifies
182
``cte_type=8bit``, it will act as if ``cte_type`` is ``7bit``.
184
.. attribute:: raise_on_defect
186
If :const:`True`, any defects encountered will be raised as errors. If
187
:const:`False` (the default), defects will be passed to the
188
:meth:`register_defect` method.
192
.. attribute:: mangle_from\_
194
If :const:`True`, lines starting with *"From "* in the body are
195
escaped by putting a ``>`` in front of them. This parameter is used when
196
the message is being serialized by a generator.
197
Default: :const:`False`.
199
.. versionadded:: 3.5
200
The *mangle_from_* parameter.
202
The following :class:`Policy` method is intended to be called by code using
203
the email library to create policy instances with custom settings:
205
.. method:: clone(**kw)
207
Return a new :class:`Policy` instance whose attributes have the same
208
values as the current instance, except where those attributes are
209
given new values by the keyword arguments.
211
The remaining :class:`Policy` methods are called by the email package code,
212
and are not intended to be called by an application using the email package.
213
A custom policy must implement all of these methods.
215
.. method:: handle_defect(obj, defect)
217
Handle a *defect* found on *obj*. When the email package calls this
218
method, *defect* will always be a subclass of
219
:class:`~email.errors.Defect`.
221
The default implementation checks the :attr:`raise_on_defect` flag. If
222
it is ``True``, *defect* is raised as an exception. If it is ``False``
223
(the default), *obj* and *defect* are passed to :meth:`register_defect`.
225
.. method:: register_defect(obj, defect)
227
Register a *defect* on *obj*. In the email package, *defect* will always
228
be a subclass of :class:`~email.errors.Defect`.
230
The default implementation calls the ``append`` method of the ``defects``
231
attribute of *obj*. When the email package calls :attr:`handle_defect`,
232
*obj* will normally have a ``defects`` attribute that has an ``append``
233
method. Custom object types used with the email package (for example,
234
custom ``Message`` objects) should also provide such an attribute,
235
otherwise defects in parsed messages will raise unexpected errors.
237
.. method:: header_max_count(name)
239
Return the maximum allowed number of headers named *name*.
241
Called when a header is added to a :class:`~email.message.Message`
242
object. If the returned value is not ``0`` or ``None``, and there are
243
already a number of headers with the name *name* equal to the value
244
returned, a :exc:`ValueError` is raised.
246
Because the default behavior of ``Message.__setitem__`` is to append the
247
value to the list of headers, it is easy to create duplicate headers
248
without realizing it. This method allows certain headers to be limited
249
in the number of instances of that header that may be added to a
250
``Message`` programmatically. (The limit is not observed by the parser,
251
which will faithfully produce as many headers as exist in the message
254
The default implementation returns ``None`` for all header names.
256
.. method:: header_source_parse(sourcelines)
258
The email package calls this method with a list of strings, each string
259
ending with the line separation characters found in the source being
260
parsed. The first line includes the field header name and separator.
261
All whitespace in the source is preserved. The method should return the
262
``(name, value)`` tuple that is to be stored in the ``Message`` to
263
represent the parsed header.
265
If an implementation wishes to retain compatibility with the existing
266
email package policies, *name* should be the case preserved name (all
267
characters up to the '``:``' separator), while *value* should be the
268
unfolded value (all line separator characters removed, but whitespace
269
kept intact), stripped of leading whitespace.
271
*sourcelines* may contain surrogateescaped binary data.
273
There is no default implementation
275
.. method:: header_store_parse(name, value)
277
The email package calls this method with the name and value provided by
278
the application program when the application program is modifying a
279
``Message`` programmatically (as opposed to a ``Message`` created by a
280
parser). The method should return the ``(name, value)`` tuple that is to
281
be stored in the ``Message`` to represent the header.
283
If an implementation wishes to retain compatibility with the existing
284
email package policies, the *name* and *value* should be strings or
285
string subclasses that do not change the content of the passed in
288
There is no default implementation
290
.. method:: header_fetch_parse(name, value)
292
The email package calls this method with the *name* and *value* currently
293
stored in the ``Message`` when that header is requested by the
294
application program, and whatever the method returns is what is passed
295
back to the application as the value of the header being retrieved.
296
Note that there may be more than one header with the same name stored in
297
the ``Message``; the method is passed the specific name and value of the
298
header destined to be returned to the application.
300
*value* may contain surrogateescaped binary data. There should be no
301
surrogateescaped binary data in the value returned by the method.
303
There is no default implementation
305
.. method:: fold(name, value)
307
The email package calls this method with the *name* and *value* currently
308
stored in the ``Message`` for a given header. The method should return a
309
string that represents that header "folded" correctly (according to the
310
policy settings) by composing the *name* with the *value* and inserting
311
:attr:`linesep` characters at the appropriate places. See :rfc:`5322`
312
for a discussion of the rules for folding email headers.
314
*value* may contain surrogateescaped binary data. There should be no
315
surrogateescaped binary data in the string returned by the method.
317
.. method:: fold_binary(name, value)
319
The same as :meth:`fold`, except that the returned value should be a
320
bytes object rather than a string.
322
*value* may contain surrogateescaped binary data. These could be
323
converted back into binary data in the returned bytes object.
326
.. class:: Compat32(**kw)
328
This concrete :class:`Policy` is the backward compatibility policy. It
329
replicates the behavior of the email package in Python 3.2. The
330
:mod:`~email.policy` module also defines an instance of this class,
331
:const:`compat32`, that is used as the default policy. Thus the default
332
behavior of the email package is to maintain compatibility with Python 3.2.
334
The following attributes have values that are different from the
335
:class:`Policy` default:
337
.. attribute:: mangle_from_
339
The default is ``True``.
341
The class provides the following concrete implementations of the
342
abstract methods of :class:`Policy`:
344
.. method:: header_source_parse(sourcelines)
346
The name is parsed as everything up to the '``:``' and returned
347
unmodified. The value is determined by stripping leading whitespace off
348
the remainder of the first line, joining all subsequent lines together,
349
and stripping any trailing carriage return or linefeed characters.
351
.. method:: header_store_parse(name, value)
353
The name and value are returned unmodified.
355
.. method:: header_fetch_parse(name, value)
357
If the value contains binary data, it is converted into a
358
:class:`~email.header.Header` object using the ``unknown-8bit`` charset.
359
Otherwise it is returned unmodified.
361
.. method:: fold(name, value)
363
Headers are folded using the :class:`~email.header.Header` folding
364
algorithm, which preserves existing line breaks in the value, and wraps
365
each resulting line to the ``max_line_length``. Non-ASCII binary data are
366
CTE encoded using the ``unknown-8bit`` charset.
368
.. method:: fold_binary(name, value)
370
Headers are folded using the :class:`~email.header.Header` folding
371
algorithm, which preserves existing line breaks in the value, and wraps
372
each resulting line to the ``max_line_length``. If ``cte_type`` is
373
``7bit``, non-ascii binary data is CTE encoded using the ``unknown-8bit``
374
charset. Otherwise the original source header is used, with its existing
375
line breaks and any (RFC invalid) binary data it may contain.
378
An instance of :class:`Compat32` is provided as a module constant:
382
An instance of :class:`Compat32`, providing backward compatibility with the
383
behavior of the email package in Python 3.2.
388
The documentation below describes new policies that are included in the
389
standard library on a :term:`provisional basis <provisional package>`.
390
Backwards incompatible changes (up to and including removal of the feature)
391
may occur if deemed necessary by the core developers.
394
.. class:: EmailPolicy(**kw)
396
This concrete :class:`Policy` provides behavior that is intended to be fully
397
compliant with the current email RFCs. These include (but are not limited
398
to) :rfc:`5322`, :rfc:`2047`, and the current MIME RFCs.
400
This policy adds new header parsing and folding algorithms. Instead of
401
simple strings, headers are ``str`` subclasses with attributes that depend
402
on the type of the field. The parsing and folding algorithm fully implement
403
:rfc:`2047` and :rfc:`5322`.
405
In addition to the settable attributes listed above that apply to all
406
policies, this policy adds the following additional attributes:
410
If ``False``, follow :rfc:`5322`, supporting non-ASCII characters in
411
headers by encoding them as "encoded words". If ``True``, follow
412
:rfc:`6532` and use ``utf-8`` encoding for headers. Messages
413
formatted in this way may be passed to SMTP servers that support
414
the ``SMTPUTF8`` extension (:rfc:`6531`).
416
.. attribute:: refold_source
418
If the value for a header in the ``Message`` object originated from a
419
:mod:`~email.parser` (as opposed to being set by a program), this
420
attribute indicates whether or not a generator should refold that value
421
when transforming the message back into stream form. The possible values
424
======== ===============================================================
425
``none`` all source values use original folding
427
``long`` source values that have any line that is longer than
428
``max_line_length`` will be refolded
430
``all`` all values are refolded.
431
======== ===============================================================
433
The default is ``long``.
435
.. attribute:: header_factory
437
A callable that takes two arguments, ``name`` and ``value``, where
438
``name`` is a header field name and ``value`` is an unfolded header field
439
value, and returns a string subclass that represents that header. A
440
default ``header_factory`` (see :mod:`~email.headerregistry`) is provided
441
that understands some of the :RFC:`5322` header field types. (Currently
442
address fields and date fields have special treatment, while all other
443
fields are treated as unstructured. This list will be completed before
444
the extension is marked stable.)
446
.. attribute:: content_manager
448
An object with at least two methods: get_content and set_content. When
449
the :meth:`~email.message.Message.get_content` or
450
:meth:`~email.message.Message.set_content` method of a
451
:class:`~email.message.Message` object is called, it calls the
452
corresponding method of this object, passing it the message object as its
453
first argument, and any arguments or keywords that were passed to it as
454
additional arguments. By default ``content_manager`` is set to
455
:data:`~email.contentmanager.raw_data_manager`.
457
.. versionadded:: 3.4
460
The class provides the following concrete implementations of the abstract
461
methods of :class:`Policy`:
463
.. method:: header_max_count(name)
465
Returns the value of the
466
:attr:`~email.headerregistry.BaseHeader.max_count` attribute of the
467
specialized class used to represent the header with the given name.
469
.. method:: header_source_parse(sourcelines)
471
The implementation of this method is the same as that for the
472
:class:`Compat32` policy.
474
.. method:: header_store_parse(name, value)
476
The name is returned unchanged. If the input value has a ``name``
477
attribute and it matches *name* ignoring case, the value is returned
478
unchanged. Otherwise the *name* and *value* are passed to
479
``header_factory``, and the resulting header object is returned as
480
the value. In this case a ``ValueError`` is raised if the input value
481
contains CR or LF characters.
483
.. method:: header_fetch_parse(name, value)
485
If the value has a ``name`` attribute, it is returned to unmodified.
486
Otherwise the *name*, and the *value* with any CR or LF characters
487
removed, are passed to the ``header_factory``, and the resulting
488
header object is returned. Any surrogateescaped bytes get turned into
489
the unicode unknown-character glyph.
491
.. method:: fold(name, value)
493
Header folding is controlled by the :attr:`refold_source` policy setting.
494
A value is considered to be a 'source value' if and only if it does not
495
have a ``name`` attribute (having a ``name`` attribute means it is a
496
header object of some sort). If a source value needs to be refolded
497
according to the policy, it is converted into a header object by
498
passing the *name* and the *value* with any CR and LF characters removed
499
to the ``header_factory``. Folding of a header object is done by
500
calling its ``fold`` method with the current policy.
502
Source values are split into lines using :meth:`~str.splitlines`. If
503
the value is not to be refolded, the lines are rejoined using the
504
``linesep`` from the policy and returned. The exception is lines
505
containing non-ascii binary data. In that case the value is refolded
506
regardless of the ``refold_source`` setting, which causes the binary data
507
to be CTE encoded using the ``unknown-8bit`` charset.
509
.. method:: fold_binary(name, value)
511
The same as :meth:`fold` if :attr:`~Policy.cte_type` is ``7bit``, except
512
that the returned value is bytes.
514
If :attr:`~Policy.cte_type` is ``8bit``, non-ASCII binary data is
516
into bytes. Headers with binary data are not refolded, regardless of the
517
``refold_header`` setting, since there is no way to know whether the
518
binary data consists of single byte characters or multibyte characters.
520
The following instances of :class:`EmailPolicy` provide defaults suitable for
521
specific application domains. Note that in the future the behavior of these
522
instances (in particular the ``HTTP`` instance) may be adjusted to conform even
523
more closely to the RFCs relevant to their domains.
527
An instance of ``EmailPolicy`` with all defaults unchanged. This policy
528
uses the standard Python ``\n`` line endings rather than the RFC-correct
533
Suitable for serializing messages in conformance with the email RFCs.
534
Like ``default``, but with ``linesep`` set to ``\r\n``, which is RFC
539
The same as ``SMTP`` except that :attr:`~EmailPolicy.utf8` is ``True``.
540
Useful for serializing messages to a message store without using encoded
541
words in the headers. Should only be used for SMTP trasmission if the
542
sender or recipient addresses have non-ASCII characters (the
543
:meth:`smtplib.SMTP.send_message` method handles this automatically).
547
Suitable for serializing headers with for use in HTTP traffic. Like
548
``SMTP`` except that ``max_line_length`` is set to ``None`` (unlimited).
552
Convenience instance. The same as ``default`` except that
553
``raise_on_defect`` is set to ``True``. This allows any policy to be made
556
somepolicy + policy.strict
558
With all of these :class:`EmailPolicies <.EmailPolicy>`, the effective API of
559
the email package is changed from the Python 3.2 API in the following ways:
561
* Setting a header on a :class:`~email.message.Message` results in that
562
header being parsed and a header object created.
564
* Fetching a header value from a :class:`~email.message.Message` results
565
in that header being parsed and a header object created and
568
* Any header object, or any header that is refolded due to the
569
policy settings, is folded using an algorithm that fully implements the
570
RFC folding algorithms, including knowing where encoded words are required
573
From the application view, this means that any header obtained through the
574
:class:`~email.message.Message` is a header object with extra
575
attributes, whose string value is the fully decoded unicode value of the
576
header. Likewise, a header may be assigned a new value, or a new header
577
created, using a unicode string, and the policy will take care of converting
578
the unicode string into the correct RFC encoded form.
580
The header objects and their attributes are described in
581
:mod:`~email.headerregistry`.