1
========================================
2
Notes from the PyCon 2012 Mailman Sprint
3
========================================
6
The notes are based on Barry Warsaw's description of the Mailman 3
7
pipeline at the PyCon 2012 Mailman sprint on March 13, with
8
diagrams from his "Mailman" presentation at PyCon 2012.
9
Transcribed by Stephen Turnbull.
11
*These are notes from the Mailman sprint at PyCon 2012. They are not
12
terribly well organized, nor fully fleshed out. Please edit and push
13
branches to Launchpad at lp:mailman or post patches to
14
<https://bugs.launchpad.net/mailman>.*
16
The intent of this document is to provide a view of Mailman 3's workflow and
17
structures from "eight miles high".
20
Basic Messaging Handling Workflow
21
=================================
23
Mailman accepts a message via the LMTP protocol (RFC 2033). It implements a
24
simple LMTP server internally based on the LMTP server provided in the Python
25
standard library. The LMTP server's responsibility is to parse the message
26
into a tuple (*mlist*, *msg*, *msgdata*). If the parse fails (including
27
messages which Mailman considers to be invalid due to lack of `Message-Id` as
28
strongly recommended by RFC 2822 and RFC 5322), the message will be rejected,
29
otherwise the parsed message and metadata dictionary are pickled, and the
30
resulting *message pickle* added to one of the `in`, `command`, or `bounce`
37
node [shape=box, color=lightblue, style=filled];
38
msg [shape=ellipse, color=black, fillcolor=white];
39
lmtpd [label="LMTP\nSERVER"];
40
msg -> MTA [label="SMTP"];
41
MTA -> lmtpd [label="LMTP"];
42
lmtpd -> MTA [label="reject"];
43
lmtpd -> IN -> PIPELINE [label=".pck"];
44
lmtpd -> BOUNCES [label=".pck"];
45
lmtpd -> COMMAND [label=".pck"];
48
The `in` queue is processed by *filter chains* (explained below) to determine
49
whether the post (or administrative request) will be processed. If not
50
allowed, the message pickle is discarded, rejected (returned to sender), or
51
held (saved for moderator approval -- not shown). Otherwise the message is
52
added to the `pipeline` (i.e. posting) queue.
54
Each of the `command`, `bounce`, and `pipeline` queues is processed by a
55
*pipeline of handlers* as in Mailman 2's pipeline. (Some functions such as
56
spam detection that were handled in the Mailman 2 pipeline are now in the
59
Handlers may copy messages to other queues (*e.g.*, `archive`), and eventually
60
posted messages for distribution to the list membership end up in the `out`
61
queue for injection into the MTA.
63
The `virgin` queue is a special queue for messages created by Mailman.
68
node [shape=box, style=rounded, group=0]
69
{ "MIME\ndelete" -> "cleanse headers" -> "add headers" -> \
70
"calculate\nrecipients" -> "to digest" -> "to archive" -> \
72
node [shape=box, color=lightblue, style=filled, group=1]
73
{ rank=same; PIPELINE -> "MIME\ndelete" }
74
{ rank=same; "to digest" -> DIGEST }
75
{ rank=same; "to archive" -> ARCHIVE }
76
{ rank=same; "to outgoing" -> OUT }
83
Once a message has been classified as a post or administrivia, rules are
84
applied to determine whether the message should be distributed or acted on.
85
Rules include things like "if the message's sender is a non-member, hold it
86
for moderation", or "if the message contains an `Approved` header with a valid
87
password, allow it to be posted". A rule may also make no decision, in which
88
case message processing is passed on to the next rule in the filter chain.
89
The default set of rules looks something like this:
94
rankdir=LR /* This gives the right orientation of the columns. */
96
subgraph in { IN [shape=box, color=lightblue, style=filled] }
101
approved [group=0, label="<f0> approved | {<f1> | <f2>}"]
102
emergency [group=0, label="<f0> emergency | {<f1> | <f2>}"]
103
loop [group=0, label="<f0> loop | {<f1> | <f2>}"]
104
modmember [group=0, label="<f0> member\nmoderated | {<f1> | <f2>}"]
105
administrivia [group=0, label="<f0> administrivia | <f1>"]
106
maxsize [group=0, label="<f0> max\ size | {<f1> | <f2>}"]
107
any [group=0, label="<f0> any | {<f1> | <f2>}"]
108
truth [label="<f0> truth | <f1>"]
109
approved:f1 -> emergency:f0 [weight=100]
110
emergency:f1 -> loop:f0
111
loop:f1 -> modmember:f0
112
modmember:f1 -> administrivia:f0
113
administrivia:f1 -> maxsize:f0
120
node [shape=box, style=filled];
121
DISCARD [shape=invhouse, color=black, style=solid];
122
MODERATION [color=wheat];
125
{ PIPELINE [shape=box, style=filled, color=cyan]; }
128
approved:f2 -> PIPELINE [minlen=2]
130
modmember:f2 -> MODERATION
132
emergency:f2:e -> HOLD
133
maxsize:f2 -> MODERATION
135
truth:f1 -> PIPELINE [minlen=2]
144
Each Runner's configuration object knows whether it should be started
145
when the Mailman daemon starts, and what queue the Runner manages.
151
`bin/mailman`: This is an ubercommand, with subcommands for all the various
152
things admins might want to do, similar to Mailman 2's mailmanctl, but with
155
`bin/master`: The runner manager: starts, watches, stops the runner
158
`bin/runner`: Individual runner daemons. Each instance is configured with
159
arguments specified on the command line.
165
A *user* represents a person. A user has an *id* and a *display
166
name*, and optionally a list of linked addresses.
168
Each *address* is a separate object, linked to no more than one user.
170
A list *member* associates an address with a mailing list. Each list member
171
has a id, a mailing list name, an address (which may be `None`, representing
172
the user's *preferred address*), a list of preferences, and a *role* such as
173
"owner" or "moderator". Roles are used to determine what kinds of mail the
174
user receives via that membership. *Owners* will receive mail to
175
*list*-owner, but not posts and moderation traffic, for example. A user with
176
multiple roles on a single list will therefore have multiple memberships in
177
that list, one for each role.
179
Roles are implemented by "magical, invisible" *rosters* which are objects
180
representing queries on the membership database.
186
Each list *style* is a named object. Its attributes are functions used to
187
apply the relevant style settings to the mailing list *at creation time*.
188
Since these are functions, they can be composed in various ways, to create