~barry/mailman/events-and-web

Viewing changes to Mailman/rules/docs/header-matching.txt

Committer: Barry Warsaw
Date: 2008-02-03 04:03:19 UTC
mfrom: (6581.1.27 rules)
Revision ID: barry@python.org-20080203040319-mnb1sar9bumaih01

Merge the 'rules' branch.

Give the first alpha a code name.

This branch mostly gets rid of all the approval oriented handlers in favor of
a chain-of-rules based approach.  This will be much more powerful and
extensible, allowing rule definition by plugin and chain creation via web
page.

When a message is processed by the incoming queue, it gets sent through a
chain of rules.  The starting chain is defined on the mailing list object, and
there is a built-in default starting chain, called 'built-in'.  Each chain is
made up of links, which describe a rule and an action, along with possibly
some other information.  Actions allow processing to take a detour through
another chain, jump to another chain, stop processing, run a function, etc.

The built-in chain essentially implements the original early part of the
handler pipeline.  If a message makes it through the built-in chain, it gets
sent to the prep queue, where the message is decorated and such before sending
out to the list membership.  The 'accept' chain is what moves the message into
the prep queue.

There are also 'hold', 'discard', and 'reject' chains, which do what you would
expect them to.  There are lots of built-in rules, implementing everything
from the old emergency handler to new handlers such as one not allowing empty
subject headers.

IMember grows an is_moderated attribute.

The 'adminapproved' metadata key is renamed 'moderator_approved'.

Fix some bogus uses of noreply_address to no_reply_address.

Stash an 'original_size' attribute on the message after parsing its plain
text.  This can be used later to ensure the original message does not exceed a
specified size without have to flatten the message again.

The KNOWN_SPAMMERS global variable is replaced with HEADER_MATCHES.  The
mailing list's header_filter_rules variable is replaced with header_matches
which has the same semantics as HEADER_MATCHES, but is list-specific.

DEFAULT_MAIL_COMMANDS_MAX_LINES -> EMAIL_COMMANDS_MAX_LINES.

Update smtplistener.py to be much better, to use maildir format instead of
mbox format, to respond to RSET commands by clearing the maildir, and by
silencing annoying asyncore error messages.

Extend the doctest runner so that it will run .txt files in any docs
subdirectory in the code tree.

Add plugable keys 'mailman.mta' and 'mailman.rules'.  The latter may have only
one setting while the former is extensible.

There are lots of doctests which should give all the gory details.

Mailman/Post.py -> Mailman/inject.py and the command line usage of this module
is removed.

SQLALCHEMY_ECHO, which was unused, is removed.

Backport the ability to specify additional footer interpolation variables by
the message metadata 'decoration-data' key.

can_acknowledge() defines whether a message can be responded to by the email
robot.

Simplify the implementation of _reset() based on Storm fixes.  Be able to
handle lists in Storm values.

Do some reorganization.

files added:
Mailman/app/chains.py

Mailman/app/rules.py

Mailman/chains

Mailman/chains/__init__.py

Mailman/chains/accept.py

Mailman/chains/base.py

Mailman/chains/builtin.py

Mailman/chains/discard.py

Mailman/chains/headers.py

Mailman/chains/hold.py

Mailman/chains/reject.py

Mailman/docs/chains.txt

Mailman/interfaces/chain.py

Mailman/interfaces/rules.py

Mailman/queue/docs

Mailman/queue/docs/OVERVIEW.txt

Mailman/queue/docs/incoming.txt

Mailman/rules

Mailman/rules/__init__.py

Mailman/rules/administrivia.py

Mailman/rules/any.py

Mailman/rules/approved.py

Mailman/rules/docs

Mailman/rules/docs/administrivia.txt

Mailman/rules/docs/emergency.txt

Mailman/rules/docs/header-matching.txt

Mailman/rules/docs/implicit-dest.txt

Mailman/rules/docs/loop.txt

Mailman/rules/docs/max-size.txt

Mailman/rules/docs/moderation.txt

Mailman/rules/docs/news-moderation.txt

Mailman/rules/docs/no-subject.txt

Mailman/rules/docs/recipients.txt

Mailman/rules/docs/rules.txt

Mailman/rules/docs/suspicious.txt

Mailman/rules/docs/truth.txt

Mailman/rules/emergency.py

Mailman/rules/implicit_dest.py

Mailman/rules/loop.py

Mailman/rules/max_recipients.py

Mailman/rules/max_size.py

Mailman/rules/moderation.py

Mailman/rules/news_moderation.py

Mailman/rules/no_subject.py

Mailman/rules/suspicious.py

Mailman/rules/truth.py

Mailman/tests/helpers.py

files removed:
Mailman/Handlers/Approve.py

Mailman/Handlers/Emergency.py

Mailman/Handlers/Hold.py

Mailman/Handlers/SpamDetect.py

Mailman/docs/antispam.txt

Mailman/docs/hold.txt

Mailman/queue/tests

Mailman/queue/tests/__init__.py

files renamed:
Mailman/Post.py => Mailman/inject.py

Mailman/docs/news-runner.txt => Mailman/queue/docs/news.txt

Mailman/docs/outgoing.txt => Mailman/queue/docs/outgoing.txt

Mailman/docs/runner.txt => Mailman/queue/docs/runner.txt

Mailman/docs/switchboard.txt => Mailman/queue/docs/switchboard.txt

Mailman/docs/approve.txt => Mailman/rules/docs/approve.txt

files modified:
Mailman/Defaults.py

Mailman/Handlers/Decorate.py

Mailman/Message.py

Mailman/Utils.py

Mailman/app/bounces.py

Mailman/app/moderator.py

Mailman/app/replybot.py

Mailman/app/styles.py

Mailman/bin/inject.py

Mailman/configuration.py

Mailman/database/mailinglist.py

Mailman/database/mailman.sql

Mailman/database/member.py

Mailman/database/model.py

Mailman/database/pending.py

Mailman/docs/mlist-addresses.txt

Mailman/docs/requests.txt

Mailman/initialize.py

Mailman/interfaces/__init__.py

Mailman/interfaces/mailinglist.py

Mailman/interfaces/member.py

Mailman/queue/__init__.py

Mailman/queue/command.py

Mailman/queue/incoming.py

Mailman/templates/en/__init__.py *

Mailman/templates/en/postauth.txt

Mailman/tests/bounces/__init__.py *

Mailman/tests/smtplistener.py

Mailman/tests/test_documentation.py

docs/NEWS.txt

setup.py

Show diffs side-by-side

added added

removed removed

Mailman/rules/docs/header-matching.txt

Header matching

===============

Mailman can do pattern based header matching during its normal rule

processing. There is a set of site-wide default header matchines specified in

the configuaration file under the HEADER_MATCHES variable.

>>> from Mailman.app.lifecycle import create_list

>>> mlist = create_list(u'_xtest@example.com')

Because the default HEADER_MATCHES variable is empty when the configuration

file is read, we'll just extend the current header matching chain with a

pattern that matches 4 or more stars, discarding the message if it hits.

>>> from Mailman.configuration import config

>>> chain = config.chains['header-match']

>>> chain.extend('x-spam-score', '[*]{4,}', 'discard')

First, if the message has no X-Spam-Score header, the message passes through

the chain untouched (i.e. no disposition).

>>> msg = message_from_string("""\

... From: aperson@example.com

... To: _xtest@example.com

... Subject: Not spam

... Message-ID: <one>

...

... This is a message.

... """)

>>> from Mailman.app.chains import process

Pass through is seen as nothing being in the log file after processing.

# XXX This checks the vette log file because there is no other evidence

# that this chain has done anything.

>>> import os

>>> fp = open(os.path.join(config.LOG_DIR, 'vette'))

>>> fp.seek(0, 2)

>>> file_pos = fp.tell()

>>> process(mlist, msg, {}, 'header-match')

>>> fp.seek(file_pos)

>>> print 'LOG:', fp.read()

LOG:

Now, if the header exists but does not match, then it also passes through

untouched.

>>> msg['X-Spam-Score'] = '***'

>>> del msg['subject']

>>> msg['Subject'] = 'This is almost spam'

>>> del msg['message-id']

>>> msg['Message-ID'] = '<two>'

>>> file_pos = fp.tell()

>>> process(mlist, msg, {}, 'header-match')

>>> fp.seek(file_pos)

>>> print 'LOG:', fp.read()

LOG:

But now if the header matches, then the message gets discarded.

>>> del msg['x-spam-score']

>>> msg['X-Spam-Score'] = '****'

>>> del msg['subject']

>>> msg['Subject'] = 'This is spam, but barely'

>>> del msg['message-id']

>>> msg['Message-ID'] = '<three>'

>>> file_pos = fp.tell()

>>> process(mlist, msg, {}, 'header-match')

>>> fp.seek(file_pos)

>>> print 'LOG:', fp.read()

LOG: ... DISCARD: <three>

For kicks, let's show a message that's really spammy.

>>> del msg['x-spam-score']

>>> msg['X-Spam-Score'] = '**********'

>>> del msg['subject']

>>> msg['Subject'] = 'This is really spammy'

>>> del msg['message-id']

>>> msg['Message-ID'] = '<four>'

>>> file_pos = fp.tell()

>>> process(mlist, msg, {}, 'header-match')

>>> fp.seek(file_pos)

>>> print 'LOG:', fp.read()

LOG: ... DISCARD: <four>

Flush out the extended header matching rules.

>>> chain.flush()

List-specific header matching

-----------------------------

100

Each mailing list can also be configured with a set of header matching regular

101

expression rules. These are used to impose list-specific header filtering

102

with the same semantics as the global `HEADER_MATCHES` variable.

103

104

The list administrator wants to match not on four stars, but on three plus

105

signs, but only for the current mailing list.

106

107

>>> mlist.header_matches = [('x-spam-score', '[+]{3,}', 'discard')]

108

109

A message with a spam score of two pluses does not match.

110

111

>>> del msg['x-spam-score']

112

>>> msg['X-Spam-Score'] = '++'

113

>>> del msg['message-id']

114

>>> msg['Message-ID'] = '<five>'

115

>>> file_pos = fp.tell()

116

>>> process(mlist, msg, {}, 'header-match')

117

>>> fp.seek(file_pos)

118

>>> print 'LOG:', fp.read()

119

LOG:

120

121

A message with a spam score of three pluses does match.

122

123

>>> del msg['x-spam-score']

124

>>> msg['X-Spam-Score'] = '+++'

125

>>> del msg['message-id']

126

>>> msg['Message-ID'] = '<six>'

127

>>> file_pos = fp.tell()

128

>>> process(mlist, msg, {}, 'header-match')

129

>>> fp.seek(file_pos)

130

>>> print 'LOG:', fp.read()

131

LOG: ... DISCARD: <six>

132

133

134

As does a message with a spam score of four pluses.

135

136

>>> del msg['x-spam-score']

137

>>> msg['X-Spam-Score'] = '+++'

138

>>> del msg['message-id']

139

>>> msg['Message-ID'] = '<seven>'

140

>>> file_pos = fp.tell()

141

>>> process(mlist, msg, {}, 'header-match')

142

>>> fp.seek(file_pos)

143

>>> print 'LOG:', fp.read()

144

LOG: ... DISCARD: <seven>

145

Older »