1
Code Clean-Up - Phase 1
2
-----------------------
7
Bogofilter was released over a year ago and has continually been
8
extended, corrected, enhanced, and refined. Over this time it has
9
evolved from a simple Bayesian filter to a sophisticated filter that
10
understands email, decodes text parts of multi-part MIME messages,
13
During this evolution, old functions have remained in the code and
14
command-line options have been added to provide compatibility with
15
older versions. Many of these functions and options have started
16
collecting dust - some are not commonly used and others are not
19
Bogofilter is suffering from creeping featuritis and optionitis.
21
It is time to clean house!
23
The goal of the bogofilter 0.16 series is to clean out this excess
24
code and create a core of high quality code. This will necessarily cut
25
some ties with previous versions, and you may need to adjust your
26
wrapper scripts to make up for features we have dropped.
28
The following list is supposed to be complete. Let us know if we've
29
omitted anything. We shall try to provide workarounds and migration
30
paths whenever possible.
35
1) Scoring algorithms:
37
Bogofilter will support only the Robinson-Fisher algorithm,
38
commonly called the "Fisher algorithm". The Graham algorithm and
39
Robinson geometric-mean algorithm, a.k.a. Robinson algorithm, have
44
Bogofilter will now support only the combined wordlist, i.e.
45
wordlist.db, which contains both the ham and spam counts for each
46
token. The older, separate wordlists (spamlist.db and goodlist.db)
47
are no longer supported.
49
The bogoupgrade program can still be used to merge the separate
50
databases for you. Type "bogoupgrade -d /you/wordlist/directory/"
53
Ignore lists, i.e. ignorelist.db, are also being deprecated. The
54
ignore list feature has never been thoroughly tested and is not
55
used (as far as we know).
59
Binary RPM packages are now being built with BerkeleyDB-4.1 (or
62
For convenience, use whatever BerkeleyDB version came with your
63
system. We have tested BerkeleyDB 3.2 and newer, but our testing
64
focus is with the recent 4.X releases. We developers are no longer
65
using BerkeleyDB-3.3, but will leave the code in bogofilter to
66
allow its continued use.
68
4) Command line switches:
70
Bogofilter will no longer support the switches listed in this
71
section. If used, bogofilter will print an error message and exit.
73
Scoring related switches:
75
-g - select Graham algorithm
76
-r - select Robinson Geometric-Mean algorithm
77
-f - select Robinson-Fisher algorithm
78
-2 - set binary classification mode
79
-3 - set ternary classification mode
81
Note: The Robinson-Fisher algorithm is bogofilter's one and
82
only algorithm. The classification mode switches are
83
unnecessary. Bogofilter will use binary mode if ham_cutoff is
84
zero and will use ternary mode (Yes, No, Unsure) if ham_cutoff
85
in non-zero and less than spam_cutoff.
89
-W - use combined wordlist for spam and ham tokens
90
-WW - use separate wordlists for spam and ham tokens
92
Note: Combined mode is now the only supported mode.
94
Backwards compatible token generation switches:
96
-Pi and -PI - ignore_case
97
-Pt and -PT - tokenize_html_tags
98
-Pc and -PC - strict_check
99
-Pd and -PD - degen_enabled
100
-Pf and -PF - first_match
102
Note: Since last May, the default values for these switches
106
tokenize_html_tags enabled
107
strict_check disabled
108
degen_enabled disabled
111
There will be no change in the default values.
113
5) Configuration options:
115
The following configuration options (for the above switches) are
130
The following configuration options (which don't correspond to
131
switches) are deprecated:
136
Note: Bogofilter will print a warning message if it sees any of
137
these options, but will run fine anyhow.
141
The user formatted SPAM_HEADER will no longer support format
142
specification "%a" (for algorithm) since bogofilter now has only
148
With the 0.16.0 release, a number of features have been deprecated.
149
The relevant code is bracketed by "#ifdef ENABLE_DEPRECATED_CODE" and
150
"#endif" statements. The default build will not include the
151
deprecated features. For those who still need these features,
152
configure option "--enable-deprecated-code" exists to allow them to be
158
Bogofilter 0.16.0 will be the "Code Clean-Up - Phase 1" release. The
159
"deprecated" state will exist until 0.16.X is promoted to "stable"
160
status, or for a month, whichever is longer.
162
Bogofilter 0.17.0 will be the "Code Clean-Up - Phase 2" release. All the
163
deprecated code will be removed.