5
NOTE: More information on important changes for bogofilter updaters
6
is in the RELEASE.NOTES files. Read them!!
10
* Added ESF options to bogofilter's man page.
11
* Revised man pages' description of multi-parameter options.
15
* Fixed problem recognizing empty line ending of header in
16
files with CRLFs and X-Bogosity line as last header line.
20
* Added ESF options to bogofilter's help message.
24
* t.lock3 regression modified for solaris shell compatibility.
30
* Fix abort during db_open (ds_open wasn't first calling
32
* Added regression test t.lock3 for this fix.
33
* Revise datastore and database levels so that each level
34
calls its own init() and cleanup() routines.
40
* Added format specification '%I' to allow logging of the IP
41
address from which an email was received.
45
* Avoid "Invalid buffer size, exiting." problems by discarding
46
text from an excessively long html tag.
50
* Bogofilter's creation of new wordlists now includes
51
.WORDLIST_VERSION token.
52
* Fix erroneous double opening of wordlists specified on
54
* Corrected tests so that "make check" passes with qdbm and
59
* Fix included GSL compile for compilers that do not support
60
"extern inline", such as Compaq C V6.3.
61
* FAQ updated with info on multiple wordlists and ignore
63
* FAQ updated with info on building as non-root user.
69
* Fix problem with not expanding tildes.
70
* Add DS_LOAD flag to distinguish between database creation by
71
bogofilter and bogoutil so .WORDLIST_VERSION is only added
73
* Modify regression tests to use bogoutil to create empty
74
wordlists (as needed).
78
* Fixed registration. When multiple wordlists are specified,
79
registration is to first regular wordlist.
80
* Added regresstion test for multiple wordlists.
84
* Distinguish open modes of READ, WRITE, and CREATE so
85
* bogofilter will include .WORDLIST_VERSION in new database.
89
* Modify contrib scripts so they're sh compatible.
90
* Fix problem with not expanding tildes.
94
* Use named constants for wordlist 'type' attribute.
98
* Fix problem when bogofilter's home is a symlink.
102
* Miscellaneous cleanups for multiple wordlist code.
103
* Cleanup variable names in database open() code.
104
* Increase width of 'count' column for -vvv output.
108
* Add ignore list capability.
109
* Revive and revise multiple wordlist code.
113
* Revised bogotune's parameter ranges for coarse scan.
117
* Changed list address in FAQ to bogofilter.org
121
* Added code for Robinson's Effective Size Factor (ESF)
122
to score.c, bogotune.c, bogofilter.cf.example, etc.
123
* Added '-E' options to bogotune to suppress ESF scan.
127
* Remove unreferenced enum wl_e.
128
* Add .WORDLIST_VERSION meta symbol.
129
* Change subnet prefix from url: to ip:
130
* Add -u switch to bogoutil to do wordlist upgrade.
131
* Add regression test t.upgrade.subnet.prefix
135
* Fixed configure's --enable-memdebug option.
139
* Updated TODO list and procmailrc.example
143
* Lower output precision for regression tests, by using %f
144
rather than %e, to mask differences between GSL versions.
147
2004-04-08 - New Stable Release
151
* Change default parameters to the results of Greg's 300k ham
152
and 300k spam bogotune run:
154
robs robx min_dev spam_co
155
old 0.010000 0.415000 0.100000 0.950000
156
new 0.017800 0.520000 0.375000 0.990000
160
* Fix check for PGP signatures.
164
* Ignore data portion of PGP signatures.
165
* Use "mime:" (rather than "head:") to tag mime part headers.
169
* Fix "Can't find '.MSG_COUNT'" problem in bogotune.
170
* Fix defect that continues decoding base64/qp after invalid chars.
174
* Fix message reporting lack of config file.
180
* Warn if user specified config file doesn't exist.
184
* Fix tagging of IPAddrs in header lines.
188
* Added code to use Berkeley DB Concurrent Data Store, which
189
uses a "DB_ENV" BerkeleyDB environment and the BerkeleyDB
190
Lock Manager. This is documented in file doc/README.db.
194
* "bogofilter -Q" output now includes all config file options
195
and is formatted to be read in as a config file.
199
* Removed t.lock2 from "make check".
203
* Fixed minor bogotune problems related to building wordlists.
207
* Exempt tokens .MSG_COUNT and .ROBX from maintenance operations.
208
* Remove unused 'active' and 'weight' attributes of wordlists.
209
* If message uses CRLF, put CRLF after added header lines.
213
* Fixed code causing gcc 2.9x complaint.
217
* Algorithm code cleanup -- created file score.[ch] from
218
files method.[ch], robinson.[ch], and fisher.[ch]
222
* Changed path in example config files to /var/spool/bogofilter.
226
* Corrected typos in doc/bogofilter.xml (the man page)
227
and in contrib/mailfilter.example
228
* Documented bogoutil's histogram ('-H') option.
229
* Revised documentation of '-d', '-p', and '-w' options
231
* Relax filename vs. path restrictions in bogoutil.
235
* Added experimental code for BerkeleyDB environments code
236
for improved locking and caching.
242
* Additional portability fixes for SunOS 4.1.4
246
* Fix '-d' and '--bogofilter_dir=directory' options.
250
* Documentation cleanups.
254
* Fixed warnings from Intel C++ 8
255
* Portability fixes for SunOS 4.1.4
259
* Removed DEPRECATED CODE.
260
* Bogoutil now 'and's the maintenance operations together.
261
* Bogofilter and bogolexer now use standard long option
266
* Extracted basic typedefs from system.h into bftypes.h.
270
* Correct processing order as '-V' doesn't need to read the
272
* Correct SIGSEGV causes by missing environment variables.
274
0.16.4 2004-01-23 - Released
275
2004-01-31 - Promoted to Stable Version
279
* Config options 'thresh_stats' and 'thresh_rtable' have been
284
* Fix bogus config file warnings from bogolexer.
285
* Merge lexer rules to save a few bytes.
286
* Additional portability fixes.
290
* Removed optional argument from "-u" flag.
291
* Config file options can be put on the command line by using
292
"--key=value" syntax.
296
* Added '-c' (compact) option to bogominitrain.pl
300
* Tag Cc: lines as To: lines.
301
* Allow setting threshold for auto-updating, using the "-u
302
value" flag or the "thresh_update=value" option.
307
* Pass -mieee to GCC on machines that know this option (Alpha, SH).
308
Prevents SIGFPE and program termination with coredumps in some
309
conditions. Reported by Paul Ackersville, fix suggested by Clint
312
* Fix defined minimum autoconf/automake versions to match actual
313
requirements: autoconf 2.54, automake 1.7.
315
* Change AS_HELP_STRING to AC_HELP_STRING for compatibility with older
316
autoconf than version 2.58.
318
* Fix two compiler warnings in BerkeleyDB data store code. It is
319
unknown whether one has real-world impact when opening data bases,
320
fixing nonetheless. The other is cosmetic in any case.
324
* '-H' now disables header tagging for bogofilter and bogolexer.
330
* Ignore X-Bogosity lines in message header.
334
* Updated bogotune man page.
335
* Fixed minor errors in bogotune.
336
* Corrected memory leaks affecting bogotune.
340
* Added additional ifdefs for deprecated code.
341
* Added database histogram to bogoutil (using '-H' flag).
345
* Added "#ifdef ENABLE_DEPRECATED_CODE" statements as
347
* Corrected configure.ac defect for "--enable-memdebug".
348
* Corrected lexer defect that created unnecessary
349
"rcvd:Received" tokens.
350
* Fixed header file problem that causes sigsegv with qdbm.
351
* Expanded VERP pattern to reduce hapax count.
353
0.15.13 2003-12-31 - Stable Release
355
If updating from a bogofilter version older than 0.15.4, it's
356
a good idea to retrain or else use the '-H' (header tag
357
degeneration) option.
361
* Fix problem with separate wordlists that causes
362
"db_get_dbvalue( '...' ), err: 12, out of memory".
366
* Fixed memory leak affecting bogotune.
367
* Added -V (version) option to bogotune.
368
* RPM build fixed. (with help from Charles A Edwards)
369
* Compiler options changed to reduce warnings during gsl builds.
373
* Distinguish between pipe and stdin as input sources.
374
(contributed by Henning Makholm)
378
* Minor fixes for parsing msg-count files.
379
* Fix sigsegv in bogotune when '-D' option is used.
380
* Clarify list of supported mail formats in FAQ.
381
* Force line buffered output when not in passthrough mode.
382
* Enhance memdebug capabilities.
386
* Fix decoding of escaped urls.
390
* Fix compilation problems with datastore_tdb.c and datastore_qdbm.c
394
* Additional portability fixes for DGUX in configure.ac and bogogrep.c
398
* Bugfix for SIGFPE (division by zero) crash on start-up on systems
399
with BerkeleyDB 3.2 or older.
400
* bogotune-faq.html - new file
401
* bogofilter-faq.html - updated
402
* bogoupgrade - revised help message and man page
406
* bogotune now understands degeneration options and can use
407
them when creating message-count files.
411
* Fixed CRLF problem in bogoreader.c
415
* Updates to bogofilter man page and FAQ.
416
* Updates to configure scripts and makefiles to support DGUX.
417
* Improved configuration of BerkeleyDB and GSL.
421
* Removed unused '-F' (force) option from bogofilter.
422
* Exclude ~ (tilde) at the end of tokens.
426
* Removed unused '-q' (quiet) option from bogofilter.
430
* Important string format fixes that address "prints garbage" and
432
* Formatting and portability fixes for DGUX.
433
* The test suite ("make check") now works without formail for
435
* Added -M option to bogotune for creating message count
436
files. msg-count.sh no longer needed.
440
* Fixed "configure --enable-static" for building statically
442
* "make check" now uses static executables when "configure
443
--enable-static" was used.
444
* Revised msg-count.sh so that formail isn't needed.
448
* Script msg-count.sh has been added to create message count
449
files from mailboxes and mail directories.
453
* Fixed bug in header degeneration.
454
* Added degeneration options to config file.
455
* Added subject line tagging for Unsures. (Contributed by
460
* Multiple fixes, revisions, and changes to bogotune.
464
* Fix defect in robx calculation (for bogoutil and bogotune).
468
* The test suite ("make check") now works without procmail for t.MH
470
* Moved robx calculation code to new file for sharing by bogoutil
472
* Fix segfault when using '-H' (header_degen) option.
476
* Minor revisions to the FAQ.
477
* Miscellaneous cleanups to bogotune.c
478
* Cleanup bogoutil's help message.
482
* Fixed a defect in lexer.c that was discarding X-Bogosity
483
lines in rfc822 attachments. Thanks to Martin Gagern for
484
reporting the problem and supplying a patch.
485
* Fixed a memory leak in bogoutil. Thanks to Dan Deward for
486
reporting the problem.
487
* Refactored passthrough.c.
488
* Test suite bugfixes for TDB/QDBM.
492
* TDB passes all checks again.
493
* Fixed a defect in QDBM support that was breaking maintenance mode.
497
* Revised wordhash code to minimize storage needed for bogotune.
501
* Exclude apostrophes and backticks at the end of a token.
502
* Updated bogominitrain.pl to v1.4.2
503
* BerkeleyDB support warns if data base size approaches file size
504
resource limit, to avoid DB corruption when bogofilter is spawned
505
from Postfix-controlled systems (Postfix by default limits file
506
size to 51,200,000 bytes).
510
* "<!DOCTYPE HTML PUBLIC...>" is now recognized as starting
515
* Bogotune now checks for incorrectly classified messages in the test
516
data and exits if so.
518
* Configure now finds a POSIX compliant shell for running version.sh
522
* Several minor lexer bugs fixed.
523
* Lexer changes reduce size of bogofilter executable by approx 90%.
527
* Minor error message cleanups.
528
* Minor documentation fixes.
532
* Fixed timestamp config option.
533
* Removed repetition counts in lexer for TOKEN and MIME_BOUNDARY
534
patterns to reduce executable size.
538
* Remove --disable-* options for algorithms. Has never been supported
539
well and serves no useful purpose, the algorithm code is irrelevant
540
compared to lexer or other stuff.
544
* Print "X-Bogosity" line when "-t" is used alone.
545
* Modified handling of mime attachments to decode rfc822 and
546
to ignore applications and images.
550
* Change bogoupgrade back to using 2 arg open for perl-5.6
555
* Added man page for bogotune.
559
* Initial release of faster, C language version of bogotune.
563
* Configure script modified to better detect BerkeleyDB libs.
564
* Makefile modified to build bogolexer and bogoutil with fewer
569
* Added decoding of percent escaped characters in URLs.
573
* English and french versions of bogofilter-faq.html revised.
574
* Tuning script doc/bogotune rewritten as C program.
578
* Fix build problem in doc directory.
579
* Refactored source code so bogolexer will build without
584
* Fix initialization problem that prevents reading more than
586
* Initialize wordhash storage.
590
* Include all tokens in bogoutil dump output (unless in
595
* Disable header line tagging when processing msg-count files.
599
* Added decoding of escaped characters in html.
600
* Revised mailbox processing so type recognition is now
602
* Added support for ANT mailboxes.
603
* Made portability changes for OS/2 and RISC-OS
607
* Fixed problem in bogoupgrade.
608
* Revised configure.ac to remove GLIBC-2.3 dependency and
609
include info for DOS-ish file in config.h.
613
* Bogofilter can now use GSL 1.0 to 1.3 as well as 1.4.
614
If your distribution splits GSL into a library and a developer
615
package (Mandrake and Debian Linux), remember to install both!
616
* Rebuild i586.rpm with GSL dynamically linked; source rpms
617
and bogofilter-static rpm not affected.
618
* Don't allow whitespace in SMTP and ESMTP tokens.
619
* Revised reference ouputs for SMTP/ESMTP change.
623
* Rebuild of i586.rpm with GSL-1.4 dynamically linked.
624
Source rpms and bogofilter-static rpm not affected.
628
* Added GSL-1.4 as requirement for binary rpm.
629
* Fixed up t.separate reference test and cleaned up
630
t.degen, t.split, and t.regtest.
631
* Man page and French FAQ revised.
635
* Added '-H' (header-degen) option to aid transition to new
636
parsing. See RELEASE.NOTES-0.15 for more info.
640
* Minor revisions of OS/2 and RISC-OS compatibility code.
644
* Transaction code added for wordlist maintenance.
645
* Timestamp code refactored and moved from maint.c to datastore.c
649
* VERPs (Variable Envelope Return Paths) now have their
650
sequence numbers replaced by a '#' for scoring.
651
* Fixed problem that caused auto-update ("-u") to not update
653
* End-of-header code revised to ensure that passthrough ("-p")
654
properly places the X-Bogosity line.
658
* GNU GSL 1.4 has replaced DCDFLIB. If GSL 1.4 or newer is
659
installed in your system, bogofilter will use that (which will
660
usually be a shared library); if GSL is missing or a prior
661
version is present, bogofilter will statically link against GSL 1.4
662
excerpts from the gsl/ directory.
664
While we believe we have been allowed to include DCDFLIB with
665
bogofilter, some people had expressed concerns. GSL is subject
666
to the GNU General Public License v2, so we are DEFINITELY
667
allowed to include it.
671
* Added support for OS/2's file system.
675
* Fixed logging behavior when scoring mailboxes, maildirs, etc.
679
* Fixed processing of rmail files.
683
* Additional header line tagging as suggested by Michael O'Reilly.
684
* Minor revision of bogotune.
688
* Report if database file permissions wrong.
689
* No longer including pid in syslog error messages.
693
* No longer ignoring message separators.
697
* Added BOGOTEST environment variable to enable flex debugging.
698
* Minor revision of bogominitrain.pl
702
* Fixed bogoutil problem with '-w' and '-p'.
706
* Revise parsing pattern for "encoded text" and regression
707
test for folded text.
711
* Use GSL (the Gnu Scientific Library) when it's available.
717
* Fixed maintenance mode (broken during database API rewrite).
718
* Added regression test for maintenance mode.
719
* Re-organized test framework to put all scripts in src/tests,
720
all input files in src/tests/inputs, and reference outputs
721
in src/tests/outputs.
725
* Revised parsing to ignore additional headers, i.e.
726
Resent-Message-ID, In-Reply-To, and References.
730
* Fix auto-update ('-u') bug that double registers ham and spam.
731
* Correct QDBM optimization problems arising from API change.
737
* Header line unfolding now handled by flex rules.
738
Special thanks to Michael O'Reilly for his help!
739
* Initial release of RISC-OS support, including qdbm and tdb.
740
* QDBM is now supported.
741
* The data base configuration has changed. --with-tdb is gone,
742
use --with-database=db, --with-database=tdb or
743
--with-database=qdbm instead.
747
* Update bogowordfreq to work with bogoreader.
748
* Fatal flex errors are now caught and bogofilter exits
749
gracefully after closing its database(s).
753
* Limit size of unfolded header lines.
757
* Check for xmlto during configuration.
758
* Fix problem in empty line parsing rule.
759
* Fix string termination problem for bulk mode paths.
763
* Allow -I to be used with file or directory.
764
* Revise flex rule for encoded text to reduce program size.
768
* Revise flex grammar:
769
- to reduce size of generated rules
770
- to simplify handling of header tags and mime parts
774
* Clean-up message header processing:
775
- Don't tokenize message separator lines.
776
- Merge whitespace separated encoded words.
777
- Unfold header lines.
781
* Fix defective printing in 'bogofilter -Q' output.
785
* Revise mime processing to cure "fatal flex scanner internal
786
error--end of buffer missed".
787
* Restore parsing rule for ending a "loose" html comment.
791
* Change mime boundary line to operate on raw input,
792
i.e. before decoding it.
793
* Add charset map for windows-1251 to KOI8-R (Cyrillic).
797
* Fix some printf calls for 64-bit machines (%*s).
798
* Fix compilation with TDB.
799
* The -b and -B options now autodetect Maildir/s and iterate over each
800
mail in them. When the named input is a file, it is assumed to be a
801
single mail unless -s, -S, -n, -N or -M is given - in that case it's
803
* Revised message reader implementation so the way of specifing the
804
input is independent from the input format (such as mbox vs.
806
* Bugfixed Maildir implementation to read cur/ and new/.
807
* Implement support for MH directory (such as used by Sylpheed).
811
* Implemented new message reading protocol for processing
812
bulk mode, splitting mailboxes into messages, reading
813
messages in maildirs, etc.
817
* _Really_ fix defective printing in 'bogofilter -Q' output.
821
* Fix parser errors that can cause:
822
1. Incorrect processing of html comments.
823
2. "fatal flex scanner internal error--end of buffer missed",
824
which kills bogofilter.
825
* Fix defective printing in 'bogofilter -Q' output.
826
* Compiles with TDB again.
828
0.14.5.2 2003-08-20 - Stable Release
830
* bogominitrain.pl - removed email 'cruft' and revised format
835
* Fixed parameter type error in dbh_print_names() that causes
837
* Enhanced verbose output of bogominitrain.pl
838
* Documented '-T' option in man page.
842
* Fixed parsing error that treated "^From " in encoded text as
847
* Revised format for '-T'.
848
* Fixed defect in bogominitrain.pl's norepetition mode.
852
* Updated bogominitrain.pl to version 1.3.
856
* Corrected parsing error (in html code) that caused
857
bogofilter to miss message separators.
862
* Added '-T' as terse mode (with fixed formatting).
866
* Revised processing of From and empty lines so that parsing
867
works correctly with both flex-2.5.4 and flex-2.5.31.
875
* Revised database API so that there are 3 distinct layers
876
(program, datastore, and database) with a clean interface
878
* Correct exitcodes in bogoutil by using EX_ERROR.
882
* Fixed token registration bug in 0.14.x versions.
883
* Fixed seg fault caused by database lock contention.
887
* Fixed critical locking bug introduced into bogofilter 0.14.0
888
with the combined-wordlist code: when working with separate
889
wordlists, bogofilter would lock only the first one opened,
891
* Documentation updates.
895
* %g formatting is now supported by bogofilter's formatting functions.
896
* Merged trio 1.10 (http://ctrio.sourceforge.net/) to support
897
compilation on ancient systems (Solaris 2.5) that do not have
898
[v]snprintf functions.
899
Trio is Copyright (C) 1998-2000 Bjorn Reese and Daniel Stenberg.
900
* Various documentation updates, including the FAQ.
903
* The test suite was adjusted for older grep variants (Solaris
904
2.5) that don't cope with long lines.
905
* Print database version in print_version().
909
* Postfix integration instructions have been upgraded.
910
* Debug output for wordlists and databases was enhanced.
914
* Replaced use of memcpy() by memmove() in an input routine. The
915
overlapping copy migh cause data corruption on some systems.
916
* Fixed "make check" failures for bogoutil introduced with the
917
"combined wordlist" feature in 0.14.0. There has been a buffer
918
overflow. All users of bogofilter with combined wordlist prior to
919
0.14.2 are advised to upgrade.
920
* Fixed bogus "t.valgrind" test FAILures.
921
* Fixed uninitialized data in db_get_dbvalue(), for split word lists.
922
* New file, contrib/vm-bogofilter.el, provides an interface
923
between the VM mail reader and bogofilter."
924
* Revised lexer_v3.l for compatibility with flex-2.5.31
925
* Break up long line in regression test input for Solaris 2.5
930
* Fixed check for adding spam_subject_tag to Subject: line.
931
* Updated French version of FAQ.
935
* Correct problem with t.degen regression test.
939
* Updated English version of FAQ.
943
* Initial release of token degeneration code.
947
* Revised lexer pattern to better recognize encoded tokens.
951
* Implemented named exitcodes, with Unsure having its own
952
value (2) and changing the value for error from 2 to 3.
956
* Fix problem with encoded text.
957
* Fix handling of absolute paths.
958
* Fix defect in base64 decoding that can cause segfaults.
959
* Bogoutil now complains before exiting when it can't open a
961
* Updated bogominitrain.pl to work with combined wordlists.
965
* Updated contrib/bogominitrain.pl prints more info and can save
966
messages used in training.
968
* Miscellaneous documentation updates.
972
* Decode encoded text in header lines.
976
* Bogofilter and bogoutil detect whether one or two wordlists
977
are in BOGOFILTER_DIR and use the appropriate wordlist mode
978
(combined or separate).
980
* Bogofilter's -V output now includes algorithm and database
985
* Default wordlist mode is single, combined wordlist.
986
File wordlist.db contains all spam and ham tokens.
990
* Added tdb (trivial database) support.
994
* Initial release of code allowing bogofilter to use a single,
995
combined BerkeleyDB database for storing both ham and spam
996
tokens. The file is named wordlist.db.
1000
* Added contrib/bogominitrain.pl
1004
* Fix bug in "boundary" parser, which ignored boundaries with spaces.
1008
* Updated doc/integrating-with-qmail
1012
* Accept whitespace in html tags.
1013
* Checks for "Subject:" are now case insensitive.
1014
* Adds "Subject:" header if needed for including
1016
(Thanks to Pavel Kankovsky for these patches.)
1018
0.13.7.2 2003-07-02 - Stable Release
1020
* Fixed loop in yyinput() caused by unexpected EOF.
1024
* Update bogotune to version 0.3
1025
* Added '-k size' option to bogofilter and bogoutil for
1026
setting BerkeleyDB's cache size.
1030
* For bogotune change processing of '-t' switch from pass 1 to
1031
pass 2 so that it supercedes the config file.
1032
* Man pages now use '\ ' when a non-breaking space is needed,
1037
* '-Q' processing no longer requires that spamlist.db be present.
1041
* Replaced tuning/tuning.sh with tuning/bogotune (and related files).
1045
* Minor code rewrites to speed up processing messages, mboxes,
1046
and msg-count files. In particular, tuning/tuning.sh runs
1047
are approx 47% faster than before.
1048
* Fixed several errors in tuning/tuning.sh and reformatted
1049
"Top 10 Results" output.
1050
* Minor changes to bogoutil to support bogotune script.
1051
* Added newlines to correct usage messages.
1055
* Don't allow square brackets in tokens. Do allow dollar
1056
signs in tokens in msg-count files.
1057
* Bogolexer now ignores first 'From' token to match scoring
1058
behavior of bogofilter.
1062
* Updated file tuning/README and script tuning/tuning.sh.
1066
* Fix check for "^From " lines to work properly during base64
1068
* End html comment processing when a message header is found.
1069
* Improve README for the tuning scripts and simplify them.
1073
* Allow terminal exclamation points on tokens.
1074
* Updated contrib/mime.get.rfc822
1078
* Fixed bogofilter's non-use of message counts in msg-count files.
1079
* Diagnose invalid values of robx.
1080
* Modified rstats_print_histogram() so it doesn't print 'nan's.
1081
* Modified t.frame to find version of grep on Solaris so
1082
t.bulkmode can run successfully.
1086
* Modified t.parsing test so it works with OSX's default file system.
1090
* Changed default value of ROBS from 0.001 to 0.01
1091
* Fixed options '-M' (mailbox mode) and '-p' (passthrough
1092
mode) so they work properly together.
1093
* Minor cleanups in bogofilter.cf.example
1094
* Added db-3.2 and db-3.1 to list for AC_CHECK_DB in configure.ac
1098
* Minor code tweaks to quiet gcc-3.3 warnings.
1102
* Added doc/programmer/README.osx to distribution.
1103
* Corrected FAQ's procmail recipe for training with SpamAssassin.
1107
* Added -V (version) option to bogolexer.
1108
* Tweaked long line check used to prevent scanner buffer overflow.
1112
* In bulkmode, output filenames to stdout.
1114
* Further fixes for static-build system.
1118
* Autoconfiguration of BerkeleyDB library has been improved.
1119
* Build procedure for statically linked binaries has been improved.
1121
* Fixed defect in replace_nonascii_characters that was
1122
superseding ignore_case option.
1124
* Portability fix for efence usage in t.frame.
1126
* Added static-build to solve glibc version problem.
1130
* Modified "make rpm" to also build statically linked binaries.
1131
They're packaged in bogofilter-static-x.y.z-1.i586.rpm
1133
* Fixed bogofilter.spec.in to include files CHANGES-0.13 and
1134
RELEASE.NOTES-0.13 which had been left out.
1136
* tests/t.frame portability fix for non-Linux compatibility.
1140
* Added file RELEASE.NOTES-0.13. Read it!
1142
* Changed parsing defaults to:
1144
-PI ignore_case (default is disabled)
1145
-Ph header_line_markup (default is enabled)
1146
-Pt tokenize_html_tags (default is enabled)
1148
* Recognize a line of whitespace as ending the message header.
1150
* contrib/randomtrain and contrib/scramble can now process
1151
both mbox and maildir formats.
1152
* Added perl script contrib/mime.get.rfc822 to extract
1153
forwarded messages from with a message.
1155
* Added basic support for emacs RMAIL mailboxes.
1156
* Removed incomplete RMAIL/Babyl-5 support.
1158
* Registration code modified to count unique tokens for each
1159
message and display the total of the counts.
1161
* Added 'bogo-what?' to FAQ.
1165
* Modified bulk mode code to allow registering maildirs.
1166
* Added options to return tokens from inside HTML tags.
1167
Switch '-Ht' and option "tokenize_html_tags" turn it on.
1168
* Bogofilter's '-e' switch can now be used without '-p'.
1169
* Added doc/integrating-with-postfix.
1170
* Added bogofilter-faq-fr.html, a French translation of the FAQ.
1171
* Revise description of verbose output in FAQ.
1172
* Update man page documentation of bogofilter's switches.
1173
* Added basic memory accounting and debug capability.
1174
* Fixed memory leak in rstats.c
1175
* Fixed defect in handling of folded spam header lines.
1176
* Modified parsing so that yyredo() and yy_use_redo_text()
1177
are no longer needed.
1181
* Corrected bulkmode problem processing messages without
1183
* Corrected alignment of wordprop_t which caused bus error
1185
* Added directory to 'Error creating directory' message.
1189
* Corrected bad BOGOFILTER_DIR value in t.bulkmode
1190
* Subdirectories contrib and tuning now install from rpm
1191
to /usr/share/bogofilter.
1195
* Corrected some errors in rpm specfile.
1198
* Added 'tuning' directory with scripts for tuning bogofilter.
1199
(cf. bogofilter-tuning.HOWTO)
1202
* Added '-M' to allow classification of multiple messages in
1203
mbox formatted files.
1204
* New option '-Q' (query/display config) replaces '-qv'.
1205
* Grouped options into logical groups for help message and
1206
man page. Revised option descriptions.
1209
* Added bogofilter-tuning.HOWTO as replacement for README.Robinson
1212
* Added classification support for msg-count formatted files.
1213
* Add bogolex.sh for creating msg-count formatted email file.
1216
* Added bulk mode procesing for Maildirs.
1217
'-b' reads filenames from stdin.
1218
'-B' gets filenames from the command line.
1221
* Miscellaneous refactoring in main.c
1223
0.11.2 - stable version
1227
* Added 'terse' option to bogofilter.cf for selecting format of
1232
* Use frexp() to retain maximum precision of floating point results.
1238
* Reformat histogram output (from "-vv") to fit in 80 columns.
1239
* Added sample configuration for maildrop.
1243
* Added protections against negative token counts to
1244
bogoutil.c and database_db.c
1245
* Additional portability changes made to the regression tests.
1246
* Enhanced '-m' option allows specifying robs value.
1251
* Include 'strict_check' in '-qv' output.
1252
* Correct outdated acinclude.m4, as it causes the configure
1253
script to be invalid.
1254
* Revised UPGRADE document.
1255
* Added contrib/bogotrain.sh
1260
* Change bogoutil's '-p' option to require a database.
1261
* Fix OS X segfault caused by using DB handle after closing database.
1264
* Improve bogoutil's reporting of a bad directory or filename.
1265
* Simplify configure check for BerkeleyDB.
1266
* Extend configure's compiler checks for AIX.
1271
* Changed default value of 'strict_check' to 'no' (disabled).
1274
* Added config file option 'strict_check' for processing
1275
html comments. Enabled means to use "<!--" and "-->" to
1276
delimit comments. Disabled uses "<!" and ">".
1281
* Bogofilter now frees _all_ memory that it allocates.
1282
* FAQ reorganized and info added on asian spam, the format of
1283
verbose output, and using SpamAssassin to train bogofilter.
1284
* Fixed processing of '-o' option.
1289
* Cleaned up help messages and added version info.
1290
* Expanded bogofilter-faq.html
1291
* Fixed precedence for directory specifications.
1292
* Fixed processing of folded X-Bogosity line.
1293
* Fixed processing of spam_subject_tag.
1295
0.11.1.3 - stable release
1298
* Expanded regression tests.
1299
* Cleaned up fprintf() arguments.
1300
* Cleaned up message and mime header checks.
1301
* Additional improvements to maintenance code.
1306
* Fixed bogoutil's broken maintenance mode.
1307
* Update bogofilter documentation and FAQ.
1308
* Explicitly check linking against libdb early to avoid unspecific
1309
error messages as "cannot determine size of unsigned short".
1310
* Retry locking without mmap() on systems that return the old-fashioned
1311
EACCES rather than EAGAIN for locking failures such as AIX 4.3.3.
1312
* Fix potential division by zero in histogram generator, it caused
1313
program abort after not handling floating point exceptions on some
1314
architectures such as Alpha. The division by zero is now avoided.
1318
* Fixed flaw that caused user config file to be ignored.
1319
* Fixed broken '-u' (update) code.
1320
* Updated documentation of bogolexer and bogoutil.
1325
* Using standard html comment delimiters when ignoring comments.
1326
* Fixed charset initialization flaw.
1330
* The Robinson-Fisher algorithm is now the default algorithm.
1331
* The configuration file parser is stricter and more correct.
1332
* Separated message registration options from unregistration
1333
options. '-S' and '-N' have been changed and now just do
1334
unregistration. To move a message from one wordlist to the
1335
other, use '-S -n' or '-N -s' (as appropriate)
1336
* Bogofilter's -p (passthrough) mode will no longer read the entire
1337
mail into memory if the standard input is a seekable regular file.
1338
* Bogofilter's '-l' option was changed and no longer allows an
1339
argument. Use the new '-L yourtag' option to provide a tag
1341
* Database access efficiency changes.
1342
* Improvements in html comment handling code.
1343
* Internal cleanup of storage used in parsing messages and
1344
working with databases.
1345
* Manual pages now contain the proper path to bogofilter.cf.
1349
* Updated bogofilter and bogoutil man pages.
1350
* Give command line options preference over config file options.
1354
* Database access efficiency changes.
1355
* Database properly closed at end of maintenance pass.
1356
* Improvements in html comment handling code.
1357
* Support option '-?'
1361
* Stable release of 0.10.1.5
1370
* A variety of robustness and portability changes, code and
1371
file cleanups and documentation updates.
1372
* Multiple fixes for mime and html processing.
1373
* Additional support and fixes for the various spam scoring
1375
** See file CHANGES-0.10 for details of the above items.
1379
* Added mime processing capability, with decoding of base64,
1380
quoted-printable and uuencoded sections. Ignores attachments when
1381
computing spamicity.
1382
* Added wordlist maintenance capability to bogoutil. Can ignore
1383
tokens based on count, age, or length. Can replace non-ASCII chars
1384
with question marks.
1385
* Added dates to wordlist tokens. Option "datestamp_tokens=true|false"
1386
can be used to enable/disable them.
1387
* Moved most documentation files to doc directory.
1388
* Added sample procmail file, contrib/procmailrc.example
1389
* Spamicity score now computable from multiple word list pairs, i.e. all
1390
spam and ham word lists in directories named on command line or in
1391
config file (via "wordlist=" or "bogofilter_dir=" lines).
1392
* Lexer is now case insensitive
1393
* Increase MAXTOKENLEN from 20 to 30, allowing more and longer tokens
1395
* New options for setting of default charset and replacing of non-ASCII
1396
characters. New character set handling routines to provide charset
1397
specific token parsing.
1398
* New error handling routine will output error messages to stderr and,
1399
if '-l' (logging) is enabled, to syslog.
1400
* New message formatting capability allows formats to be put in config
1401
file for X-Bogosity line and logging messages. Message content can
1402
include status, spamicity, version, etc.
1403
* Long-standing locking bugs that caused corruption in the database
1405
* Work around ash-0.2 and bash-1 bugs, needed for make check.
1406
* Cater for malloc/calloc implementations that return NULL when 0 bytes
1407
of memory are requested, some AIX versions e. g., that would
1408
previously falsely claim an "out of memory" condition.
1409
(also available as patch for 0.9.1.2)
1410
* Reorder gcc __attribute__ lines for gcc-2.7 compatibility.
1411
(also available as patch for 0.9.1.2)
1415
* A defect in the collect_words routines (in 0.9.1) caused
1416
incorrect generation of "must get only one message to
1417
calculate spamicity!" messages. This has been fixed.
1418
* A defect in the contrib/bogopass script caused the unbase64-edited
1419
version of the mail to be printed rather than the original with just
1420
the header added. This has been fixed.
1421
* Documentation has been revised and updated.
1422
* Robinson-Fisher method now produces a tristate status, i.e.
1423
spam/ham/unsure, if ham_cutoff is non-zero. ham_cutoff defaults to
1424
0.1 and can be set via config file.
1425
* Script contrib/bogopass has revised error and environment checking.
1429
* New script contrib/bogopass allows processing of base64
1430
attachments. This is a temporary solution until base64
1431
code can be built into bogofilter.
1432
* New file README.Robinson describes the tunable parameters
1433
for the Robinson algorithm and what to do for best performance.
1434
* Changed the default behavior to use the Robinson algorithm.
1435
* Corrected incorrect sort order when printing statistics.
1436
* Added support for Fisher's method of combining probabilities,
1437
as optimized for this purpose by Rob W. W. Hooft, to the
1438
"Robinson" algorithm.
1439
* The new file METHODS describes the Graham, Robinson, and Fisher
1440
methods that bogofilter supports for computing spamicity.
1441
* New file README.dcdflib gives some info on the dcdflib free
1442
library of routines for cumulative distribution functions.
1443
* A new '-f' option tells bogofilter to use Fisher's method.
1444
* A new '-c' option in bogofilter allows specification of the
1445
configuration file to read.
1446
* A new '-C' option tells bogofilter not to read any config file.
1447
* The syslog facility in '-l' mode has changed from "daemon" to "mail",
1448
so your logs may now be in /var/log/maillog or /var/log/mail rather
1449
than /var/log/messages. Check your /etc/syslog.conf.
1450
* The testing framework now works on Solaris.
1453
* Fixed several portability problems uncovered by the new
1455
* Added three more regression tests designed to confirm that
1456
bogofilter's results are matching saved reference results.
1457
* Implemented an object oriented API for using computational methods.
1458
* Split the main module into a registration module and three algorithm
1459
modules - for the fisher, graham and robinson methods.
1460
* Registering big mbox files is much faster now, at the expense of some
1465
* The lexer code now detects read errors (and exits with code 2
1467
* Fixed passthrough mode in bogofilter: it no longer strips the
1468
spam-header from a mail body.
1469
* Fixed portability to some systems, notably, Solaris
1470
and HP-UX, added README. for some systems to describe build
1472
* Fixed "rpl_malloc" link failures.
1473
* Fixed bogofilter 0.7.6 passthrough regression on some
1474
systems: The X-Bogofilter header would be added to the body
1475
and a bogus blank line would be added.
1476
* Bogofilter now supports a configuration file named
1477
/etc/bogofilter.cf and/or ~/.bogofilter.cf.
1478
* Bogofilter's use of '-v' for printing spamicity statistics
1479
has been organized with increasing levels of details as
1480
additional '-v's are added.
1481
* When using the Robinson algorithm, bogofilter can print a
1482
simple histogram showing word probability distribution.
1483
* Bogoutil supports a new '-w' switch for displaying tokens
1484
from the word list databases.
1485
* Bogolexer added to distribution. Provides easy access to
1486
parsing a file to examine the tokens.
1487
* Bogolexer has a new '-p' (passthrough) for printing tokens and
1488
bogoutil has a new '-p' (probability) for printing the
1489
probabilities of one or more tokens. They can be connected
1490
via pipe to display the probabilities of all words in a message.
1491
* DB 4.1 support has been fixed.
1492
* Documentation updates.
1496
* Added README.hp-ux for those using HP-UX.
1497
* Added support for additional architectures - ia64, arm,
1499
* Bogofilter -p mode now preserves CR and NUL characters.
1500
* Bogofilter -p mode now detects if the computer runs out of
1502
* Bogofilter supports a new "-l" switch to write run-time log
1503
information to syslog.
1504
* Bogofilter supports a new algorithm to calculate the
1505
"spamicity", the "Robinson" algorithm. It is enabled with the
1506
new "-r" switch. The old behavior is called the "Graham"
1507
algorithm and can be enforced with the new "-g" switch. The
1508
default behavior is to use the "Graham" algorithm.
1509
* Bogofilter now has an "-R FILE" option (that implies -r) to print
1510
an R data frame to FILE.
1511
* Bogofilter and bogoutil now have a "-x CLASSES" option to turn
1513
* Bogoupgrade.pl has been renamed to bogoupgrade.
1514
* There is now a man page for bogoupgrade.
1515
* BASE64 treatment has been fixed. It ignored whole lines if
1516
they consisted of a single token. Now a token is only
1517
considered base64 and ignored if it's >= 32 characters or ends
1518
in one or two padding "=" signs.
1519
* MIME boundary lines are now emitted as tokens. Some of them
1520
are typical of certain spam software, so they might turn out
1522
* All control characters are now considered token delimiters.
1523
* Bogofilter now aborts if it cannot figure where to look for its
1525
* The software no longer crashes on machines that do not allow
1526
for unaligned memory access (m68k; many RISC, e. g. SPARC).
1527
* DB 4.1 is now supported.
1528
* Documentation updates.
1530
0.7.5 Sun Oct 20 17:34:35 PDT 2002
1532
* The header in bogofilter -p mode now defaults to X-Bogosity, but
1533
can be changed by using "./configure --enable-spam-header=name" at
1535
* The option names -h/-H are back to -n/-N like they were in version
1536
0.6, and -h now means "help".
1537
* A utility has been added to help upgrade wordlists from older
1538
versions of bogofilter to the current format. See the UPGRADE file
1539
for more information.
1540
* Support has been added for the environment variable BOGOFILTER_DIR
1541
to control where bogofilter looks for it's wordlists.
1542
* Now bogofilter no longer depends on the Judy package. We now use a
1543
high performance hashing algorithm for message evaluation. The Judy
1544
package is no longer required to compile or run bogofilter.
1545
* Support for the -e flag, which will cause bogofilter to exit with a
1546
value of 0 regardless of the spamicity of the message. This is useful
1548
* Support for -u flag. This allows message evaluation and training to
1549
happen in the same invocation of bogofilter.
1550
* Extended TOKEN patterns to improve support for European languages.
1551
* Improved wordlist locking to prevent data corruption.
1552
* Added procmail recipes for example usage in the man page.
1554
0.7.4 Tue Sep 17 02:29:48 EDT 2002
1556
* Added infrastructure to support multiple wordlists
1557
* Fixed classification bug
1558
* Fixed errors in documentation
1559
* Improved portability of locking code
1560
* Fixed 'last line occasionally emitted twice' bug
1561
* Cleaned up underflow checking for word counts in bogofilter.c
1562
* Code readability improvements
1563
* Split main() function in bogofilter.c into smaller pieces
1564
* Message processing performance improvements
1566
0.7.3 Thu Sep 12 13:28:37 PDT 2002
1569
* Added portable file locking support for files and databases
1571
* Bug fix for negative counts in word registration
1572
* Bug fix for SEGV in $HOME path code
1573
* Bug fix for trailing slash in -d option
1575
0.7.2: Wed Sep 11 15:28:00 PDT 2002
1578
* Introduced GNU configure for portability code
1580
0.7.1: Tue Sep 10 00:59:00 PDT 2002
1583
* Skip existing X-Spam-Header
1584
* Performance improvement for -p mode
1586
* Bug fix in getopt argument
1588
0.7: Sat Sep 7 14:18:33 EDT 2002
1591
* Check your scripts! Option names have changed.
1592
* Name changes: goodlist -> hamlist, badlist-> spamlist.
1593
This is a step towards supporting more categories.
1594
* Autodaemon is gone. Instead, the new implementation uses
1595
DBM. Optimization with mmap will be in a future release.
1596
* Speed-tuning of the bogofilter function.
1597
* We're back to not ignoring HTML comments.
1599
0.6: Fri Aug 30 00:25:49 EDT 2002
1601
* Fixed a fluky bug in the socket-transmission logic
1602
* Fixed an edge case where a single message with a From line
1603
was getting counted twice.
1604
* Unknown-word probability bumped from 0.2 to 0.4, tracking a
1605
change by Paul Graham.
1606
* Documented -d option.
1608
0.5: Thu Aug 29 13:38:12 EDT 2002
1610
* Passthrough option can be used to add an X-Spam-Status header.
1611
* There is now a per-message word frequency cap, so spammers can't do
1612
an equivalent of Google fodder.
1613
* HTML comments are now ignored.
1614
* HTML 4.0 keywords and attributes are now ignored.
1615
* Improved extrema calculation.
1616
* Mutt patch withdrawn -- have a better version of mutt macros instead.
1617
* -S and -N options from matt@lickey.com (Matt Armstrong).
1618
* Client-server partitioning with a persistent server, drastically
1619
reducing startup cost after the first run.
1620
* Minor bug fix by Eric Seppanen.
1622
0.4: Sat Aug 24 09:07:45 EDT 2002
1624
* regenerated bogofilter mutt patch.
1625
* wordlist files are now automatically created in -s and -n modes.
1626
* Reversed the exit values, following a suggestion by Michael Elkins
1627
about how to make bogofilter fail gracefully.
1628
* -Wall cleanup and uninitialized-variable fix from Eric Seppanen.
1629
* fcntl(2) file locking to head off a race condition in write_list.
1630
* Added the long-sought procmail recipe.
1632
0.3: Fri 23 Aug 03:30:49 EDT 2002
1634
* Specfile/Makefile improvements from Graham Reed.
1635
* Case blindness fix from Eric Seppanen.
1636
* Deallocation fix from Mike Mayfield.
1637
* Wordlist file format changed.
1639
0.2: Tue Aug 20 06:49:42 EDT 2002
1641
* Added mutt-1.4 interface patch
1642
* Note: Location of the base directory has changed.
1644
0.1: Mon 19 Aug 2002 03:07:31
1648
$Id: NEWS,v 1.265 2004/06/27 00:45:35 relson Exp $
1650
vim:tw=79 com=bf\:* ts=8 sts=8 sw=8 ai:
1652
LocalWords: BOGOFILTER Exp procmail contrib Spamicity config spamicity
1653
LocalWords: bogoutil datestamp MAXTOKENLEN charset stderr syslog gcc malloc
1654
LocalWords: calloc Bogosity