1
.TH MU-INDEX 1 "May 2011" "User Manuals"
5
mu index \- index e-mail messages stored in Maildirs
13
\fBmu index\fR is the \fBmu\fR command for scanning the contents of Maildir
14
directories and storing the results in a Xapian database. The data can then be
20
understands Maildirs as defined by Daniel Bernstein for qmail(7). In addition,
21
it understands recursive Maildirs (Maildirs within Maildirs), Maildir++. It
22
can also deal with VFAT-based Maildirs which use '!' as the seperators instead
23
of ':' as used by \fITinymail\fR/\fIModest\fR and some other e-mail programs.
25
E-mail messages which are not stored in something resembling a maildir
26
leaf-directory (\fIcur\fR and \fInew\fR) are ignored, as are the cache
27
directories for \fInotmuch\fR and \fIgnus\fR.
29
Symlinks are not followed.
31
If there is a file called \fI.noindex\fR in a directory, the contents of that
32
directory and all of its subdirectories will be ignored. This can be useful to
33
exclude certain directories from the indexing process, for example directories
36
The first run of \fBmu index\fR may take a few minutes if you have a lot of
37
mail (ten thousands of messages). Fortunately, such a full scan needs to be
38
done only once; after that it suffices to index the changes, which goes much
39
faster. See the 'Note on performance' below for more information.
41
The optional 'phase two' of the indexing-process is the removal of messages
42
from the database for which there is no longer a corresponding file in the
43
Maildir. If you do not want this, you can use \fB\-n\fR, \fB\-\-nocleanup\fR.
45
When \fBmu index\fR catches one of the signals \fBSIGINT\fR, \fBSIGHUP\fR or
46
\fBSIGTERM\fR (e.g,, when you press Ctrl-C during the indexing process), it
47
tries to shutdown gracefully; it tries to save and commit data, and close the
48
database etc. If it receives another signal (e.g,, when pressing Ctrl-C once
49
more), \fBmu index\fR will terminate immediately.
53
Note, some of the general options are described in the \fBmu(1)\fR man-page
54
and not here, as they apply to multiple mu commands.
57
\fB\-m\fR, \fB\-\-maildir\fR=\fI<maildir>\fR
58
starts searching at \fI<maildir>\fR. By default, \fBmu\fR uses whatever the
59
\fBMAILDIR\fR environment variable is set to; if it is not set, it tries
60
\fI~/Maildir\fR. See the note on mixing sub-maildirs below.
64
re-index all mails, even ones that are already in the database.
68
disables the database cleanup that \fBmu\fR does by default after indexing.
72
clear all messages from the database before
73
indexing. This is effectively the same as removing the database. The
74
difference with \fB\-\-reindex\fR is that \fB\-\-rebuild\fR guarantees that
75
after the indexing has finished, there are no 'old' messages in the database
76
anymore, which is not true with \fB\-\-reindex\fR when indexing only a part of
77
messages (using \fB\-\-maildir\fR). For this reason, it is necessary to run
78
\fBmu index \-\-rebuild\fR when there is an upgrade in the database
79
format. \fBmu index\fR will issue a warning about this.
83
automatically use \fB\-y\fR, \fB\-\-empty\fR
84
when \fBmu\fR notices that the database version is not up-to-date. This option
85
is for use in cron scripts and the like, so they won't require any user
86
interaction, even when mu introduces a new database version.
89
\fB\-\-xbatchsize\fR=\fI<batch size>\fR
90
set the maximum number of messages to process in a single Xapian
91
transaction. In practice, this option is only useful if you find that \fBmu\fR
92
is running out of memory while indexing; in that case, you can set the batch
93
size to (for example) 1000, which will reduce memory consumption, but also
94
substantially reduce the indexing performance.
97
\fB\-\-max-msg-size\fR=\fI<max msg size>\fR
98
set the maximum size (in bytes) for messages. The default maximum (currently
99
at 50Mb) should be enough in most cases, but if you encounter warnings from
100
\fBmu\fR about ignoring messsage because they are too big, you may want to
101
increase this. Note that the reason for having a maximum size is that big
102
message require big memory allocations, which may lead to problems.
105
It is not recommended tot mix maildirs and sub-maildirs within the hierarchy
106
in the same database; for example, it's better not to index both with
107
\fB\-\-maildir\fR=~/MyMaildir and \fB\-\-maildir\fR=~/MyMaildir/foo, as this
108
may lead to unexpected results when searching with the the 'maildir:' search
109
parameter (see below).
111
.SS A note on performance
112
As a non-scientific benchmark, a simple test on the authors machine (a
113
Thinkpad X61s laptop using Linux 2.6.35 and an ext3 file system) with no
114
existing database, and a maildir with 27273 messages:
117
$ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
118
$ time mu index --quiet
119
66,65s user 6,05s system 27% cpu 4:24,20 total
121
(about 103 messages per second)
123
A second run, which is the more typical use case when there is a database
124
already, goes much faster:
127
$ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
128
$ time mu index --quiet
129
0,48s user 0,76s system 10% cpu 11,796 total
131
(more than 2300 messages per second)
133
Note that each of test flushes the caches first; a more common use case might
134
be to run \fBmu index\fR when new mail has arrived; the cache may stay
135
quite 'warm' in that case:
138
$ time mu index --quiet
139
0,33s user 0,40s system 80% cpu 0,905 total
141
which is more than 30000 messages per second.
143
In general, \fBmu\fR has been getting faster with each release, even with
144
relatively expensive new features such as text-normalization (for
145
case-insensitve/accent-insensitive matching). The profiles are dominated by
146
operations in the Xapian database now.
149
By default, \fBmu index\fR stores its message database in \fI~/.mu/xapian\fR;
150
the database has an embedded version number, and \fBmu\fR will automatically
151
update it when it notices a different version. This allows for automatic
152
updating of \fBmu\fR-versions, without the need to clear out any old
155
However, note that versions of \fBmu\fR before 0.7 used a different scheme,
156
which put the database in \fI~/.mu/xapian\-<version>\fR. These older databases
157
can safely be deleted. Starting from version 0.7, this manual cleanup should
160
\fBmu\fR stores logs of its operations and queries in \fI<muhome>/mu.log\fR
161
(by default, this is \fI~/.mu/mu.log\fR). Upon startup, \fBmu\fR checks the
162
size of this log file. If it exceeds 1 MB, it will be moved to
163
\fI~/.mu/mu.log.old\fR, overwriting any existing file of that name, and start
164
with an empty log file. This scheme allows for continued use of \fBmu\fR
165
without the need for any manual maintenance of log files.
169
\fBmu index\fR uses \fBMAILDIR\fR to find the user's Maildir if it has not
170
been specified explicitly with \fB\-\-maildir\fR=\fI<maildir>\fR. If
171
\fBMAILDIR\fR is not set, \fBmu index\fR will try \fI~/Maildir\fR.
174
\fBmu index\fR return 0 upon successful completion, and any other number
175
greater than 2 signals an error, for example:
179
|------+--------------------------------|
181
| 1 | general error |
182
| 3 | could not obtain db write lock |
183
| 4 | database is corrupted |
188
Please report bugs if you find them:
189
.BR http://code.google.com/p/mu0/issues/list
193
Dirk-Jan C. Binnema <djcb@djcbsoftware.nl>