~ubuntu-branches/ubuntu/trusty/openldap/trusty-updates : contents of debian/README.DB

~ubuntu-branches/ubuntu/trusty/openldap/trusty-updates : (revision 47)
For good performance using the BDB backend, a good DB_CONFIG file in the
database directory (usually /var/lib/ldap) is crucial. The following two
articles should help you to determine a good configuration for your 
requirements. A standard DB_CONFIG is installed but it may not be adequate
for your system.

The current version of OpenLDAP supports putting DB_CONFIG parameters into
slapd.conf instead by prefixing those options with dbconfig.  See the
slapd-bdb(5) man page for more information.  If there is no DB_CONFIG file
when slapd starts and there are dbconfig lines in slapd.conf, slapd will
write out a DB_CONFIG file with those settings before initializing the
database.

With the current version of OpenLDAP, any changes to DB_CONFIG will take
effect automatically after restarting slapd.  Running db_recover is no
longer required.

 -- Torsten Landschoff <torsten@debian.org>  Sun, 29 May 2005 18:08:10 +0200
    Russ Allbery <rra@debian.org>  Fri, 01 Jun 2007 23:57:33 -0700

How do I configure the BDB backend?
-----------------------------------
(Taken from http://www.openldap.org/faq/data/cache/893.html, author unknown)

The BDB backend ("back-bdb") uses a lot of special features of Sleepycat's
Berkeley DB library, and there are a lot of details that must be set correctly
to get the best results from it. Even though the LDBM backend ("back-ldbm") can
use the BerkeleyDB library, the BDB and LDBM backends have some very important
differences, as already noted in (Xref) What are the different backends? What
are their differences?.

Because back-bdb is transaction-based and uses write-ahead logging to ensure
data consistency, it has much heavier I/O demands than back-ldbm. Also, the
transaction log files accumulate as data is written to the directory, and these
log files must be cleaned out periodically. Otherwise the log files will
consume enormous amounts of disk space. The cleanup procedures are described in
(Xref) How to maintain Berkeley DB (logs etc.) ?.

The information needed to fully understand things and to properly configure
back-bdb is divided among the slapd-bdb(5) manual page and the SleepyCat
BerkeleyDB documentation (http://www.sleepycat.com/docs/).

You should read the entire slapd-bdb(5) manpage before proceeding. The only
mandatory keyword is "directory" for setting the location of the database
files. The other keywords control tradeoffs between data reliability,
performance, and memory use. To ensure that committed transactions actually get
flushed to disk, you should use the "checkpoint" keyword, otherwise your data
is vulnerable to loss due to system failures. See the SleepyCat documentation
for more information about checkpoints. (In fact, you should read all of
chapter 9 "Berkeley DB Transactional Data Store Applications" in the SleepyCat
reference manual. At least, read sections 1-3 and 13-24.)

The "dbnosync" keyword is provided for compatibility with back-ldbm; the
preferred method of setting this is to use the BDB DB_CONFIG file option
set_flags DB_TXN_NOSYNC. The "lockdetect" keyword is also deprecated, you
should instead use the BDB DB_CONFIG file set_lk_detect keyword. (It's safe to
leave this at the default setting.)

A number of important items must be configured in the BDB DB_CONFIG file and
not in slapd.conf. You should, at least, read about these items:

set_cachesize
    The BDB library maintains its own cache separate from the back-bdb entry
    cache. You must set this cache to a size appropriate for your database and
    physical memory size. Note that this is a persistent setting - after you
    set it the first time, further changes will be ignored until you recreate
    the environment using db_recover. 
set_lg_dir
    Set the directory for storing transaction logs. For best performance,
    the transaction logs must be located on a different physical disk from
    the database files. 
set_lg_bsize
    Set the buffer size for the transaction log. Larger is better, but it
    doesn't have much effect unless you're also using the DB_TXN_NOSYNC
    option. With a default log file size of 10MB I usually set this to 2MB.
    The default is only 32K, which is too small for back-bdb. 

On a very busy system you might see error messages talking about running out of
locks, lockers, or lock objects. Usually the default values are plenty, and in
older versions of the BDB library the errors were more likely due to library
bugs than actual system load. However, it is possible that you have actually
run out of lock resources due to heavy system usage. If this happens, you
should read about the set_lk_max_lockers, set_lk_max_locks, and
set_lk_max_objects keywords. 

How do I determine the proper BDB/HDB database cache size?
----------------------------------------------------------
(Taken from http://www.openldap.org/faq/data/cache/1075.html, written by 
hyc@openldap.org, Kurt@OpenLDAP.org)

Not having a proper database cache size will cause performance issues. (Note:
in older versions of Berkeley DB, an improper database case size could also
cause the server to hang.)

These issues are not an indication of corruption occurring in the database. It
is merely the fact that the cache is thrashing itself that causes
performance/response time to slowdown. If you take the time to read and
understand the Berkeley DB documentation, measure the library performance using
db_stat, and tune your settings, you will avoid these problems.

It is not absolutely necessary to configure a BerkeleyDB cache equal in size to
your entire database. All that you need is a cache that's large enough for your
"working set." That means, large enough to hold all of the most frequently
accessed data, plus a few less-frequently accessed items.

You should really read the BDB documentation referenced above, but let me spell
out what that really means here, in detail. The discussion here is focused on
back-bdb and back-hdb, but most of it also applies to back-ldbm when using
BerkeleyDB as the underlying database engine.

Start with the most obvious - the back-bdb database lives in two main files,
dn2id.bdb and id2entry.bdb. These are B-tree databases. We have never
documented the back-bdb internal layout before, because it didn't seem like
something anyone should have to worry about, nor was it necessarily cast in
stone. But here's how it works today, in OpenLDAP 2.1 and 2.2. (All of the
database files in back-ldbm are B-trees by default.)

A B-tree is a balanced tree; it stores data in its leaf nodes and bookkeeping
data in its interior nodes. (If you don't know what tree data structures look
like in general, Google for some references, because that's getting far too
elementary for the purposes of this discussion.)

For decent performance, you need enough cache memory to contain all the nodes
along the path from the root of the tree down to the particular data item
you're accessing. That's enough cache for a single search. For the general
case, you want enough cache to contain all the internal nodes in the database.
"db_stat -d" will tell you how many internal pages are present in a database.
You should check this number for both dn2id and id2entry.

Also note that id2entry always uses 16KB per "page", while dn2id uses whatever
the underlying filesystem uses, typically 4 or 8KB. To avoid thrashing the
cache and triggering these infinite hang bugs in BDB 4.1.25, your cache must be
at least as large as the number of internal pages in both the dn2id and
id2entry databases, plus some extra space to accomodate the actual leaf data
pages.

For example, in my OpenLDAP 2.2 test database, I have an input LDIF file that's
about 360MB. With the back-hdb backend this creates a dn2id.bdb that's 68MB,
and an id2entry that's 800MB. db_stat tells me that dn2id uses 4KB pages, has
433 internal pages, and 6378 leaf pages. The id2entry uses 16KB pages, has 52
internal pages, and 45912 leaf pages. In order to efficiently retrieve any
single entry in this database, the cache should be at least

(433+1) * 4KB + (52+1) * 16KB in size: 1736KB + 848KB =~ 2.5MB.

This doesn't take into account other library overhead, so this is even lower
than the barest minimum. The default cache size, when nothing is configured, is
only 256KB. If you tried to do much of anything with this database and only
default settings, BDB 4.1.25 would lock up in an infinite loop.

This 2.5MB number also doesn't take indexing into account. Each indexed
attribute uses another database file of its own, using a Hash structure.
(Again, in back-ldbm, the indexes also use B-trees by default, so this part of
the discussion doesn't apply unless back-ldbm was explicitly compiled to use
Hashes instead. Also, in OpenLDAP 2.2 onward, all of the indexes use B-trees,
there are no more Hash database files. So just use the B-tree information above
and ignore this Hash discussion.)

Unlike the B-trees, where you only need to touch one data page to find an entry
of interest, doing an index lookup generally touches multiple keys, and the
point of a hash structure is that the keys are evenly distributed across the
data space. That means there's no convenient compact subset of the database
that you can keep in the cache to insure quick operation, you can pretty much
expect references to be scattered across the whole thing. My strategy here
would be to provide enough cache for at least 50% of all of the hash data.
(Number of hash buckets + number of overflow pages + number of duplicate pages)
* page size / 2.

The objectClass index for my example database is 5.9MB and uses 3 hash buckets
and 656 duplicate pages. So ( 3 + 656 ) * 4KB / 2 =~ 1.3MB.

With only this index enabled, I'd figure at least a 4MB cache for this backend.
(Of course you're using a single cache shared among all of the database files,
so the cache pages will most likely get used for something other than what you
accounted for, but this gives you a fighting chance.)

With this 4MB cache I can slapcat this entire database on my 1.3GHz PIII in 1
minute, 40 seconds. With the cache doubled to 8MB, it still takes the same
1:40s. Once you've got enough cache to fit the B-tree internal pages,
increasing it further won't have any effect until the cache really is large
enough to hold 100% of the data pages. I don't have enough free RAM to hold all
the 800MB id2entry data, so 4MB is good enough.

With back-bdb and back-hdb you can use "db_stat -m" to check how well the
database cache is performing. Unfortunately you can't do this with back-ldbm,
as the statistics are not accessible when slapd is running, nor are they saved
anywhere when slapd is stopped. (Yet another reason not to use back-ldbm.)