2
<!--Copyright 1997-2002 by Sleepycat Software, Inc.-->
3
<!--All rights reserved.-->
4
<!--See the file LICENSE for redistribution information.-->
7
<title>Berkeley DB Reference Guide: Selecting a cache size</title>
8
<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit.">
9
<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++">
12
<a name="2"><!--meow--></a>
13
<table width="100%"><tr valign=top>
14
<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Access Methods</dl></h3></td>
15
<td align=right><a href="../../ref/am_conf/pagesize.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../reftoc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/byteorder.html"><img src="../../images/next.gif" alt="Next"></a>
18
<h1 align=center>Selecting a cache size</h1>
19
<p>The size of the cache used for the underlying database can be specified
20
by calling the <a href="../../api_c/db_set_cachesize.html">DB->set_cachesize</a> method.
21
Choosing a cache size is, unfortunately, an art. Your cache must be at
22
least large enough for your working set plus some overlap for unexpected
24
<p>When using the Btree access method, you must have a cache big enough for
25
the minimum working set for a single access. This will include a root
26
page, one or more internal pages (depending on the depth of your tree),
27
and a leaf page. If your cache is any smaller than that, each new page
28
will force out the least-recently-used page, and Berkeley DB will re-read the
29
root page of the tree anew on each database request.
30
<p>If your keys are of moderate size (a few tens of bytes) and your pages
31
are on the order of 4K to 8K, most Btree applications will be only
32
three levels. For example, using 20 byte keys with 20 bytes of data
33
associated with each key, a 8KB page can hold roughly 400 keys (or 200
34
key/data pairs), so a fully populated three-level Btree will hold 32
35
million key/data pairs, and a tree with only a 50% page-fill factor will
36
still hold 16 million key/data pairs. We rarely expect trees to exceed
37
five levels, although Berkeley DB will support trees up to 255 levels.
38
<p>The rule-of-thumb is that cache is good, and more cache is better.
39
Generally, applications benefit from increasing the cache size up to a
40
point, at which the performance will stop improving as the cache size
41
increases. When this point is reached, one of two things have happened:
42
either the cache is large enough that the application is almost never
43
having to retrieve information from disk, or, your application is doing
44
truly random accesses, and therefore increasing size of the cache doesn't
45
significantly increase the odds of finding the next requested information
46
in the cache. The latter is fairly rare -- almost all applications show
47
some form of locality of reference.
48
<p>That said, it is important not to increase your cache size beyond the
49
capabilities of your system, as that will result in reduced performance.
50
Under many operating systems, tying down enough virtual memory will cause
51
your memory and potentially your program to be swapped. This is
52
especially likely on systems without unified OS buffer caches and virtual
53
memory spaces, as the buffer cache was allocated at boot time and so
54
cannot be adjusted based on application requests for large amounts of
56
<p>For example, even if accesses are truly random within a Btree, your
57
access pattern will favor internal pages to leaf pages, so your cache
58
should be large enough to hold all internal pages. In the steady state,
59
this requires at most one I/O per operation to retrieve the appropriate
61
<p>You can use the <a href="../../utility/db_stat.html">db_stat</a> utility to monitor the effectiveness of
62
your cache. The following output is excerpted from the output of that
63
utility's <b>-m</b> option:
64
<p><blockquote><pre>prompt: db_stat -m
65
131072 Cache size (128K).
66
4273 Requested pages found in the cache (97%).
67
134 Requested pages not found in the cache.
68
18 Pages created in the cache.
69
116 Pages read into the cache.
70
93 Pages written from the cache to the backing file.
71
5 Clean pages forced from the cache.
72
13 Dirty pages forced from the cache.
73
0 Dirty buffers written by trickle-sync thread.
74
130 Current clean buffer count.
75
4 Current dirty buffer count.
77
<p>The statistics for this cache say that there have been 4,273 requests of
78
the cache, and only 116 of those requests required an I/O from disk. This
79
means that the cache is working well, yielding a 97% cache hit rate. The
80
<a href="../../utility/db_stat.html">db_stat</a> utility will present these statistics both for the cache
81
as a whole and for each file within the cache separately.
82
<table width="100%"><tr><td><br></td><td align=right><a href="../../ref/am_conf/pagesize.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../reftoc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/am_conf/byteorder.html"><img src="../../images/next.gif" alt="Next"></a>
84
<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font>