2
<!--Copyright 1997-2002 by Sleepycat Software, Inc.-->
3
<!--All rights reserved.-->
4
<!--See the file LICENSE for redistribution information.-->
7
<title>Berkeley DB Reference Guide: Recoverability and deadlock handling</title>
8
<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit.">
9
<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++">
12
<table width="100%"><tr valign=top>
13
<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Berkeley DB Transactional Data Store Applications</dl></h3></td>
14
<td align=right><a href="../../ref/transapp/data_open.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../reftoc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/atomicity.html"><img src="../../images/next.gif" alt="Next"></a>
17
<h1 align=center>Recoverability and deadlock handling</h1>
18
<p>The first reason listed for using transactions was recoverability. Any
19
logical change to a database may require multiple changes to underlying
20
data structures. For example, modifying a record in a Btree may require
21
leaf and internal pages to split, so a single <a href="../../api_c/db_put.html">DB->put</a> method
22
call can potentially require that multiple physical database pages be
23
written. If only some of those pages are written and then the system
24
or application fails, the database is left inconsistent and cannot be
25
used until it has been recovered; that is, until the partially completed
26
changes have been undone.
27
<p><i>Write-ahead-logging</i> is the term that describes the underlying
28
implementation that Berkeley DB uses to ensure recoverability. What it means
29
is that before any change is made to a database, information about the
30
change is written to a database log. During recovery, the log is read,
31
and databases are checked to ensure that changes described in the log
32
for committed transactions appear in the database. Changes that appear
33
in the database but are related to aborted or unfinished transactions
34
in the log are undone from the database.
35
<p>For recoverability after application or system failure, operations that
36
modify the database must be protected by transactions. More
37
specifically, operations are not recoverable unless a transaction is
38
begun and each operation is associated with the transaction via the
39
Berkeley DB interfaces, and then the transaction successfully committed. This
40
is true even if logging is turned on in the database environment.
41
<p>Here is an example function that updates a record in a database in a
42
transactionally protected manner. The function takes a key and data
43
items as arguments and then attempts to store them into the database.
44
<p><blockquote><pre>int
45
main(int argc, char *argv)
49
DB *db_cats, *db_color, *db_fruit;
54
while ((ch = getopt(argc, argv, "")) != EOF)
66
/* Open database: Key is fruit class; Data is specific type. */
67
db_open(dbenv, &db_fruit, "fruit", 0);
69
/* Open database: Key is a color; Data is an integer. */
70
db_open(dbenv, &db_color, "color", 0);
74
* Key is a name; Data is: company name, cat breeds.
76
db_open(dbenv, &db_cats, "cats", 1);
78
<b> add_fruit(dbenv, db_fruit, "apple", "yellow delicious");</b>
84
add_fruit(DB_ENV *dbenv, DB *db, char *fruit, char *name)
91
memset(&key, 0, sizeof(key));
92
memset(&data, 0, sizeof(data));
94
key.size = strlen(fruit);
96
data.size = strlen(name);
99
/* Begin the transaction. */
100
if ((ret = dbenv->txn_begin(dbenv, NULL, &tid, 0)) != 0) {
101
dbenv->err(dbenv, ret, "DB_ENV->txn_begin");
105
/* Store the value. */
106
switch (ret = db->put(db, tid, &key, &data, 0)) {
108
/* Success: commit the change. */
109
if ((ret = tid->commit(tid, 0)) != 0) {
110
dbenv->err(dbenv, ret, "DB_TXN->commit");
114
case DB_LOCK_DEADLOCK:
116
/* Retry the operation. */
117
if ((t_ret = tid->abort(tid)) != 0) {
118
dbenv->err(dbenv, t_ret, "DB_TXN->abort");
121
if (++fail == MAXIMUM_RETRY)
126
}</b></pre></blockquote>
127
<p>Berkeley DB also uses transactions to recover from deadlock. Database
128
operations (that is, any call to a function underlying the handles
129
returned by <a href="../../api_c/db_open.html">DB->open</a> and <a href="../../api_c/db_cursor.html">DB->cursor</a>) are usually
130
performed on behalf of a unique locker. Transactions can be used to
131
perform multiple calls on behalf of the same locker within a single
132
thread of control. For example, consider the case in which a cursor
133
scan locates a record and then accesses some other item in the database,
134
based on that record. If these operations are done using the handle's
135
default locker IDs, they may conflict. If the locks are obtained on
136
behalf of a transaction, using the transaction's locker ID instead of
137
the handle's locker ID, the operations will not conflict.
138
<p>There is a new error return in this function that you may not have seen
139
before. In transactional (not Concurrent Data Store) applications
140
supporting both readers and writers, or just multiple writers, Berkeley DB
141
functions have an additional possible error return:
142
<a href="../../ref/program/errorret.html#DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a>. This means that two thread of controls
143
deadlocked, and the thread receiving the <a href="../../ref/program/errorret.html#DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a> error
144
return has been selected to discard its locks in order to resolve the
145
problem. When an application receives a <a href="../../ref/program/errorret.html#DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a>
146
return, the correct action is to close any cursors involved in the
147
operation and abort any enclosing transaction. In the sample code, any
148
time the <a href="../../api_c/db_put.html">DB->put</a> method returns <a href="../../ref/program/errorret.html#DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a>,
149
<a href="../../api_c/txn_abort.html">DB_TXN->abort</a> is called (which releases the transaction's Berkeley DB
150
resources and undoes any partial changes to the databases), and then
151
the transaction is retried from the beginning.
152
<p>There is no requirement that the transaction be attempted again, but
153
that is a common course of action for applications. Applications may
154
want to set an upper bound on the number of times an operation will be
155
retried because some operations on some data sets may simply be unable
156
to succeed. For example, updating all of the pages on a large Web site
157
during prime business hours may simply be impossible because of the high
158
access rate to the database.
159
<p>The <a href="../../api_c/txn_abort.html">DB_TXN->abort</a> method is called in error cases other than deadlock.
160
Any time an error occurs, such that a transactionally protected set of
161
operations cannot complete successfully, the transaction must be
162
aborted. While deadlock is by far the most common of these errors,
163
there are other possibilities; for example, running out of disk space
164
for the filesystem. In Berkeley DB transactional applications, there are
165
three classes of error returns: "expected" errors, "unexpected but
166
recoverable" errors, and a single "unrecoverable" error. Expected
167
errors are errors like <a href="../../ref/program/errorret.html#DB_NOTFOUND">DB_NOTFOUND</a>, which indicates that a
168
searched-for key item is not present in the database. Applications may
169
want to explicitly test for and handle this error, or, in the case where
170
the absence of a key implies the enclosing transaction should fail,
171
simply call <a href="../../api_c/txn_abort.html">DB_TXN->abort</a>. Unexpected but recoverable errors are
172
errors like <a href="../../ref/program/errorret.html#DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a>, which indicates that an operation
173
has been selected to resolve a deadlock, or a system error such as EIO,
174
which likely indicates that the filesystem has no available disk space.
175
Applications must immediately call <a href="../../api_c/txn_abort.html">DB_TXN->abort</a> when these returns
176
occur, as it is not possible to proceed otherwise. The only
177
unrecoverable error is <a href="../../ref/program/errorret.html#DB_RUNRECOVERY">DB_RUNRECOVERY</a>, which indicates that the
178
system must stop and recovery must be run.
179
<p>It is possible to simplify the above code in the case of a transaction
180
comprised entirely of a single database put or delete operation. The
181
<a href="../../api_c/db_put.html">DB->put</a> and <a href="../../api_c/db_del.html">DB->del</a> method (and other) calls support the
182
<a href="../../api_c/env_set_flags.html#DB_AUTO_COMMIT">DB_AUTO_COMMIT</a> flag that allows applications to implicitly wrap
183
the operation in a transaction. For example, with the
184
<a href="../../api_c/env_set_flags.html#DB_AUTO_COMMIT">DB_AUTO_COMMIT</a> flag, the above code could be more simply written
186
<p><blockquote><pre><b> for (fail = 0; fail++ <= MAXIMUM_RETRY && (ret =
187
db->put(db, NULL, &key, &data, DB_AUTO_COMMIT)) == DB_LOCK_DEADLOCK;)
189
return (ret == 0 ? 0 : 1);</b></pre></blockquote>
190
<p>Programmers should not attempt to enumerate all possible error returns
191
in their software. Instead, they should explicitly handle expected
192
returns and default to aborting the transaction for the rest. It is
193
entirely the choice of the programmer whether to check for
194
<a href="../../ref/program/errorret.html#DB_RUNRECOVERY">DB_RUNRECOVERY</a> explicitly or not -- attempting new Berkeley DB
195
operations after <a href="../../ref/program/errorret.html#DB_RUNRECOVERY">DB_RUNRECOVERY</a> is returned does not worsen the
196
situation. Alternatively, using the <a href="../../api_c/env_set_paniccall.html">DB_ENV->set_paniccall</a> method to
197
handle an unrecoverable error and simply doing some number of
198
abort-and-retry cycles for any unexpected Berkeley DB or system error in the
199
mainline code often results in the simplest and cleanest application
201
<table width="100%"><tr><td><br></td><td align=right><a href="../../ref/transapp/data_open.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../reftoc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/atomicity.html"><img src="../../images/next.gif" alt="Next"></a>
203
<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font>