2
<!--Copyright 1997-2002 by Sleepycat Software, Inc.-->
3
<!--All rights reserved.-->
4
<!--See the file LICENSE for redistribution information.-->
7
<title>Berkeley DB Reference Guide: Application structure</title>
8
<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit.">
9
<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++">
12
<table width="100%"><tr valign=top>
13
<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Berkeley DB Transactional Data Store Applications</dl></h3></td>
14
<td align=right><a href="../../ref/transapp/term.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../reftoc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/env_open.html"><img src="../../images/next.gif" alt="Next"></a>
17
<h1 align=center>Application structure</h1>
18
<p>When building transactionally protected applications, there are some
19
special issues that must be considered. The most important one is that
20
if any thread of control exits for any reason while holding Berkeley DB
21
resources, recovery must be performed to do the following:
23
<li>Recover the Berkeley DB resources.
24
<li>Release any locks or mutexes that may have been held to avoid starvation
25
as the remaining threads of control convoy behind the failed thread's
27
<li>Clean up any partially completed operations that may have left a
28
database in an inconsistent or corrupted state.
30
<p>Complicating this problem is the fact that the Berkeley DB library itself
31
cannot determine whether recovery is required; the application itself
32
<b>must</b> make that decision. A further complication is that
33
recovery must be single-threaded; that is, one thread of control or
34
process must perform recovery before any other thread of control or
35
processes attempts to create or join the Berkeley DB environment.
36
<p>There are two approaches to handling this problem:
38
<p><dt>The hard way:<dd>An application can track its own state carefully enough that it knows
39
when recovery needs to be performed. Specifically, the rule to use is
40
that recovery must be performed before using a Berkeley DB environment any
41
time the threads of control previously using the Berkeley DB environment did
42
not shut the environment down cleanly before exiting the environment
43
for any reason (including application or system failure).
44
<p>Requirements for shutting down the environment cleanly differ, depending
45
on the type of environment created. If the environment is public and
46
persistent (that is, the <a href="../../api_c/env_open.html#DB_PRIVATE">DB_PRIVATE</a> flag was not specified to
47
the <a href="../../api_c/env_open.html">DB_ENV->open</a> method), recovery must be performed if any transaction
48
was not committed or aborted, or <a href="../../api_c/env_close.html">DB_ENV->close</a> method was not called
49
for any open <a href="../../api_c/env_class.html">DB_ENV</a> handle.
50
<p>If the environment is private and temporary (that is, the
51
<a href="../../api_c/env_open.html#DB_PRIVATE">DB_PRIVATE</a> flag was specified to the <a href="../../api_c/env_open.html">DB_ENV->open</a> method),
52
recovery must be performed if any transaction was not committed or
53
aborted, or <a href="../../api_c/env_close.html">DB_ENV->close</a> method was not called for any open
54
<a href="../../api_c/env_class.html">DB_ENV</a> handle. In addition, at least one transaction checkpoint
55
must be performed after all existing transactions have been committed
57
<p><dt>The easy way:<dd>It greatly simplifies matters that recovery may be performed regardless
58
of whether recovery strictly needs to be performed; that is, it is not
59
an error to run recovery on a database for which no recovery is
60
necessary. Because of this fact, it is almost invariably simpler to
61
ignore the previous rules about shutting an application down cleanly,
62
and simply run recovery each time a thread of control accessing a
63
database environment fails for any reason, as well as before accessing
64
any database environment after system reboot.
66
<p>There are two common ways to build transactionally protected Berkeley DB
67
applications. The most common way is as a single, usually
68
multithreaded, process. This architecture is simplest because it
69
requires no monitoring of other threads of control. When the
70
application starts, it opens and potentially creates the environment,
71
runs recovery (whether it was needed or not), and then opens its
72
databases. From then on, the application can create new threads of
73
control as it chooses. All threads of control share the open Berkeley DB
74
<a href="../../api_c/env_class.html">DB_ENV</a> and <a href="../../api_c/db_class.html">DB</a> handles. In this model, databases are
75
rarely opened or closed when more than a single thread of control is
76
running; that is, they are opened when only a single thread is running,
77
and closed after all threads but one have exited. The last thread of
78
control to exit closes the databases and the environment.
79
<p>An alternative way to build Berkeley DB applications is as a set of
80
cooperating processes, which may or may not be multithreaded. This
81
architecture is more complicated.
82
<p>First, this architecture requires that the order in which threads of
83
control are created and subsequently access the Berkeley DB environment be
84
controlled because recovery must be single-threaded. The first thread
85
of control to access the environment must run recovery, and no other
86
thread should attempt to access the environment until recovery is
87
complete. (Note that this ordering requirement does not apply to
88
environment creation without recovery. If multiple threads attempt to
89
create a Berkeley DB environment, only one will perform the creation and the
90
others will join the already existing environment.)
91
<p>Second, this architecture requires that threads of control be monitored.
92
If any thread of control that owns Berkeley DB resources exits without first
93
cleanly discarding those resources, recovery is usually necessary.
94
Before running recovery, all threads using the Berkeley DB environment must
95
relinquish all of their Berkeley DB resources (it does not matter if they do
96
so gracefully or because they are forced to exit). Then, recovery can
97
be run and the threads of control continued or restarted.
98
<p>We have found that the safest way to structure groups of cooperating
99
processes is to first create a single process (often a shell script)
100
that opens/creates the Berkeley DB environment and runs recovery, and that
101
then creates the processes or threads that will actually perform work.
102
The initial thread has no further responsibilities other than to monitor
103
the threads of control it has created, to ensure that none of them
104
unexpectedly exits. If one exits, the initial process then forces all
105
of the threads of control using the Berkeley DB environment to exit, runs
106
recovery, and restarts the working threads of control.
107
<p>If it is not practical to have a single parent for the processes sharing
108
a Berkeley DB environment, each process sharing the environment should log
109
their connection to and exit from the environment in a way that allows
110
a monitoring process to detect if a thread of control might have
111
acquired Berkeley DB resources and never released them. In this model, an
112
initial "watcher" process opens/creates the Berkeley DB environment and runs
113
recovery, and then creates a sentinel file. Any other process wanting
114
to use the Berkeley DB environment checks for the sentinel file; if the
115
sentinel file exists, the other process registers its process ID with
116
the watcher and joins the database environment. When the other process
117
finishes with the environment, it unregisters its process ID with the
118
water. The watcher periodically checks to ensure that no process has
119
failed while using the environment. If a process does fail while using
120
the environment, the watcher removes the sentinel file, kills all
121
processes currently using the environment, runs recovery, and re-creates
123
<p>Obviously, it is important that the monitoring process in either case
124
be as simple and well-tested as possible because there is no recourse
126
<table width="100%"><tr><td><br></td><td align=right><a href="../../ref/transapp/term.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../reftoc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/env_open.html"><img src="../../images/next.gif" alt="Next"></a>
128
<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font>