2
<!--Copyright 1997-2002 by Sleepycat Software, Inc.-->
3
<!--All rights reserved.-->
4
<!--See the file LICENSE for redistribution information.-->
7
<title>Berkeley DB Reference Guide: Elections</title>
8
<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit.">
9
<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++">
12
<table width="100%"><tr valign=top>
13
<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Berkeley DB Replication</dl></h3></td>
14
<td align=right><a href="../../ref/rep/init.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../reftoc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/rep/logonly.html"><img src="../../images/next.gif" alt="Next"></a>
17
<h1 align=center>Elections</h1>
18
<p>Berkeley DB never initiates elections, that is the responsibility of the
19
application. It is not dangerous to hold an election, as the Berkeley DB
20
election process ensures there is never more than a single master
21
environment. Clients should initiate an election whenever they lose
22
contact with the master environment, whenever they see a return of
23
<a href="../../api_c/rep_message.html#DB_REP_HOLDELECTION">DB_REP_HOLDELECTION</a> from the <a href="../../api_c/rep_message.html">DB_ENV->rep_process_message</a> method, or when, for
24
whatever reason, they do not know who the master is. It is not
25
necessary for applications to immediately hold elections when they
26
start, as any existing master will be quickly discovered after calling
27
<a href="../../api_c/rep_start.html">DB_ENV->rep_start</a>. If no master has been found after a short wait
28
period, then the application should call for an election.
29
<p>For a client to become the master, the client must win an election. To
30
win an election, the replication group must currently have no master,
31
the client must have the highest priority of the database environments
32
participating in the election, and at least (N / 2 + 1) of the members
33
of the replication group must participate in the election. In the case
34
of multiple database environments with equal priorities, the environment
35
with the most recent log records will win.
36
<p>It is dangerous to configure more than one master environment using the
37
<a href="../../api_c/rep_start.html">DB_ENV->rep_start</a> method, and applications should be careful not to do so.
38
Applications should only configure themselves as the master environment
39
if they are the only possible master, or if they have won an election.
40
An application can only know it has won an election if the
41
<a href="../../api_c/rep_elect.html">DB_ENV->rep_elect</a> method returns success and the local database environment's
42
ID as the new master environment ID, or if the <a href="../../api_c/rep_message.html">DB_ENV->rep_process_message</a> method
43
returns <a href="../../api_c/rep_message.html#DB_REP_NEWMASTER">DB_REP_NEWMASTER</a> and the local database environment's
44
ID as the new master environment ID.
45
<p>To add a database environment to the replication group with the intent
46
of it becoming the master, first add it as a client. Since it may be
47
out-of-date with respect to the current master, allow it to update
48
itself from the current master. Then, shut the current master down.
49
Presumably, the added client will win the subsequent election. If the
50
client does not win the election, it is likely that it was not given
51
sufficient time to update itself with respect to the current master.
52
<p>If a client is unable to find a master or win an election, it means that
53
the network has been partitioned and there are not enough environments
54
participating in the election for one of the participants to win. In
55
this case, the application should repeatedly call <a href="../../api_c/rep_start.html">DB_ENV->rep_start</a> and
56
<a href="../../api_c/rep_elect.html">DB_ENV->rep_elect</a>, alternating between attempting to discover an
57
existing master, and holding an election to declare a new one. In
58
desperate circumstances, an application could simply declare itself the
59
master by calling <a href="../../api_c/rep_start.html">DB_ENV->rep_start</a>, or by reducing the number of
60
participants required to win an election until the election is won.
61
Neither of these solutions is recommended: in the case of a network
62
partition, either of these choices can result in there being two masters
63
in one replication group, and the databases in the environment might
64
irretrievably diverge as they are modified in different ways by the
66
<p>It is possible for a less-preferred database environment to win an
67
election if a number of systems crash at the same time. Because an
68
election winner is declared as soon as enough environments participate
69
in the election, the environment on a slow booting but well-connected
70
machine might lose to an environment on a badly connected but faster
71
booting machine. In the case of a number of environments crashing at
72
the same time (for example, a set of replicated servers in a single
73
machine room), applications should bring the database environments on
74
line as clients initially (which will allow them to process read queries
75
immediately), and then hold an election after sufficient time has passed
76
for the slower booting machines to catch up.
77
<p>If, for any reason, a less-preferred database environment becomes the
78
master, it is possible to switch masters in a replicated environment,
79
although it is not a simple operation. For example, the preferred
80
master crashes, and one of the replication group clients becomes the
81
group master. In order to restore the preferred master to master
82
status, take the following steps:
84
<p><li>The preferred master should reboot and re-join the replication group
86
<li>Once the preferred master has caught up with the replication group, the
87
application on the current master should complete all active
88
transactions, close all open database handles, and reconfigure itself
89
as a client using the <a href="../../api_c/rep_start.html">DB_ENV->rep_start</a> method.
90
<li>Then, the current or preferred master should call for an election using
91
the <a href="../../api_c/rep_elect.html">DB_ENV->rep_elect</a> method.
93
<table width="100%"><tr><td><br></td><td align=right><a href="../../ref/rep/init.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../reftoc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/rep/logonly.html"><img src="../../images/next.gif" alt="Next"></a>
95
<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font>