1
<!--$Id: archival.so,v 10.49 2001/06/12 14:12:18 bostic Exp $-->
2
<!--Copyright 1997-2001 by Sleepycat Software, Inc.-->
3
<!--All rights reserved.-->
6
<title>Berkeley DB Reference Guide: Database and log file archival</title>
7
<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit.">
8
<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++">
11
<a name="2"><!--meow--></a><a name="3"><!--meow--></a>
12
<table width="100%"><tr valign=top>
13
<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Berkeley DB Transactional Data Store Applications</dl></h3></td>
14
<td align=right><a href="../../ref/transapp/checkpoint.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../reftoc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/logfile.html"><img src="../../images/next.gif" alt="Next"></a>
17
<h1 align=center>Database and log file archival</h1>
18
<p>The third component of the administrative infrastructure, archival for
19
catastrophic recovery, concerns the recoverability of the database in
20
the face of catastrophic failure. Recovery after catastrophic failure
21
is intended to minimize data loss when physical hardware has been
22
destroyed -- for example, loss of a disk that contains databases or log
23
files. Although the application may still experience data loss in this
24
case, it is possible to minimize it.
25
<p>First, you may want to periodically create snapshots (that is, backups)
26
of your databases to make it possible to recover from catastrophic
27
failure. These snapshots are either a standard backup, which creates a
28
consistent picture of the databases as of a single instant in time; or
29
an on-line backup (also known as a <i>hot</i> backup), which creates
30
a consistent picture of the databases as of an unspecified instant
31
during the period of time when the snapshot was made. The advantage of
32
a hot backup is that applications may continue to read and write the
33
databases while the snapshot is being taken. The disadvantage of a hot
34
backup is that more information must be archived, and recovery based on
35
a hot backup is to an unspecified time between the start of the backup
36
and when the backup is completed.
37
<p>Second, after taking a snapshot, you should periodically archive the
38
log files being created in the environment. It is often helpful to
39
think of database archival in terms of full and incremental filesystem
40
backups. A snapshot is a full backup, whereas the periodic archival of
41
the current log files is an incremental backup. For example, it might
42
be reasonable to take a full snapshot of a database environment weekly
43
or monthly, and archive additional log files daily. Using both the
44
snapshot and the log files, a catastrophic crash at any time can be
45
recovered to the time of the most recent log archival; a time long after
46
the original snapshot.
47
<p>To create a standard backup of your database that can be used to recover
48
from catastrophic failure, take the following steps:
50
<p><li>Commit or abort all ongoing transactions.
51
<p><li>Stop writing your databases until the backup has completed. Read-only
52
operations are permitted, but no write operations and no filesystem
53
operations may be performed (for example, the <a href="../../api_c/env_remove.html">DB_ENV->remove</a> and
54
<a href="../../api_c/db_open.html">DB->open</a> functions may not be called).
55
<p><li>Force an environment checkpoint (see <a href="../../utility/db_checkpoint.html">db_checkpoint</a> for more
57
<p><li>Run <a href="../../utility/db_archive.html">db_archive</a> <b>-s</b> to identify all the database data
58
files, and copy them to a backup device such as CD-ROM, alternate disk,
60
<p>If the database files are stored in a separate directory from the other
61
Berkeley DB files, it may be simpler to archive the directory itself instead
62
of the individual files (see <a href="../../api_c/env_set_data_dir.html">DB_ENV->set_data_dir</a> for additional
63
information). <b>Note: if any of the database files did not have
64
an open DB handle during the lifetime of the current log files,
65
<a href="../../utility/db_archive.html">db_archive</a> will not list them in its output!</b> This is another
66
reason it may be simpler to use a separate database file directory and
67
archive the entire directory instead of archiving only the files listed
68
by <a href="../../utility/db_archive.html">db_archive</a>.
69
<p><li>Run <a href="../../utility/db_archive.html">db_archive</a> <b>-l</b> to identify all the log files,
70
and copy the last one (that is, the one with the highest number) to a
71
backup device such as CD-ROM, alternate disk, or tape.
73
<p>To create a <i>hot</i> backup of your database that can be used to
74
recover from catastrophic failure, take the following steps:
76
<p><li>Archive your databases, as described in the previous step #4.
77
You do not have to halt ongoing transactions or force a
78
checkpoint. In the case of a hot backup, the utility you use to copy
79
the databases must read database pages atomically (as described by
80
<a href="../../ref/transapp/reclimit.html">Berkeley DB recoverability</a>).
81
<p><li>When performing a hot backup, you must additionally archive all of the
82
log files. Note that the order of these two operations is required,
83
and the database files must be archived before the log files. This
84
means that if the database files and log files are in the same
85
directory, you cannot simply archive the directory; you must make sure
86
that the correct order of archival is maintained.
87
<p>To archive your log files, run the <a href="../../utility/db_archive.html">db_archive</a> utility using
88
the <b>-l</b> option to identify all the database log files, and
89
copy them to your backup media. If the database log files are stored
90
in a separate directory from the other database files, it may be simpler
91
to archive the directory itself instead of the individual files (see
92
the <a href="../../api_c/env_set_lg_dir.html">DB_ENV->set_lg_dir</a> function for more information).
94
<p>Once these steps are completed, your database can be recovered from
95
catastrophic failure (see <a href="recovery.html">Recovery procedures</a> for
97
<p>To update your snapshot so that recovery from catastrophic failure is
98
possible up to a new point in time, repeat step 2 under the hot backup
99
instructions -- copying all existing log files to a backup device. This
100
is applicable to both standard and hot backups; that is, you can update
101
snapshots made either way. Each time both the database and log files
102
are copied to backup media, you may discard all previous database
103
snapshots and saved log files. Archiving additional log files does not
104
allow you to discard either previous database snapshots or log files.
105
<p>The time to restore from catastrophic failure is a function of the
106
number of log records that have been written since the snapshot was
107
originally created. Perhaps more importantly, the more separate pieces
108
of backup media you use, the more likely it is that you will have a
109
problem reading from one of them. For these reasons, it is often best
110
to make snapshots on a regular basis.
111
<p><b>Obviously, the reliability of your archive media will affect the safety
112
of your data. For archival safety, ensure that you have multiple copies
113
of your database backups, verify that your archival media is error-free
114
and readable, and that copies of your backups are stored offsite!</b>
115
<p>The functionality provided by the <a href="../../utility/db_archive.html">db_archive</a> utility is also
116
available directly from the Berkeley DB library. The following code fragment
117
prints out a list of log and database files that need to be archived:
118
<p><blockquote><pre>void
119
log_archlist(DB_ENV *dbenv)
122
char **begin, **list;
124
/* Get the list of database files. */
125
if ((ret = log_archive(dbenv,
126
&list, DB_ARCH_ABS | DB_ARCH_DATA)) != 0) {
127
dbenv->err(dbenv, ret, "log_archive: DB_ARCH_DATA");
131
for (begin = list; *list != NULL; ++list)
132
printf("database file: %s\n", *list);
136
/* Get the list of log files. */
137
if ((ret = log_archive(dbenv,
138
&list, DB_ARCH_ABS | DB_ARCH_LOG)) != 0) {
139
dbenv->err(dbenv, ret, "log_archive: DB_ARCH_LOG");
143
for (begin = list; *list != NULL; ++list)
144
printf("log file: %s\n", *list);
148
<table width="100%"><tr><td><br></td><td align=right><a href="../../ref/transapp/checkpoint.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../reftoc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/transapp/logfile.html"><img src="../../images/next.gif" alt="Next"></a>
150
<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font>