~ubuntu-branches/ubuntu/karmic/postgresql-8.4/karmic-proposed

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
<!-- $PostgreSQL: pgsql/doc/src/sgml/pgstandby.sgml,v 2.10.2.1 2010/08/17 04:49:33 petere Exp $ -->

<sect1 id="pgstandby">
 <title>pg_standby</title>

 <indexterm zone="pgstandby">
  <primary>pg_standby</primary>
 </indexterm>

 <para>
  <application>pg_standby</> supports creation of a <quote>warm standby</>
  database server.  It is designed to be a production-ready program, as well
  as a customizable template should you require specific modifications.
 </para>

 <para>
  <application>pg_standby</> is designed to be a waiting
  <literal>restore_command</literal>, which is needed to turn a standard
  archive recovery into a warm standby operation.  Other
  configuration is required as well, all of which is described in the main
  server manual (see <xref linkend="warm-standby">).
 </para>

 <para>
  <application>pg_standby</application> features include:
 </para>
 <itemizedlist>
  <listitem>
   <para>
    Written in C, so very portable and easy to install
   </para>
  </listitem>
  <listitem>
   <para>
    Easy-to-modify source code, with specifically designated
    sections to modify for your own needs
   </para>
  </listitem>
  <listitem>
   <para>
    Already tested on Linux and Windows
   </para>
  </listitem>
 </itemizedlist>

 <sect2>
  <title>Usage</title>

  <para>
   To configure a standby
   server to use <application>pg_standby</>, put this into its
   <filename>recovery.conf</filename> configuration file:
  </para>
  <programlisting>
restore_command = 'pg_standby <replaceable>archiveDir</> %f %p %r'
  </programlisting>
  <para>
   where <replaceable>archiveDir</> is the directory from which WAL segment
   files should be restored.
  </para>
  <para>
   The full syntax of <application>pg_standby</>'s command line is
  </para>
  <synopsis>
pg_standby <optional> <replaceable>option</> ... </optional> <replaceable>archivelocation</> <replaceable>nextwalfile</> <replaceable>xlogfilepath</> <optional> <replaceable>restartwalfile</> </optional>
  </synopsis>
  <para>
   When used within <literal>restore_command</literal>, the <literal>%f</> and
   <literal>%p</> macros should be specified for <replaceable>nextwalfile</>
   and <replaceable>xlogfilepath</> respectively, to provide the actual file
   and path required for the restore.
  </para>
  <para>
   If <replaceable>restartwalfile</> is specified, normally by using the
   <literal>%r</literal> macro, then all WAL files logically preceding this
   file will be removed from <replaceable>archivelocation</>. This minimizes
   the number of files that need to be retained, while preserving
   crash-restart capability.  Use of this parameter is appropriate if the
   <replaceable>archivelocation</> is a transient staging area for this
   particular standby server, but <emphasis>not</> when the
   <replaceable>archivelocation</> is intended as a long-term WAL archive area.
  </para>
  <para>
   <application>pg_standby</application> assumes that
   <replaceable>archivelocation</> is a directory readable by the
   server-owning user.  If <replaceable>restartwalfile</> (or <literal>-k</>)
   is specified,
   the <replaceable>archivelocation</> directory must be writable too.
  </para>
  <para>
   There are two ways to fail over to a <quote>warm standby</> database server
   when the master server fails:

   <variablelist>
    <varlistentry>
     <term>Smart Failover</term>
     <listitem>
      <para>
       In smart failover, the server is brought up after applying all WAL
       files available in the archive. This results in zero data loss, even if
       the standby server has fallen behind, but if there is a lot of
       unapplied WAL it can be a long time before the standby server becomes
       ready. To trigger a smart failover, create a trigger file containing
       the word <literal>smart</>, or just create it and leave it empty.
      </para>
     </listitem>
    </varlistentry>
    <varlistentry>
     <term>Fast Failover</term>
     <listitem>
      <para>
       In fast failover, the server is brought up immediately. Any WAL files
       in the archive that have not yet been applied will be ignored, and
       all transactions in those files are lost. To trigger a fast failover,
       create a trigger file and write the word <literal>fast</> into it.
       <application>pg_standby</> can also be configured to execute a fast
       failover automatically if no new WAL file appears within a defined
       interval.
      </para>
     </listitem>
    </varlistentry>
   </variablelist>
  </para>

  <table>
   <title><application>pg_standby</> options</title>
   <tgroup cols="3">
     <thead>
     <row>
      <entry>Option</entry>
      <entry>Default</entry>
      <entry>Description</entry>
     </row>
    </thead>
    <tbody>
     <row>
      <entry><literal>-c</></entry>
      <entry>yes</entry>
      <entry>
       Use <literal>cp</> or <literal>copy</> command to restore WAL files
       from archive.
      </entry>
     </row>
     <row>
      <entry><literal>-d</></entry>
      <entry>no</entry>
      <entry>Print lots of debug logging output on <filename>stderr</>.</entry>
     </row>
     <row>
      <entry><literal>-k</> <replaceable>numfiles</></entry>
      <entry>0</entry>
      <entry>
       Remove files from <replaceable>archivelocation</replaceable> so that
       no more than this many WAL files before the current one are kept in the
       archive.  Zero (the default) means not to remove any files from
       <replaceable>archivelocation</replaceable>.
       This parameter will be silently ignored if
       <replaceable>restartwalfile</replaceable> is specified, since that
       specification method is more accurate in determining the correct
       archive cut-off point.
       Use of this parameter is <emphasis>deprecated</> as of
       <productname>PostgreSQL</> 8.3; it is safer and more efficient to
       specify a <replaceable>restartwalfile</replaceable> parameter.  A too
       small setting could result in removal of files that are still needed
       for a restart of the standby server, while a too large setting wastes
       archive space.
      </entry>
     </row>
     <row>
      <entry><literal>-r</> <replaceable>maxretries</></entry>
      <entry>3</entry>
      <entry>
        Set the maximum number of times to retry the copy command if it
        fails. After each failure, we wait for <replaceable>sleeptime</> *
        <replaceable>num_retries</>
        so that the wait time increases progressively.  So by default,
        we will wait 5 secs, 10 secs, then 15 secs before reporting
        the failure back to the standby server. This will be
        interpreted as end of recovery and the standby will come
        up fully as a result.
      </entry>
     </row>
     <row>
      <entry><literal>-s</> <replaceable>sleeptime</></entry>
      <entry>5</entry>
      <entry>
       Set the number of seconds (up to 60) to sleep between tests to see
       if the WAL file to be restored is available in the archive yet.
       The default setting is not necessarily recommended;
       consult <xref linkend="warm-standby"> for discussion.
      </entry>
     </row>
     <row>
      <entry><literal>-t</> <replaceable>triggerfile</></entry>
      <entry>none</entry>
      <entry>
       Specify a trigger file whose presence should cause failover.
       It is recommended that you use a structured filename to
       avoid confusion as to which server is being triggered
       when multiple servers exist on the same system; for example
       <filename>/tmp/pgsql.trigger.5432</>.
      </entry>
     </row>
     <row>
      <entry><literal>-w</> <replaceable>maxwaittime</></entry>
      <entry>0</entry>
      <entry>
       Set the maximum number of seconds to wait for the next WAL file,
       after which a fast failover will be performed.
       A setting of zero (the default) means wait forever.
       The default setting is not necessarily recommended;
       consult <xref linkend="warm-standby"> for discussion.
      </entry>
     </row>
    </tbody>
   </tgroup>
  </table>
 </sect2>

 <sect2>
  <title>Examples</title>

  <para>On Linux or Unix systems, you might use:</para>

  <programlisting>
archive_command = 'cp %p .../archive/%f'

restore_command = 'pg_standby -d -s 2 -t /tmp/pgsql.trigger.5442 .../archive %f %p %r 2>>standby.log'

recovery_end_command = 'rm -f /tmp/pgsql.trigger.5442'
  </programlisting>
  <para>
   where the archive directory is physically located on the standby server,
   so that the <literal>archive_command</> is accessing it across NFS,
   but the files are local to the standby (enabling use of <literal>ln</>).
   This will:
  </para>
  <itemizedlist>
   <listitem>
    <para>
     produce debugging output in <filename>standby.log</>
    </para>
   </listitem>
   <listitem>
    <para>
     sleep for 2 seconds between checks for next WAL file availability
    </para>
   </listitem>
   <listitem>
    <para>
     stop waiting only when a trigger file called
     <filename>/tmp/pgsql.trigger.5442</> appears,
     and perform failover according to its content
    </para>
   </listitem>
   <listitem>
    <para>
     remove the trigger file when recovery ends
    </para>
   </listitem>
   <listitem>
    <para>
     remove no-longer-needed files from the archive directory
    </para>
   </listitem>
  </itemizedlist>

  <para>On Windows, you might use:</para>

  <programlisting>
archive_command = 'copy %p ...\\archive\\%f'

restore_command = 'pg_standby -d -s 5 -t C:\pgsql.trigger.5442 ...\archive %f %p %r 2>>standby.log'

recovery_end_command = 'del C:\pgsql.trigger.5442'
  </programlisting>
  <para>
   Note that backslashes need to be doubled in the
   <literal>archive_command</>, but <emphasis>not</emphasis> in the
   <literal>restore_command</> or <literal>recovery_end_command</>.
   This will:
  </para>
  <itemizedlist>
   <listitem>
    <para>
     use the <literal>copy</> command to restore WAL files from archive
    </para>
   </listitem>
   <listitem>
    <para>
     produce debugging output in <filename>standby.log</>
    </para>
   </listitem>
   <listitem>
    <para>
     sleep for 5 seconds between checks for next WAL file availability
    </para>
   </listitem>
   <listitem>
    <para>
     stop waiting only when a trigger file called
     <filename>C:\pgsql.trigger.5442</> appears,
     and perform failover according to its content
    </para>
   </listitem>
   <listitem>
    <para>
     remove the trigger file when recovery ends
    </para>
   </listitem>
   <listitem>
    <para>
     remove no-longer-needed files from the archive directory
    </para>
   </listitem>
  </itemizedlist>

  <para>
   The <literal>copy</> command on Windows sets the final file size
   before the file is completely copied, which would ordinarily confuse
   <application>pg_standby</application>.  Therefore
   <application>pg_standby</application> waits <literal>sleeptime</>
   seconds once it sees the proper file size.  GNUWin32's <literal>cp</>
   sets the file size only after the file copy is complete.
  </para>

  <para>
   Since the Windows example uses <literal>copy</> at both ends, either
   or both servers might be accessing the archive directory across the
   network.
  </para>

 </sect2>

 <sect2>
  <title>Supported server versions</title>

  <para>
   <application>pg_standby</application> is designed to work with
   <productname>PostgreSQL</> 8.2 and later.
  </para>
  <para>
   <productname>PostgreSQL</> 8.3 provides the <literal>%r</literal> macro,
   which is designed to let <application>pg_standby</application> know the
   last file it needs to keep.  With <productname>PostgreSQL</> 8.2, the
   <literal>-k</literal> option must be used if archive cleanup is
   required.  This option remains available in 8.3, but its use is deprecated.
  </para>
  <para>
   <productname>PostgreSQL</> 8.4 provides the
   <literal>recovery_end_command</literal> option.  Without this option
   a leftover trigger file can be hazardous.
  </para>
 </sect2>

 <sect2>
  <title>Author</title>

  <para>
   Simon Riggs <email>simon@2ndquadrant.com</email>
  </para>
 </sect2>

</sect1>