3
<chapter id="maintenance">
4
<title>Routine Database Maintenance Tasks</title>
6
<indexterm zone="maintenance">
7
<primary>maintenance</primary>
10
<indexterm zone="maintenance">
11
<primary>routine maintenance</primary>
15
<productname>PostgreSQL</>, like any database software, requires that certain tasks
16
be performed regularly to achieve optimum performance. The tasks
17
discussed here are <emphasis>required</emphasis>, but they
18
are repetitive in nature and can easily be automated using standard
19
tools such as <application>cron</application> scripts or
20
Windows' <application>Task Scheduler</>. But it is the database
21
administrator's responsibility to set up appropriate scripts, and to
22
check that they execute successfully.
26
One obvious maintenance task is creation of backup copies of the data on a
27
regular schedule. Without a recent backup, you have no chance of recovery
28
after a catastrophe (disk failure, fire, mistakenly dropping a critical
29
table, etc.). The backup and recovery mechanisms available in
30
<productname>PostgreSQL</productname> are discussed at length in
31
<xref linkend="backup">.
35
The other main category of maintenance task is periodic <quote>vacuuming</>
36
of the database. This activity is discussed in
37
<xref linkend="routine-vacuuming">. Closely related to this is updating
38
the statistics that will be used by the query planner, as discussed in
39
<xref linkend="vacuum-for-statistics">.
43
Another task that might need periodic attention is log file management.
44
This is discussed in <xref linkend="logfile-maintenance">.
48
<productname>PostgreSQL</productname> is low-maintenance compared
49
to some other database management systems. Nonetheless,
50
appropriate attention to these tasks will go far towards ensuring a
51
pleasant and productive experience with the system.
54
<sect1 id="routine-vacuuming">
55
<title>Routine Vacuuming</title>
57
<indexterm zone="routine-vacuuming">
58
<primary>vacuum</primary>
62
<productname>PostgreSQL</productname> databases require periodic
63
maintenance known as <firstterm>vacuuming</>. For many installations, it
64
is sufficient to let vacuuming be performed by the <firstterm>autovacuum
65
daemon</>, which is described in <xref linkend="autovacuum">. You might
66
need to adjust the autovacuuming parameters described there to obtain best
67
results for your situation. Some database administrators will want to
68
supplement or replace the daemon's activities with manually-managed
69
<command>VACUUM</> commands, which typically are executed according to a
70
schedule by <application>cron</application> or <application>Task
71
Scheduler</> scripts. To set up manually-managed vacuuming properly,
72
it is essential to understand the issues discussed in the next few
73
subsections. Administrators who rely on autovacuuming may still wish
74
to skim this material to help them understand and adjust autovacuuming.
77
<sect2 id="vacuum-basics">
78
<title>Vacuuming Basics</title>
81
<productname>PostgreSQL</productname>'s
82
<xref linkend="sql-vacuum" endterm="sql-vacuum-title"> command has to
83
process each table on a regular basis for several reasons:
87
<simpara>To recover or reuse disk space occupied by updated or deleted
92
<simpara>To update data statistics used by the
93
<productname>PostgreSQL</productname> query planner.</simpara>
97
<simpara>To protect against loss of very old data due to
98
<firstterm>transaction ID wraparound</>.</simpara>
102
Each of these reasons dictates performing <command>VACUUM</> operations
103
of varying frequency and scope, as explained in the following subsections.
107
There are two variants of <command>VACUUM</>: standard <command>VACUUM</>
108
and <command>VACUUM FULL</>. <command>VACUUM FULL</> can reclaim more
109
disk space but runs much more slowly. Also,
110
the standard form of <command>VACUUM</> can run in parallel with production
111
database operations. (Commands such as <command>SELECT</command>,
112
<command>INSERT</command>, <command>UPDATE</command>, and
113
<command>DELETE</command> will continue to function as normal, though you
114
will not be able to modify the definition of a table with commands such as
115
<command>ALTER TABLE</command> while it is being vacuumed.)
116
<command>VACUUM FULL</> requires exclusive lock on the table it is
117
working on, and therefore cannot be done in parallel with other use
118
of the table. Another disadvantage of <command>VACUUM FULL</> is that
119
while it reduces table size, it does not reduce index size proportionally;
120
in fact it can make indexes <emphasis>larger</>. Generally, therefore,
121
administrators should strive to use standard <command>VACUUM</> and
122
avoid <command>VACUUM FULL</>.
126
<command>VACUUM</command> creates a substantial amount of I/O
127
traffic, which can cause poor performance for other active sessions.
128
There are configuration parameters that can be adjusted to reduce the
129
performance impact of background vacuuming — see
130
<xref linkend="runtime-config-resource-vacuum-cost">.
134
<sect2 id="vacuum-for-space-recovery">
135
<title>Recovering Disk Space</title>
137
<indexterm zone="vacuum-for-space-recovery">
138
<primary>disk space</primary>
142
In <productname>PostgreSQL</productname>, an
143
<command>UPDATE</> or <command>DELETE</> of a row does not
144
immediately remove the old version of the row.
145
This approach is necessary to gain the benefits of multiversion
146
concurrency control (see <xref linkend="mvcc">): the row version
147
must not be deleted while it is still potentially visible to other
148
transactions. But eventually, an outdated or deleted row version is no
149
longer of interest to any transaction. The space it occupies must then be
150
reclaimed for reuse by new rows, to avoid infinite growth of disk
151
space requirements. This is done by running <command>VACUUM</>.
155
The standard form of <command>VACUUM</command> removes dead row
156
versions in tables and indexes and marks the space available for
157
future reuse. However, it will not return the space to the operating
158
system, except in the special case where one or more pages at the
159
end of a table become entirely free and an exclusive table lock can be
160
easily obtained. In contrast, <command>VACUUM FULL</> actively compacts
161
tables by moving row versions to earlier pages. It is thus able to
162
force pages at the end of the table to become entirely free, whereupon
163
it will return them to the operating system. However, if many rows
164
must be moved, this can take a long time. Also, moving a row requires
165
transiently making duplicate index entries for it (the entry pointing
166
to its new location must be made before the old entry can be removed);
167
so moving a lot of rows this way causes severe index bloat.
171
The usual goal of routine vacuuming is to do standard <command>VACUUM</>s
172
often enough to avoid needing <command>VACUUM FULL</>. The
173
autovacuum daemon attempts to work this way, and in fact will
174
never issue <command>VACUUM FULL</>. In this approach, the idea
175
is not to keep tables at their minimum size, but to maintain steady-state
176
usage of disk space: each table occupies space equivalent to its
177
minimum size plus however much space gets used up between vacuumings.
178
Although <command>VACUUM FULL</> can be used to shrink a table back
179
to its minimum size and return the disk space to the operating system,
180
there is not much point in this if the table will just grow again in the
181
future. Thus, moderately-frequent standard <command>VACUUM</> runs are a
182
better approach than infrequent <command>VACUUM FULL</> runs for
183
maintaining heavily-updated tables.
187
Some administrators prefer to schedule vacuuming themselves, for example
188
doing all the work at night when load is low.
189
The difficulty with doing vacuuming according to a fixed schedule
190
is that if a table has an unexpected spike in update activity, it may
191
get bloated to the point that <command>VACUUM FULL</> is really necessary
192
to reclaim space. Using the autovacuum daemon alleviates this problem,
193
since the daemon schedules vacuuming dynamically in response to update
194
activity. It is unwise to disable the daemon completely unless you
195
have an extremely predictable workload. One possible compromise is
196
to set the daemon's parameters so that it will only react to unusually
197
heavy update activity, thus keeping things from getting out of hand,
198
while scheduled <command>VACUUM</>s are expected to do the bulk of the
199
work when the load is typical.
203
For those not using autovacuum, a typical approach is to schedule a
204
database-wide <command>VACUUM</> once a day during a low-usage period,
205
supplemented by more frequent vacuuming of heavily-updated tables as
206
necessary. (Some installations with extremely high update rates vacuum
207
their busiest tables as often as once every few minutes.) If you have
208
multiple databases in a cluster, don't forget to
209
<command>VACUUM</command> each one; the program <xref
210
linkend="app-vacuumdb" endterm="app-vacuumdb-title"> might be helpful.
215
Neither form of <command>VACUUM</> is entirely satisfactory when
216
a table contains large numbers of dead row versions as a result of
217
massive update or delete activity. If you have such a table and
218
you need to reclaim the excess disk space it occupies, the best
219
way is to use <xref linkend="sql-cluster" endterm="sql-cluster-title">
220
or one of the table-rewriting variants of
221
<xref linkend="sql-altertable" endterm="sql-altertable-title">.
222
These commands rewrite an entire new copy of the table and build
223
new indexes for it. Like <command>VACUUM FULL</>, they require
224
exclusive lock. Note that they also temporarily use extra disk
225
space, since the old copies of the table and indexes can't be
226
released until the new ones are complete. In the worst case where
227
your disk is nearly full, <command>VACUUM FULL</> may be the only
228
workable alternative.
234
If you have a table whose entire contents are deleted on a periodic
235
basis, consider doing it with
236
<xref linkend="sql-truncate" endterm="sql-truncate-title"> rather
237
than using <command>DELETE</command> followed by
238
<command>VACUUM</command>. <command>TRUNCATE</command> removes the
239
entire content of the table immediately, without requiring a
240
subsequent <command>VACUUM</command> or <command>VACUUM
241
FULL</command> to reclaim the now-unused disk space.
242
The disadvantage is that strict MVCC semantics are violated.
247
<sect2 id="vacuum-for-statistics">
248
<title>Updating Planner Statistics</title>
250
<indexterm zone="vacuum-for-statistics">
251
<primary>statistics</primary>
252
<secondary>of the planner</secondary>
255
<indexterm zone="vacuum-for-statistics">
256
<primary>ANALYZE</primary>
260
The <productname>PostgreSQL</productname> query planner relies on
261
statistical information about the contents of tables in order to
262
generate good plans for queries. These statistics are gathered by
263
the <xref linkend="sql-analyze" endterm="sql-analyze-title"> command,
264
which can be invoked by itself or
265
as an optional step in <command>VACUUM</>. It is important to have
266
reasonably accurate statistics, otherwise poor choices of plans might
267
degrade database performance.
271
The autovacuum daemon, if enabled, will automatically issue
272
<command>ANALYZE</> commands whenever the content of a table has
273
changed sufficiently. However, administrators might prefer to rely
274
on manually-scheduled <command>ANALYZE</> operations, particularly
275
if it is known that update activity on a table will not affect the
276
statistics of <quote>interesting</> columns. The daemon schedules
277
<command>ANALYZE</> strictly as a function of the number of rows
278
inserted or updated; it has no knowledge of whether that will lead
279
to meaningful statistical changes.
283
As with vacuuming for space recovery, frequent updates of statistics
284
are more useful for heavily-updated tables than for seldom-updated
285
ones. But even for a heavily-updated table, there might be no need for
286
statistics updates if the statistical distribution of the data is
287
not changing much. A simple rule of thumb is to think about how much
288
the minimum and maximum values of the columns in the table change.
289
For example, a <type>timestamp</type> column that contains the time
290
of row update will have a constantly-increasing maximum value as
291
rows are added and updated; such a column will probably need more
292
frequent statistics updates than, say, a column containing URLs for
293
pages accessed on a website. The URL column might receive changes just
294
as often, but the statistical distribution of its values probably
295
changes relatively slowly.
299
It is possible to run <command>ANALYZE</> on specific tables and even
300
just specific columns of a table, so the flexibility exists to update some
301
statistics more frequently than others if your application requires it.
302
In practice, however, it is usually best to just analyze the entire
303
database, because it is a fast operation. <command>ANALYZE</> uses a
304
statistical random sampling of the rows of a table rather than reading
310
Although per-column tweaking of <command>ANALYZE</> frequency might not be
311
very productive, you might well find it worthwhile to do per-column
312
adjustment of the level of detail of the statistics collected by
313
<command>ANALYZE</>. Columns that are heavily used in <literal>WHERE</>
314
clauses and have highly irregular data distributions might require a
315
finer-grain data histogram than other columns. See <command>ALTER TABLE
316
SET STATISTICS</>, or change the database-wide default using the <xref
317
linkend="guc-default-statistics-target"> configuration parameter.
322
<sect2 id="vacuum-for-wraparound">
323
<title>Preventing Transaction ID Wraparound Failures</title>
325
<indexterm zone="vacuum-for-wraparound">
326
<primary>transaction ID</primary>
327
<secondary>wraparound</secondary>
331
<productname>PostgreSQL</productname>'s MVCC transaction semantics
332
depend on being able to compare transaction ID (<acronym>XID</>)
333
numbers: a row version with an insertion XID greater than the current
334
transaction's XID is <quote>in the future</> and should not be visible
335
to the current transaction. But since transaction IDs have limited size
336
(32 bits at this writing) a cluster that runs for a long time (more
337
than 4 billion transactions) would suffer <firstterm>transaction ID
338
wraparound</>: the XID counter wraps around to zero, and all of a sudden
339
transactions that were in the past appear to be in the future — which
340
means their outputs become invisible. In short, catastrophic data loss.
341
(Actually the data is still there, but that's cold comfort if you cannot
342
get at it.) To avoid this, it is necessary to vacuum every table
343
in every database at least once every two billion transactions.
347
The reason that periodic vacuuming solves the problem is that
348
<productname>PostgreSQL</productname> distinguishes a special XID
349
<literal>FrozenXID</>. This XID is always considered older
350
than every normal XID. Normal XIDs are
351
compared using modulo-2<superscript>31</> arithmetic. This means
352
that for every normal XID, there are two billion XIDs that are
353
<quote>older</> and two billion that are <quote>newer</>; another
354
way to say it is that the normal XID space is circular with no
355
endpoint. Therefore, once a row version has been created with a particular
356
normal XID, the row version will appear to be <quote>in the past</> for
357
the next two billion transactions, no matter which normal XID we are
358
talking about. If the row version still exists after more than two billion
359
transactions, it will suddenly appear to be in the future. To
360
prevent data loss, old row versions must be reassigned the XID
361
<literal>FrozenXID</> sometime before they reach the
362
two-billion-transactions-old mark. Once they are assigned this
363
special XID, they will appear to be <quote>in the past</> to all
364
normal transactions regardless of wraparound issues, and so such
365
row versions will be good until deleted, no matter how long that is.
366
This reassignment of old XIDs is handled by <command>VACUUM</>.
370
<command>VACUUM</>'s behavior is controlled by the two configuration
371
parameters: <xref linkend="guc-vacuum-freeze-min-age"> and
372
<xref linkend="guc-vacuum-freeze-table-age">.
373
<varname>vacuum_freeze_table_age</> controls when <command>VACUUM</>
374
performs a full sweep of the table, in order to replace old XID values
375
with <literal>FrozenXID</>. <varname>vacuum_freeze_min_age</>
376
controls how old an XID value has to be before it's replaced with
377
<literal>FrozenXID</>. Larger values of these settings
378
preserve transactional information longer, while smaller values increase
379
the number of transactions that can elapse before the table must be
384
The maximum time that a table can go unvacuumed is two billion
385
transactions minus the <varname>vacuum_freeze_min_age</> that was used
386
when <command>VACUUM</> last scanned the whole table. If it were to go
387
unvacuumed for longer than
388
that, data loss could result. To ensure that this does not happen,
389
autovacuum is invoked on any table that might contain XIDs older than the
390
age specified by the configuration parameter <xref
391
linkend="guc-autovacuum-freeze-max-age">. (This will happen even if
392
autovacuum is otherwise disabled.)
396
This implies that if a table is not otherwise vacuumed,
397
autovacuum will be invoked on it approximately once every
398
<varname>autovacuum_freeze_max_age</> minus
399
<varname>vacuum_freeze_min_age</> transactions.
400
For tables that are regularly vacuumed for space reclamation purposes,
401
this is of little importance. However, for static tables
402
(including tables that receive inserts, but no updates or deletes),
403
there is no need for vacuuming for space reclamation, and so it can
404
be useful to try to maximize the interval between forced autovacuums
405
on very large static tables. Obviously one can do this either by
406
increasing <varname>autovacuum_freeze_max_age</> or by decreasing
407
<varname>vacuum_freeze_min_age</>.
411
The sole disadvantage of increasing <varname>vacuum_freeze_table_age</>
412
and <varname>autovacuum_freeze_max_age</>
413
is that the <filename>pg_clog</> subdirectory of the database cluster
414
will take more space, because it must store the commit status for all
415
transactions back to the <varname>autovacuum_freeze_max_age</> horizon.
416
The commit status uses two bits per transaction, so if
417
<varname>autovacuum_freeze_max_age</> has its maximum allowed value of
418
a little less than two billion, <filename>pg_clog</> can be expected to
419
grow to about half a gigabyte. If this is trivial compared to your
420
total database size, setting <varname>autovacuum_freeze_max_age</> and
421
<varname>vacuum_freeze_table_age</varname> to their maximum allowed values
422
is recommended. Otherwise, set them depending
423
on what you are willing to allow for <filename>pg_clog</> storage.
424
(The default, 200 million transactions, translates to about 50MB of
425
<filename>pg_clog</> storage.)
429
One disadvantage of decreasing <varname>vacuum_freeze_min_age</> is that
430
it might cause <command>VACUUM</> to do useless work: changing a table row's
431
XID to <literal>FrozenXID</> is a waste of time if the row is modified
432
soon thereafter (causing it to acquire a new XID). So the setting should
433
be large enough that rows are not frozen until they are unlikely to change
434
any more. Another disadvantage of decreasing this setting is
435
that details about exactly which transaction inserted or modified a
436
row will be lost sooner. This information sometimes comes in handy,
437
particularly when trying to analyze what went wrong after a database
438
failure. For these two reasons, decreasing this setting is not
439
recommended except for completely static tables.
443
To track the age of the oldest XIDs in a database,
444
<command>VACUUM</> stores XID
445
statistics in the system tables <structname>pg_class</> and
446
<structname>pg_database</>. In particular,
447
the <structfield>relfrozenxid</> column of a table's
448
<structname>pg_class</> row contains the freeze cutoff XID that was used
449
by the last <command>VACUUM</> for that table. All normal
450
XIDs older than this cutoff XID are guaranteed to have been replaced by
451
<literal>FrozenXID</> within the table. Similarly,
452
the <structfield>datfrozenxid</> column of a database's
453
<structname>pg_database</> row is a lower bound on the normal XIDs
454
appearing in that database — it is just the minimum of the
455
per-table <structfield>relfrozenxid</> values within the database.
457
examine this information is to execute queries such as:
460
SELECT relname, age(relfrozenxid) FROM pg_class WHERE relkind = 'r';
461
SELECT datname, age(datfrozenxid) FROM pg_database;
464
The <literal>age</> column measures the number of transactions from the
465
cutoff XID to the current transaction's XID. When <command>VACUUM</>
466
scans the whole table, after it's finished <literal>age(relfrozenxid)</>
467
should be a little more than the <varname>vacuum_freeze_min_age</> setting
468
that was used (more by the number of transactions started since the
469
<command>VACUUM</> started).
473
<command>VACUUM</> normally only scans pages that have been modified
474
since last vacuum, but <structfield>relfrozenxid</> can only be advanced
475
when the whole table is scanned. The whole table is scanned when
476
<structfield>relfrozenxid</> is more than
477
<varname>vacuum_freeze_table_age</> transactions old, if
478
<command>VACUUM FREEZE</> command is used, or if all pages happen to
479
require vacuuming to remove dead row versions. If no whole-table-scanning
480
<command>VACUUM</> is issued on the table until
481
<varname>autovacuum_freeze_max_age</> is reached, an autovacuum will soon
482
be forced for the table.
486
If for some reason autovacuum fails to clear old XIDs from a table,
487
the system will begin to emit warning messages like this when the
488
database's oldest XIDs reach ten million transactions from the wraparound
492
WARNING: database "mydb" must be vacuumed within 177009986 transactions
493
HINT: To avoid a database shutdown, execute a database-wide VACUUM in "mydb".
496
(A manual <command>VACUUM</> should fix the problem, as suggested by the
497
hint; but note that the <command>VACUUM</> must be performed by a
498
superuser, else it will fail to process system catalogs and thus not
499
be able to advance the database's <structfield>datfrozenxid</>.)
500
If these warnings are
501
ignored, the system will shut down and refuse to execute any new
502
transactions once there are fewer than 1 million transactions left
506
ERROR: database is not accepting commands to avoid wraparound data loss in database "mydb"
507
HINT: Stop the postmaster and use a standalone backend to VACUUM in "mydb".
510
The 1-million-transaction safety margin exists to let the
511
administrator recover without data loss, by manually executing the
512
required <command>VACUUM</> commands. However, since the system will not
513
execute commands once it has gone into the safety shutdown mode,
514
the only way to do this is to stop the server and use a single-user
515
backend to execute <command>VACUUM</>. The shutdown mode is not enforced
516
by a single-user backend. See the <xref linkend="app-postgres"> reference
517
page for details about using a single-user backend.
522
<sect2 id="autovacuum">
523
<title id="autovacuum-title">The Autovacuum Daemon</title>
526
<primary>autovacuum</primary>
527
<secondary>general information</secondary>
530
<productname>PostgreSQL</productname> has an optional but highly
531
recommended feature called <firstterm>autovacuum</firstterm>,
532
whose purpose is to automate the execution of
533
<command>VACUUM</command> and <command>ANALYZE </command> commands.
534
When enabled, autovacuum checks for
535
tables that have had a large number of inserted, updated or deleted
536
tuples. These checks use the statistics collection facility;
537
therefore, autovacuum cannot be used unless <xref
538
linkend="guc-track-counts"> is set to <literal>true</literal>.
539
In the default configuration, autovacuuming is enabled and the related
540
configuration parameters are appropriately set.
544
The <quote>autovacuum daemon</> actually consists of multiple processes.
545
There is a persistent daemon process, called the
546
<firstterm>autovacuum launcher</firstterm>, which is in charge of starting
547
<firstterm>autovacuum worker</firstterm> processes for all databases. The
548
launcher will distribute the work across time, but attempt to start one
549
worker on each database every <xref linkend="guc-autovacuum-naptime">
550
seconds. One worker will be launched for each database, with a maximum
551
of <xref linkend="guc-autovacuum-max-workers"> processes running at the
552
same time. If there are more than
553
<xref linkend="guc-autovacuum-max-workers"> databases to be processed,
554
the next database will be processed as soon as the first worker finishes.
555
The worker processes will check each table within its database and
556
execute <command>VACUUM</> and/or <command>ANALYZE</> as needed.
560
The <xref linkend="guc-autovacuum-max-workers"> setting limits how many
561
workers may be running at any time. If several large tables all become
562
eligible for vacuuming in a short amount of time, all autovacuum workers
563
may become occupied with vacuuming those tables for a long period.
565
in other tables and databases not being vacuumed until a worker became
566
available. There is not a limit on how many workers might be in a
567
single database, but workers do try to avoid repeating work that has
568
already been done by other workers. Note that the number of running
569
workers does not count towards the <xref linkend="guc-max-connections"> nor
570
the <xref linkend="guc-superuser-reserved-connections"> limits.
574
Tables whose <structfield>relfrozenxid</> value is more than
575
<varname>autovacuum_freeze_max_age</> transactions old are always
576
vacuumed (this also applies to those tables whose freeze max age has
577
been modified via storage parameters; see below). Otherwise, if the
578
number of tuples obsoleted since the last
579
<command>VACUUM</command> exceeds the <quote>vacuum threshold</quote>, the
580
table is vacuumed. The vacuum threshold is defined as:
582
vacuum threshold = vacuum base threshold + vacuum scale factor * number of tuples
584
where the vacuum base threshold is
585
<xref linkend="guc-autovacuum-vacuum-threshold">,
586
the vacuum scale factor is
587
<xref linkend="guc-autovacuum-vacuum-scale-factor">,
588
and the number of tuples is
589
<structname>pg_class</structname>.<structfield>reltuples</structfield>.
590
The number of obsolete tuples is obtained from the statistics
591
collector; it is a semi-accurate count updated by each
592
<command>UPDATE</command> and <command>DELETE</command> operation. (It
593
is only semi-accurate because some information might be lost under heavy
598
For analyze, a similar condition is used: the threshold, defined as:
600
analyze threshold = analyze base threshold + analyze scale factor * number of tuples
602
is compared to the total number of tuples inserted or updated
603
since the last <command>ANALYZE</command>.
607
The default thresholds and scale factors are taken from
608
<filename>postgresql.conf</filename>, but it is possible to override them
609
on a table-by-table basis; see
610
<xref linkend="sql-createtable-storage-parameters"
611
endterm="sql-createtable-storage-parameters-title"> for more information.
613
has been changed via storage parameters, that value is used; otherwise the
614
global settings are used. See <xref linkend="runtime-config-autovacuum"> for
615
more details on the global settings.
619
Besides the base threshold values and scale factors, there are six
620
more autovacuum parameters that can be set for each table via
622
The first parameter, <literal>autovacuum_enabled</>,
623
can be set to <literal>false</literal> to instruct the autovacuum daemon
624
to skip that particular table entirely. In this case
625
autovacuum will only touch the table if it must do so
626
to prevent transaction ID wraparound.
627
Another two parameters,
628
<literal>autovacuum_vacuum_cost_delay</literal> and
629
<literal>autovacuum_vacuum_cost_limit</literal>, are used to set
630
table-specific values for the
631
<xref linkend="runtime-config-resource-vacuum-cost"
632
endterm="runtime-config-resource-vacuum-cost-title">
634
<literal>autovacuum_freeze_min_age</literal>,
635
<literal>autovacuum_freeze_max_age</literal> and
636
<literal>autovacuum_freeze_table_age</literal> are used to set
637
values for <xref linkend="guc-vacuum-freeze-min-age">,
638
<xref linkend="guc-autovacuum-freeze-max-age"> and
639
<xref linkend="guc-vacuum-freeze-table-age"> respectively.
643
When multiple workers are running, the cost limit is
644
<quote>balanced</quote> among all the running workers, so that the
645
total impact on the system is the same, regardless of the number
646
of workers actually running.
652
<sect1 id="routine-reindex">
653
<title>Routine Reindexing</title>
655
<indexterm zone="routine-reindex">
656
<primary>reindex</primary>
660
In some situations it is worthwhile to rebuild indexes periodically
661
with the <xref linkend="sql-reindex" endterm="sql-reindex-title">
666
In <productname>PostgreSQL</> releases before 7.4, periodic reindexing
667
was frequently necessary to avoid <quote>index bloat</>, due to lack of
668
internal space reclamation in B-tree indexes. Any situation in which the
669
range of index keys changed over time — for example, an index on
670
timestamps in a table where old entries are eventually deleted —
671
would result in bloat, because index pages for no-longer-needed portions
672
of the key range were not reclaimed for re-use. Over time, the index size
673
could become indefinitely much larger than the amount of useful data in it.
677
In <productname>PostgreSQL</> 7.4 and later, index pages that have become
678
completely empty are reclaimed for re-use. There is still a possibility
679
for inefficient use of space: if all but a few index keys on a page have
680
been deleted, the page remains allocated. So a usage pattern in which all
681
but a few keys in each range are eventually deleted will see poor use of
682
space. For such usage patterns, periodic reindexing is recommended.
686
The potential for bloat in non-B-tree indexes has not been well
687
characterized. It is a good idea to keep an eye on the index's physical
688
size when using any non-B-tree index type.
692
Also, for B-tree indexes a freshly-constructed index is somewhat faster to
693
access than one that has been updated many times, because logically
694
adjacent pages are usually also physically adjacent in a newly built index.
695
(This consideration does not currently apply to non-B-tree indexes.) It
696
might be worthwhile to reindex periodically just to improve access speed.
701
<sect1 id="logfile-maintenance">
702
<title>Log File Maintenance</title>
704
<indexterm zone="logfile-maintenance">
705
<primary>server log</primary>
706
<secondary>log file maintenance</secondary>
710
It is a good idea to save the database server's log output
711
somewhere, rather than just routing it to <filename>/dev/null</>.
712
The log output is invaluable when it comes time to diagnose
713
problems. However, the log output tends to be voluminous
714
(especially at higher debug levels) and you won't want to save it
715
indefinitely. You need to <quote>rotate</> the log files so that
716
new log files are started and old ones removed after a reasonable
721
If you simply direct the <systemitem>stderr</> of
722
<command>postgres</command> into a
723
file, you will have log output, but
724
the only way to truncate the log file is to stop and restart
725
the server. This might be OK if you are using
726
<productname>PostgreSQL</productname> in a development environment,
727
but few production servers would find this behavior acceptable.
731
A better approach is to send the server's
732
<systemitem>stderr</> output to some type of log rotation program.
733
There is a built-in log rotation program, which you can use by
734
setting the configuration parameter <literal>logging_collector</> to
735
<literal>true</> in <filename>postgresql.conf</>. The control
736
parameters for this program are described in <xref
737
linkend="runtime-config-logging-where">. You can also use this approach
738
to capture the log data in machine readable CSV format.
742
Alternatively, you might prefer to use an external log rotation
743
program, if you have one that you are already using with other
744
server software. For example, the <application>rotatelogs</application>
745
tool included in the <productname>Apache</productname> distribution
746
can be used with <productname>PostgreSQL</productname>. To do this,
747
just pipe the server's
748
<systemitem>stderr</> output to the desired program.
749
If you start the server with
750
<command>pg_ctl</>, then <systemitem>stderr</>
751
is already redirected to <systemitem>stdout</>, so you just need a
752
pipe command, for example:
755
pg_ctl start | rotatelogs /var/log/pgsql_log 86400
760
Another production-grade approach to managing log output is to
761
send it all to <application>syslog</> and let
762
<application>syslog</> deal with file rotation. To do this, set the
763
configuration parameter <literal>log_destination</> to <literal>syslog</>
764
(to log to <application>syslog</> only) in
765
<filename>postgresql.conf</>. Then you can send a <literal>SIGHUP</literal>
766
signal to the <application>syslog</> daemon whenever you want to force it
767
to start writing a new log file. If you want to automate log
768
rotation, the <application>logrotate</application> program can be
769
configured to work with log files from
770
<application>syslog</application>.
774
On many systems, however, <application>syslog</> is not very reliable,
775
particularly with large log messages; it might truncate or drop messages
776
just when you need them the most. Also, on <productname>Linux</>,
777
<application>syslog</> will sync each message to disk, yielding poor
778
performance. (You can use a <literal>-</> at the start of the file name
779
in the <application>syslog</> configuration file to disable this behavior.)
783
Note that all the solutions described above take care of starting new
784
log files at configurable intervals, but they do not handle deletion
785
of old, no-longer-interesting log files. You will probably want to set
786
up a batch job to periodically delete old log files. Another possibility
787
is to configure the rotation program so that old log files are overwritten