2
doc/src/sgml/ref/analyze.sgml
3
PostgreSQL documentation
6
<refentry id="SQL-ANALYZE">
8
<refentrytitle>ANALYZE</refentrytitle>
9
<manvolnum>7</manvolnum>
10
<refmiscinfo>SQL - Language Statements</refmiscinfo>
14
<refname>ANALYZE</refname>
15
<refpurpose>collect statistics about a database</refpurpose>
18
<indexterm zone="sql-analyze">
19
<primary>ANALYZE</primary>
24
ANALYZE [ VERBOSE ] [ <replaceable class="PARAMETER">table</replaceable> [ ( <replaceable class="PARAMETER">column</replaceable> [, ...] ) ] ]
29
<title>Description</title>
32
<command>ANALYZE</command> collects statistics about the contents
33
of tables in the database, and stores the results in the <link
34
linkend="catalog-pg-statistic"><structname>pg_statistic</></>
35
system catalog. Subsequently, the query planner uses these
36
statistics to help determine the most efficient execution plans for
41
With no parameter, <command>ANALYZE</command> examines every table in the
42
current database. With a parameter, <command>ANALYZE</command> examines
43
only that table. It is further possible to give a list of column names,
44
in which case only the statistics for those columns are collected.
49
<title>Parameters</title>
53
<term><literal>VERBOSE</literal></term>
56
Enables display of progress messages.
62
<term><replaceable class="PARAMETER">table</replaceable></term>
65
The name (possibly schema-qualified) of a specific table to
66
analyze. Defaults to all tables in the current database.
72
<term><replaceable class="PARAMETER">column</replaceable></term>
75
The name of a specific column to analyze. Defaults to all columns.
83
<title>Outputs</title>
86
When <literal>VERBOSE</> is specified, <command>ANALYZE</> emits
87
progress messages to indicate which table is currently being
88
processed. Various statistics about the tables are printed as well.
96
In the default <productname>PostgreSQL</productname> configuration,
97
the autovacuum daemon (see <xref linkend="autovacuum">)
98
takes care of automatic analyzing of tables when they are first loaded
99
with data, and as they change throughout regular operation.
100
When autovacuum is disabled,
101
it is a good idea to run <command>ANALYZE</command> periodically, or
102
just after making major changes in the contents of a table. Accurate
103
statistics will help the planner to choose the most appropriate query
104
plan, and thereby improve the speed of query processing. A common
105
strategy is to run <xref linkend="sql-vacuum">
106
and <command>ANALYZE</command> once a day during a low-usage time of day.
110
<command>ANALYZE</command>
111
requires only a read lock on the target table, so it can run in
112
parallel with other activity on the table.
116
The statistics collected by <command>ANALYZE</command> usually
117
include a list of some of the most common values in each column and
118
a histogram showing the approximate data distribution in each
119
column. One or both of these can be omitted if
120
<command>ANALYZE</command> deems them uninteresting (for example,
121
in a unique-key column, there are no common values) or if the
122
column data type does not support the appropriate operators. There
123
is more information about the statistics in <xref
124
linkend="maintenance">.
128
For large tables, <command>ANALYZE</command> takes a random sample
129
of the table contents, rather than examining every row. This
130
allows even very large tables to be analyzed in a small amount of
131
time. Note, however, that the statistics are only approximate, and
132
will change slightly each time <command>ANALYZE</command> is run,
133
even if the actual table contents did not change. This might result
134
in small changes in the planner's estimated costs shown by
135
<xref linkend="sql-explain">.
136
In rare situations, this non-determinism will cause the planner's
137
choices of query plans to change after <command>ANALYZE</command> is run.
138
To avoid this, raise the amount of statistics collected by
139
<command>ANALYZE</command>, as described below.
143
The extent of analysis can be controlled by adjusting the
144
<xref linkend="guc-default-statistics-target"> configuration variable, or
145
on a column-by-column basis by setting the per-column statistics
146
target with <command>ALTER TABLE ... ALTER COLUMN ... SET
147
STATISTICS</command> (see <xref linkend="sql-altertable">).
148
The target value sets the
149
maximum number of entries in the most-common-value list and the
150
maximum number of bins in the histogram. The default target value
151
is 100, but this can be adjusted up or down to trade off accuracy of
152
planner estimates against the time taken for
153
<command>ANALYZE</command> and the amount of space occupied in
154
<literal>pg_statistic</literal>. In particular, setting the
155
statistics target to zero disables collection of statistics for
156
that column. It might be useful to do that for columns that are
157
never used as part of the <literal>WHERE</>, <literal>GROUP BY</>,
158
or <literal>ORDER BY</> clauses of queries, since the planner will
159
have no use for statistics on such columns.
163
The largest statistics target among the columns being analyzed determines
164
the number of table rows sampled to prepare the statistics. Increasing
165
the target causes a proportional increase in the time and space needed
166
to do <command>ANALYZE</command>.
170
One of the values estimated by <command>ANALYZE</command> is the number of
171
distinct values that appear in each column. Because only a subset of the
172
rows are examined, this estimate can sometimes be quite inaccurate, even
173
with the largest possible statistics target. If this inaccuracy leads to
174
bad query plans, a more accurate value can be determined manually and then
176
<command>ALTER TABLE ... ALTER COLUMN ... SET (n_distinct = ...)</>
177
(see <xref linkend="sql-altertable">).
181
If the table being analyzed has one or more children,
182
<command>ANALYZE</command> will gather statistics twice: once on the
183
rows of the parent table only, and a second time on the rows of the
184
parent table with all of its children. The autovacuum daemon, however,
185
will only consider inserts or updates on the parent table when deciding
186
whether to trigger an automatic analyze. If that table is rarely
187
inserted into or updated, the inheritance statistics will not be up to date
188
unless you run <command>ANALYZE</command> manually.
193
<title>Compatibility</title>
196
There is no <command>ANALYZE</command> statement in the SQL standard.
201
<title>See Also</title>
203
<simplelist type="inline">
204
<member><xref linkend="sql-vacuum"></member>
205
<member><xref linkend="app-vacuumdb"></member>
206
<member><xref linkend="runtime-config-resource-vacuum-cost"></member>
207
<member><xref linkend="autovacuum"></member>