mirror of
https://git.postgresql.org/git/postgresql.git
synced 2024-12-21 08:29:39 +08:00
Clean up autovacuum documentation, which was a bit out of sync with what
the code actually does, and needed copy-editing anyway. Also take the opportunity to expand the section on routine reindexing.
This commit is contained in:
parent
9fc24f2bf6
commit
fdff883aca
@ -1,5 +1,5 @@
|
|||||||
<!--
|
<!--
|
||||||
$PostgreSQL: pgsql/doc/src/sgml/maintenance.sgml,v 1.48 2005/09/23 02:01:34 momjian Exp $
|
$PostgreSQL: pgsql/doc/src/sgml/maintenance.sgml,v 1.49 2005/10/21 19:39:08 tgl Exp $
|
||||||
-->
|
-->
|
||||||
|
|
||||||
<chapter id="maintenance">
|
<chapter id="maintenance">
|
||||||
@ -474,9 +474,9 @@ HINT: Stop the postmaster and use a standalone backend to VACUUM in "mydb".
|
|||||||
tuples. These checks use the row-level statistics collection facility;
|
tuples. These checks use the row-level statistics collection facility;
|
||||||
therefore, the autovacuum daemon cannot be used unless <xref
|
therefore, the autovacuum daemon cannot be used unless <xref
|
||||||
linkend="guc-stats-start-collector"> and <xref
|
linkend="guc-stats-start-collector"> and <xref
|
||||||
linkend="guc-stats-row-level"> are set <literal>true</literal>. Also, it's
|
linkend="guc-stats-row-level"> are set to <literal>true</literal>. Also,
|
||||||
important to allow a slot for the autovacuum process when choosing the
|
it's important to allow a slot for the autovacuum process when choosing
|
||||||
value of <xref linkend="guc-superuser-reserved-connections">.
|
the value of <xref linkend="guc-superuser-reserved-connections">.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
@ -487,75 +487,91 @@ HINT: Stop the postmaster and use a standalone backend to VACUUM in "mydb".
|
|||||||
database-wide <command>VACUUM</command> call, or <command>VACUUM
|
database-wide <command>VACUUM</command> call, or <command>VACUUM
|
||||||
FREEZE</command> if it's a template database, and then terminates. If
|
FREEZE</command> if it's a template database, and then terminates. If
|
||||||
no database fulfills this criterion, the one that was least recently
|
no database fulfills this criterion, the one that was least recently
|
||||||
processed by autovacuum itself is chosen. In this mode, each table in
|
processed by autovacuum is chosen. In this case each table in
|
||||||
the database is checked for new and obsolete tuples, according to the
|
the selected database is checked, and individual <command>VACUUM</command>
|
||||||
applicable autovacuum parameters. If a <link linkend="catalog-pg-autovacuum">
|
or <command>ANALYZE</command> commands are issued as needed.
|
||||||
<structname>pg_autovacuum</structname></link> tuple is found for this
|
|
||||||
table, these settings are applied; otherwise the global values in
|
|
||||||
<filename>postgresql.conf</filename> are used. See <xref linkend="runtime-config-autovacuum">
|
|
||||||
for more details on the global settings.
|
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
For each table, two conditions are used to determine which operation to
|
For each table, two conditions are used to determine which operation(s)
|
||||||
apply. If the number of obsolete tuples since the last
|
to apply. If the number of obsolete tuples since the last
|
||||||
<command>VACUUM</command> exceeds the <quote>vacuum threshold</quote>, the
|
<command>VACUUM</command> exceeds the <quote>vacuum threshold</quote>, the
|
||||||
table is vacuumed and analyzed. The vacuum threshold is defined as:
|
table is vacuumed. The vacuum threshold is defined as:
|
||||||
<programlisting>
|
<programlisting>
|
||||||
vacuum threshold = vacuum base threshold + vacuum scale factor * number of tuples
|
vacuum threshold = vacuum base threshold + vacuum scale factor * number of tuples
|
||||||
</programlisting>
|
</programlisting>
|
||||||
where the vacuum base threshold is
|
where the vacuum base threshold is
|
||||||
<structname>pg_autovacuum</structname>.<structfield>vac_base_thresh</structfield>,
|
<xref linkend="guc-autovacuum-vacuum-threshold">,
|
||||||
the vacuum scale factor is
|
the vacuum scale factor is
|
||||||
<structname>pg_autovacuum</structname>.<structfield>vac_scale_factor</structfield>
|
<xref linkend="guc-autovacuum-vacuum-scale-factor">,
|
||||||
and the number of tuples is
|
and the number of tuples is
|
||||||
<structname>pg_class</structname>.<structfield>reltuples</structfield>.
|
<structname>pg_class</structname>.<structfield>reltuples</structfield>.
|
||||||
The number of obsolete tuples is taken from the statistics
|
The number of obsolete tuples is obtained from the statistics
|
||||||
collector, which is a semi-accurate count updated by each
|
collector; it is a semi-accurate count updated by each
|
||||||
<command>UPDATE</command> and <command>DELETE</command> operation. (It
|
<command>UPDATE</command> and <command>DELETE</command> operation. (It
|
||||||
is only semi-accurate because some information may be lost under heavy
|
is only semi-accurate because some information may be lost under heavy
|
||||||
load.) For analyze, a similar condition is used: the threshold, calculated
|
load.) For analyze, a similar condition is used: the threshold, defined as
|
||||||
by an equivalent equation to that above, is compared to the number of
|
<programlisting>
|
||||||
new tuples, that is, those created by the <command>INSERT</command> and
|
analyze threshold = analyze base threshold + analyze scale factor * number of tuples
|
||||||
<command>COPY</command> commands.
|
</programlisting>
|
||||||
|
is compared to the total number of tuples inserted, updated, or deleted
|
||||||
|
since the last <command>ANALYZE</command>.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
Note that if any of the values in <structname>pg_autovacuum</structname>
|
The default thresholds and scale factors are taken from
|
||||||
are set to a negative number, or if a tuple is not present at all in
|
<filename>postgresql.conf</filename>, but it is possible to override them
|
||||||
<structname>pg_autovacuum</structname> for any particular table, the
|
on a table-by-table basis by making entries in the system catalog
|
||||||
equivalent value from <filename>postgresql.conf</filename> is used.
|
<link
|
||||||
|
linkend="catalog-pg-autovacuum"><structname>pg_autovacuum</></link>.
|
||||||
|
If a <structname>pg_autovacuum</structname> row exists for a particular
|
||||||
|
table, the settings it specifies are applied; otherwise the global
|
||||||
|
settings are used. See <xref linkend="runtime-config-autovacuum"> for
|
||||||
|
more details on the global settings.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
Besides the base threshold values and scale factors, there are three
|
Besides the base threshold values and scale factors, there are three
|
||||||
parameters that can be set for each table in <structname>pg_autovacuum</structname>.
|
more parameters that can be set for each table in
|
||||||
The first parameter, <structname>pg_autovacuum</>.<structfield>enabled</>,
|
<structname>pg_autovacuum</structname>.
|
||||||
can be used to instruct the autovacuum daemon to skip any particular table
|
The first, <structname>pg_autovacuum</>.<structfield>enabled</>,
|
||||||
by setting it to <literal>false</literal>.
|
can be set to <literal>false</literal> to instruct the autovacuum daemon
|
||||||
The other two, the vacuum cost delay
|
to skip that particular table entirely. In this case
|
||||||
|
autovacuum will only touch the table when it vacuums the entire database
|
||||||
|
to prevent transaction ID wraparound.
|
||||||
|
The other two parameters, the vacuum cost delay
|
||||||
(<structname>pg_autovacuum</structname>.<structfield>vac_cost_delay</structfield>)
|
(<structname>pg_autovacuum</structname>.<structfield>vac_cost_delay</structfield>)
|
||||||
and the vacuum cost limit
|
and the vacuum cost limit
|
||||||
(<structname>pg_autovacuum</structname>.<structfield>vac_cost_limit</structfield>),
|
(<structname>pg_autovacuum</structname>.<structfield>vac_cost_limit</structfield>),
|
||||||
are used to set table-specific values for the
|
are used to set table-specific values for the
|
||||||
<xref linkend="runtime-config-resource-vacuum-cost" endterm="runtime-config-resource-vacuum-cost-title">
|
<xref linkend="runtime-config-resource-vacuum-cost" endterm="runtime-config-resource-vacuum-cost-title">
|
||||||
feature. The above note about negative values also applies here, but
|
feature.
|
||||||
also note that if the <filename>postgresql.conf</filename> variables
|
|
||||||
<varname>autovacuum_vacuum_cost_limit</varname> and
|
|
||||||
<varname>autovacuum_vacuum_cost_delay</varname> are also set to negative
|
|
||||||
values, the global <varname>vacuum_cost_limit</varname> and
|
|
||||||
<varname>vacuum_cost_delay</varname> values will be used instead.
|
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<note>
|
<para>
|
||||||
|
If any of the values in <structname>pg_autovacuum</structname>
|
||||||
|
are set to a negative number, or if a row is not present at all in
|
||||||
|
<structname>pg_autovacuum</structname> for any particular table, the
|
||||||
|
corresponding values from <filename>postgresql.conf</filename> are used.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
There is not currently any support for making
|
||||||
|
<structname>pg_autovacuum</structname> entries, except by doing
|
||||||
|
manual <command>INSERT</>s into the catalog. This feature will be
|
||||||
|
improved in future releases, and it is likely that the catalog
|
||||||
|
definition will change.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<caution>
|
||||||
<para>
|
<para>
|
||||||
The contents of the <structname>pg_autovacuum</structname> system
|
The contents of the <structname>pg_autovacuum</structname> system
|
||||||
catalog are currently not saved in database dumps created by
|
catalog are currently not saved in database dumps created by
|
||||||
the tools <command>pg_dump</command> and <command>pg_dumpall</command>.
|
the tools <command>pg_dump</command> and <command>pg_dumpall</command>.
|
||||||
If you need to preserve them across a dump/reload cycle, make sure you
|
If you want to preserve them across a dump/reload cycle, make sure you
|
||||||
dump the catalog manually.
|
dump the catalog manually.
|
||||||
</para>
|
</para>
|
||||||
</note>
|
</caution>
|
||||||
|
|
||||||
</sect2>
|
</sect2>
|
||||||
</sect1>
|
</sect1>
|
||||||
@ -571,8 +587,42 @@ vacuum threshold = vacuum base threshold + vacuum scale factor * number of tuple
|
|||||||
<para>
|
<para>
|
||||||
In some situations it is worthwhile to rebuild indexes periodically
|
In some situations it is worthwhile to rebuild indexes periodically
|
||||||
with the <command>REINDEX</> command.
|
with the <command>REINDEX</> command.
|
||||||
However, <productname>PostgreSQL</> 7.4 has substantially reduced the need
|
</para>
|
||||||
for this activity compared to earlier releases.
|
|
||||||
|
<para>
|
||||||
|
In <productname>PostgreSQL</> releases before 7.4, periodic reindexing
|
||||||
|
was frequently necessary to avoid <quote>index bloat</>, due to lack of
|
||||||
|
internal space reclamation in btree indexes. Any situation in which the
|
||||||
|
range of index keys changed over time — for example, an index on
|
||||||
|
timestamps in a table where old entries are eventually deleted —
|
||||||
|
would result in bloat, because index pages for no-longer-needed portions
|
||||||
|
of the key range were not reclaimed for re-use. Over time, the index size
|
||||||
|
could become indefinitely much larger than the amount of useful data in it.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
In <productname>PostgreSQL</> 7.4 and later, index pages that have become
|
||||||
|
completely empty are reclaimed for re-use. There is still a possibility
|
||||||
|
for inefficient use of space: if all but a few index keys on a page have
|
||||||
|
been deleted, the page remains allocated. So a usage pattern in which all
|
||||||
|
but a few keys in each range are eventually deleted will see poor use of
|
||||||
|
space. The potential for bloat is not indefinite — at worst there
|
||||||
|
will be one key per page — but it may still be worthwhile to schedule
|
||||||
|
periodic reindexing for indexes that have such usage patterns.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
The potential for bloat in non-btree indexes has not been well
|
||||||
|
characterized. It is a good idea to keep an eye on the index's physical
|
||||||
|
size when using any non-btree index type.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
Also, for btree indexes a freshly-constructed index is somewhat faster to
|
||||||
|
access than one that has been updated many times, because logically
|
||||||
|
adjacent pages are usually also physically adjacent in a newly built index.
|
||||||
|
(This consideration does not currently apply to non-btree indexes.) It
|
||||||
|
might be worthwhile to reindex periodically just to improve access speed.
|
||||||
</para>
|
</para>
|
||||||
</sect1>
|
</sect1>
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user