mirror of
https://git.postgresql.org/git/postgresql.git
synced 2025-01-24 18:55:04 +08:00
Copy-editing for recent documentation changes relevant to WAL,
full_page_writes, etc.
This commit is contained in:
parent
6d6c3722fb
commit
f72a342fb7
@ -1,5 +1,5 @@
|
|||||||
<!--
|
<!--
|
||||||
$PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.71 2005/10/15 01:15:33 alvherre Exp $
|
$PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.72 2005/10/22 21:56:07 tgl Exp $
|
||||||
-->
|
-->
|
||||||
<chapter id="backup">
|
<chapter id="backup">
|
||||||
<title>Backup and Restore</title>
|
<title>Backup and Restore</title>
|
||||||
@ -1148,21 +1148,20 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows
|
|||||||
|
|
||||||
<para>
|
<para>
|
||||||
It should also be noted that the default <acronym>WAL</acronym>
|
It should also be noted that the default <acronym>WAL</acronym>
|
||||||
format is fairly bulky since it includes many disk page snapshots. The pages
|
format is fairly bulky since it includes many disk page snapshots.
|
||||||
are partially compressed, using the simple expedient of removing the
|
These page snapshots are designed to support crash recovery,
|
||||||
empty space (if any) within each block. You can significantly reduce
|
since we may need to fix partially-written disk pages. Depending
|
||||||
|
on your system hardware and software, the risk of partial writes may
|
||||||
|
be small enough to ignore, in which case you can significantly reduce
|
||||||
the total volume of archived logs by turning off page snapshots
|
the total volume of archived logs by turning off page snapshots
|
||||||
using the <xref linkend="guc-full-page-writes"> parameter,
|
using the <xref linkend="guc-full-page-writes"> parameter.
|
||||||
though you should read the notes and warnings in
|
(Read the notes and warnings in
|
||||||
<xref linkend="reliability"> before you do so.
|
<xref linkend="reliability"> before you do so.)
|
||||||
These page snapshots are designed to allow crash recovery,
|
Turning off page snapshots does not prevent use of the logs for PITR
|
||||||
since we may need to fix partially-written disk pages. It is not
|
operations.
|
||||||
necessary to store these page copies for PITR operations, however.
|
|
||||||
If you turn off <xref linkend="guc-full-page-writes">, your PITR
|
|
||||||
backup and recovery operations will continue to work successfully.
|
|
||||||
An area for future development is to compress archived WAL data by
|
An area for future development is to compress archived WAL data by
|
||||||
removing unnecessary page copies when <xref linkend="guc-full-page-writes">
|
removing unnecessary page copies even when <varname>full_page_writes</>
|
||||||
is turned on. In the meantime, administrators
|
is on. In the meantime, administrators
|
||||||
may wish to reduce the number of page snapshots included in WAL by
|
may wish to reduce the number of page snapshots included in WAL by
|
||||||
increasing the checkpoint interval parameters as much as feasible.
|
increasing the checkpoint interval parameters as much as feasible.
|
||||||
</para>
|
</para>
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
<!--
|
<!--
|
||||||
$PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.31 2005/10/15 20:12:32 neilc Exp $
|
$PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.32 2005/10/22 21:56:07 tgl Exp $
|
||||||
-->
|
-->
|
||||||
<chapter Id="runtime-config">
|
<chapter Id="runtime-config">
|
||||||
<title>Run-time Configuration</title>
|
<title>Run-time Configuration</title>
|
||||||
@ -1251,14 +1251,15 @@ SET ENABLE_SEQSCAN TO OFF;
|
|||||||
<listitem>
|
<listitem>
|
||||||
<para>
|
<para>
|
||||||
If this option is on, the <productname>PostgreSQL</> server
|
If this option is on, the <productname>PostgreSQL</> server
|
||||||
will use the <function>fsync()</> system call in several places
|
will try to make sure that updates are physically written to
|
||||||
to make sure that updates are physically written to disk. This
|
disk, by issuing <function>fsync()</> system calls or various
|
||||||
insures that a database cluster will recover to a
|
equivalent methods (see <xref linkend="guc-wal-sync-method">).
|
||||||
|
This ensures that the database cluster can recover to a
|
||||||
consistent state after an operating system or hardware crash.
|
consistent state after an operating system or hardware crash.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
However, using <function>fsync()</function> results in a
|
However, using <varname>fsync</varname> results in a
|
||||||
performance penalty: when a transaction is committed,
|
performance penalty: when a transaction is committed,
|
||||||
<productname>PostgreSQL</productname> must wait for the
|
<productname>PostgreSQL</productname> must wait for the
|
||||||
operating system to flush the write-ahead log to disk. When
|
operating system to flush the write-ahead log to disk. When
|
||||||
@ -1268,7 +1269,7 @@ SET ENABLE_SEQSCAN TO OFF;
|
|||||||
However, if the system crashes, the results of the last few
|
However, if the system crashes, the results of the last few
|
||||||
committed transactions may be lost in part or whole. In the
|
committed transactions may be lost in part or whole. In the
|
||||||
worst case, unrecoverable data corruption may occur.
|
worst case, unrecoverable data corruption may occur.
|
||||||
(Crashes of the database server itself are <emphasis>not</>
|
(Crashes of the database software itself are <emphasis>not</>
|
||||||
a risk factor here. Only an operating-system-level crash
|
a risk factor here. Only an operating-system-level crash
|
||||||
creates a risk of corruption.)
|
creates a risk of corruption.)
|
||||||
</para>
|
</para>
|
||||||
@ -1277,8 +1278,8 @@ SET ENABLE_SEQSCAN TO OFF;
|
|||||||
Due to the risks involved, there is no universally correct
|
Due to the risks involved, there is no universally correct
|
||||||
setting for <varname>fsync</varname>. Some administrators
|
setting for <varname>fsync</varname>. Some administrators
|
||||||
always disable <varname>fsync</varname>, while others only
|
always disable <varname>fsync</varname>, while others only
|
||||||
turn it off for bulk loads, where there is a clear restart
|
turn it off during initial bulk data loads, where there is a clear
|
||||||
point if something goes wrong, whereas some administrators
|
restart point if something goes wrong. Others
|
||||||
always leave <varname>fsync</varname> enabled. The default is
|
always leave <varname>fsync</varname> enabled. The default is
|
||||||
to enable <varname>fsync</varname>, for maximum reliability.
|
to enable <varname>fsync</varname>, for maximum reliability.
|
||||||
If you trust your operating system, your hardware, and your
|
If you trust your operating system, your hardware, and your
|
||||||
@ -1288,9 +1289,9 @@ SET ENABLE_SEQSCAN TO OFF;
|
|||||||
|
|
||||||
<para>
|
<para>
|
||||||
This option can only be set at server start or in the
|
This option can only be set at server start or in the
|
||||||
<filename>postgresql.conf</filename> file. If this option
|
<filename>postgresql.conf</filename> file. If you turn
|
||||||
is <literal>off</>, consider also turning off
|
this option off, also consider turning off
|
||||||
<varname>guc-full-page-writes</>.
|
<xref linkend="guc-full-page-writes">.
|
||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
</varlistentry>
|
</varlistentry>
|
||||||
@ -1302,8 +1303,10 @@ SET ENABLE_SEQSCAN TO OFF;
|
|||||||
</indexterm>
|
</indexterm>
|
||||||
<listitem>
|
<listitem>
|
||||||
<para>
|
<para>
|
||||||
Method used for forcing WAL updates out to disk. Possible
|
Method used for forcing WAL updates out to disk.
|
||||||
values are:
|
If <varname>fsync</varname> is off then this setting is irrelevant,
|
||||||
|
since updates will not be forced out at all.
|
||||||
|
Possible values are:
|
||||||
</para>
|
</para>
|
||||||
<itemizedlist>
|
<itemizedlist>
|
||||||
<listitem>
|
<listitem>
|
||||||
@ -1313,7 +1316,12 @@ SET ENABLE_SEQSCAN TO OFF;
|
|||||||
</listitem>
|
</listitem>
|
||||||
<listitem>
|
<listitem>
|
||||||
<para>
|
<para>
|
||||||
<literal>fdatasync</> (call <function>fdatasync()</> at each commit),
|
<literal>fdatasync</> (call <function>fdatasync()</> at each commit)
|
||||||
|
</para>
|
||||||
|
</listitem>
|
||||||
|
<listitem>
|
||||||
|
<para>
|
||||||
|
<literal>fsync_writethrough</> (call <function>fsync()</> at each commit, forcing write-through of any disk write cache)
|
||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
<listitem>
|
<listitem>
|
||||||
@ -1322,11 +1330,6 @@ SET ENABLE_SEQSCAN TO OFF;
|
|||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
<listitem>
|
<listitem>
|
||||||
<para>
|
|
||||||
<literal>fsync_writethrough</> (force write-through of any disk write cache)
|
|
||||||
</para>
|
|
||||||
</listitem>
|
|
||||||
<listitem>
|
|
||||||
<para>
|
<para>
|
||||||
<literal>open_sync</> (write WAL files with <function>open()</> option <symbol>O_SYNC</>)
|
<literal>open_sync</> (write WAL files with <function>open()</> option <symbol>O_SYNC</>)
|
||||||
</para>
|
</para>
|
||||||
@ -1334,8 +1337,7 @@ SET ENABLE_SEQSCAN TO OFF;
|
|||||||
</itemizedlist>
|
</itemizedlist>
|
||||||
<para>
|
<para>
|
||||||
Not all of these choices are available on all platforms.
|
Not all of these choices are available on all platforms.
|
||||||
The top-most supported option is used as the default.
|
The default is the first method in the above list that is supported.
|
||||||
If <varname>fsync</varname> is off then this setting is irrelevant.
|
|
||||||
This option can only be set at server start or in the
|
This option can only be set at server start or in the
|
||||||
<filename>postgresql.conf</filename> file.
|
<filename>postgresql.conf</filename> file.
|
||||||
</para>
|
</para>
|
||||||
@ -1349,21 +1351,37 @@ SET ENABLE_SEQSCAN TO OFF;
|
|||||||
<term><varname>full_page_writes</varname> (<type>boolean</type>)</term>
|
<term><varname>full_page_writes</varname> (<type>boolean</type>)</term>
|
||||||
<listitem>
|
<listitem>
|
||||||
<para>
|
<para>
|
||||||
A page write in process during an operating system crash might
|
When this option is on, the <productname>PostgreSQL</> server
|
||||||
be only partially written to disk, leading to an on-disk page
|
writes the entire content of each disk page to WAL during the
|
||||||
that contains a mix of old and new data. During recovery, the
|
first modification of that page after a checkpoint.
|
||||||
row changes stored in WAL are not enough to completely restore
|
This is needed because
|
||||||
the page.
|
a page write that is in process during an operating system crash might
|
||||||
|
be only partially completed, leading to an on-disk page
|
||||||
|
that contains a mix of old and new data. The row-level change data
|
||||||
|
normally stored in WAL will not be enough to completely restore
|
||||||
|
such a page during post-crash recovery. Storing the full page image
|
||||||
|
guarantees that the page can be correctly restored, but at a price
|
||||||
|
in increasing the amount of data that must be written to WAL.
|
||||||
|
(Because WAL replay always starts from a checkpoint, it is sufficient
|
||||||
|
to do this during the first change of each page after a checkpoint.
|
||||||
|
Therefore, one way to reduce the cost of full-page writes is to
|
||||||
|
increase the checkpoint interval parameters.)
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
When this option is on, the <productname>PostgreSQL</> server
|
Turning this option off speeds normal operation, but
|
||||||
writes full pages to WAL when they are first modified after a
|
might lead to a corrupt database after an operating system crash
|
||||||
checkpoint so crash recovery is possible. Turning this option off
|
or power failure. The risks are similar to turning off
|
||||||
might lead to a corrupt system after an operating system crash
|
<varname>fsync</>, though smaller. It may be safe to turn off
|
||||||
or power failure because uncorrected partial pages might contain
|
this option if you have hardware (such as a battery-backed disk
|
||||||
inconsistent or corrupt data. The risks are less but similar to
|
controller) or filesystem software (e.g., Reiser4) that reduces
|
||||||
<varname>fsync</>.
|
the risk of partial page writes to an acceptably low level.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
Turning off this option does not affect use of
|
||||||
|
WAL archiving for point-in-time recovery (PITR)
|
||||||
|
(see <xref linkend="backup-online">).
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
@ -1384,7 +1402,7 @@ SET ENABLE_SEQSCAN TO OFF;
|
|||||||
Number of disk-page buffers allocated in shared memory for WAL data.
|
Number of disk-page buffers allocated in shared memory for WAL data.
|
||||||
The default is 8. The setting need only be large enough to hold
|
The default is 8. The setting need only be large enough to hold
|
||||||
the amount of WAL data generated by one typical transaction, since
|
the amount of WAL data generated by one typical transaction, since
|
||||||
the data is flushed to disk at every transaction commit.
|
the data is written out to disk at every transaction commit.
|
||||||
This option can only be set at server start.
|
This option can only be set at server start.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
@ -1481,8 +1499,9 @@ SET ENABLE_SEQSCAN TO OFF;
|
|||||||
<para>
|
<para>
|
||||||
Write a message to the server log if checkpoints caused by
|
Write a message to the server log if checkpoints caused by
|
||||||
the filling of checkpoint segment files happen closer together
|
the filling of checkpoint segment files happen closer together
|
||||||
than this many seconds. The default is 30 seconds.
|
than this many seconds (which suggests that
|
||||||
Zero turns off the warning.
|
<varname>checkpoint_segments</> ought to be raised). The default is
|
||||||
|
30 seconds. Zero disables the warning.
|
||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
</varlistentry>
|
</varlistentry>
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/wal.sgml,v 1.36 2005/10/13 17:32:42 momjian Exp $ -->
|
<!-- $PostgreSQL: pgsql/doc/src/sgml/wal.sgml,v 1.37 2005/10/22 21:56:07 tgl Exp $ -->
|
||||||
|
|
||||||
<chapter id="reliability">
|
<chapter id="reliability">
|
||||||
<title>Reliability</title>
|
<title>Reliability</title>
|
||||||
@ -7,12 +7,12 @@
|
|||||||
Reliability is a major feature of any serious database system, and
|
Reliability is a major feature of any serious database system, and
|
||||||
<productname>PostgreSQL</> does everything possible to guarantee
|
<productname>PostgreSQL</> does everything possible to guarantee
|
||||||
reliable operation. One aspect of reliable operation is that all data
|
reliable operation. One aspect of reliable operation is that all data
|
||||||
recorded by a transaction should be stored in a non-volatile area
|
recorded by a committed transaction should be stored in a non-volatile area
|
||||||
that is safe from power loss, operating system failure, and hardware
|
that is safe from power loss, operating system failure, and hardware
|
||||||
failure (unrelated to the non-volatile area itself). To accomplish
|
failure (except failure of the non-volatile area itself, of course).
|
||||||
this, <productname>PostgreSQL</> uses the magnetic platters of modern
|
Successfully writing the data to the computer's permanent storage
|
||||||
disk drives for permanent storage that is immune to the failures
|
(disk drive or equivalent) ordinarily meets this requirement.
|
||||||
listed above. In fact, even if a computer is fatally damaged, if
|
In fact, even if a computer is fatally damaged, if
|
||||||
the disk drives survive they can be moved to another computer with
|
the disk drives survive they can be moved to another computer with
|
||||||
similar hardware and all committed transactions will remain intact.
|
similar hardware and all committed transactions will remain intact.
|
||||||
</para>
|
</para>
|
||||||
@ -21,60 +21,64 @@
|
|||||||
While forcing data periodically to the disk platters might seem like
|
While forcing data periodically to the disk platters might seem like
|
||||||
a simple operation, it is not. Because disk drives are dramatically
|
a simple operation, it is not. Because disk drives are dramatically
|
||||||
slower than main memory and CPUs, several layers of caching exist
|
slower than main memory and CPUs, several layers of caching exist
|
||||||
between the computer's main memory and the disk drive platters.
|
between the computer's main memory and the disk platters.
|
||||||
First, there is the operating system kernel cache, which caches
|
First, there is the operating system's buffer cache, which caches
|
||||||
frequently requested disk blocks and delays disk writes. Fortunately,
|
frequently requested disk blocks and combines disk writes. Fortunately,
|
||||||
all operating systems give applications a way to force writes from
|
all operating systems give applications a way to force writes from
|
||||||
the kernel cache to disk, and <productname>PostgreSQL</> uses those
|
the buffer cache to disk, and <productname>PostgreSQL</> uses those
|
||||||
features. In fact, the <xref linkend="guc-wal-sync-method"> parameter
|
features. (See the <xref linkend="guc-wal-sync-method"> parameter
|
||||||
controls how this is done.
|
to adjust how this is done.)
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
Secondly, there is an optional disk drive controller cache,
|
Next, there may be a cache in the disk drive controller; this is
|
||||||
particularly popular on <acronym>RAID</> controller cards. Some of
|
particularly common on <acronym>RAID</> controller cards. Some of
|
||||||
these caches are <literal>write-through</>, meaning writes are passed
|
these caches are <firstterm>write-through</>, meaning writes are passed
|
||||||
along to the drive as soon as they arrive. Others are
|
along to the drive as soon as they arrive. Others are
|
||||||
<literal>write-back</>, meaning data is passed on to the drive at
|
<firstterm>write-back</>, meaning data is passed on to the drive at
|
||||||
some later time. Such caches can be a reliability problem because the
|
some later time. Such caches can be a reliability hazard because the
|
||||||
disk controller card cache is volatile, unlike the disk driver
|
memory in the disk controller cache is volatile, and will lose its
|
||||||
platters, unless the disk drive controller has a battery-backed
|
contents in a power failure. Better controller cards have
|
||||||
cache, meaning the card has a battery that maintains power to the
|
<firstterm>battery-backed</> caches, meaning the card has a battery that
|
||||||
cache in case of server power loss. When the disk drives are later
|
maintains power to the cache in case of system power loss. After power
|
||||||
accessible, the data is written to the drives.
|
is restored the data will be written to the disk drives.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
And finally, most disk drives have caches. Some are write-through
|
And finally, most disk drives have caches. Some are write-through
|
||||||
(typically SCSI), and some are write-back(typically IDE), and the
|
while some are write-back, and the
|
||||||
same concerns about data loss exist for write-back drive caches as
|
same concerns about data loss exist for write-back drive caches as
|
||||||
exist for disk controller caches. To have reliability, all
|
exist for disk controller caches. Consumer-grade IDE drives are
|
||||||
storage subsystems must be reliable in their storage characteristics.
|
particularly likely to contain write-back caches that will not
|
||||||
When the operating system sends a write request to the drive platters,
|
survive a power failure.
|
||||||
there is little it can do to make sure the data has arrived at a
|
|
||||||
non-volatile store area on the system. Rather, it is the
|
|
||||||
administrator's responsibility to be sure that all storage components
|
|
||||||
have reliable characteristics.
|
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
One other area of potential data loss are the disk platter writes
|
When the operating system sends a write request to the disk hardware,
|
||||||
themselves. Disk platters are internally made up of 512-byte sectors.
|
there is little it can do to make sure the data has arrived at a truly
|
||||||
|
non-volatile storage area. Rather, it is the
|
||||||
|
administrator's responsibility to be sure that all storage components
|
||||||
|
ensure data integrity. Avoid disk controllers that have non-battery-backed
|
||||||
|
write caches. At the drive level, disable write-back caching if the
|
||||||
|
drive cannot guarantee the data will be written before shutdown.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
Another risk of data loss is posed by the disk platter write
|
||||||
|
operations themselves. Disk platters are divided into sectors,
|
||||||
|
commonly 512 bytes each. Every physical read or write operation
|
||||||
|
processes a whole sector.
|
||||||
When a write request arrives at the drive, it might be for 512 bytes,
|
When a write request arrives at the drive, it might be for 512 bytes,
|
||||||
1024 bytes, or 8192 bytes, and the process of writing could fail due
|
1024 bytes, or 8192 bytes, and the process of writing could fail due
|
||||||
to power loss at any time, meaning some of the 512-byte sectors were
|
to power loss at any time, meaning some of the 512-byte sectors were
|
||||||
written, and others were not, or the first half of a 512-byte sector
|
written, and others were not. To guard against such failures,
|
||||||
has new data, and the remainder has the original data. Obviously, on
|
|
||||||
startup, <productname>PostgreSQL</> would not be able to deal with
|
|
||||||
these partially written cases. To guard against that,
|
|
||||||
<productname>PostgreSQL</> periodically writes full page images to
|
<productname>PostgreSQL</> periodically writes full page images to
|
||||||
permanent storage <emphasis>before</> modifying the actual page on
|
permanent storage <emphasis>before</> modifying the actual page on
|
||||||
disk. By doing this, during crash recovery <productname>PostgreSQL</> can
|
disk. By doing this, during crash recovery <productname>PostgreSQL</> can
|
||||||
restore partially-written pages. If you have a battery-backed disk
|
restore partially-written pages. If you have a battery-backed disk
|
||||||
controller or filesystem (e.g. Reiser4) that prevents partial page writes,
|
controller or filesystem software (e.g., Reiser4) that prevents partial
|
||||||
you can turn off this page imaging by using the
|
page writes, you can turn off this page imaging by using the
|
||||||
<xref linkend="guc-full-page-writes"> parameter. This parameter has no
|
<xref linkend="guc-full-page-writes"> parameter.
|
||||||
effect on the successful use of Point in Time Recovery (PITR),
|
|
||||||
described in <xref linkend="backup-online">.
|
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
@ -111,11 +115,7 @@
|
|||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
WAL brings three major benefits:
|
A major benefit of using <acronym>WAL</acronym> is a
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
The first major benefit of using <acronym>WAL</acronym> is a
|
|
||||||
significantly reduced number of disk writes, because only the log
|
significantly reduced number of disk writes, because only the log
|
||||||
file needs to be flushed to disk at the time of transaction
|
file needs to be flushed to disk at the time of transaction
|
||||||
commit, rather than every data file changed by the transaction.
|
commit, rather than every data file changed by the transaction.
|
||||||
@ -129,30 +129,7 @@
|
|||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
The next benefit is crash recovery protection. The truth is
|
<acronym>WAL</acronym> also makes it possible to support on-line
|
||||||
that, before <acronym>WAL</acronym> was introduced back in release 7.1,
|
|
||||||
<productname>PostgreSQL</productname> was never able to guarantee
|
|
||||||
consistency in the case of a crash. Now,
|
|
||||||
<acronym>WAL</acronym> protects fully against the following problems:
|
|
||||||
|
|
||||||
<orderedlist>
|
|
||||||
<listitem>
|
|
||||||
<simpara>index rows pointing to nonexistent table rows</simpara>
|
|
||||||
</listitem>
|
|
||||||
|
|
||||||
<listitem>
|
|
||||||
<simpara>index rows lost in split operations</simpara>
|
|
||||||
</listitem>
|
|
||||||
|
|
||||||
<listitem>
|
|
||||||
<simpara>totally corrupted table or index page content, because
|
|
||||||
of partially written data pages</simpara>
|
|
||||||
</listitem>
|
|
||||||
</orderedlist>
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
Finally, <acronym>WAL</acronym> makes it possible to support on-line
|
|
||||||
backup and point-in-time recovery, as described in <xref
|
backup and point-in-time recovery, as described in <xref
|
||||||
linkend="backup-online">. By archiving the WAL data we can support
|
linkend="backup-online">. By archiving the WAL data we can support
|
||||||
reverting to any time instant covered by the available WAL data:
|
reverting to any time instant covered by the available WAL data:
|
||||||
@ -169,7 +146,7 @@
|
|||||||
<title><acronym>WAL</acronym> Configuration</title>
|
<title><acronym>WAL</acronym> Configuration</title>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
There are several <acronym>WAL</acronym>-related configuration parameters that
|
There are several <acronym>WAL</>-related configuration parameters that
|
||||||
affect database performance. This section explains their use.
|
affect database performance. This section explains their use.
|
||||||
Consult <xref linkend="runtime-config"> for general information about
|
Consult <xref linkend="runtime-config"> for general information about
|
||||||
setting server configuration parameters.
|
setting server configuration parameters.
|
||||||
@ -178,16 +155,17 @@
|
|||||||
<para>
|
<para>
|
||||||
<firstterm>Checkpoints</firstterm><indexterm><primary>checkpoint</></>
|
<firstterm>Checkpoints</firstterm><indexterm><primary>checkpoint</></>
|
||||||
are points in the sequence of transactions at which it is guaranteed
|
are points in the sequence of transactions at which it is guaranteed
|
||||||
that the data files have been updated with all information logged before
|
that the data files have been updated with all information written before
|
||||||
the checkpoint. At checkpoint time, all dirty data pages are flushed to
|
the checkpoint. At checkpoint time, all dirty data pages are flushed to
|
||||||
disk and a special checkpoint record is written to the log file. As a
|
disk and a special checkpoint record is written to the log file.
|
||||||
result, in the event of a crash, the crash recovery procedure knows from
|
In the event of a crash, the crash recovery procedure looks at the latest
|
||||||
what point in the log (known as the redo record) it should start the
|
checkpoint record to determine the point in the log (known as the redo
|
||||||
REDO operation, since any changes made to data files before that point
|
record) from which it should start the REDO operation. Any changes made to
|
||||||
are already on disk. After a checkpoint has been made, any log segments
|
data files before that point are known to be already on disk. Hence, after
|
||||||
written before the redo record are no longer needed and can be recycled
|
a checkpoint has been made, any log segments preceding the one containing
|
||||||
or removed. (When <acronym>WAL</acronym> archiving is being done, the
|
the redo record are no longer needed and can be recycled or removed. (When
|
||||||
log segments must be archived before being recycled or removed.)
|
<acronym>WAL</acronym> archiving is being done, the log segments must be
|
||||||
|
archived before being recycled or removed.)
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
@ -206,7 +184,7 @@
|
|||||||
more often. This allows faster after-crash recovery (since less work
|
more often. This allows faster after-crash recovery (since less work
|
||||||
will need to be redone). However, one must balance this against the
|
will need to be redone). However, one must balance this against the
|
||||||
increased cost of flushing dirty data pages more often. If
|
increased cost of flushing dirty data pages more often. If
|
||||||
<xref linkend="guc-full-page-writes"> is set (the default), there is
|
<xref linkend="guc-full-page-writes"> is set (as is the default), there is
|
||||||
another factor to consider. To ensure data page consistency,
|
another factor to consider. To ensure data page consistency,
|
||||||
the first modification of a data page after each checkpoint results in
|
the first modification of a data page after each checkpoint results in
|
||||||
logging the entire page content. In that case,
|
logging the entire page content. In that case,
|
||||||
@ -228,8 +206,9 @@
|
|||||||
<varname>checkpoint_segments</varname>. Occasional appearance of such
|
<varname>checkpoint_segments</varname>. Occasional appearance of such
|
||||||
a message is not cause for alarm, but if it appears often then the
|
a message is not cause for alarm, but if it appears often then the
|
||||||
checkpoint control parameters should be increased. Bulk operations such
|
checkpoint control parameters should be increased. Bulk operations such
|
||||||
as a COPY, INSERT SELECT etc. may cause a number of such warnings if you
|
as large <command>COPY</> transfers may cause a number of such warnings
|
||||||
do not set <xref linkend="guc-checkpoint-segments"> high enough.
|
to appear if you have not set <varname>checkpoint_segments</> high
|
||||||
|
enough.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
@ -273,8 +252,7 @@
|
|||||||
correspondingly increase shared memory usage. When
|
correspondingly increase shared memory usage. When
|
||||||
<xref linkend="guc-full-page-writes"> is set and the system is very busy,
|
<xref linkend="guc-full-page-writes"> is set and the system is very busy,
|
||||||
setting this value higher will help smooth response times during the
|
setting this value higher will help smooth response times during the
|
||||||
period immediately following each checkpoint. As a guide, a setting of 1024
|
period immediately following each checkpoint.
|
||||||
would be considered to be high.
|
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
@ -310,8 +288,7 @@
|
|||||||
(provided that <productname>PostgreSQL</productname> has been
|
(provided that <productname>PostgreSQL</productname> has been
|
||||||
compiled with support for it) will result in each
|
compiled with support for it) will result in each
|
||||||
<function>LogInsert</function> and <function>LogFlush</function>
|
<function>LogInsert</function> and <function>LogFlush</function>
|
||||||
<acronym>WAL</acronym> call being logged to the server log. The output
|
<acronym>WAL</acronym> call being logged to the server log. This
|
||||||
is too verbose for use as a guide to performance tuning. This
|
|
||||||
option may be replaced by a more general mechanism in the future.
|
option may be replaced by a more general mechanism in the future.
|
||||||
</para>
|
</para>
|
||||||
</sect1>
|
</sect1>
|
||||||
@ -340,15 +317,6 @@
|
|||||||
available stock of numbers.
|
available stock of numbers.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
|
||||||
The <acronym>WAL</acronym> buffers and control structure are in
|
|
||||||
shared memory and are handled by the server child processes; they
|
|
||||||
are protected by lightweight locks. The demand on shared memory is
|
|
||||||
dependent on the number of buffers. The default size of the
|
|
||||||
<acronym>WAL</acronym> buffers is 8 buffers of 8 kB each, or 64 kB
|
|
||||||
total.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
It is of advantage if the log is located on another disk than the
|
It is of advantage if the log is located on another disk than the
|
||||||
main database files. This may be achieved by moving the directory
|
main database files. This may be achieved by moving the directory
|
||||||
|
Loading…
Reference in New Issue
Block a user