mirror of
https://git.postgresql.org/git/postgresql.git
synced 2025-01-12 18:34:36 +08:00
8fd733bd19
file to be a symlink. We tried to fix this issue with an earlier server-side patch, but it didn't fix the whole issue. The same bug is present in older releases as well, but the 8.4 train is about to leave the station, and I'm not sure if have consensus on whether we can remove the -l option in back-branches or do we need to attempt a server-side fix to make symlinking safe. Patch by Simon Riggs, per discussion on bug identified by Fujii Masao.
365 lines
12 KiB
Plaintext
365 lines
12 KiB
Plaintext
<!-- $PostgreSQL: pgsql/doc/src/sgml/pgstandby.sgml,v 2.10 2009/06/25 12:03:11 heikki Exp $ -->
|
|
|
|
<sect1 id="pgstandby">
|
|
<title>pg_standby</title>
|
|
|
|
<indexterm zone="pgstandby">
|
|
<primary>pg_standby</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
<application>pg_standby</> supports creation of a <quote>warm standby</>
|
|
database server. It is designed to be a production-ready program, as well
|
|
as a customizable template should you require specific modifications.
|
|
</para>
|
|
|
|
<para>
|
|
<application>pg_standby</> is designed to be a waiting
|
|
<literal>restore_command</literal>, which is needed to turn a standard
|
|
archive recovery into a warm standby operation. Other
|
|
configuration is required as well, all of which is described in the main
|
|
server manual (see <xref linkend="warm-standby">).
|
|
</para>
|
|
|
|
<para>
|
|
<application>pg_standby</application> features include:
|
|
</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
Written in C, so very portable and easy to install
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
Easy-to-modify source code, with specifically designated
|
|
sections to modify for your own needs
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
Already tested on Linux and Windows
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<sect2>
|
|
<title>Usage</title>
|
|
|
|
<para>
|
|
To configure a standby
|
|
server to use <application>pg_standby</>, put this into its
|
|
<filename>recovery.conf</filename> configuration file:
|
|
</para>
|
|
<programlisting>
|
|
restore_command = 'pg_standby <replaceable>archiveDir</> %f %p %r'
|
|
</programlisting>
|
|
<para>
|
|
where <replaceable>archiveDir</> is the directory from which WAL segment
|
|
files should be restored.
|
|
</para>
|
|
<para>
|
|
The full syntax of <application>pg_standby</>'s command line is
|
|
</para>
|
|
<synopsis>
|
|
pg_standby <optional> <replaceable>option</> ... </optional> <replaceable>archivelocation</> <replaceable>nextwalfile</> <replaceable>xlogfilepath</> <optional> <replaceable>restartwalfile</> </optional>
|
|
</synopsis>
|
|
<para>
|
|
When used within <literal>restore_command</literal>, the <literal>%f</> and
|
|
<literal>%p</> macros should be specified for <replaceable>nextwalfile</>
|
|
and <replaceable>xlogfilepath</> respectively, to provide the actual file
|
|
and path required for the restore.
|
|
</para>
|
|
<para>
|
|
If <replaceable>restartwalfile</> is specified, normally by using the
|
|
<literal>%r</literal> macro, then all WAL files logically preceding this
|
|
file will be removed from <replaceable>archivelocation</>. This minimizes
|
|
the number of files that need to be retained, while preserving
|
|
crash-restart capability. Use of this parameter is appropriate if the
|
|
<replaceable>archivelocation</> is a transient staging area for this
|
|
particular standby server, but <emphasis>not</> when the
|
|
<replaceable>archivelocation</> is intended as a long-term WAL archive area.
|
|
</para>
|
|
<para>
|
|
<application>pg_standby</application> assumes that
|
|
<replaceable>archivelocation</> is a directory readable by the
|
|
server-owning user. If <replaceable>restartwalfile</> (or <literal>-k</>)
|
|
is specified,
|
|
the <replaceable>archivelocation</> directory must be writable too.
|
|
</para>
|
|
<para>
|
|
There are two ways to fail over to a <quote>warm standby</> database server
|
|
when the master server fails:
|
|
|
|
<variablelist>
|
|
<varlistentry>
|
|
<term>Smart Failover</term>
|
|
<listitem>
|
|
<para>
|
|
In smart failover, the server is brought up after applying all WAL
|
|
files available in the archive. This results in zero data loss, even if
|
|
the standby server has fallen behind, but if there is a lot of
|
|
unapplied WAL it can be a long time before the standby server becomes
|
|
ready. To trigger a smart failover, create a trigger file containing
|
|
the word <literal>smart</>, or just create it and leave it empty.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
<varlistentry>
|
|
<term>Fast Failover</term>
|
|
<listitem>
|
|
<para>
|
|
In fast failover, the server is brought up immediately. Any WAL files
|
|
in the archive that have not yet been applied will be ignored, and
|
|
all transactions in those files are lost. To trigger a fast failover,
|
|
create a trigger file and write the word <literal>fast</> into it.
|
|
<application>pg_standby</> can also be configured to execute a fast
|
|
failover automatically if no new WAL file appears within a defined
|
|
interval.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
</para>
|
|
|
|
<table>
|
|
<title><application>pg_standby</> options</title>
|
|
<tgroup cols="3">
|
|
<thead>
|
|
<row>
|
|
<entry>Option</entry>
|
|
<entry>Default</entry>
|
|
<entry>Description</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry><literal>-c</></entry>
|
|
<entry>yes</entry>
|
|
<entry>
|
|
Use <literal>cp</> or <literal>copy</> command to restore WAL files
|
|
from archive.
|
|
</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>-d</></entry>
|
|
<entry>no</entry>
|
|
<entry>Print lots of debug logging output on <filename>stderr</>.</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>-k</> <replaceable>numfiles</></entry>
|
|
<entry>0</entry>
|
|
<entry>
|
|
Remove files from <replaceable>archivelocation</replaceable> so that
|
|
no more than this many WAL files before the current one are kept in the
|
|
archive. Zero (the default) means not to remove any files from
|
|
<replaceable>archivelocation</replaceable>.
|
|
This parameter will be silently ignored if
|
|
<replaceable>restartwalfile</replaceable> is specified, since that
|
|
specification method is more accurate in determining the correct
|
|
archive cut-off point.
|
|
Use of this parameter is <emphasis>deprecated</> as of
|
|
<productname>PostgreSQL</> 8.3; it is safer and more efficient to
|
|
specify a <replaceable>restartwalfile</replaceable> parameter. A too
|
|
small setting could result in removal of files that are still needed
|
|
for a restart of the standby server, while a too large setting wastes
|
|
archive space.
|
|
</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>-r</> <replaceable>maxretries</></entry>
|
|
<entry>3</entry>
|
|
<entry>
|
|
Set the maximum number of times to retry the copy command if it
|
|
fails. After each failure, we wait for <replaceable>sleeptime</> *
|
|
<replaceable>num_retries</>
|
|
so that the wait time increases progressively. So by default,
|
|
we will wait 5 secs, 10 secs, then 15 secs before reporting
|
|
the failure back to the standby server. This will be
|
|
interpreted as end of recovery and the standby will come
|
|
up fully as a result.
|
|
</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>-s</> <replaceable>sleeptime</></entry>
|
|
<entry>5</entry>
|
|
<entry>
|
|
Set the number of seconds (up to 60) to sleep between tests to see
|
|
if the WAL file to be restored is available in the archive yet.
|
|
The default setting is not necessarily recommended;
|
|
consult <xref linkend="warm-standby"> for discussion.
|
|
</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>-t</> <replaceable>triggerfile</></entry>
|
|
<entry>none</entry>
|
|
<entry>
|
|
Specify a trigger file whose presence should cause failover.
|
|
It is recommended that you use a structured filename to
|
|
avoid confusion as to which server is being triggered
|
|
when multiple servers exist on the same system; for example
|
|
<filename>/tmp/pgsql.trigger.5432</>.
|
|
</entry>
|
|
</row>
|
|
<row>
|
|
<entry><literal>-w</> <replaceable>maxwaittime</></entry>
|
|
<entry>0</entry>
|
|
<entry>
|
|
Set the maximum number of seconds to wait for the next WAL file,
|
|
after which a fast failover will be performed.
|
|
A setting of zero (the default) means wait forever.
|
|
The default setting is not necessarily recommended;
|
|
consult <xref linkend="warm-standby"> for discussion.
|
|
</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Examples</title>
|
|
|
|
<para>On Linux or Unix systems, you might use:</para>
|
|
|
|
<programlisting>
|
|
archive_command = 'cp %p .../archive/%f'
|
|
|
|
restore_command = 'pg_standby -d -s 2 -t /tmp/pgsql.trigger.5442 .../archive %f %p %r 2>>standby.log'
|
|
|
|
recovery_end_command = 'rm -f /tmp/pgsql.trigger.5442'
|
|
</programlisting>
|
|
<para>
|
|
where the archive directory is physically located on the standby server,
|
|
so that the <literal>archive_command</> is accessing it across NFS,
|
|
but the files are local to the standby (enabling use of <literal>ln</>).
|
|
This will:
|
|
</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
produce debugging output in <filename>standby.log</>
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
sleep for 2 seconds between checks for next WAL file availability
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
stop waiting only when a trigger file called
|
|
<filename>/tmp/pgsql.trigger.5442</> appears,
|
|
and perform failover according to its content
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
remove the trigger file when recovery ends
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
remove no-longer-needed files from the archive directory
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>On Windows, you might use:</para>
|
|
|
|
<programlisting>
|
|
archive_command = 'copy %p ...\\archive\\%f'
|
|
|
|
restore_command = 'pg_standby -d -s 5 -t C:\pgsql.trigger.5442 ...\archive %f %p %r 2>>standby.log'
|
|
|
|
recovery_end_command = 'del C:\pgsql.trigger.5442'
|
|
</programlisting>
|
|
<para>
|
|
Note that backslashes need to be doubled in the
|
|
<literal>archive_command</>, but <emphasis>not</emphasis> in the
|
|
<literal>restore_command</> or <literal>recovery_end_command</>.
|
|
This will:
|
|
</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
use the <literal>copy</> command to restore WAL files from archive
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
produce debugging output in <filename>standby.log</>
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
sleep for 5 seconds between checks for next WAL file availability
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
stop waiting only when a trigger file called
|
|
<filename>C:\pgsql.trigger.5442</> appears,
|
|
and perform failover according to its content
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
remove the trigger file when recovery ends
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
remove no-longer-needed files from the archive directory
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para>
|
|
The <literal>copy</> command on Windows sets the final file size
|
|
before the file is completely copied, which would ordinarly confuse
|
|
<application>pg_standby</application>. Therefore
|
|
<application>pg_standby</application> waits <literal>sleeptime</>
|
|
seconds once it sees the proper file size. GNUWin32's <literal>cp</>
|
|
sets the file size only after the file copy is complete.
|
|
</para>
|
|
|
|
<para>
|
|
Since the Windows example uses <literal>copy</> at both ends, either
|
|
or both servers might be accessing the archive directory across the
|
|
network.
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Supported server versions</title>
|
|
|
|
<para>
|
|
<application>pg_standby</application> is designed to work with
|
|
<productname>PostgreSQL</> 8.2 and later.
|
|
</para>
|
|
<para>
|
|
<productname>PostgreSQL</> 8.3 provides the <literal>%r</literal> macro,
|
|
which is designed to let <application>pg_standby</application> know the
|
|
last file it needs to keep. With <productname>PostgreSQL</> 8.2, the
|
|
<literal>-k</literal> option must be used if archive cleanup is
|
|
required. This option remains available in 8.3, but its use is deprecated.
|
|
</para>
|
|
<para>
|
|
<productname>PostgreSQL</> 8.4 provides the
|
|
<literal>recovery_end_command</literal> option. Without this option
|
|
a leftover trigger file can be hazardous.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Author</title>
|
|
|
|
<para>
|
|
Simon Riggs <email>simon@2ndquadrant.com</email>
|
|
</para>
|
|
</sect2>
|
|
|
|
</sect1>
|