mirror of
https://git.postgresql.org/git/postgresql.git
synced 2025-01-24 18:55:04 +08:00
Add pg_upgrade IMPLEMENTATION file to CVS.
This commit is contained in:
parent
6c4a98d99c
commit
a898199df5
121
contrib/pg_upgrade/IMPLEMENTATION
Normal file
121
contrib/pg_upgrade/IMPLEMENTATION
Normal file
@ -0,0 +1,121 @@
|
||||
------------------------------------------------------------------------------
|
||||
PG_MIGRATOR: IN-PLACE UPGRADES FOR POSTGRESQL
|
||||
------------------------------------------------------------------------------
|
||||
|
||||
Upgrading a PostgreSQL database from one major release to another can be
|
||||
an expensive process. For minor upgrades, you can simply install new
|
||||
executables and forget about upgrading existing data. But for major
|
||||
upgrades, you have to export all of your data using pg_dump, install the
|
||||
new release, run initdb to create a new cluster, and then import your
|
||||
old data. If you have a lot of data, that can take a considerable amount
|
||||
of time. If you have too much data, you may have to buy more storage
|
||||
since you need enough room to hold the original data plus the exported
|
||||
data. pg_migrator can reduce the amount of time and disk space required
|
||||
for many upgrades.
|
||||
|
||||
The URL http://momjian.us/main/writings/pgsql/pg_migrator.pdf contains a
|
||||
presentation about pg_migrator internals that mirrors the text
|
||||
description below.
|
||||
|
||||
------------------------------------------------------------------------------
|
||||
WHAT IT DOES
|
||||
------------------------------------------------------------------------------
|
||||
|
||||
pg_migrator is a tool that performs an in-place upgrade of existing
|
||||
data. Some upgrades change the on-disk representation of data;
|
||||
pg_migrator cannot help in those upgrades. However, many upgrades do
|
||||
not change the on-disk representation of a user-defined table. In those
|
||||
cases, pg_migrator can move existing user-defined tables from the old
|
||||
database cluster into the new cluster.
|
||||
|
||||
There are two factors that determine whether an in-place upgrade is
|
||||
practical.
|
||||
|
||||
Every table in a cluster shares the same on-disk representation of the
|
||||
table headers and trailers and the on-disk representation of tuple
|
||||
headers. If this changes between the old version of PostgreSQL and the
|
||||
new version, pg_migrator cannot move existing tables to the new cluster;
|
||||
you will have to pg_dump the old data and then import that data into the
|
||||
new cluster.
|
||||
|
||||
Second, all data types should have the same binary representation
|
||||
between the two major PostgreSQL versions.
|
||||
|
||||
------------------------------------------------------------------------------
|
||||
HOW IT WORKS
|
||||
------------------------------------------------------------------------------
|
||||
|
||||
To use pg_migrator during an upgrade, start by installing a fresh
|
||||
cluster using the newest version in a new directory. When you've
|
||||
finished installation, the new cluster will contain the new executables
|
||||
and the usual template0, template1, and postgres, but no user-defined
|
||||
tables. At this point, you can shut down the old and new postmasters and
|
||||
invoke pg_migrator.
|
||||
|
||||
When pg_migrator starts, it ensures that all required executables are
|
||||
present and contain the expected version numbers. The verification
|
||||
process also checks the old and new $PGDATA directories to ensure that
|
||||
the expected files and subdirectories are in place. If the verification
|
||||
process succeeds, pg_migrator starts the old postmaster and runs
|
||||
pg_dumpall --schema-only to capture the metadata contained in the old
|
||||
cluster. The script produced by pg_dumpall will be used in a later step
|
||||
to recreate all user-defined objects in the new cluster.
|
||||
|
||||
Note that the script produced by pg_dumpall will only recreate
|
||||
user-defined objects, not system-defined objects. The new cluster will
|
||||
contain the system-defined objects created by the latest version of
|
||||
PostgreSQL.
|
||||
|
||||
Once pg_migrator has extracted the metadata from the old cluster, it
|
||||
performs a number of bookkeeping tasks required to 'sync up' the new
|
||||
cluster with the existing data.
|
||||
|
||||
First, pg_migrator renames any tablespace directories in the old cluster
|
||||
--- the new cluster will need to use the same tablespace directories and
|
||||
will complain if those directories exist when pg_migrator imports the
|
||||
metadata in a later step. It then freeze all transaction information
|
||||
stored in old server rows.
|
||||
|
||||
Next, pg_migrator copies the commit status information and 'next
|
||||
transaction ID' from the old cluster to the new cluster. This is the
|
||||
steps ensures that the proper tuples are visible from the new cluster.
|
||||
Remember, pg_migrator does not export/import the content of user-defined
|
||||
tables so the transaction IDs in the new cluster must match the
|
||||
transaction IDs in the old data. pg_migrator also copies the starting
|
||||
address for write-ahead logs from the old cluster to the new cluster.
|
||||
|
||||
Now pg_migrator begins reconstructing the metadata obtained from the old
|
||||
cluster using the first part of the pg_dumpall output. Once all of the
|
||||
databases have been created in the new cluster, pg_migrator tackles the
|
||||
problem of naming toast relations. Toast tables are used to store
|
||||
oversized data out-of-line, i.e., in a separate file. When the server
|
||||
decides to move a datum out of a tuple and into a toast table, it stores
|
||||
a pointer in the original slot in the tuple. That pointer contains the
|
||||
relfilenode (i.e. filename) of the toast table. That means that any
|
||||
table which contains toasted data will contain the filename of the toast
|
||||
table in each toast pointer. Therefore, it is very important that toast
|
||||
tables retain their old names when they are created in the new cluster.
|
||||
CREATE TABLE does not offer any explicit support for naming toast
|
||||
tables. To ensure that the toast table names retain their old names,
|
||||
pg_migrator reserves the name of each toast table before importing the
|
||||
metadata from the old cluster. To reserve a filename, pg_migrator simply
|
||||
creates an empty file with the appropriate name and the server avoids
|
||||
that name when it detects a collision.
|
||||
|
||||
Next, pg_migrator executes the remainder of the script produced earlier
|
||||
by pg_dumpall --- this script effectively creates the complete
|
||||
user-defined metadata from the old cluster to the new cluster. When that
|
||||
script completes, pg_migrator, after shutting down the new postmaster,
|
||||
deletes the placeholder toast tables and sets the proper toast tuple
|
||||
names into the new cluster.
|
||||
|
||||
Finally, pg_migrator links or copies each user-defined table and its
|
||||
supporting indexes and toast tables from the old cluster to the new
|
||||
cluster. In this last step, pg_migrator assigns a new name to each
|
||||
relation so it matches the pg_class.relfilenode in the new
|
||||
cluster. Toast file names are preserved, as outlined above.
|
||||
|
||||
An important feature of the pg_migrator design is that it leaves the
|
||||
original cluster intact --- if a problem occurs during the upgrade, you
|
||||
can still run the previous version, after renaming the tablespaces back
|
||||
to the original names.
|
Loading…
Reference in New Issue
Block a user