From a164213219260fe88e0678c3db41b1a4f1af1364 Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Fri, 30 Jun 2000 16:14:21 +0000 Subject: [PATCH] New and revised material for Admin guide, re backup & restore and database management --- doc/src/sgml/admin.sgml | 15 +- doc/src/sgml/backup.sgml | 414 ++++++++++++++++++++++++++++++++++ doc/src/sgml/manage-ag.sgml | 422 +++++++++++++++++++---------------- doc/src/sgml/start-ag.sgml | 123 ---------- doc/src/sgml/user-manag.sgml | 2 +- 5 files changed, 648 insertions(+), 328 deletions(-) create mode 100644 doc/src/sgml/backup.sgml delete mode 100644 doc/src/sgml/start-ag.sgml diff --git a/doc/src/sgml/admin.sgml b/doc/src/sgml/admin.sgml index c8fb66b4112..7eb80a71073 100644 --- a/doc/src/sgml/admin.sgml +++ b/doc/src/sgml/admin.sgml @@ -1,5 +1,5 @@ + + Backup and Restore + + + As everything that contains valuable data, Postgres + databases should be backed up regularly. While the procedure is + essentially simple, it is important to have a basic understanding of + the underlying techniques and assumptions. + + + + There are two fundamentally different approaches to backing up + Postgres data: + + SQL dump + File system level backup + + + + + <acronym>SQL</> Dump + + + The idea behind this method is to generate a text file with SQL + commands that, when fed back to the server, will recreate the + database in the same state as it was at the time of the dump. + Postgres provides the utility program + pg_dump for this purpose. The basic usage of this + command is: + +pg_dump dbname > outfile + + As you see, pg_dump writes its results to the + standard output. We will see below how this can be useful. + + + + pg_dump is a regular Postgres + client application (albeit a particularly clever one). This means + that you can do this backup procedure from any remote host that has + access to the database. But remember that pg_dump + does not operate with special permissions. In particular, you must + have read access to all tables that you want to back up, so in + practice you almost always have to be a database superuser. + + + + To specify which databaser server pg_dump should + contact, use the command line options + + + As any other Postgres client application, + pg_dump will by default connect with the database + user name that is equal to the current Unix user name. To override + this, either specify the option to force a prompt for + the user name, or set the environment variable + PGUSER. Remember that pg_dump + connections are subject to the normal client authentication + mechanisms (which are described in ). + + + + Dumps created by pg_dump are internally consistent, + that is, updates to the database while pg_dump is + running will not be in the dump. pg_dump does not + block other operations on the database while it is working. + (Exceptions are those operations that need to operate with an + exclusive lock, such as VACUUM.) + + + + + When your database schema relies on OIDs (for instances as foreign + keys) you must instruct pg_dump to dump the OIDs + as well. To do this, use the command line + option. + + + + + Restoring the dump + + + The text files created by pg_dump are intended to + be read in by the psql program. The + general command form to restore a dump is + +psql dbname < infile + + where infile is what + you used as outfile + for the pg_dump command. The database dbname will not be created by this + command, you must do that yourself before executing + psql (e.g., with createdb dbname). psql + supports similar options to pg_dump for + controlling the database server location and the user names. See + its reference page for more information. + + + + If the objects in the original database were owned by different + users, then the dump will instruct psql to connect + as each affected user in turn and then create the relevant + objects. This way the original ownership is preserved. This also + means, however, that all these user must already exist, and + furthermore that you must be allowed to connect as each of them. + It might therefore be necessary to temporarily relax the client + authentication settings. + + + + The ability of pg_dump and psql to + write or read from pipes also make it possible to dump a database + directory from one server to another, for example + + +pg_dump -h host1 dbname | psql -h host2 dbname + + + + + + + Using <command>pg_dumpall</> + + + The above mechanism is cumbersome and inappropriate when backing + up an entire database cluster. For this reason the + pg_dumpall program is provided. + pg_dumpall backs up each database in a given + cluster and also makes sure that the state of global data such as + users and groups is preserved. The call sequence for + pg_dumpall is simply + +pg_dumpall > outfile + + The resulting dumps can be restored with psql as + described above. But in this case it is definitely necessary that + you have database superuser access, as that is required to restore + the user and group information. + + + + pg_dumpall has one little flaw: It is + not prepared for interactively authenticating to each database it + dumps. If you are using password authentication then you need to + set it the environment variable PGPASSWORD to + communicate the password the the underlying calls to + pg_dump. More severely, if you have different + passwords set up for each database, then + pg_dumpall will fail. You can either choose a + different authentication mechanism for the purposes of backup or + adjust the pg_dumpall shell script to your + needs. + + + + + Large Databases + + + Acknowledgement + + Originally written by Hannu Krosing + (hannu@trust.ee) on 1999-06-19 + + + + + Since Postgres allows tables larger + than the maximum file size on your system, it can be problematic + to dump the table to a file, since the resulting file will likely + be larger than the maximum size allowed by your system. As + pg_dump writes to the standard output, you can + just use standard *nix tools to work around this possible problem. + + + + Use compressed dumps. + + Use your favorite compression program, for example + gzip. + + +pg_dump dbname | gzip > filename.gz + + + Reload with + + +createdb dbname +gunzip -c filename.gz | psql dbname + + + or + + +cat filename.gz | gunzip | psql dbname + + + + + + Use <application>split</>. + + This allows you to split the output into pieces that are + acceptable in size to the underlying file system. For example, to + make chunks of 1 megabyte: + + + +pg_dump dbname | split -b 1m - filename + + + + Reload with + + + +createdb dbname +cat filename.* | psql dbname + + + + + + + + + Caveats + + + pg_dump (and by implication + pg_dumpall) has a few limitations which stem from + the difficulty to reconstruct certain information from the system + catalogs. + + + + Specifically, the order in which pg_dump writes + the objects is not very sophisticated. This can lead to problems + for example when functions are used as column default values. The + only answer is to manually reorder the dump. If you created + circular dependencies in your schema then you will have more work + to do. + + + + Large objects are not handled by pg_dump. The + directory contrib/pg_dumplo of the + Postgres source tree contains a program that can + do that. + + + + Please familiarize yourself with the + pg_dump reference page. + + + + + + File system level backup + + + An alternative backup strategy is to directly copy the files that + Postgres uses to store the data in the database. In + it is explained where these files + are located, but you have probably found them already if you are + interested in this method. You can use whatever method you prefer + for doing usual file system backups, for example + + +tar -cf backup.tar /usr/local/pgsql/data + + + + + + There are two restrictions, however, which make this method + impractical, or at least inferior to the pg_dump + method: + + + + + The database server must be shut down in order to + get a usable backup. Half-way measures such as disallowing all + connections will not work as there is always some buffering + going on. For this reason it is also not advisable to trust file + systems that claim to support consistent + snapshots. Information about stopping the server can be + found in . + + + + Needless to say that you also need to shut down the server + before restoring the data. + + + + + + If you have dug into the details of the file system layout you + may be tempted to try to back up or restore only certain + individual tables or databases from their respective files or + directories. This will not work because the + information contained in these files contains only half the + truth. The other half is in the file + pg_log, which contains the commit status of + all transactions. A table file is only usable with this + information. Of course it is also impossible to restore only a + table and the associated pg_log file + because that will render all other tables in the database + cluster useless. + + + + + + + Also note that the file system backup will not necessarily be + smaller than an SQL dump. On the contrary, it will most likely be + larger. (pg_dump does not need to dump + the contents of indices for example, just the commands to recreate + them.) + + + + + + Migration between releases + + + As a general rule, the internal data storage format is subject to + change between releases of Postgres. This does not + apply to different patch levels, these always have + compatible storage formats. For example, releases 6.5.3, 7.0.1, and + 7.1 are not compatible, whereas 7.0.2 and 7.0.1 are. When you + update between compatible versions, then you can simply reuse the + data area in disk by the new executables. Otherwise you need to + back up your data and restore it on the new + server, using pg_dump. (There are checks in place + that prevent you from doing the wrong thing, so no harm can be done + by confusing these things.) The precise installation procedure is + not subject of this section, the Installation + Instructions carry these details. + + + + The least downtime can be achieved by installing the new server in + a different directory and running both the old and the new servers + in parallel, on different ports. Then you can use something like + + +pg_dumpall -p 5432 | psql -d template1 -p 6543 + + + to transfer your data, or use an intermediate file if you want. + Then you can shut down the old server and start the new server at + the port the old one was running at. You should make sure that the + database is not updated after you run pg_dumpall, + otherwise you will obviously lose that data. See for information on how to prohibit + access. In practice you probably want to test your client + applications on the new setup before switching over. + + + + If you cannot or do not want to run two servers in parallel you can + do the back up step before installing the new version, bring down + the server, move the old version out of the way, install the new + version, start the new server, restore the data. For example: + + +pg_dumpall > backup +kill -INT `cat /usr/local/pgsql/postmaster.pid` +mv /usr/local/pgsql /usr/local/pgsql.old +cd /usr/src/postgresql-7.1 +gmake install +initdb -D /usr/local/pgsql/data +postmaster -D /usr/local/pgsql/data +psql < backup + + + See about ways to start and stop the + server and other details. The installation instructions will advise + you of strategic places to perform these steps. + + + + + When you move the old installation out of the way + it is no longer perfectly usable. Some parts of the installation + contain information about where the other parts are located. This + is usually not a big problem but if you plan on using two + installations in parallel for a while you should assign them + different installation directories at build time. + + + + diff --git a/doc/src/sgml/manage-ag.sgml b/doc/src/sgml/manage-ag.sgml index d27a2094a07..a05b984ef55 100644 --- a/doc/src/sgml/manage-ag.sgml +++ b/doc/src/sgml/manage-ag.sgml @@ -1,79 +1,221 @@ - - Managing a Database + + Managing Databases + + + A database is a named collection of SQL objects (database + objects); every database object (tables, function, etc.) + belongs to one and only one database. An application that connects + to the database server specifies with its connection request the + name of the database it wants to connect to. It is not possible to + access more than once database per connection. (But an application + is not restricted in the number of connections it opens to the same + or other databases.) + + + + + SQL calls databases catalogs, but there is no + difference in practice. + + + + + In order to create or drop databases, the Postgres + postmaster must be up and running (see ). + + + + Creating a Database - If the Postgres - postmaster is up and running we can create - some databases to experiment with. Here, we describe the - basic commands for managing a database. + Databases are created with the query language command + CREATE DATABASE: + +CREATE DATABASE name + + where name can be chosen freely. (Depending on the + current implementation, certain characters that are special to the + underlying operating system might be prohibited. There will be + run-time checks for that.) The current user automatically becomes + the owner of the new database. It is the privilege of the owner of + a database to remove it later on (which also removes all the + objects in it, even if they have a different owner). - - Creating a Database + + The creation of databases is a restricted operation. See how to grant permission. + + + Bootstrapping - Let's say you want to create a database named mydb. - You can do this with the following command: - - -% createdb dbname - - - Postgres allows you to create - any number of databases - at a given site and you automatically become the - database administrator of the database you just created. - Database names must have an alphabetic first - character and are limited to 31 characters in length. - Not every user has authorization to become a database - administrator. If Postgres - refuses to create databases - for you, then the site administrator needs to grant you - permission to create databases. Consult your site - administrator if this occurs. + Since you need to be connected to the database server in order to + execute the CREATE DATABASE command, the + question remains how the first database at any given + site can be created. The first database is always created by the + initdb command when the data storage area is + initialized. (See .) This + database is called template1 and cannot be deleted. So + to create the first real database you can connect to + template1. - + - - Accessing a Database + + The name template1 is no accident: When a new + database is created, the template database is essentially cloned. + This means that any changes you make in template1 are + propagated to all subsequently created databases. This implies that + you should not use the template database for real work, but when + used judiciously this feature can be convenient. + + + + As an extra convenience, there is also a program that you can + execute from the shell to create new databases, + createdb. + + +createdb dbname + + + createdb does no magic. It connects to the template1 + database and executes the CREATE DATABASE command, + exactly as described above. It uses psql program + internally. The reference page on createdb contains the invocation + details. In particular, createdb without any arguments will create + a database with the current user name, which may or may not be what + you want. + + + + Alternative Locations - Once you have constructed a database, you can access it - by: + It is possible to create a database in a location other than the + default. Remember that all database access occurs through the + database server backend, so that any location specified must be + accessible by the backend. + - - - - running the Postgres terminal monitor program - (psql) which allows you to interactively - enter, edit, and execute SQL commands. - - + + Alternative database locations are referenced by an environment + variable which gives the absolute path to the intended storage + location. This environment variable must have been defined before + the backend was started. Any valid environment variable name may + be used to reference an alternative location, although using + variable names with a prefix of PGDATA is recommended + to avoid confusion and conflict with other variables. + - - - writing a C program using the libpq subroutine - library. This allows you to submit SQL commands - from C and get answers and status messages back to - your program. This interface is discussed further - in the PostgreSQL Programmer's Guide. - - - + + To create the variable in the environment of the server process + you must first shut down the server, define the variable, + initialize the data area, and finally restart the server. (See + and .) To set an environment variable, type + + +PGDATA2=/home/postgres/data + + + in Bourne shells, or + + +setenv PGDATA2 /home/postgres/data + + + in csh or tcsh. You have to make sure that this environment + variable is always defined in the server environment, otherwise + you won't be able to access that database. Therefore you probably + want to set it in some sort of shell startup file or server + startup script. + - You might want to start up psql, - to try out the examples in this manual. It can be activated for the - dbname database by typing the command: + + To create a data storage area in PGDATA2, ensure that + /home/postgres already exists and is writable + by the user account that runs the server (see ). Then from the command line, type + + +initlocation PGDATA2 + + + The you can restart the server. + + + + To create a database at the new location, use the command + +CREATE DATABASE name WITH LOCATION = 'location' + + where location is the environment variable you + used, PGDATA2 in this example. The createdb + command has the option + + + Database created at alternative locations using this method can be + accessed and dropped like any other database. + + + + + It can also be possible to specify absolute paths directly to the + CREATE DATABASE command without defining environment + variables. This is disallowed by default because it is a security + risk. To allow it, you must compile Postgres with + the C preprocessor macro ALLOW_ABSOLUTE_DBPATHS + defined. One way to do this is to run the compilation step like + this: gmake COPT=-DALLOW_ABSOLUTE_DBPATHS all. + + + + + + + + Accessing a Database + + + Once you have constructed a database, you can access it by: + + + + + running the Postgres terminal monitor program + (psql) which allows you to interactively + enter, edit, and execute SQL commands. + + + + + + writing a C program using the libpq subroutine + library. This allows you to submit SQL commands + from C and get answers and status messages back to + your program. This interface is discussed further + in the PostgreSQL Programmer's Guide. + + + + + You might want to start up psql, + to try out the examples in this manual. It can be activated for the + dbname database by typing the command: psql dbname - You will be greeted with the following message: + You will be greeted with the following message: Welcome to psql, the PostgreSQL interactive terminal. @@ -138,151 +280,39 @@ Type: \copyright for distribution terms are denoted by "/* ... */", a convention borrowed from Ingres. - + - - Destroying a Database + + Destroying a Database - - If you are the database administrator for the database - mydb, you can destroy it using the following Unix command: + + Databases are destroyed with the command DROP DATABASE: + +DROP DATABASE name + + Only the owner of the database (i.e., the user that created it) can + drop databases. Dropping a databases removes all objects that were + contained within the database. The destruction of a database cannot + be undone. + - -% dropdb dbname - + + You cannot execute the DROP DATABASE command + while connected to the victim database. You can, however, be + connected to any other database, including the template1 database, + which would be the only option for dropping the last database of a + given cluster. + - This action physically removes all of the Unix files - associated with the database and cannot be undone, so - this should only be done with a great deal of forethought. - - - - It is also possible to destroy a database from within an - SQL session by using - - -> drop database dbname - - - - - - Backup and Restore - - - - Every database should be backed up on a regular basis. Since - Postgres manages it's own files in the - file system, it is not advisable to rely on - system backups of your file system for your database backups; - there is no guarantee that the files will be in a usable, - consistant state after restoration. - - - - - Postgres provides two utilities to - backup your system: pg_dump to backup - individual databases and - pg_dumpall to backup your installation - in one step. - - - - An individual database can be backed up using the following - command: - - -% pg_dump dbname > dbname.pgdump - - - and can be restored using - - -cat dbname.pgdump | psql dbname - - - - - This technique can be used to move databases to new - locations, and to rename existing databases. - - - - Large Databases - - - Author - - Written by Hannu Krosing on - 1999-06-19. - - - - - Since Postgres allows tables larger - than the maximum file size on your system, it can be problematic - to dump the table to a file, since the resulting file will likely - be larger than the maximum size allowed by your system. - - - As pg_dump writes to stdout, - you can just use standard *nix tools - to work around this possible problem: - - - - - Use compressed dumps: - - -% pg_dump dbname | gzip > filename.dump.gz - - - reload with - - -% createdb dbname -% gunzip -c filename.dump.gz | psql dbname - - -or - - -% cat filename.dump.gz | gunzip | psql dbname - - - - - - - Use split: - - -% pg_dump dbname | split -b 1m - filename.dump. - - -reload with - - -% createdb dbname -% cat filename.dump.* | pgsql dbname - - - - - - - - Of course, the name of the file - (filename) and the - content of the pg_dump output need not - match the name of the database. Also, the restored database can - have an arbitrary new name, so this mechanism is also suitable - for renaming databases. - - - + + For convenience, there is also a shell program to drop databases: + +dropdb dbname + + (Unlike createdb, it is not the default action to drop + the database with the current user name.) + + - - - Disk Management - - - Alternate Locations - - - It is possible to create a database in a location other than the default - location for the installation. Remember that all database access actually - occurs through the database backend, so that any location specified must - be accessible by the backend. - - - - Alternate database locations are created and referenced by an environment variable - which gives the absolute path to the intended storage location. - This environment variable must have been defined before the backend was started - and must be writable by the postgres administrator account. - Any valid environment variable name may be used to reference an alternate - location, although using variable name with a prefix of PGDATA is recommended - to avoid confusion and conflict with other variables. - - - - - In previous versions of Postgres, - it was also permissable to use an absolute path name - to specify an alternate storage location. - The environment variable style of specification - is to be preferred since it allows the site administrator more flexibility in - managing disk storage. - If you prefer using absolute paths, you may do so by defining - "ALLOW_ABSOLUTE_DBPATHS" and recompiling Postgres - To do this, either add this line - - -#define ALLOW_ABSOLUTE_DBPATHS 1 - - - to the file src/include/config.h, or by specifying - - - CFLAGS+= -DALLOW_ABSOLUTE_DBPATHS - - - in your Makefile.custom. - - - - - Remember that database creation is actually performed by the database backend. - Therefore, any environment variable specifying an alternate location must have - been defined before the backend was started. To define an alternate location - PGDATA2 pointing to /home/postgres/data, first type - - -% setenv PGDATA2 /home/postgres/data - - - to define the environment variable to be used with subsequent commands. - Usually, you will want to define this variable in the - Postgres superuser's - .profile - or - .cshrc - initialization file to ensure that it is defined upon system startup. - Any environment variable can be used to reference alternate location, - although it is preferred that the variables be prefixed with "PGDATA" - to eliminate confusion and the possibility of conflicting with or - overwriting other variables. - - - - To create a data storage area in PGDATA2, ensure - that /home/postgres already exists and is writable - by the postgres administrator. - Then from the command line, type - - -% setenv PGDATA2 /home/postgres/data -% initlocation $PGDATA2 -Creating Postgres database system directory /home/postgres/data - -Creating Postgres database system directory /home/postgres/data/base - - - - - - To test the new location, create a database test by typing - - -% createdb -D PGDATA2 test -% dropdb test - - - - - - - diff --git a/doc/src/sgml/user-manag.sgml b/doc/src/sgml/user-manag.sgml index 255b5f9801a..942dde5b35f 100644 --- a/doc/src/sgml/user-manag.sgml +++ b/doc/src/sgml/user-manag.sgml @@ -56,7 +56,7 @@ CREATE USER name constrained in its login name by her real name.) - + User attributes