August version of docs superceded by sgml version from Jan.

This commit is contained in:
Thomas G. Lockhart 1998-11-06 15:04:31 +00:00
parent a00c668139
commit 3d22596fe7

View File

@ -1,448 +0,0 @@
PL/pgSQL
A procedural language for the PostgreSQL RDBMS
Jan Wieck <jwieck@debis.com>
Preface
PL/pgSQL is a procedural language based on SQL designed for
the PostgreSQL database system.
The extensibility features of PostgreSQL are mostly based on
the ability to define functions for various operations.
Functions could have been written in PostgreSQL's SQL dialect
or in the C programming language. Functions written in C are
compiled into a shared object and loaded by the database
backend process on demand. Also the trigger features of
PostgreSQL are based on functions but required the use of the
C language.
Since version 6.3 PostgreSQL supports the definition of
procedural languages. In the case of a function or trigger
procedure defined in a procedural language, the database has
no builtin knowlege how to interpret the functions source
text. Instead, the function and trigger calls are passed into
a handler that knows the details of the language. The
handler itself is a function compiled into a shared object
and loaded on demand.
Overview
The PL/pgSQL language is case insensitive. All keywords and
identifiers can be used in upper-/lowercase mixed.
PL/pgSQL is a block oriented language. A block is defined as
[<<label>>]
[DECLARE
-- declarations]
BEGIN
-- statements
END;
There can be any number of subblocks in the statements
section of a block. Subblocks can be used to hide variables
from outside a block of statements (see Scope and visability
below). The variables declared in the declarations section
preceding a block are initialized to their default values
every time the block is entered, not only once per function
call.
It is important not to misunderstand the meaning of BEGIN/END
for grouping statements in PL/pgSQL and the database commands
for transaction control. Functions or trigger procedures
cannot start or commit transactions and PostgreSQL
transactions cannot have subtransactions.
Comments
There are two types of comments in PL/pgSQL. A double dash
'--' starts a comment that extends to the end of the line. A
'/*' starts a block comment that extends to the next '*/'.
Block comments cannot be nested, but double dash comments can
be enclosed into a block comment and double dashes can hide
'/*' and '*/'.
Declarations
All variables, rows and records used in a block or it's
subblocks must be declared in the declarations section of the
block except for the loop variable of a FOR loop iterating
over a range of integer values. The parameters given to the
function are automatically declared with the usual
identifiers $n. The declarations have the following syntax:
<name> [CONSTANT] <type> [NOT NULL]
[DEFAULT | := <value>];
Declares a variable of the specified type. If the
variable is declared as CONSTANT, the value cannot be
changed. If NOT NULL is specified, an assignment of a
NULL value results in a runtime error. Since the
default value of a variable is the SQL NULL value,
all variables declared as NOT NULL must also have a
default value.
The default value is evaluated at the actual function
call. So assigning 'now' to an abstime variable
causes the variable to have the time of the actual
function call, not when the function was compiled
(during it's first call since the lifetime of the
database connection).
<name> <class>%ROWTYPE;
Declares a row with the structure of the given class.
Class must be an existing table- or viewname of the
database. The fields of the row are accessed in the
dot notation. Parameters to a procedure could be
tuple types. In that case the corresponding
identifier $n will be a rowtype. Only the user
attributes of a tuple are accessible in the row.
There must be no whitespaces between the classname,
the percent and the ROWTYPE keyword.
The fields of the rowtype inherit the tables
fieldsizes for char() etc. data types (atttypmod
from pg_attribute).
<name> RECORD;
Records are similar to rowtypes, but they have no
predefined structure They are used in selections and
FOR loops to hold one actual database tuple from a
select operation. One and the same record can be used
in different selections. Accessing a record or an
attempt to assign a value to a record field when
there's no actual tuple in it results in a runtime
error.
The new and old tuples in triggers are given to the
trigger procedure as records. This is necessary,
because under PostgreSQL one and the same trigger
procedure can handle trigger events for different
tables.
<name> ALIAS FOR $n;
For better readability of the code it's possible to
define an alias for a positional parameter to the
function.
RENAME <oldname> TO <newname>;
Change the name of a variable, record or rowtype.
This is useful if new or old should be referenced by
another name inside a trigger procedure.
Datatypes
The type of a variable can be any of the existing data types
of the database. <type> above is defined as:
postgesql-basetype
or variable%TYPE
or rowtype.field%TYPE
or class.field%TYPE
As for the rowtype declaration, there must be no whitespaces
between the classname, the percent and the TYPE keyword.
Expressions
All expressions used in PL/pgSQL statements are processed
using the backends executor. Since even a constant looking
expression can have a totally different meaning for a
particular data type (as 'now' for abstime), it is impossible
for the PL/pgSQL parser to identify real constant values
other than the NULL keyword. The expressions are evaluated by
internally executing a query
SELECT <expr>
over the SPI manager. In the expression, occurences of
variable identifiers are substituted by parameters and the
actual values from the variables are passed to the executor
as query parameters. All the expressions used in a PL/pgSQL
function are only prepared and saved once.
If record fields are used in expressions or database
statements, the data types of the fields should not change
between calls of one and the same expression. Keep this in
mind when writing trigger procedures that handle events for
more than one table.
Statements
Anything not understood by the parser as specified below will
be put into a query and sent down to the database engine to
execute. The resulting query should not return any data
(insert, update, delete queries and all utility statements).
Assignment
An assignment of a value to a variable or rowtype field
is written as:
<identifier> := <expr>;
If the expressions result data type doesn't match the
variables data type, or the variables atttypmod value is
known (as for char(20)), the result value will be
implicitly casted by the PL/pgSQL executor using the
result types output- and the variables type input-
functions. Note that this could potentially result in
runtime errors generated by the types input functions.
An assignment of a complete selection into a record or
rowtype can be done as:
SELECT expressions INTO <target> FROM fromlist;
Target can be a record or rowtype variable, or a comma
separated list of variables and record/row fields.
If a rowtype or a variable list is used as target, the
selected values must exactly match the structure of the
target(s) or a runtime error occurs. The fromlist can be
followed by any valid qualification, grouping, sorting
etc.
There is a special condition [NOT] FOUND that can be used
immediately after a SELECT INTO to check if the data has
been found.
SELECT * INTO myrec FROM EMP WHERE empname = myname;
IF NOT FOUND THEN
RAISE EXCEPTION 'employee % not found', myname;
END IF;
If the selection returns multiple rows, only the first is
moved into the target fields. All others are discarded.
Calling another function
If a function should be called, this is normally done by
a SELECT query. But there are cases where someone isn't
interested in the functions result.
PERFORM querystring;
executes a 'SELECT querystring' over the SPI manager and
discards the result.
Returning from the function
RETURN <expr>;
The function terminates and the value of <expr> will be
returned to the upper executor. The return value of a
function cannot be undefined. If control reaches the end
of the toplevel block of the function without hitting a
RETURN statement, a runtime error will occur.
Aborting and messages
As indicated above there is an RAISE statement that can
throw messages into the PostgreSQL elog mechanism.
RAISE level 'format' [, identifier [...]];
Inside the format, % can be used as a placeholder for the
following, comma separated identifiers. The identifiers
must specify an existing variable or row/record field.
Conditionals
IF <expr> THEN
-- statements
[ELSE
-- statements]
END IF;
The expression <expr> must return a value that at least
can be casted into a boolean.
Loops
There are multiple types of loops.
[<<label>>]
LOOP
-- statements
END LOOP;
An unconditional loop that must be terminated explicitly
by an EXIT statement. The optional label can be used by
EXIT statements of nested loops to specify which level of
nesting should be terminated.
[<<label>>]
WHILE <expr> LOOP
-- statements
END LOOP;
A conditional loop that is executed as long as the
evaluation of <expr> returns true.
[<<label>>]
FOR <name> IN [REVERSE] <expr>..<expr> LOOP
-- statements
END LOOP.
A loop that iterates over a range of integer values. The
variable <name> is automatically created as type integer
and exists only inside the loop. The two expressions
giving the lower and upper bound of the range are
evaluated only when entering the loop. The iteration step
is 1.
FOR <recname|rowname> IN <select_clause> LOOP
-- statements
END LOOP;
The record or row is assigned all the rows resulting from
the select clause and the statements executed for each.
If the loop is terminated with an EXIT statement, the
last accessed row is still accessible in the record or
rowtype.
EXIT [label] [WHEN <expr>];
If no label given, the innermost loop is terminated and
the statement following END LOOP is executed next. If
label is given, it must be the label of the current or an
upper level of nested loops or blocks. Then the named
loop or block is terminated and control continues with
the statement after the loops/blocks corresponding END.
Trigger procedures
PL/pgSQL can also be used to define trigger procedures. They
are created using CREATE FUNCTION as a function with no
arguments and a return type of opaque.
There are some PostgreSQL specific details in functions used
as trigger procedures.
First they have some special variables created above the
toplevel statement block. These are:
new (record)
The new database tuple on INSERT/UPDATE operations at
ROW level.
old (record)
The old database tuple on UPDATE/DELETE operations at
ROW level.
tg_name (type name)
The triggers name from pg_trigger.
tg_when (type text)
A string of either 'BEFORE' or 'AFTER' depending on
the triggers definition.
tg_level (type text)
A string of either 'ROW' or 'STATEMENT' depending on
the triggers definition.
tg_op (type text)
A string of 'INSERT', 'UPDATE' or 'DELETE' telling
for which operation the trigger is actually fired.
tg_relid (type oid)
The Oid of the relation for which the trigger is
actually fired.
tg_relname (type name)
The relations name for which the trigger is actually
fired.
tg_nargs (type integer)
The number of arguments given to the trigger
procedure in the CREATE TRIGGER statement.
tg_argv[] (types text)
The arguments from the CREATE TRIGGER statement. The
index counts from 0 and can be given as expression.
Invalid indices (< 0 or >= tg_nargs) result in a NULL
value.
Second, they must return either NULL, or a record/rowtype
containing exactly the structure of the table the trigger was
fired for. Triggers fired AFTER might allways return NULL
with no effect. Triggers fired BEFORE signal the trigger
manager to skip the operation for this actual row when
returning NULL. Otherwise, the returned record/rowtype
replaces the inserted/updated tuple in the operation. It is
possible to replace single values directly in new and return
that, or to build a complete new record/rowtype to return.
Exceptions
PostgreSQL doesn't have a very smart exception handling
model. Whenever the parser, planner/optimizer or executor
decide that a statement cannot be processed any longer, the
whole transaction gets aborted and the the system jumps back
into the mainloop using longjmp() to get the next query from
the client application.
It is possible to hook into the longjmp() mechanism to notice
that this happens. But currently it's impossible to tell what
really caused the abort (input/output conversion error,
floating point error, parse error) And it's possible that the
backend is in an inconsistent state at this point so
returning to the upper executor or issuing more commands
might corrupt the whole database.
Thus, the only thing PL/pgSQL currently does when it
encounters an abort during execution of a function or trigger
procedure is to write some additional DEBUG log messages
telling in which function and where (line number and type of
statement) this happened.
This might change in the future.