From d849c5d182f93a8b26ef66d7b71b0380e9f3c0f1 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Sat, 14 Jul 2007 23:02:25 +0000 Subject: [PATCH] Editorial overhaul of plpgsql documentation. Provide detailed documentation of variable substitution and plan caching behavior in dedicated sections. (A lot of this material existed already, but was scattered in various places in the chapter.) Reorganize material a little bit, mostly to try to avoid diving into deep details in the first introductory sections. Document some fine points that had escaped treatment before, notably the ability to qualify plpgsql variable names with block labels. Some minor wordsmithing here and there. --- doc/src/sgml/plpgsql.sgml | 1325 ++++++++++++++++++++++--------------- 1 file changed, 797 insertions(+), 528 deletions(-) diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml index 18ba19dd77..562dfad728 100644 --- a/doc/src/sgml/plpgsql.sgml +++ b/doc/src/sgml/plpgsql.sgml @@ -1,4 +1,4 @@ - + <application>PL/pgSQL</application> - <acronym>SQL</acronym> Procedural Language @@ -7,6 +7,9 @@ PL/pgSQL + + Overview + PL/pgSQL is a loadable procedural language for the PostgreSQL database @@ -48,101 +51,13 @@ - Except for input/output conversion and calculation functions - for user-defined types, anything that can be defined in C language - functions can also be done with PL/pgSQL. + Functions created with PL/pgSQL can be + used anywhere that built-in functions could be used. For example, it is possible to create complex conditional computation functions and later use them to define operators or use them in index expressions. - - Overview - - - The PL/pgSQL call handler parses the function's source text and - produces an internal binary instruction tree the first time the - function is called (within each session). The instruction tree - fully translates the - PL/pgSQL statement structure, but individual - SQL expressions and SQL commands - used in the function are not translated immediately. - - - - As each expression and SQL command is first - used in the function, the PL/pgSQL interpreter - creates a prepared execution plan (using the - SPI manager's SPI_prepare - and SPI_saveplan - functions).preparing a queryin - PL/pgSQL Subsequent visits to that expression or command - reuse the prepared plan. Thus, a function with conditional code - that contains many statements for which execution plans might be - required will only prepare and save those plans that are really - used during the lifetime of the database connection. This can - substantially reduce the total amount of time required to parse - and generate execution plans for the statements in a - PL/pgSQL function. A disadvantage is that errors - in a specific expression or command cannot be detected until that - part of the function is reached in execution. - - - - Once PL/pgSQL has made an execution plan for a particular - command in a function, it will reuse that plan for the life of the - database connection. This is usually a win for performance, but it - can cause some problems if you dynamically - alter your database schema. For example: - - -CREATE FUNCTION populate() RETURNS integer AS $$ -DECLARE - -- declarations -BEGIN - PERFORM my_function(); -END; -$$ LANGUAGE plpgsql; - - - If you execute the above function, it will reference the OID for - my_function() in the execution plan produced for - the PERFORM statement. Later, if you - drop and recreate my_function(), then - populate() will not be able to find - my_function() anymore. You would then have to - recreate populate(), or at least start a new - database session so that it will be compiled afresh. Another way - to avoid this problem is to use CREATE OR REPLACE - FUNCTION when updating the definition of - my_function (when a function is - replaced, its OID is not changed). - - - - Because PL/pgSQL saves execution plans - in this way, SQL commands that appear directly in a - PL/pgSQL function must refer to the - same tables and columns on every execution; that is, you cannot use - a parameter as the name of a table or column in an SQL command. To get - around this restriction, you can construct dynamic commands using - the PL/pgSQL EXECUTE - statement — at the price of constructing a new execution plan on - every execution. - - - - - The PL/pgSQL - EXECUTE statement is not related to the - SQL - statement supported by the - PostgreSQL server. The server's - EXECUTE statement cannot be used within - PL/pgSQL functions (and is not needed). - - - Advantages of Using <application>PL/pgSQL</application> @@ -167,23 +82,22 @@ $$ LANGUAGE plpgsql; computation and a series of queries inside the database server, thus having the power of a procedural language and the ease of use of SQL, but with considerable - savings because you don't have the whole client/server - communication overhead. + savings of client/server communication overhead. - Elimination of additional round trips between - client and server + Extra round trips between + client and server are eliminated Intermediate results that the client does not - need do not need to be marshaled or transferred between server + need do not have to be marshaled or transferred between server and client - There is no need for additional rounds of query - parsing + Multiple rounds of query + parsing can be avoided - This can allow for a considerable performance increase as + This can result in a considerable performance increase as compared to an application that does not use stored functions. @@ -244,201 +158,6 @@ $$ LANGUAGE plpgsql; - - Tips for Developing in <application>PL/pgSQL</application> - - - One good way to develop in - PL/pgSQL is to use the text editor of your - choice to create your functions, and in another window, use - psql to load and test those functions. - If you are doing it this way, it - is a good idea to write the function using CREATE OR - REPLACE FUNCTION. That way you can just reload the file to update - the function definition. For example: - -CREATE OR REPLACE FUNCTION testfunc(integer) RETURNS integer AS $$ - .... -$$ LANGUAGE plpgsql; - - - - - While running psql, you can load or reload such - a function definition file with: - -\i filename.sql - - and then immediately issue SQL commands to test the function. - - - - Another good way to develop in PL/pgSQL is with a - GUI database access tool that facilitates development in a - procedural language. One example of such as a tool is - PgAccess, although others exist. These tools often - provide convenient features such as escaping single quotes and - making it easier to recreate and debug functions. - - - - Handling of Quotation Marks - - - The code of a PL/pgSQL function is specified in - CREATE FUNCTION as a string literal. If you - write the string literal in the ordinary way with surrounding - single quotes, then any single quotes inside the function body - must be doubled; likewise any backslashes must be doubled (assuming - escape string syntax is used). - Doubling quotes is at best tedious, and in more complicated cases - the code can become downright incomprehensible, because you can - easily find yourself needing half a dozen or more adjacent quote marks. - It's recommended that you instead write the function body as a - dollar-quoted string literal (see ). In the dollar-quoting - approach, you never double any quote marks, but instead take care to - choose a different dollar-quoting delimiter for each level of - nesting you need. For example, you might write the CREATE - FUNCTION command as: - -CREATE OR REPLACE FUNCTION testfunc(integer) RETURNS integer AS $PROC$ - .... -$PROC$ LANGUAGE plpgsql; - - Within this, you might use quote marks for simple literal strings in - SQL commands and $$ to delimit fragments of SQL commands - that you are assembling as strings. If you need to quote text that - includes $$, you could use $Q$, and so on. - - - - The following chart shows what you have to do when writing quote - marks without dollar quoting. It might be useful when translating - pre-dollar quoting code into something more comprehensible. - - - - - 1 quotation mark - - - To begin and end the function body, for example: - -CREATE FUNCTION foo() RETURNS integer AS ' - .... -' LANGUAGE plpgsql; - - Anywhere within a single-quoted function body, quote marks - must appear in pairs. - - - - - - 2 quotation marks - - - For string literals inside the function body, for example: - -a_output := ''Blah''; -SELECT * FROM users WHERE f_name=''foobar''; - - In the dollar-quoting approach, you'd just write: - -a_output := 'Blah'; -SELECT * FROM users WHERE f_name='foobar'; - - which is exactly what the PL/pgSQL parser would see - in either case. - - - - - - 4 quotation marks - - - When you need a single quotation mark in a string constant inside the - function body, for example: - -a_output := a_output || '' AND name LIKE ''''foobar'''' AND xyz'' - - The value actually appended to a_output would be: - AND name LIKE 'foobar' AND xyz. - - - In the dollar-quoting approach, you'd write: - -a_output := a_output || $$ AND name LIKE 'foobar' AND xyz$$ - - being careful that any dollar-quote delimiters around this are not - just $$. - - - - - - 6 quotation marks - - - When a single quotation mark in a string inside the function body is - adjacent to the end of that string constant, for example: - -a_output := a_output || '' AND name LIKE ''''foobar'''''' - - The value appended to a_output would then be: - AND name LIKE 'foobar'. - - - In the dollar-quoting approach, this becomes: - -a_output := a_output || $$ AND name LIKE 'foobar'$$ - - - - - - - 10 quotation marks - - - When you want two single quotation marks in a string constant (which - accounts for 8 quotation marks) and this is adjacent to the end of that - string constant (2 more). You will probably only need that if - you are writing a function that generates other functions, as in - . - For example: - -a_output := a_output || '' if v_'' || - referrer_keys.kind || '' like '''''''''' - || referrer_keys.key_string || '''''''''' - then return '''''' || referrer_keys.referrer_type - || ''''''; end if;''; - - The value of a_output would then be: - -if v_... like ''...'' then return ''...''; end if; - - - - In the dollar-quoting approach, this becomes: - -a_output := a_output || $$ if v_$$ || referrer_keys.kind || $$ like '$$ - || referrer_keys.key_string || $$' - then return '$$ || referrer_keys.referrer_type - || $$'; end if;$$; - - where we assume we only need to put single quote marks into - a_output, because it will be re-quoted before use. - - - - - - - - Structure of <application>PL/pgSQL</application> @@ -465,10 +184,25 @@ END label ; concludes a function body does not require a semicolon. + + + A common mistake is to write a semicolon immediately after + BEGIN. This is incorrect and will result in a syntax error. + + + - All key words and identifiers can be written in mixed upper and - lower case. Identifiers are implicitly converted to lowercase - unless double-quoted. + A label is only needed if you want to + identify the block for use + in an EXIT statement, or to qualify the names of the + variables declared in the block. If a label is given after + END, it must match the label at the block's beginning. + + + + All key words are case-insensitive. + Identifiers are implicitly converted to lowercase + unless double-quoted, just as they are in ordinary SQL commands. @@ -485,19 +219,17 @@ END label ; Any statement in the statement section of a block can be a subblock. Subblocks can be used for logical grouping or to localize variables to a small group - of statements. - - - - The variables declared in the declarations section preceding a - block are initialized to their default values every time the - block is entered, not only once per function call. For example: + of statements. Variables declared in a subblock mask any + similarly-named variables of outer blocks for the duration + of the subblock; but you can access the outer variables anyway + if you qualify their names with their block's label. For example: CREATE FUNCTION somefunc() RETURNS integer AS $$ +<< outerblock >> DECLARE quantity integer := 30; BEGIN - RAISE NOTICE 'Quantity here is %', quantity; -- Quantity here is 30 + RAISE NOTICE 'Quantity here is %', quantity; -- Prints 30 quantity := 50; -- -- Create a subblock @@ -505,10 +237,11 @@ BEGIN DECLARE quantity integer := 80; BEGIN - RAISE NOTICE 'Quantity here is %', quantity; -- Quantity here is 80 + RAISE NOTICE 'Quantity here is %', quantity; -- Prints 80 + RAISE NOTICE 'Outer quantity here is %', outerblock.quantity; -- Prints 50 END; - RAISE NOTICE 'Quantity here is %', quantity; -- Quantity here is 50 + RAISE NOTICE 'Quantity here is %', quantity; -- Prints 50 RETURN quantity; END; @@ -579,8 +312,9 @@ arow RECORD; - The default value is evaluated every time the block is entered. So, - for example, assigning now() to a variable of type + A variable's default value is evaluated and assigned to the variable + each time the block is entered (not just once per function call). + So, for example, assigning now() to a variable of type timestamp causes the variable to have the time of the current function call, not the time when the function was precompiled. @@ -916,88 +650,43 @@ RENAME this_var TO that_var; All expressions used in PL/pgSQL - statements are processed using the server's regular - SQL executor. In effect, a query like + statements are processed using the server's main + SQL executor. For example, when you write + a PL/pgSQL statement like + +IF expression THEN ... + + PL/pgSQL will evaluate the expression by + feeding a query like SELECT expression - is executed using the SPI manager. Before evaluation, - occurrences of PL/pgSQL variable - identifiers are replaced by parameters, and the actual values from - the variables are passed to the executor in the parameter array. + to the main SQL engine. While forming the SELECT command, + any occurrences of PL/pgSQL variable names + are replaced by parameters, as discussed in detail in + . This allows the query plan for the SELECT to be prepared just once and then reused for subsequent - evaluations. - - - - The evaluation done by the PostgreSQL - main parser has some side - effects on the interpretation of constant values. In detail there - is a difference between what these two functions do: - + evaluations with different values of the variables. Thus, what + really happens on first use of an expression is essentially a + PREPARE command. For example, if we have declared + two integer variables x and y, and we write -CREATE FUNCTION logfunc1(logtxt text) RETURNS timestamp AS $$ - BEGIN - INSERT INTO logtable VALUES (logtxt, 'now'); - RETURN 'now'; - END; -$$ LANGUAGE plpgsql; +IF x < y THEN ... - - and: - + what happens behind the scenes is -CREATE FUNCTION logfunc2(logtxt text) RETURNS timestamp AS $$ - DECLARE - curtime timestamp; - BEGIN - curtime := 'now'; - INSERT INTO logtable VALUES (logtxt, curtime); - RETURN curtime; - END; -$$ LANGUAGE plpgsql; +PREPARE statement_name(integer, integer) AS SELECT $1 < $2; - - - - In the case of logfunc1, the - PostgreSQL main parser knows when - preparing the plan for the INSERT that the - string 'now' should be interpreted as - timestamp because the target column of - logtable is of that type. Thus, - 'now' will be converted to a constant when the - INSERT is planned, and then used in all - invocations of logfunc1 during the lifetime - of the session. Needless to say, this isn't what the programmer - wanted. - - - - In the case of logfunc2, the - PostgreSQL main parser does not know - what type 'now' should become and therefore - it returns a data value of type text containing the string - now. During the ensuing assignment - to the local variable curtime, the - PL/pgSQL interpreter casts this - string to the timestamp type by calling the - text_out and timestamp_in - functions for the conversion. So, the computed time stamp is updated - on each execution as the programmer expects. - - - - The mutable nature of record variables presents a problem in this - connection. When fields of a record variable are used in - expressions or statements, the data types of the fields must not - change between calls of one and the same expression, since the - expression will be planned using the data type that is present - when the expression is first reached. Keep this in mind when - writing trigger procedures that handle events for more than one - table. (EXECUTE can be used to get around - this problem when necessary.) + and then this prepared statement is EXECUTEd for each + execution of the IF statement, with the current values + of the PL/pgSQL variables supplied as + parameter values. + The query plan prepared in this way is saved for the life of the database + connection, as described in + . Normally these details are + not important to a PL/pgSQL user, but + they are useful to know when trying to diagnose a problem. @@ -1021,7 +710,7 @@ $$ LANGUAGE plpgsql; An assignment of a value to a PL/pgSQL variable or row/record field is written as: -identifier := expression; +variable := expression; As explained above, the expression in such a statement is evaluated by means of an SQL SELECT command sent to the main @@ -1042,36 +731,29 @@ $$ LANGUAGE plpgsql; Examples: -user_id := 20; tax := subtotal * 0.06; +my_record.user_id := 20; - Executing a Query With No Result + Executing a Command With No Result - For any SQL query that does not return rows, for example + For any SQL command that does not return rows, for example INSERT without a RETURNING clause, you can - execute the query within a PL/pgSQL function - just by writing the query. + execute the command within a PL/pgSQL function + just by writing the command. Any PL/pgSQL variable name appearing - in the query text is replaced by a parameter symbol, and then the + in the command text is replaced by a parameter symbol, and then the current value of the variable is provided as the parameter value - at run time. This allows the same textual query to do different - things in different calls of the function. - - - - - This two-step process allows - PL/pgSQL to plan the query just once - and re-use the plan on subsequent executions. As an example, - if you write: + at run time. This is exactly like the processing described earlier + for expressions; for details see . + As an example, if you write: DECLARE key TEXT; @@ -1080,30 +762,33 @@ BEGIN ... UPDATE mytab SET val = val + delta WHERE id = key; - the query text seen by the main SQL engine will look like: + the command text seen by the main SQL engine will look like: UPDATE mytab SET val = val + $1 WHERE id = $2; - Although you don't normally have to think about this, it's helpful - to know it when you need to make sense of syntax-error messages. - - + Although you don't normally have to think about this, it's helpful + to know it when you need to make sense of syntax-error messages. + PL/pgSQL will substitute for any identifier matching one of the function's declared variables; it is not bright enough to know whether that's what you meant! Thus, it is a bad idea - to use a variable name that is the same as any table or column name - that you need to reference in queries within the function. Sometimes - you can work around this by using qualified names in the query: - PL/pgSQL will not substitute in a - qualified name foo.bar, even if - foo or bar is a declared variable - name. + to use a variable name that is the same as any table, column, or + function name that you need to reference in commands within the + function. For more discussion see . + + When executing a SQL command in this way, + PL/pgSQL plans the command just once + and re-uses the plan on subsequent executions, for the life of + the database connection. The implications of this are discussed + in detail in . + + Sometimes it is useful to evaluate an expression or SELECT query but discard the result, for example when calling a function @@ -1120,9 +805,11 @@ PERFORM query; way you would write an SQL SELECT command, but replace the initial keyword SELECT with PERFORM. PL/pgSQL variables will be - substituted into the query as usual. Also, the special variable + substituted into the query just as for commands that return no result, + and the plan is cached in the same way. Also, the special variable FOUND is set to true if the query produced at - least one row, or false if it produced no rows. + least one row, or false if it produced no rows (see + ). @@ -1175,7 +862,8 @@ DELETE ... RETURNING expressions INTO STRIC variable, or a comma-separated list of simple variables and record/row fields. PL/pgSQL variables will be - substituted into the rest of the query as usual. + substituted into the rest of the query, and the plan is cached, + just as described above for commands that do not return rows. This works for SELECT, INSERT/UPDATE/DELETE with RETURNING, and utility commands that return row-set @@ -1215,9 +903,9 @@ DELETE ... RETURNING expressions INTO STRIC - If STRICT is not specified, then - target will be set to the first row - returned by the query, or to nulls if the query returned no rows. + If STRICT is not specified in the INTO + clause, then target will be set to the first + row returned by the query, or to nulls if the query returned no rows. (Note that the first row is not well-defined unless you've used ORDER BY.) Any result rows after the first row are discarded. @@ -1258,7 +946,7 @@ END; an error for more than one returned row, even when STRICT is not specified. This is because there is no option such as ORDER BY with which to determine - which affected row would be returned. + which affected row should be returned. @@ -1275,52 +963,6 @@ END; - - Doing Nothing At All - - - Sometimes a placeholder statement that does nothing is useful. - For example, it can indicate that one arm of an if/then/else - chain is deliberately empty. For this purpose, use the - NULL statement: - - -NULL; - - - - - For example, the following two fragments of code are equivalent: - - BEGIN - y := x / 0; - EXCEPTION - WHEN division_by_zero THEN - NULL; -- ignore the error - END; - - - - BEGIN - y := x / 0; - EXCEPTION - WHEN division_by_zero THEN -- ignore the error - END; - - Which is preferable is a matter of taste. - - - - - In Oracle's PL/SQL, empty statement lists are not allowed, and so - NULL statements are required for situations - such as this. PL/pgSQL allows you to - just write nothing, instead. - - - - - Executing Dynamic Commands @@ -1329,7 +971,8 @@ NULL; PL/pgSQL functions, that is, commands that will involve different tables or different data types each time they are executed. PL/pgSQL's - normal attempts to cache plans for commands will not work in such + normal attempts to cache plans for commands (as discussed in + ) will not work in such scenarios. To handle this sort of problem, the EXECUTE statement is provided: @@ -1345,16 +988,15 @@ EXECUTE command-string INT - Note in particular that no substitution of PL/pgSQL - variables is done on the computed command string. The values of - variables must be inserted in the command string as it is constructed. + No substitution of PL/pgSQL variables is done on the + computed command string. Any required variable values must be inserted + in the command string as it is constructed. - Unlike all other commands in PL/pgSQL, a command - run by an EXECUTE statement is not prepared - and saved just once during the life of the session. Instead, the - command is prepared each time the statement is run. The command + Also, there is no plan caching for commands executed via + EXECUTE. Instead, the + command is prepared each time the statement is run. Thus the command string can be dynamically created within the function to perform actions on different tables and columns. @@ -1368,7 +1010,7 @@ EXECUTE command-string INT result structure automatically). If multiple rows are returned, only the first will be assigned to the INTO variable. If no rows are returned, NULL is assigned to the - INTO variable. If no INTO + INTO variable(s). If no INTO clause is specified, the query results are discarded. @@ -1379,9 +1021,23 @@ EXECUTE command-string INT SELECT INTO is not currently supported within - EXECUTE. + EXECUTE; instead, execute a plain SELECT + command and specify INTO as part of the EXECUTE + itself. + + + The PL/pgSQL + EXECUTE statement is not related to the + SQL + statement supported by the + PostgreSQL server. The server's + EXECUTE statement cannot be used directly within + PL/pgSQL functions (and is not needed). + + + When working with dynamic commands you will often have to handle escaping of single quotes. The recommended method for quoting fixed text in your @@ -1393,7 +1049,7 @@ EXECUTE command-string INT Dynamic values that are to be inserted into the constructed - query require special handling since they might themselves contain + query require careful handling since they might themselves contain quote characters. An example (this assumes that you are using dollar quoting for the function as a whole, so the quote marks need not be doubled): @@ -1505,8 +1161,8 @@ GET DIAGNOSTICS integer_var = ROW_COUNT; A PERFORM statement sets FOUND - true if it produces (and discards) a row, false if no row is - produced. + true if it produces (and discards) one or more rows, false if + no row is produced. @@ -1551,6 +1207,52 @@ GET DIAGNOSTICS integer_var = ROW_COUNT; + + + Doing Nothing At All + + + Sometimes a placeholder statement that does nothing is useful. + For example, it can indicate that one arm of an if/then/else + chain is deliberately empty. For this purpose, use the + NULL statement: + + +NULL; + + + + + For example, the following two fragments of code are equivalent: + + BEGIN + y := x / 0; + EXCEPTION + WHEN division_by_zero THEN + NULL; -- ignore the error + END; + + + + BEGIN + y := x / 0; + EXCEPTION + WHEN division_by_zero THEN -- ignore the error + END; + + Which is preferable is a matter of taste. + + + + + In Oracle's PL/SQL, empty statement lists are not allowed, and so + NULL statements are required for situations + such as this. PL/pgSQL allows you to + just write nothing, instead. + + + + @@ -1652,8 +1354,10 @@ RETURN NEXT expression; If you declared the function with output parameters, write just - RETURN NEXT with no expression. The current values - of the output parameter variable(s) will be saved for eventual return. + RETURN NEXT with no expression. On each + execution, the current values + of the output parameter variable(s) will be saved for eventual return + as a row of the result. Note that you must declare the function as returning SETOF record when there are multiple output parameters, or @@ -1840,7 +1544,7 @@ END IF; IF-THEN-ELSIF-ELSE provides a more convenient method of checking many alternatives in one statement. - Formally it is equivalent to nested + Functionally it is equivalent to nested IF-THEN-ELSE-IF-THEN commands, but only one END IF is needed. @@ -1916,7 +1620,7 @@ END LOOP label ; -EXIT label WHEN expression ; +EXIT label WHEN boolean-expression ; @@ -1931,7 +1635,7 @@ EXIT label WHEN If WHEN is specified, the loop exit occurs only if - expression is true. Otherwise, control passes + boolean-expression is true. Otherwise, control passes to the statement after EXIT. @@ -1976,21 +1680,23 @@ END; -CONTINUE label WHEN expression ; +CONTINUE label WHEN boolean-expression ; If no label is given, the next iteration of - the innermost loop is begun. That is, control is passed back - to the loop control expression (if any), and the body of the - loop is re-evaluated. If label is present, it + the innermost loop is begun. That is, all statements remaining + in the loop body are skipped, and control returns + to the loop control expression (if any) to determine whether + another loop iteration is needed. + If label is present, it specifies the label of the loop whose execution will be continued. If WHEN is specified, the next iteration of the - loop is begun only if expression is + loop is begun only if boolean-expression is true. Otherwise, control passes to the statement after CONTINUE. @@ -2024,15 +1730,16 @@ END LOOP; <<label>> -WHILE expression LOOP +WHILE boolean-expression LOOP statements END LOOP label ; The WHILE statement repeats a - sequence of statements so long as the condition expression - evaluates to true. The condition is checked just before + sequence of statements so long as the + boolean-expression + evaluates to true. The expression is checked just before each entry to the loop body. @@ -2043,7 +1750,7 @@ WHILE amount_owed > 0 AND gift_certificate_balance > 0 LOOP -- some computations here END LOOP; -WHILE NOT boolean_expression LOOP +WHILE NOT done LOOP -- some computations here END LOOP; @@ -2069,9 +1776,9 @@ END LOOP label ; The two expressions giving the lower and upper bound of the range are evaluated once when entering the loop. If the BY clause isn't specified the iteration - step is 1 otherwise it's the value specified in the BY + step is 1, otherwise it's the value specified in the BY clause. If REVERSE is specified then the step value is - considered negative. + subtracted, rather than added, after each iteration. @@ -2154,6 +1861,13 @@ $$ LANGUAGE plpgsql; commands such as EXPLAIN will work too. + + PL/pgSQL variables are substituted into the query text, + and the query plan is cached for possible re-use, as discussed in + detail in and + . + + The FOR-IN-EXECUTE statement is another way to iterate over rows: @@ -2169,20 +1883,6 @@ END LOOP label ; choose the speed of a preplanned query or the flexibility of a dynamic query, just as with a plain EXECUTE statement. - - - - The PL/pgSQL parser presently distinguishes the - two kinds of FOR loops (integer or query result) by checking - whether .. appears outside any parentheses between - IN and LOOP. If .. is not seen then - the loop is presumed to be a loop over rows. Mistyping the .. - is thus likely to lead to a complaint along the lines of - loop variable of loop over rows must be a record or row variable - or list of scalar variables, - rather than the simple syntax error one might expect to get. - - @@ -2314,16 +2014,19 @@ CREATE FUNCTION merge_db(key INT, data TEXT) RETURNS VOID AS $$ BEGIN LOOP + -- first try to update the key UPDATE db SET b = data WHERE a = key; IF found THEN RETURN; END IF; - + -- not there, so try to insert the key + -- if someone else inserts the same key concurrently, + -- we could get a unique-key failure BEGIN INSERT INTO db(a,b) VALUES (key, data); RETURN; EXCEPTION WHEN unique_violation THEN - -- do nothing + -- do nothing, and loop to try the UPDATE again END; END LOOP; END; @@ -2433,7 +2136,12 @@ OPEN unbound_cursor NO PL/pgSQL: PL/pgSQL variable names are substituted, and the query plan is cached for - possible reuse. The SCROLL and NO SCROLL + possible reuse. When a PL/pgSQL + variable is substituted into the cursor query, the value that is + substituted is the one it has at the time of the OPEN; + subsequent changes to the variable will not affect the cursor's + behavior. + The SCROLL and NO SCROLL options have the same meanings as for a bound cursor. @@ -2458,8 +2166,11 @@ OPEN unbound_cursor NO refcursor variable). The query is specified as a string expression, in the same way as in the EXECUTE - command. As usual, this gives flexibility so the query can vary - from one run to the next. The SCROLL and + command. As usual, this gives flexibility so the query plan can vary + from one run to the next (see ), + and it also means that variable substitution is not done on the + command string. + The SCROLL and NO SCROLL options have the same meanings as for a bound cursor. @@ -2492,6 +2203,17 @@ OPEN bound_cursor ( argument_ behavior was already determined. + + Note that because variable substitution is done on the bound + cursor's query, there are two ways to pass values into the cursor: + either with an explicit argument to OPEN, or + implicitly by referencing a PL/pgSQL variable + in the query. However, only variables declared before the bound + cursor was declared will be substituted into it. In either case + the value to be passed is determined at the time of the + OPEN. + + Examples: @@ -2537,7 +2259,8 @@ FETCH direction { FROM | IN } FETCH retrieves the next row from the cursor into a target, which might be a row variable, a record variable, or a comma-separated list of simple variables, just like - SELECT INTO. As with SELECT + SELECT INTO. If there is no next row, the + target is set to NULL(s). As with SELECT INTO, the special variable FOUND can be checked to see whether a row was obtained or not. @@ -2562,6 +2285,11 @@ FETCH direction { FROM | IN } with the SCROLL option. + + cursor must be the name of a refcursor + variable that references an open cursor portal. + + Examples: @@ -2586,8 +2314,7 @@ MOVE direction { FROM | IN } < FETCH command, except it only repositions the cursor and does not return the row moved to. As with SELECT INTO, the special variable FOUND can - be checked to see whether the cursor was successfully - repositioned or not. + be checked to see whether there was a next row to move to. @@ -2817,7 +2544,7 @@ RAISE level ' + <application>PL/pgSQL</> Under the Hood + + + This section discusses some implementation details that are + frequently important for PL/pgSQL users to know. + + + + Variable Substitution + + + When PL/pgSQL prepares a SQL statement or expression + for execution, any PL/pgSQL variable name + appearing in the statement or expression is replaced by a parameter symbol, + $n. The current value + of the variable is then provided as the value for the parameter whenever + the statement or expression is executed. As an example, consider the + function + +CREATE FUNCTION logfunc(logtxt text) RETURNS void AS $$ + DECLARE + curtime timestamp := now(); + BEGIN + INSERT INTO logtable VALUES (logtxt, curtime); + END; +$$ LANGUAGE plpgsql; + + The INSERT statement will effectively be processed as + +PREPARE statement_name(text, timestamp) AS + INSERT INTO logtable VALUES ($1, $2); + + followed on each execution by EXECUTE with the current + actual values of the two variables. (Note: here we are speaking of + the main SQL engine's + command, + not PL/pgSQL's EXECUTE.) + + + + The substitution mechanism will replace any token that matches a + known variable's name. This poses various traps for the unwary. + For example, it is a bad idea + to use a variable name that is the same as any table or column name + that you need to reference in queries within the function, because + what you think is a table or column name will still get replaced. + In the above example, suppose that logtable has + column names logtxt and logtime, + and we try to write the INSERT as + + INSERT INTO logtable (logtxt, logtime) VALUES (logtxt, curtime); + + This will be fed to the main SQL parser as + + INSERT INTO logtable ($1, logtime) VALUES ($1, $2); + + resulting in a syntax error like this: + +ERROR: syntax error at or near "$1" +LINE 1: INSERT INTO logtable ( $1 , logtime) VALUES ( $1 , $2 ) + ^ +QUERY: INSERT INTO logtable ( $1 , logtime) VALUES ( $1 , $2 ) +CONTEXT: SQL statement in PL/PgSQL function "logfunc2" near line 5 + + + + + This example is fairly easy to diagnose, since it leads to an + obvious syntax error. Much nastier are cases where the substitution + is syntactically permissible, since the only symptom may be misbehavior + of the function. In one case, a user wrote something like this: + + DECLARE + val text; + search_key integer; + BEGIN + ... + FOR val IN SELECT val FROM table WHERE key = search_key LOOP ... + + and wondered why all his table entries seemed to be NULL. Of course + what happened here was that the query became + + SELECT $1 FROM table WHERE key = $2 + + and thus it was just an expensive way of assigning val's + current value back to itself for each row. + + + + A commonly used coding rule for avoiding such traps is to use a + different naming convention for PL/pgSQL + variables than you use for table and column names. For example, + if all your variables are named + v_something while none of your + table or column names start with v_, you're pretty safe. + + + + Another workaround is to use qualified (dotted) names for SQL entities. + For instance we could safely have written the above example as + + FOR val IN SELECT table.val FROM table WHERE key = search_key LOOP ... + + because PL/pgSQL will not substitute a + variable for a trailing component of a qualified name. + However this solution does not work in every case — you can't + qualify a name in an INSERT's column name list, for instance. + Another point is that record and row variable names will be matched to + the first components of qualified names, so a qualified SQL name is + still vulnerable in some cases. + In such cases choosing a non-conflicting variable name is the only way. + + + + Another technique you can use is to attach a label to the block in + which your variables are declared, and then qualify the variable names + in your SQL commands (see ). + For example, + + <<pl>> + DECLARE + val text; + BEGIN + ... + UPDATE table SET col = pl.val WHERE ... + + This is not in itself a solution to the problem of conflicts, + since an unqualified name in a SQL command is still at risk of being + interpreted the wrong way. But it is useful for clarifying + the intent of potentially-ambiguous code. + + + + Variable substitution does not happen in the command string given + to EXECUTE or one of its variants. If you need to + insert a varying value into such a command, do so as part of + constructing the string value, as illustrated in + . + + + + Variable substitution currently works only in SELECT, + INSERT, UPDATE, and DELETE commands, + because the main SQL engine allows parameter symbols only in these + commands. To use a non-constant name or value in other statement + types (generically called utility statements), you must construct + the utility statement as a string and EXECUTE it. + + + + + + Plan Caching + + + The PL/pgSQL interpreter parses the function's source + text and produces an internal binary instruction tree the first time the + function is called (within each session). The instruction tree + fully translates the + PL/pgSQL statement structure, but individual + SQL expressions and SQL commands + used in the function are not translated immediately. + + + + As each expression and SQL command is first + executed in the function, the PL/pgSQL interpreter + creates a prepared execution plan (using the + SPI manager's SPI_prepare + and SPI_saveplan + functions).preparing a queryin + PL/pgSQL Subsequent visits to that expression or command + reuse the prepared plan. Thus, a function with conditional code + that contains many statements for which execution plans might be + required will only prepare and save those plans that are really + used during the lifetime of the database connection. This can + substantially reduce the total amount of time required to parse + and generate execution plans for the statements in a + PL/pgSQL function. A disadvantage is that errors + in a specific expression or command cannot be detected until that + part of the function is reached in execution. (Trivial syntax + errors will be detected during the initial parsing pass, but + anything deeper will not be detected until execution.) + + + + Once PL/pgSQL has made an execution plan for a particular + command in a function, it will reuse that plan for the life of the + database connection. This is usually a win for performance, but it + can cause some problems if you dynamically + alter your database schema. For example: + + +CREATE FUNCTION populate() RETURNS integer AS $$ +DECLARE + -- declarations +BEGIN + PERFORM my_function(); +END; +$$ LANGUAGE plpgsql; + + + If you execute the above function, it will reference the OID for + my_function() in the execution plan produced for + the PERFORM statement. Later, if you + drop and recreate my_function(), then + populate() will not be able to find + my_function() anymore. You would then have to + start a new database session so that populate() + will be compiled afresh, before it will work again. You can avoid + this problem by using CREATE OR REPLACE FUNCTION + when updating the definition of + my_function, since when a function is + replaced, its OID is not changed. + + + + + In PostgreSQL 8.3 and later, saved plans + will be replaced whenever any schema changes have occurred to any + tables they reference. This eliminates one of the major disadvantages + of saved plans. However, there is no such mechanism for function + references, and thus the above example involving a reference to a + deleted function is still valid. + + + + + Because PL/pgSQL saves execution plans + in this way, SQL commands that appear directly in a + PL/pgSQL function must refer to the + same tables and columns on every execution; that is, you cannot use + a parameter as the name of a table or column in an SQL command. To get + around this restriction, you can construct dynamic commands using + the PL/pgSQL EXECUTE + statement — at the price of constructing a new execution plan on + every execution. + + + + Another important point is that the prepared plans are parameterized + to allow the values of PL/pgSQL variables + to change from one use to the next, as discussed in detail above. + Sometimes this means that a plan is less efficient than it would be + if generated for a specific variable value. As an example, consider + +SELECT * INTO myrec FROM dictionary WHERE word LIKE search_term; + + where search_term is a PL/pgSQL + variable. The cached plan for this query will never use an index on + word, since the planner cannot assume that the + LIKE pattern will be left-anchored at runtime. To use + an index the query must be planned with a specific constant + LIKE pattern provided. This is another situation where + EXECUTE can be used to force a new plan to be + generated for each execution. + + + + The mutable nature of record variables presents another problem in this + connection. When fields of a record variable are used in + expressions or statements, the data types of the fields must not + change from one call of the function to the next, since each + expression will be planned using the data type that is present + when the expression is first reached. EXECUTE can be + used to get around this problem when necessary. + + + + If the same function is used as a trigger for more than one table, + PL/pgSQL prepares and caches plans + independently for each such table — that is, there is a cache + for each trigger function and table combination, not just for each + function. This alleviates some of the problems with varying + data types; for instance, a trigger function will be able to work + successfully with a column named key even if it happens + to have different types in different tables. + + + + Likewise, functions having polymorphic argument types have a separate + plan cache for each combination of actual argument types they have been + invoked for, so that data type differences do not cause unexpected + failures. + + + + Plan caching can sometimes have surprising effects on the interpretation + of time-sensitive values. For example there + is a difference between what these two functions do: + + +CREATE FUNCTION logfunc1(logtxt text) RETURNS void AS $$ + BEGIN + INSERT INTO logtable VALUES (logtxt, 'now'); + END; +$$ LANGUAGE plpgsql; + + + and: + + +CREATE FUNCTION logfunc2(logtxt text) RETURNS void AS $$ + DECLARE + curtime timestamp; + BEGIN + curtime := 'now'; + INSERT INTO logtable VALUES (logtxt, curtime); + END; +$$ LANGUAGE plpgsql; + + + + + In the case of logfunc1, the + PostgreSQL main parser knows when + preparing the plan for the INSERT that the + string 'now' should be interpreted as + timestamp, because the target column of + logtable is of that type. Thus, + 'now' will be converted to a constant when the + INSERT is planned, and then used in all + invocations of logfunc1 during the lifetime + of the session. Needless to say, this isn't what the programmer + wanted. + + + + In the case of logfunc2, the + PostgreSQL main parser does not know + what type 'now' should become and therefore + it returns a data value of type text containing the string + now. During the ensuing assignment + to the local variable curtime, the + PL/pgSQL interpreter casts this + string to the timestamp type by calling the + text_out and timestamp_in + functions for the conversion. So, the computed time stamp is updated + on each execution as the programmer expects. + + + + + + + + Tips for Developing in <application>PL/pgSQL</application> + + + One good way to develop in + PL/pgSQL is to use the text editor of your + choice to create your functions, and in another window, use + psql to load and test those functions. + If you are doing it this way, it + is a good idea to write the function using CREATE OR + REPLACE FUNCTION. That way you can just reload the file to update + the function definition. For example: + +CREATE OR REPLACE FUNCTION testfunc(integer) RETURNS integer AS $$ + .... +$$ LANGUAGE plpgsql; + + + + + While running psql, you can load or reload such + a function definition file with: + +\i filename.sql + + and then immediately issue SQL commands to test the function. + + + + Another good way to develop in PL/pgSQL is with a + GUI database access tool that facilitates development in a + procedural language. One example of such as a tool is + PgAccess, although others exist. These tools often + provide convenient features such as escaping single quotes and + making it easier to recreate and debug functions. + + + + Handling of Quotation Marks + + + The code of a PL/pgSQL function is specified in + CREATE FUNCTION as a string literal. If you + write the string literal in the ordinary way with surrounding + single quotes, then any single quotes inside the function body + must be doubled; likewise any backslashes must be doubled (assuming + escape string syntax is used). + Doubling quotes is at best tedious, and in more complicated cases + the code can become downright incomprehensible, because you can + easily find yourself needing half a dozen or more adjacent quote marks. + It's recommended that you instead write the function body as a + dollar-quoted string literal (see ). In the dollar-quoting + approach, you never double any quote marks, but instead take care to + choose a different dollar-quoting delimiter for each level of + nesting you need. For example, you might write the CREATE + FUNCTION command as: + +CREATE OR REPLACE FUNCTION testfunc(integer) RETURNS integer AS $PROC$ + .... +$PROC$ LANGUAGE plpgsql; + + Within this, you might use quote marks for simple literal strings in + SQL commands and $$ to delimit fragments of SQL commands + that you are assembling as strings. If you need to quote text that + includes $$, you could use $Q$, and so on. + + + + The following chart shows what you have to do when writing quote + marks without dollar quoting. It might be useful when translating + pre-dollar quoting code into something more comprehensible. + + + + + 1 quotation mark + + + To begin and end the function body, for example: + +CREATE FUNCTION foo() RETURNS integer AS ' + .... +' LANGUAGE plpgsql; + + Anywhere within a single-quoted function body, quote marks + must appear in pairs. + + + + + + 2 quotation marks + + + For string literals inside the function body, for example: + +a_output := ''Blah''; +SELECT * FROM users WHERE f_name=''foobar''; + + In the dollar-quoting approach, you'd just write: + +a_output := 'Blah'; +SELECT * FROM users WHERE f_name='foobar'; + + which is exactly what the PL/pgSQL parser would see + in either case. + + + + + + 4 quotation marks + + + When you need a single quotation mark in a string constant inside the + function body, for example: + +a_output := a_output || '' AND name LIKE ''''foobar'''' AND xyz'' + + The value actually appended to a_output would be: + AND name LIKE 'foobar' AND xyz. + + + In the dollar-quoting approach, you'd write: + +a_output := a_output || $$ AND name LIKE 'foobar' AND xyz$$ + + being careful that any dollar-quote delimiters around this are not + just $$. + + + + + + 6 quotation marks + + + When a single quotation mark in a string inside the function body is + adjacent to the end of that string constant, for example: + +a_output := a_output || '' AND name LIKE ''''foobar'''''' + + The value appended to a_output would then be: + AND name LIKE 'foobar'. + + + In the dollar-quoting approach, this becomes: + +a_output := a_output || $$ AND name LIKE 'foobar'$$ + + + + + + + 10 quotation marks + + + When you want two single quotation marks in a string constant (which + accounts for 8 quotation marks) and this is adjacent to the end of that + string constant (2 more). You will probably only need that if + you are writing a function that generates other functions, as in + . + For example: + +a_output := a_output || '' if v_'' || + referrer_keys.kind || '' like '''''''''' + || referrer_keys.key_string || '''''''''' + then return '''''' || referrer_keys.referrer_type + || ''''''; end if;''; + + The value of a_output would then be: + +if v_... like ''...'' then return ''...''; end if; + + + + In the dollar-quoting approach, this becomes: + +a_output := a_output || $$ if v_$$ || referrer_keys.kind || $$ like '$$ + || referrer_keys.key_string || $$' + then return '$$ || referrer_keys.referrer_type + || $$'; end if;$$; + + where we assume we only need to put single quote marks into + a_output, because it will be re-quoted before use. + + + + + + + + @@ -3350,6 +3618,7 @@ SELECT * FROM sales_summary_bytime; function_name.parameter_name. In PL/pgSQL, you can instead avoid a conflict by qualifying the column or table name. + (See .) @@ -3365,8 +3634,8 @@ SELECT * FROM sales_summary_bytime; In PostgreSQL the function body must be written as a string literal. Therefore you need to use dollar quoting or escape - single quotes in the function body. See . + single quotes in the function body. (See .)