<application>ECPG</application> - Embedded <acronym>SQL</acronym> in C

<application>ECPG</application> - Embedded <acronym>SQL</acronym> in C embedded SQLin C C ECPG This chapter describes the embedded SQL package for PostgreSQL. It was written by Linus Tolke (linus@epact.se) and Michael Meskes (meskes@postgresql.org). Originally it was written to work with C. It also works with C++, but it does not recognize all C++ constructs yet. This documentation is quite incomplete. But since this interface is standardized, additional information can be found in many resources about SQL. The Concept An embedded SQL program consists of code written in an ordinary programming language, in this case C, mixed with SQL commands in specially marked sections. To build the program, the source code is first passed through the embedded SQL preprocessor, which converts it to an ordinary C program, and afterwards it can be processed by a C compiler. Embedded SQL has advantages over other methods for handling SQL commands from C code. First, it takes care of the tedious passing of information to and from variables in your C program. Second, the SQL code in the program is checked at build time for syntactical correctness. Third, embedded SQL in C is specified in the SQL standard and supported by many other SQL database systems. The PostgreSQL implementation is designed to match this standard as much as possible, and it is usually possible to port embedded SQL programs written for other SQL databases to PostgreSQL with relative ease. As already stated, programs written for the embedded SQL interface are normal C programs with special code inserted to perform database-related actions. This special code always has the form EXEC SQL ...; These statements syntactically take the place of a C statement. Depending on the particular statement, they may appear at the global level or within a function. Embedded SQL statements follow the case-sensitivity rules of normal SQL code, and not those of C. The following sections explain all the embedded SQL statements. Connecting to the Database Server One connects to a database using the following statement: EXEC SQL CONNECT TO target AS connection-name USER user-name; The target can be specified in the following ways: dbname@hostname:port tcp:postgresql://hostname:port/dbname?options unix:postgresql://hostname:port/dbname?options an SQL string literal containing one of the above forms a reference to a character variable containing one of the above forms (see examples) DEFAULT If you specify the connection target literally (that is, not through a variable reference) and you don't quote the value, then the case-insensitivity rules of normal SQL are applied. In that case you can also double-quote the individual parameters separately as needed. In practice, it is probably less error-prone to use a (single-quoted) string literal or a variable reference. The connection target DEFAULT initiates a connection to the default database under the default user name. No separate user name or connection name may be specified in that case. There are also different ways to specify the user name: username username/password username IDENTIFIED BY password username USING password As above, the parameters username and password may be an SQL identifier, an SQL string literal, or a reference to a character variable. The connection-name is used to handle multiple connections in one program. It can be omitted if a program uses only one connection. The most recently opened connection becomes the current connection, which is used by default when an SQL statement is to be executed (see later in this chapter). Here are some examples of CONNECT statements: EXEC SQL CONNECT TO mydb@sql.mydomain.com; EXEC SQL CONNECT TO 'unix:postgresql://sql.mydomain.com/mydb' AS myconnection USER john; EXEC SQL BEGIN DECLARE SECTION; const char *target = "mydb@sql.mydomain.com"; const char *user = "john"; EXEC SQL END DECLARE SECTION; ... EXEC SQL CONNECT TO :target USER :user; The last form makes use of the variant referred to above as character variable reference. You will see in later sections how C variables can be used in SQL statements when you prefix them with a colon. Be advised that the format of the connection target is not specified in the SQL standard. So if you want to develop portable applications, you might want to use something based on the last example above to encapsulate the connection target string somewhere. Closing a Connection To close a connection, use the following statement: EXEC SQL DISCONNECT connection; The connection can be specified in the following ways: connection-name DEFAULT CURRENT ALL If no connection name is specified, the current connection is closed. It is good style that an application always explicitly disconnect from every connection it opened. Running SQL Commands Any SQL command can be run from within an embedded SQL application. Below are some examples of how to do that. Creating a table: EXEC SQL CREATE TABLE foo (number integer, ascii char(16)); EXEC SQL CREATE UNIQUE INDEX num1 ON foo(number); EXEC SQL COMMIT; Inserting rows: EXEC SQL INSERT INTO foo (number, ascii) VALUES (9999, 'doodad'); EXEC SQL COMMIT; Deleting rows: EXEC SQL DELETE FROM foo WHERE number = 9999; EXEC SQL COMMIT; Single-row select: EXEC SQL SELECT foo INTO :FooBar FROM table1 WHERE ascii = 'doodad'; Select using cursors: EXEC SQL DECLARE foo_bar CURSOR FOR SELECT number, ascii FROM foo ORDER BY ascii; EXEC SQL OPEN foo_bar; EXEC SQL FETCH foo_bar INTO :FooBar, DooDad; ... EXEC SQL CLOSE foo_bar; EXEC SQL COMMIT; Updates: EXEC SQL UPDATE foo SET ascii = 'foobar' WHERE number = 9999; EXEC SQL COMMIT; The tokens of the form :something are host variables, that is, they refer to variables in the C program. They are explained in . In the default mode, statements are committed only when EXEC SQL COMMIT is issued. The embedded SQL interface also supports autocommit of transactions (similar to libpq behavior) via the command-line option to ecpg (see below) or via the EXEC SQL SET AUTOCOMMIT TO ON statement. In autocommit mode, each command is automatically committed unless it is inside an explicit transaction block. This mode can be explicitly turned off using EXEC SQL SET AUTOCOMMIT TO OFF. Choosing a Connection The SQL statements shown in the previous section are executed on the current connection, that is, the most recently opened one. If an application needs to manage multiple connections, then there are two ways to handle this. The first option is to explicitly choose a connection for each SQL statement, for example EXEC SQL AT connection-name SELECT ...; This option is particularly suitable if the application needs to use several connections in mixed order. If your application uses multiple threads of execution, they cannot share a connection concurrently. You must either explicitly control access to the connection (using mutexes) or use a connection for each thread. If each thread uses its own connection, you will need to use the AT clause to specify which connection the thread will use. The second option is to execute a statement to switch the current connection. That statement is: EXEC SQL SET CONNECTION connection-name; This option is particularly convenient if many statements are to be executed on the same connection. It is not thread-aware. Using Host Variables In you saw how you can execute SQL statements from an embedded SQL program. Some of those statements only used fixed values and did not provide a way to insert user-supplied values into statements or have the program process the values returned by the query. Those kinds of statements are not really useful in real applications. This section explains in detail how you can pass data between your C program and the embedded SQL statements using a simple mechanism called host variables. Overview Passing data between the C program and the SQL statements is particularly simple in embedded SQL. Instead of having the program paste the data into the statement, which entails various complications, such as properly quoting the value, you can simply write the name of a C variable into the SQL statement, prefixed by a colon. For example: EXEC SQL INSERT INTO sometable VALUES (:v1, 'foo', :v2); This statements refers to two C variables named v1 and v2 and also uses a regular SQL string literal, to illustrate that you are not restricted to use one kind of data or the other. This style of inserting C variables in SQL statements works anywhere a value expression is expected in an SQL statement. In the SQL environment we call the references to C variables host variables. Declare Sections To pass data from the program to the database, for example as parameters in a query, or to pass data from the database back to the program, the C variables that are intended to contain this data need to be declared in specially marked sections, so the embedded SQL preprocessor is made aware of them. This section starts with EXEC SQL BEGIN DECLARE SECTION; and ends with EXEC SQL END DECLARE SECTION; Between those lines, there must be normal C variable declarations, such as int x; char foo[16], bar[16]; You can have as many declare sections in a program as you like. The declarations are also echoed to the output file as normal C variables, so there's no need to declare them again. Variables that are not intended to be used in SQL commands can be declared normally outside these special sections. The definition of a structure or union also must be listed inside a DECLARE section. Otherwise the preprocessor cannot handle these types since it does not know the definition. The special type VARCHAR is converted into a named struct for every variable. A declaration like VARCHAR var[180]; is converted into struct varchar_var { int len; char arr[180]; } var; This structure is suitable for interfacing with SQL datums of type varchar. <command>SELECT INTO</command> and <command>FETCH INTO</command> Now you should be able to pass data generated by your program into an SQL command. But how do you retrieve the results of a query? For that purpose, embedded SQL provides special variants of the usual commands SELECT and FETCH. These commands have a special INTO clause that specifies which host variables the retrieved values are to be stored in. Here is an example: /* * assume this table: * CREATE TABLE test1 (a int, b varchar(50)); */ EXEC SQL BEGIN DECLARE SECTION; int v1; VARCHAR v2; EXEC SQL END DECLARE SECTION; ... EXEC SQL SELECT a, b INTO :v1, :v2 FROM test; So the INTO clause appears between the select list and the FROM clause. The number of elements in the select list and the list after INTO (also called the target list) must be equal. Here is an example using the command FETCH: EXEC SQL BEGIN DECLARE SECTION; int v1; VARCHAR v2; EXEC SQL END DECLARE SECTION; ... EXEC SQL DECLARE foo CURSOR FOR SELECT a, b FROM test; ... do { ... EXEC SQL FETCH NEXT FROM foo INTO :v1, :v2; ... } while (...); Here the INTO clause appears after all the normal clauses. Both of these methods only allow retrieving one row at a time. If you need to process result sets that potentially contain more than one row, you need to use a cursor, as shown in the second example. Indicators The examples above do not handle null values. In fact, the retrieval examples will raise an error if they fetch a null value from the database. To be able to pass null values to the database or retrieve null values from the database, you need to append a second host variable specification to each host variable that contains data. This second host variable is called the indicator and contains a flag that tells whether the datum is null, in which case the value of the real host variable is ignored. Here is an example that handles the retrieval of null values correctly: EXEC SQL BEGIN DECLARE SECTION; VARCHAR val; int val_ind; EXEC SQL END DECLARE SECTION: ... EXEC SQL SELECT b INTO :val :val_ind FROM test1; The indicator variable val_ind will be zero if the value was not null, and it will be negative if the value was null. The indicator has another function: if the indicator value is positive, it means that the value is not null, but it was truncated when it was stored in the host variable. Dynamic SQL In many cases, the particular SQL statements that an application has to execute are known at the time the application is written. In some cases, however, the SQL statements are composed at run time or provided by an external source. In these cases you cannot embed the SQL statements directly into the C source code, but there is a facility that allows you to call arbitrary SQL statements that you provide in a string variable. The simplest way to execute an arbitrary SQL statement is to use the command EXECUTE IMMEDIATE. For example: EXEC SQL BEGIN DECLARE SECTION; const char *stmt = "CREATE TABLE test1 (...);"; EXEC SQL END DECLARE SECTION; EXEC SQL EXECUTE IMMEDIATE :stmt; You may not execute statements that retrieve data (e.g., SELECT) this way. A more powerful way to execute arbitrary SQL statements is to prepare them once and execute the prepared statement as often as you like. It is also possible to prepare a generalized version of a statement and then execute specific versions of it by substituting parameters. When preparing the statement, write question marks where you want to substitute parameters later. For example: EXEC SQL BEGIN DECLARE SECTION; const char *stmt = "INSERT INTO test1 VALUES(?, ?);"; EXEC SQL END DECLARE SECTION; EXEC SQL PREPARE mystmt FROM :stmt; ... EXEC SQL EXECUTE mystmt USING 42, 'foobar'; If the statement you are executing returns values, then add an INTO clause: EXEC SQL BEGIN DECLARE SECTION; const char *stmt = "SELECT a, b, c FROM test1 WHERE a > ?"; int v1, v2; VARCHAR v3; EXEC SQL END DECLARE SECTION; EXEC SQL PREPARE mystmt FROM :stmt; ... EXEC SQL EXECUTE mystmt INTO v1, v2, v3 USING 37; An EXECUTE command may have an INTO clause, a USING clause, both, or neither. When you don't need the prepared statement anymore, you should deallocate it: EXEC SQL DEALLOCATE PREPARE name; Using SQL Descriptor Areas An SQL descriptor area is a more sophisticated method for processing the result of a SELECT or FETCH statement. An SQL descriptor area groups the data of one row of data together with metadata items into one data structure. The metadata is particularly useful when executing dynamic SQL statements, where the nature of the result columns may not be known ahead of time. An SQL descriptor area consists of a header, which contains information concerning the entire descriptor, and one or more item descriptor areas, which basically each describe one column in the result row. Before you can use an SQL descriptor area, you need to allocate one: EXEC SQL ALLOCATE DESCRIPTOR identifier; The identifier serves as the variable name of the descriptor area. The scope of the allocated descriptor is WHAT?. When you don't need the descriptor anymore, you should deallocate it: EXEC SQL DEALLOCATE DESCRIPTOR identifier; To use a descriptor area, specify it as the storage target in an INTO clause, instead of listing host variables: EXEC SQL FETCH NEXT FROM mycursor INTO DESCRIPTOR mydesc; Now how do you get the data out of the descriptor area? You can think of the descriptor area as a structure with named fields. To retrieve the value of a field from the header and store it into a host variable, use the following command: EXEC SQL GET DESCRIPTOR name :hostvar = field; Currently, there is only one header field defined: COUNT, which tells how many item descriptor areas exist (that is, how many columns are contained in the result). The host variable needs to be of an integer type. To get a field from the item descriptor area, use the following command: EXEC SQL GET DESCRIPTOR name VALUE num :hostvar = field; num can be a literal integer or a host variable containing an integer. Possible fields are: CARDINALITY (integer) number of rows in the result set DATA actual data item (therefore, the data type of this field depends on the query) DATETIME_INTERVAL_CODE (integer) ? DATETIME_INTERVAL_PRECISION (integer) not implemented INDICATOR (integer) the indicator (indicating a null value or a value truncation) KEY_MEMBER (integer) not implemented LENGTH (integer) length of the datum in characters NAME (string) name of the column NULLABLE (integer) not implemented OCTET_LENGTH (integer) length of the character representation of the datum in bytes PRECISION (integer) precision (for type numeric) RETURNED_LENGTH (integer) length of the datum in characters RETURNED_OCTET_LENGTH (integer) length of the character representation of the datum in bytes SCALE (integer) scale (for type numeric) TYPE (integer) numeric code of the data type of the column Error Handling This section describes how you can handle exceptional conditions and warnings in an embedded SQL program. There are several nonexclusive facilities for this. Setting Callbacks One simple method to catch errors and warnings is to set a specific action to be executed whenever a particular condition occurs. In general: EXEC SQL WHENEVER condition action; condition can be one of the following: SQLERROR The specified action is called whenever an error occurs during the execution of an SQL statement. SQLWARNING The specified action is called whenever a warning occurs during the execution of an SQL statement. NOT FOUND The specified action is called whenever an SQL statement retrieves or affects zero rows. (This condition is not an error, but you might be interested in handling it specially.) action can be one of the following: CONTINUE This effectively means that the condition is ignored. This is the default. GOTO label GO TO label Jump to the specified label (using a C goto statement). SQLPRINT Print a message to standard error. This is useful for simple programs or during prototyping. The details of the message cannot be configured. STOP Call exit(1), which will terminate the program. BREAK Execute the C statement break. This should only be used in loops or switch statements. CALL name (args) DO name (args) Call the specified C functions with the specified arguments. The SQL standard only provides for the actions CONTINUE and GOTO (and GO TO). Here is an example that you might want to use in a simple program. It prints a simple message when a warning occurs and aborts the program when an error happens. EXEC SQL WHENEVER SQLWARNING SQLPRINT; EXEC SQL WHENEVER SQLERROR STOP; The statement EXEC SQL WHENEVER is a directive of the SQL preprocessor, not a C statement. The error or warning actions that it sets apply to all embedded SQL statements that appear below the point where the handler is set, unless a different action was set for the same condition between the first EXEC SQL WHENEVER and the SQL statement causing the condition, regardless of the flow of control in the C program. So neither of the two following C program excerpts will have the desired effect. /* * WRONG */ int main(int argc, char *argv[]) { ... if (verbose) { EXEC SQL WHENEVER SQLWARNING SQLPRINT; } ... EXEC SQL SELECT ...; ... } /* * WRONG */ int main(int argc, char *argv[]) { ... set_error_handler(); ... EXEC SQL SELECT ...; ... } static void set_error_handler(void) { EXEC SQL WHENEVER SQLERROR STOP; } sqlca For more powerful error handling, the embedded SQL interface provides a global variable with the name sqlca that has the following structure: struct { char sqlcaid[8]; long sqlabc; long sqlcode; struct { int sqlerrml; char sqlerrmc[70]; } sqlerrm; char sqlerrp[8]; long sqlerrd[6]; char sqlwarn[8]; char sqlstate[5]; } sqlca; (In a multithreaded program, every thread automatically gets its own copy of sqlca. This works similarly to the handling of the standard C global variable errno.) sqlca covers both warnings and errors. If multiple warnings or errors occur during the execution of a statement, then sqlca will only contain information about the last one. If no error occurred in the last SQL statement, sqlca.sqlcode will be 0 and sqlca.sqlstate will be "00000". If a warning or error occurred, then sqlca.sqlcode will be negative and sqlca.sqlstate will be different from "00000". A positive sqlca.sqlcode indicates a harmless condition, such as that the last query returned zero rows. sqlcode and sqlstate are two different error code schemes; details appear below. If the last SQL statement was successful, then sqlca.sqlerrd[1] contains the OID of the processed row, if applicable, and sqlca.sqlerrd[2] contains the number of processed or returned rows, if applicable to the command. In case of an error or warning, sqlca.sqlerrm.sqlerrmc will contain a string that describes the error. The field sqlca.sqlerrm.sqlerrml contains the length of the error message that is stored in sqlca.sqlerrm.sqlerrmc (the result of strlen(), not really interesting for a C programmer). Note that some messages are too long to fit in the fixed-size sqlerrmc array; they will be truncated. In case of a warning, sqlca.sqlwarn[2] is set to W. (In all other cases, it is set to something different from W.) If sqlca.sqlwarn[1] is set to W, then a value was truncated when it was stored in a host variable. sqlca.sqlwarn[0] is set to W if any of the other elements are set to indicate a warning. The fields sqlcaid, sqlcabc, sqlerrp, and the remaining elements of sqlerrd and sqlwarn currently contain no useful information. The structure sqlca is not defined in the SQL standard, but is implemented in several other SQL database systems. The definitions are similar at the core, but if you want to write portable applications, then you should investigate the different implementations carefully. <literal>SQLSTATE</literal> vs <literal>SQLCODE</literal> The fields sqlca.sqlstate and sqlca.sqlcode are two different schemes that provide error codes. Both are specified in the SQL standard, but SQLCODE has been marked deprecated in the 1992 edition of the standard and has been dropped in the 1999 edition. Therefore, new applications are strongly encouraged to use SQLSTATE. SQLSTATE is a five-character array. The five characters contain digits or upper-case letters that represent codes of various error and warning conditions. SQLSTATE has a hierarchical scheme: the first two characters indicate the general class of the condition, the last three characters indicate a subclass of the general condition. A successful state is indicated by the code 00000. The SQLSTATE codes are for the most part defined in the SQL standard. The PostgreSQL server natively supports SQLSTATE error codes; therefore a high degree of consistency can be achieved by using this error code scheme throughout all applications. For further information see . SQLCODE, the deprecated error code scheme, is a simple integer. A value of 0 indicates success, a positive value indicates success with additional information, a negative value indicates an error. The SQL standard only defines the positive value +100, which indicates that the last command returned or affected zero rows, and no specific negative values. Therefore, this scheme can only achieve poor portability and does not have a hierarchical code assignment. Historically, the embedded SQL processor for PostgreSQL has assigned some specific SQLCODE values for its use, which are listed below with their numeric value and their symbolic name. Remember that these are not portable to other SQL implementations. To simplify the porting of applications to the SQLSTATE scheme, the corresponding SQLSTATE is also listed. There is, however, no one-to-one or one-to-many mapping between the two schemes (indeed it is many-to-many), so you should consult the global SQLSTATE listing in in each case. These are the assigned SQLCODE values: -12 (ECPG_OUT_OF_MEMORY) Indicates that your virtual memory is exhausted. (SQLSTATE YE001) -200 (ECPG_UNSUPPORTED) Indicates the preprocessor has generated something that the library does not know about. Perhaps you are running incompatible versions of the preprocessor and the library. (SQLSTATE YE002) -201 (ECPG_TOO_MANY_ARGUMENTS) This means that the command specified more host variables than the command expected. (SQLSTATE 07001 or 07002) -202 (ECPG_TOO_FEW_ARGUMENTS) This means that the command specified fewer host variables than the command expected. (SQLSTATE 07001 or 07002) -203 (ECPG_TOO_MANY_MATCHES) This means a query has returned multiple rows but the statement was only prepared to store one result row (for example, because the specified variables are not arrays). (SQLSTATE 21000) -204 (ECPG_INT_FORMAT) The host variable is of type int and the datum in the database is of a different type and contains a value that cannot be interpreted as an int. The library uses strtol() for this conversion. (SQLSTATE 42804) -205 (ECPG_UINT_FORMAT) The host variable is of type unsigned int and the datum in the database is of a different type and contains a value that cannot be interpreted as an unsigned int. The library uses strtoul() for this conversion. (SQLSTATE 42804) -206 (ECPG_FLOAT_FORMAT) The host variable is of type float and the datum in the database is of another type and contains a value that cannot be interpreted as a float. The library uses strtod() for this conversion. (SQLSTATE 42804) -207 (ECPG_CONVERT_BOOL) This means the host variable is of type bool and the datum in the database is neither 't' nor 'f'. (SQLSTATE 42804) -208 (ECPG_EMPTY) The statement sent to the PostgreSQL server was empty. (This cannot normally happen in an embedded SQL program, so it may point to an internal error.) (SQLSTATE YE002) -209 (ECPG_MISSING_INDICATOR) A null value was returned and no null indicator variable was supplied. (SQLSTATE 22002) -210 (ECPG_NO_ARRAY) An ordinary variable was used in a place that requires an array. (SQLSTATE 42804) -211 (ECPG_DATA_NOT_ARRAY) The database returned an ordinary variable in a place that requires array value. (SQLSTATE 42804) -220 (ECPG_NO_CONN) The program tried to access a connection that does not exist. (SQLSTATE 08003) -221 (ECPG_NOT_CONN) The program tried to access a connection that does exist but is not open. (This is an internal error.) (SQLSTATE YE002) -230 (ECPG_INVALID_STMT) The statement you are trying to use has not been prepared. (SQLSTATE 26000) -240 (ECPG_UNKNOWN_DESCRIPTOR) The descriptor specified was not found. The statement you are trying to use has not been prepared. (SQLSTATE 33000) -241 (ECPG_INVALID_DESCRIPTOR_INDEX) The descriptor index specified was out of range. (SQLSTATE 07009) -242 (ECPG_UNKNOWN_DESCRIPTOR_ITEM) An invalid descriptor item was requested. (This is an internal error.) (SQLSTATE YE002) -243 (ECPG_VAR_NOT_NUMERIC) During the execution of a dynamic statement, the database returned a numeric value and the host variable was not numeric. (SQLSTATE 07006) -244 (ECPG_VAR_NOT_CHAR) During the execution of a dynamic statement, the database returned a non-numeric value and the host variable was numeric. (SQLSTATE 07006) -400 (ECPG_PGSQL) Some error caused by the PostgreSQL server. The message contains the error message from the PostgreSQL server. -401 (ECPG_TRANS) The PostgreSQL server signaled that we cannot start, commit, or rollback the transaction. (SQLSTATE 08007) -402 (ECPG_CONNECT) The connection attempt to the database did not succeed. (SQLSTATE 08001) 100 (ECPG_NOT_FOUND) This is a harmless condition indicating that the last command retrieved or processed zero rows, or that you are at the end of the cursor. (SQLSTATE 02000) Including Files To include an external file into your embedded SQL program, use: EXEC SQL INCLUDE filename; The embedded SQL preprocessor will look for a file named filename.h, preprocess it, and include it in the resulting C output. Thus, embedded SQL statements in the included file are handled correctly. Note that this is not the same as #include <filename.h> because this file would not be subject to SQL command preprocessing. Naturally, you can continue to use the C #include directive to include other header files. The include file name is case-sensitive, even though the rest of the EXEC SQL INCLUDE command follows the normal SQL case-sensitivity rules. Processing Embedded SQL Programs Now that you have an idea how to form embedded SQL C programs, you probably want to know how to compile them. Before compiling you run the file through the embedded SQL C preprocessor, which converts the SQL statements you used to special function calls. After compiling, you must link with a special library that contains the needed functions. These functions fetch information from the arguments, perform the SQL command using the libpq interface, and put the result in the arguments specified for output. The preprocessor program is called ecpg and is included in a normal PostgreSQL installation. Embedded SQL programs are typically named with an extension .pgc. If you have a program file called prog1.pgc, you can preprocess it by simply calling ecpg prog1.pgc This will create a file called prog1.c. If your input files do not follow the suggested naming pattern, you can specify the output file explicitly using the option. The preprocessed file can be compiled normally, for example: cc -c prog1.c The generated C source files include header files from the PostgreSQL installation, so if you installed PostgreSQL in a location that is not searched by default, you have to add an option such as -I/usr/local/pgsql/include to the compilation command line. To link an embedded SQL program, you need to include the libecpg library, like so: cc -o myprog prog1.o prog2.o ... -lecpg Again, you might have to add an option like -L/usr/local/pgsql/lib to that command line. If you manage the build process of a larger project using make, it may be convenient to include the following implicit rule to your makefiles: ECPG = ecpg %.c: %.pgc $(ECPG) $< The complete syntax of the ecpg command is detailed in . The ecpg library is thread-safe if it is built using the Library Functions The libecpg library primarily contains hidden functions that are used to implement the functionality expressed by the embedded SQL commands. But there are some functions that can usefully be called directly. Note that this makes your code unportable. ECPGdebug(int on, FILE *stream) turns on debug logging if called with the first argument non-zero. Debug logging is done on stream. The log contains all SQL statements with all the input variables inserted, and the results from the PostgreSQL server. This can be very useful when searching for errors in your SQL statements. ECPGstatus(int lineno, const char* connection_name) returns true if you are connected to a database and false if not. connection_name can be NULL if a single connection is being used. Internals This section explains how ECPG works internally. This information can occasionally be useful to help users understand how to use ECPG. The first four lines written by ecpg to the output are fixed lines. Two are comments and two are include lines necessary to interface to the library. Then the preprocessor reads through the file and writes output. Normally it just echoes everything to the output. When it sees an EXEC SQL statement, it intervenes and changes it. The command starts with EXEC SQL and ends with ;. Everything in between is treated as an SQL statement and parsed for variable substitution. Variable substitution occurs when a symbol starts with a colon (:). The variable with that name is looked up among the variables that were previously declared within a EXEC SQL DECLARE section. The most important function in the library is ECPGdo, which takes care of executing most commands. It takes a variable number of arguments. This can easily add up to 50 or so arguments, and we hope this will not be a problem on any platform. The arguments are: A line number This is the line number of the original line; used in error messages only. A string This is the SQL command that is to be issued. It is modified by the input variables, i.e., the variables that where not known at compile time but are to be entered in the command. Where the variables should go the string contains ?. Input variables Every input variable causes ten arguments to be created. (See below.) ECPGt_EOIT An enum telling that there are no more input variables. Output variables Every output variable causes ten arguments to be created. (See below.) These variables are filled by the function. ECPGt_EORT An enum telling that there are no more variables. For every variable that is part of the SQL command, the function gets ten arguments: The type as a special symbol. A pointer to the value or a pointer to the pointer. The size of the variable if it is a char or varchar. The number of elements in the array (for array fetches). The offset to the next element in the array (for array fetches). The type of the indicator variable as a special symbol. A pointer to the indicator variable. 0 The number of elements in the indicator array (for array fetches). The offset to the next element in the indicator array (for array fetches). Note that not all SQL commands are treated in this way. For instance, an open cursor statement like EXEC SQL OPEN cursor; is not copied to the output. Instead, the cursor's DECLARE command is used at the position of the OPEN command because it indeed opens the cursor. Here is a complete example describing the output of the preprocessor of a file foo.pgc (details may change with each particular version of the preprocessor): EXEC SQL BEGIN DECLARE SECTION; int index; int result; EXEC SQL END DECLARE SECTION; ... EXEC SQL SELECT res INTO :result FROM mytable WHERE index = :index; is translated into: /* Processed by ecpg (2.6.0) */ /* These two include files are added by the preprocessor */ #include <ecpgtype.h>; #include <ecpglib.h>; /* exec sql begin declare section */ #line 1 "foo.pgc" int index; int result; /* exec sql end declare section */ ... ECPGdo(__LINE__, NULL, "SELECT res FROM mytable WHERE index = ? ", ECPGt_int,&(index),1L,1L,sizeof(int), ECPGt_NO_INDICATOR, NULL , 0L, 0L, 0L, ECPGt_EOIT, ECPGt_int,&(result),1L,1L,sizeof(int), ECPGt_NO_INDICATOR, NULL , 0L, 0L, 0L, ECPGt_EORT); #line 147 "foo.pgc" (The indentation here is added for readability and not something the preprocessor does.)