From 8e6467fff304fbfb0bb66925aa08a9f11c125daf Mon Sep 17 00:00:00 2001 From: Bruce Momjian Date: Wed, 21 Nov 2001 03:17:22 +0000 Subject: [PATCH] Peter Eisentraut wrote: > So I would base this discussion on the premise "bytea stores binary data" > (insert examples). > > Some stylistic issues: > > bytea => bytea > > NULLs => zero bytes/bytes of value zero ("NULL" is too overloaded) > > 'non-printable' => nonprintable > > MUST => must > Here's a patch against *CVS tip* to address Peter's comments. Please let me know what you think! Joe Conway --- doc/src/sgml/datatype.sgml | 135 +++++++++++++++++++------------------ 1 file changed, 69 insertions(+), 66 deletions(-) diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml index 016993c0eb..a6012e56ac 100644 --- a/doc/src/sgml/datatype.sgml +++ b/doc/src/sgml/datatype.sgml @@ -1,5 +1,5 @@ @@ -984,7 +984,7 @@ SELECT b, char_length(b) FROM test2; bytea - 4 bytes plus the actual string + 4 bytes plus the actual binary string Variable (not specifically limited) length binary string @@ -994,29 +994,28 @@ SELECT b, char_length(b) FROM test2; A binary string is a sequence of octets that does not have either a - character set or collation associated with it. Bytea specifically - allows storage of NULLs and other 'non-printable' ASCII - characters. + character set or collation associated with it. Bytea + specifically allows storing octets of zero value and other + non-printable octets. - Certain ASCII characters MUST be escaped (but all - characters MAY be escaped) when used as part of a string literal in an - SQL statement. In general, to escape a character, it - is converted into the three digit octal number equal to the decimal - ASCII value, and preceeded by two backslashes. The - single quote (') and backslash (\) characters have special alternate - escape sequences. Details are in + Octets of certain values must be escaped (but all + octet values may be escaped) when used as part of + a string literal in an SQL statement. In general, + to escape an octet, it is converted into the three digit octal number + equivalent of its decimal octet value, and preceeded by two + backslashes. Octets with the decimal values 39 (single quote), and 92 + (backslash), have special alternate escape sequences. Details are in . - <acronym>SQL</acronym> Literal Escaped <acronym>ASCII</acronym> - Characters + <acronym>SQL</acronym> Literal Escaped Octets - Decimal ASCII Value + Decimal Octet Value Description Input Escaped Representation Example @@ -1027,7 +1026,7 @@ SELECT b, char_length(b) FROM test2; 0 - null byte + zero octet '\\000' select '\\000'::bytea; \000 @@ -1055,24 +1054,23 @@ SELECT b, char_length(b) FROM test2; Note that the result in each of the examples above was exactly one - byte in length, even though the output representation of the null byte - and backslash are more than one character. Bytea output characters - are also escaped. In general, each "non-printable" character is - converted into the three digit octal number equal to its decimal - ASCII value, and preceeded by one backslash. Most - "printable" characters are represented by their standard - ASCII representation. The backslash (\) character - has a special alternate output representation. Details are in - . + octet in length, even though the output representation of the zero + octet and backslash are more than one character. Bytea + output octets are also escaped. In general, each + non-printable octet decimal value is converted into + its equivalent three digit octal value, and preceeded by one backslash. + Most printable octets are represented by their standard + representation in the client character set. The octet with decimal + value 92 (backslash) has a special alternate output representation. + Details are in .
- <acronym>SQL</acronym> Output Escaped <acronym>ASCII</acronym> - Characters + <acronym>SQL</acronym> Output Escaped Octets - Decimal ASCII Value + Decimal Octet Value Description Output Escaped Representation Example @@ -1100,7 +1098,7 @@ SELECT b, char_length(b) FROM test2; 0 to 31 and 127 to 255 - non-printable characters + non-printable octets \### (octal value) select '\\001'::bytea; \001 @@ -1108,8 +1106,8 @@ SELECT b, char_length(b) FROM test2; 32 to 126 - printable characters - ASCII representation + printable octets + client character set representation select '\\176'::bytea; ~ @@ -1123,76 +1121,81 @@ SELECT b, char_length(b) FROM test2; preceeded with two backslashes due to the fact that they must pass through two parsers in the PostgreSQL backend. The first backslash is interpreted as an escape character by the string literal parser, - and therefore is consumed, leaving the characters that follow it. - The second backslash is recognized by bytea input function + and therefore is consumed, leaving the octets that follow. + The second backslash is recognized by bytea input function as the prefix of a three digit octal value. For example, a string literal passed to the backend as '\\001' becomes '\001' after passing through the string literal - parser. The '\001' is then sent to the bytea - input function, where it is converted to a single byte with a decimal - ASCII value of 1. + parser. The '\001' is then sent to the + bytea input function, where it is converted to a single + octet with a decimal value of 1. For a similar reason, a backslash must be input as '\\\\' (or '\\134'). The first - and third backslashes are interpreted as escape characters by the + and third backslashes are interpreted as escape octets by the string literal parser, and therefore are consumed, leaving the second and forth backslashes untouched. The second and forth - backslashes are recognized by bytea input function as a single - backslash. For example, a string literal passed to the backend as - '\\\\' becomes '\\' after passing - through the string literal parser. The '\\' is then - sent to the bytea input function, where it is converted to a single - byte with a decimal ASCII value of 92. + backslashes are recognized by the bytea input function + as a single backslash. For example, a string literal passed to the + backend as '\\\\' becomes '\\' + after passing through the string literal parser. The + '\\' is then sent to the bytea input + function, where it is converted to a single octet with a decimal + value of 92. A single quote is a bit different in that it must be input as - '\'' (or '\\134'), NOT as - '\\''. This is because, while the literal parser - interprets the single quote as a special character, and will consume - the single backslash, the bytea input function does NOT recognize - a single quote as a special character. Therefore a string + '\'' (or '\\134'), + not as '\\''. This is because, + while the literal parser interprets the single quote as a special + character, and will consume the single backslash, the + bytea input function does not + recognize a single quote as a special octet. Therefore a string literal passed to the backend as '\'' becomes ''' after passing through the string literal - parser. The ''' is then sent to the bytea - input function, where it is retains its single byte decimal - ASCII value of 39. + parser. The ''' is then sent to the + bytea input function, where it is retains its single + octet decimal value of 39. Depending on the front end to PostgreSQL you use, you may have - additional work to do in terms of escaping and unescaping bytea - strings. For example, you may also have to escape line feeds and - carriage return if your interface automatically translates these. - Or you may have to double up on backslashes if the parser for your - language or choice also treats them as an escape character. + additional work to do in terms of escaping and unescaping + bytea strings. For example, you may also have to escape + line feeds and carriage returns if your interface automatically + translates these. Or you may have to double up on backslashes if + the parser for your language or choice also treats them as an + escape octet. Compatibility - Bytea provides most of the functionality of the SQL99 binary string - type per SQL99 section 4.3. A comparison of PostgreSQL bytea and SQL99 - Binary Strings is presented in + Bytea provides most of the functionality of the binary + string type per SQL99 section 4.3. A comparison of SQL99 Binary + Strings and PostgreSQL bytea is presented in .
- Comparison of SQL99 Binary String and BYTEA types + Comparison of SQL99 Binary String and PostgreSQL + <type>BYTEA</type> types SQL99 - BYTEA + BYTEA - Name of data type BINARY LARGE OBJECT or BLOB - Name of data type BYTEA + Name of data type BINARY LARGE OBJECT + or BLOB + Name of data type BYTEA @@ -1242,9 +1245,9 @@ SELECT b, char_length(b) FROM test2; A binary string literal is comprised of an even number of - hexidecimal digits, in single quotes, preceeded by "X", - e.g. X'1a43fe' - A binary string literal is comprised of ASCII characters + hexidecimal digits, in single quotes, preceeded by X, + e.g. X'1a43fe' + A binary string literal is comprised of octets escaped according to the rules shown in