mirror of
https://git.postgresql.org/git/postgresql.git
synced 2025-01-24 18:55:04 +08:00
In an effort to reduce the total number of chapters, combine the small
chapters on extending types, operators, and aggregates into the extending functions chapter. Move the information on how to call table functions into the queries chapter. Remove some outdated information that is already present in a better form in other parts of the documentation.
This commit is contained in:
parent
730840c9b6
commit
a6554df4f7
@ -2,7 +2,7 @@
|
||||
#
|
||||
# PostgreSQL documentation makefile
|
||||
#
|
||||
# $Header: /cvsroot/pgsql/doc/src/sgml/Makefile,v 1.56 2003/03/25 16:15:35 petere Exp $
|
||||
# $Header: /cvsroot/pgsql/doc/src/sgml/Makefile,v 1.57 2003/04/10 01:22:44 petere Exp $
|
||||
#
|
||||
#----------------------------------------------------------------------------
|
||||
|
||||
@ -77,7 +77,7 @@ all: html
|
||||
|
||||
.PHONY: html
|
||||
|
||||
html: postgres.sgml $(ALLSGML) stylesheet.dsl catalogs.gif connections.gif
|
||||
html: postgres.sgml $(ALLSGML) stylesheet.dsl
|
||||
@rm -f *.html
|
||||
$(JADE) $(JADEFLAGS) $(SGMLINCLUDE) $(CATALOG) -d stylesheet.dsl -i output-html -t sgml $<
|
||||
|
||||
@ -114,8 +114,6 @@ features-unsupported.sgml: $(top_srcdir)/src/backend/catalog/sql_feature_package
|
||||
%.rtf: %.sgml $(ALLSGML) stylesheet.dsl
|
||||
$(JADE) $(JADEFLAGS) $(SGMLINCLUDE) $(CATALOG) -d stylesheet.dsl -t rtf -V rtf-backend -i output-print $<
|
||||
|
||||
postgres.rtf: catalogs.gif connections.gif
|
||||
|
||||
# TeX
|
||||
# Regular TeX and pdfTeX have slightly differing requirements, so we
|
||||
# need to distinguish the path we're taking.
|
||||
@ -123,13 +121,9 @@ postgres.rtf: catalogs.gif connections.gif
|
||||
%.tex-ps: %.sgml $(ALLSGML) stylesheet.dsl
|
||||
$(JADE) $(JADEFLAGS) $(SGMLINCLUDE) $(CATALOG) -d stylesheet.dsl -t tex -V tex-backend -i output-print -V texdvi-output -o $@ $<
|
||||
|
||||
postgres.tex-ps: catalogs.eps connections.eps
|
||||
|
||||
%.tex-pdf: %.sgml $(ALLSGML) stylesheet.dsl
|
||||
$(JADE) $(JADEFLAGS) $(SGMLINCLUDE) $(CATALOG) -d stylesheet.dsl -t tex -V tex-backend -i output-print -V texpdf-output -o $@ $<
|
||||
|
||||
postgres.tex-pdf: catalogs.pdf connections.pdf
|
||||
|
||||
%.dvi: %.tex-ps
|
||||
@rm -f $*.aux $*.log
|
||||
jadetex $<
|
||||
|
@ -1,116 +0,0 @@
|
||||
<Chapter Id="arch-pg">
|
||||
<TITLE>Architecture</TITLE>
|
||||
|
||||
<Sect1 id="arch-pg-concepts">
|
||||
<Title><ProductName>PostgreSQL</ProductName> Architectural Concepts</Title>
|
||||
|
||||
<Para>
|
||||
Before we begin, you should understand the basic
|
||||
<ProductName>PostgreSQL</ProductName> system architecture. Understanding how the
|
||||
parts of <ProductName>PostgreSQL</ProductName> interact will make the next chapter
|
||||
somewhat clearer.
|
||||
In database jargon, <ProductName>PostgreSQL</ProductName> uses a simple "process
|
||||
per-user" client/server model. A <ProductName>PostgreSQL</ProductName> session
|
||||
consists of the following cooperating Unix processes (programs):
|
||||
|
||||
<ItemizedList>
|
||||
<ListItem>
|
||||
<Para>
|
||||
A supervisory daemon process (the <Application>postmaster</Application>),
|
||||
</Para>
|
||||
</ListItem>
|
||||
<ListItem>
|
||||
<Para>
|
||||
the user's frontend application (e.g., the <Application>psql</Application> program), and
|
||||
</Para>
|
||||
</ListItem>
|
||||
<ListItem>
|
||||
<Para>
|
||||
one or more backend database servers (the <Application>postgres</Application> process itself).
|
||||
</Para>
|
||||
</ListItem>
|
||||
</ItemizedList>
|
||||
</para>
|
||||
<Para>
|
||||
A single <Application>postmaster</Application> manages a given collection of
|
||||
databases on a single host. Such a collection of
|
||||
databases is called a cluster (of databases). A frontend
|
||||
application that wishes to access a given database
|
||||
within a cluster makes calls to an interface library (e.g., <application>libpq</>)
|
||||
that is linked into the application.
|
||||
The library sends user requests over the network to the
|
||||
<Application>postmaster</Application>
|
||||
(<XRef LinkEnd="PGARCH-CONNECTIONS">(a)),
|
||||
which in turn starts a new backend server process
|
||||
(<XRef LinkEnd="PGARCH-CONNECTIONS">(b))
|
||||
|
||||
<figure id="PGARCH-CONNECTIONS">
|
||||
<title>How a connection is established</title>
|
||||
|
||||
<mediaobject>
|
||||
<imageobject>
|
||||
<imagedata align="center" fileref="connections">
|
||||
</imageobject>
|
||||
</mediaobject>
|
||||
</figure>
|
||||
|
||||
and connects the frontend process to the new server
|
||||
(<XRef LinkEnd="PGARCH-CONNECTIONS">(c)).
|
||||
From that point on, the frontend process and the backend
|
||||
server communicate without intervention by the
|
||||
<Application>postmaster</Application>. Hence, the <Application>postmaster</Application> is always running, waiting
|
||||
for connection requests, whereas frontend and backend processes
|
||||
come and go. The <FileName>libpq</FileName> library allows a single
|
||||
frontend to make multiple connections to backend processes.
|
||||
However, each backend process is a single-threaded process that can
|
||||
only execute one query at a time; so the communication over any one
|
||||
frontend-to-backend connection is single-threaded.
|
||||
</Para>
|
||||
|
||||
<Para>
|
||||
One implication of this architecture is that the
|
||||
<Application>postmaster</Application> and the backend always run on the
|
||||
same machine (the database server), while the frontend
|
||||
application may run anywhere. You should keep this
|
||||
in mind,
|
||||
because the files that can be accessed on a client
|
||||
machine may not be accessible (or may only be accessed
|
||||
using a different path name) on the database server
|
||||
machine.
|
||||
</Para>
|
||||
|
||||
<Para>
|
||||
You should also be aware that the <Application>postmaster</Application> and
|
||||
<application>postgres</> servers run with the user ID of the <ProductName>PostgreSQL</ProductName>
|
||||
<quote>superuser</>.
|
||||
Note that the <ProductName>PostgreSQL</ProductName> superuser does not
|
||||
have to be any particular user (e.g., a user named
|
||||
<literal>postgres</literal>), although many systems are installed that way.
|
||||
Furthermore, the <ProductName>PostgreSQL</ProductName> superuser should
|
||||
definitely not be the Unix superuser, <literal>root</literal>!
|
||||
It is safest if the <ProductName>PostgreSQL</ProductName> superuser is an
|
||||
ordinary, unprivileged user so far as the surrounding Unix system is
|
||||
concerned.
|
||||
In any case, all files relating to a database should belong to
|
||||
this <ProductName>Postgres</ProductName> superuser.
|
||||
</Para>
|
||||
</sect1>
|
||||
</Chapter>
|
||||
|
||||
<!-- Keep this comment at the end of the file
|
||||
Local variables:
|
||||
mode:sgml
|
||||
sgml-omittag:nil
|
||||
sgml-shorttag:t
|
||||
sgml-minimize-attributes:nil
|
||||
sgml-always-quote-attributes:t
|
||||
sgml-indent-step:1
|
||||
sgml-indent-tabs-mode:nil
|
||||
sgml-indent-data:t
|
||||
sgml-parent-document:nil
|
||||
sgml-default-dtd-file:"./reference.ced"
|
||||
sgml-exposed-tags:nil
|
||||
sgml-local-catalogs:("/usr/share/sgml/catalog")
|
||||
sgml-local-ecat-files:nil
|
||||
End:
|
||||
-->
|
@ -1,5 +1,5 @@
|
||||
<!--
|
||||
$Header: /cvsroot/pgsql/doc/src/sgml/dfunc.sgml,v 1.24 2003/03/25 16:15:35 petere Exp $
|
||||
$Header: /cvsroot/pgsql/doc/src/sgml/dfunc.sgml,v 1.25 2003/04/10 01:22:44 petere Exp $
|
||||
-->
|
||||
|
||||
<sect2 id="dfunc">
|
||||
@ -14,7 +14,8 @@ $Header: /cvsroot/pgsql/doc/src/sgml/dfunc.sgml,v 1.24 2003/03/25 16:15:35 peter
|
||||
</para>
|
||||
|
||||
<para>
|
||||
For more information you should read the documentation of your
|
||||
For information beyond what is contained in this section
|
||||
you should read the documentation of your
|
||||
operating system, in particular the manual pages for the C compiler,
|
||||
<command>cc</command>, and the link editor, <command>ld</command>.
|
||||
In addition, the <productname>PostgreSQL</productname> source code
|
||||
@ -47,13 +48,10 @@ $Header: /cvsroot/pgsql/doc/src/sgml/dfunc.sgml,v 1.24 2003/03/25 16:15:35 peter
|
||||
here.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
|
||||
<!--
|
||||
Note: Reading GNU Libtool sources is generally a good way of figuring out
|
||||
this information. The methods used within
|
||||
<productname>PostgreSQL</> source code are not
|
||||
necessarily ideal.
|
||||
Note: Reading GNU Libtool sources is generally a good way of
|
||||
figuring out this information. The methods used within PostgreSQL
|
||||
source code are not necessarily ideal.
|
||||
-->
|
||||
|
||||
<variablelist>
|
||||
@ -160,7 +158,7 @@ cc -shared -o foo.so foo.o
|
||||
<indexterm><primary>MacOS X</></>
|
||||
<listitem>
|
||||
<para>
|
||||
Here is a sample. It assumes the developer tools are installed.
|
||||
Here is an example. It assumes the developer tools are installed.
|
||||
<programlisting>
|
||||
cc -c foo.c
|
||||
cc -bundle -flat_namespace -undefined suppress -o foo.so foo.o
|
||||
@ -271,17 +269,13 @@ gcc -shared -o foo.so foo.o
|
||||
</varlistentry>
|
||||
|
||||
</variablelist>
|
||||
</para>
|
||||
|
||||
<tip>
|
||||
<para>
|
||||
If you want to package your extension modules for wide distribution
|
||||
you should consider using <ulink
|
||||
url="http://www.gnu.org/software/libtool/"><productname>GNU
|
||||
Libtool</productname></ulink> for building shared libraries. It
|
||||
encapsulates the platform differences into a general and powerful
|
||||
interface. Serious packaging also requires considerations about
|
||||
library versioning, symbol resolution methods, and other issues.
|
||||
If this is too complicated for you, you should consider using
|
||||
<ulink url="http://www.gnu.org/software/libtool/"><productname>GNU
|
||||
Libtool</productname></ulink>, which hides the platform differences
|
||||
behind a uniform interface.
|
||||
</para>
|
||||
</tip>
|
||||
|
||||
|
@ -1,9 +1,9 @@
|
||||
<!--
|
||||
$Header: /cvsroot/pgsql/doc/src/sgml/extend.sgml,v 1.20 2003/03/25 16:15:36 petere Exp $
|
||||
$Header: /cvsroot/pgsql/doc/src/sgml/extend.sgml,v 1.21 2003/04/10 01:22:44 petere Exp $
|
||||
-->
|
||||
|
||||
<chapter id="extend">
|
||||
<title>Extending <acronym>SQL</acronym>: An Overview</title>
|
||||
<title>Extending <acronym>SQL</acronym></title>
|
||||
|
||||
<indexterm zone="extend">
|
||||
<primary>extending SQL</primary>
|
||||
@ -17,22 +17,22 @@ $Header: /cvsroot/pgsql/doc/src/sgml/extend.sgml,v 1.20 2003/03/25 16:15:36 pete
|
||||
<itemizedlist spacing="compact" mark="bullet">
|
||||
<listitem>
|
||||
<para>
|
||||
functions
|
||||
functions (starting in <xref linkend="xfunc">)
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
data types
|
||||
data types (starting in <xref linkend="xtypes">)
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
operators
|
||||
operators (starting in <xref linkend="xoper">)
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
aggregates
|
||||
aggregates (starting in <xref linkend="xaggr">)
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
@ -44,30 +44,29 @@ $Header: /cvsroot/pgsql/doc/src/sgml/extend.sgml,v 1.20 2003/03/25 16:15:36 pete
|
||||
<para>
|
||||
<productname>PostgreSQL</productname> is extensible because its operation is
|
||||
catalog-driven. If you are familiar with standard
|
||||
relational systems, you know that they store information
|
||||
relational database systems, you know that they store information
|
||||
about databases, tables, columns, etc., in what are
|
||||
commonly known as system catalogs. (Some systems call
|
||||
this the data dictionary). The catalogs appear to the
|
||||
user as tables like any other, but the <acronym>DBMS</acronym> stores
|
||||
its internal bookkeeping in them. One key difference
|
||||
between <productname>PostgreSQL</productname> and standard relational systems is
|
||||
between <productname>PostgreSQL</productname> and standard relational database systems is
|
||||
that <productname>PostgreSQL</productname> stores much more information in its
|
||||
catalogs -- not only information about tables and columns,
|
||||
but also information about its types, functions, access
|
||||
catalogs: not only information about tables and columns,
|
||||
but also information about data types, functions, access
|
||||
methods, and so on. These tables can be modified by
|
||||
the user, and since <productname>PostgreSQL</productname> bases its internal operation
|
||||
the user, and since <productname>PostgreSQL</productname> bases its operation
|
||||
on these tables, this means that <productname>PostgreSQL</productname> can be
|
||||
extended by users. By comparison, conventional
|
||||
database systems can only be extended by changing hardcoded
|
||||
procedures within the <acronym>DBMS</acronym> or by loading modules
|
||||
procedures in the source code or by loading modules
|
||||
specially written by the <acronym>DBMS</acronym> vendor.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<productname>PostgreSQL</productname> is also unlike most other data managers in
|
||||
that the server can incorporate user-written code into
|
||||
The PostgreSQL server can moreover incorporate user-written code into
|
||||
itself through dynamic loading. That is, the user can
|
||||
specify an object code file (e.g., a shared library) that implements a new type or function
|
||||
specify an object code file (e.g., a shared library) that implements a new type or function,
|
||||
and <productname>PostgreSQL</productname> will load it as required. Code written
|
||||
in <acronym>SQL</acronym> is even more trivial to add to the server.
|
||||
This ability to modify its operation <quote>on the fly</quote> makes
|
||||
@ -89,195 +88,25 @@ $Header: /cvsroot/pgsql/doc/src/sgml/extend.sgml,v 1.20 2003/03/25 16:15:36 pete
|
||||
</indexterm>
|
||||
|
||||
<para>
|
||||
The <productname>PostgreSQL</productname> type system
|
||||
can be broken down in several ways.
|
||||
Types are divided into base types and composite types.
|
||||
Data types are divided into base types and composite types.
|
||||
Base types are those, like <type>int4</type>, that are implemented
|
||||
in a language such as C. They generally correspond to
|
||||
what are often known as <firstterm>abstract data types</firstterm>; <productname>PostgreSQL</productname>
|
||||
what are often known as abstract data types. <productname>PostgreSQL</productname>
|
||||
can only operate on such types through methods provided
|
||||
by the user and only understands the behavior of such
|
||||
types to the extent that the user describes them.
|
||||
Composite types are created whenever the user creates a
|
||||
table.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<productname>PostgreSQL</productname> stores these types
|
||||
in only one way (within the
|
||||
file that stores all rows of a table) but the
|
||||
table. The
|
||||
user can <quote>look inside</quote> at the attributes of these types
|
||||
from the query language and optimize their retrieval by
|
||||
(for example) defining indexes on the attributes.
|
||||
<productname>PostgreSQL</productname> base types are further
|
||||
divided into built-in
|
||||
types and user-defined types. Built-in types (like
|
||||
<type>int4</type>) are those that are compiled
|
||||
into the system.
|
||||
User-defined types are those created by the user in the
|
||||
manner to be described later.
|
||||
from the query language.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
<sect1 id="pg-system-catalogs">
|
||||
<title>About the <productname>PostgreSQL</productname> System Catalogs</title>
|
||||
&xfunc;
|
||||
&xtypes;
|
||||
&xoper;
|
||||
&xaggr;
|
||||
|
||||
<indexterm zone="pg-system-catalogs">
|
||||
<primary>catalogs</primary>
|
||||
</indexterm>
|
||||
|
||||
<para>
|
||||
Having introduced the basic extensibility concepts, we
|
||||
can now take a look at how the catalogs are actually
|
||||
laid out. You can skip this section for now, but some
|
||||
later sections will be incomprehensible without the
|
||||
information given here, so mark this page for later
|
||||
reference.
|
||||
All system catalogs have names that begin with
|
||||
<literal>pg_</literal>.
|
||||
The following tables contain information that may be
|
||||
useful to the end user. (There are many other system
|
||||
catalogs, but there should rarely be a reason to query
|
||||
them directly.)
|
||||
|
||||
<table tocentry="1">
|
||||
<title>PostgreSQL System Catalogs</title>
|
||||
<titleabbrev>Catalogs</titleabbrev>
|
||||
<tgroup cols="2">
|
||||
<thead>
|
||||
<row>
|
||||
<entry>Catalog Name</entry>
|
||||
<entry>Description</entry>
|
||||
</row>
|
||||
</thead>
|
||||
<tbody>
|
||||
<row>
|
||||
<entry><structname>pg_database</></entry>
|
||||
<entry> databases</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><structname>pg_class</></entry>
|
||||
<entry> tables</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><structname>pg_attribute</></entry>
|
||||
<entry> table columns</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><structname>pg_index</></entry>
|
||||
<entry> indexes</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><structname>pg_proc</></entry>
|
||||
<entry> procedures/functions </entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><structname>pg_type</></entry>
|
||||
<entry> data types (both base and complex)</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><structname>pg_operator</></entry>
|
||||
<entry> operators</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><structname>pg_aggregate</></entry>
|
||||
<entry> aggregate functions</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><structname>pg_am</></entry>
|
||||
<entry> access methods</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><structname>pg_amop</></entry>
|
||||
<entry> access method operators</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><structname>pg_amproc</></entry>
|
||||
<entry> access method support functions</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><structname>pg_opclass</></entry>
|
||||
<entry> access method operator classes</entry>
|
||||
</row>
|
||||
</tbody>
|
||||
</tgroup>
|
||||
</table>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<figure float="1" id="EXTEND-CATALOGS">
|
||||
<title>The major <productname>PostgreSQL</productname> system catalogs</title>
|
||||
<mediaobject>
|
||||
<imageobject>
|
||||
<imagedata fileref="catalogs" align="center">
|
||||
</imageobject>
|
||||
</mediaobject>
|
||||
</figure>
|
||||
|
||||
<xref linkend="catalogs"> gives a more detailed explanation of these
|
||||
catalogs and their columns. However,
|
||||
<xref linkend="EXTEND-CATALOGS">
|
||||
shows the major entities and their relationships
|
||||
in the system catalogs. (Columns that do not refer
|
||||
to other entities are not shown unless they are part of
|
||||
a primary key.)
|
||||
This diagram is more or less incomprehensible until you
|
||||
actually start looking at the contents of the catalogs
|
||||
and see how they relate to each other. For now, the
|
||||
main things to take away from this diagram are as follows:
|
||||
|
||||
<itemizedlist spacing="compact" mark="bullet">
|
||||
<listitem>
|
||||
<para>
|
||||
In several of the sections that follow, we will
|
||||
present various join queries on the system
|
||||
catalogs that display information we need to extend
|
||||
the system. Looking at this diagram should make
|
||||
some of these join queries (which are often
|
||||
three- or four-way joins) more understandable,
|
||||
because you will be able to see that the
|
||||
columns used in the queries form foreign keys
|
||||
in other tables.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
Many different features (tables, columns,
|
||||
functions, types, access methods, etc.) are
|
||||
tightly integrated in this schema. A simple
|
||||
create command may modify many of these catalogs.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
Types and procedures
|
||||
are central to the schema.
|
||||
|
||||
<note>
|
||||
<para>
|
||||
We use the words <firstterm>procedure</firstterm>
|
||||
and <firstterm>function</firstterm> more or less interchangeably.
|
||||
</para>
|
||||
</note>
|
||||
|
||||
Nearly every catalog contains some reference to
|
||||
rows in one or both of these tables. For
|
||||
example, <productname>PostgreSQL</productname> frequently uses type
|
||||
signatures (e.g., of functions and operators) to
|
||||
identify unique rows of other catalogs.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
There are many columns and relationships that
|
||||
have obvious meanings, but there are many
|
||||
(particularly those that have to do with access
|
||||
methods) that do not.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</para>
|
||||
</sect1>
|
||||
</chapter>
|
||||
|
||||
<!-- Keep this comment at the end of the file
|
||||
|
@ -1,4 +1,4 @@
|
||||
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/filelist.sgml,v 1.27 2003/03/25 16:15:36 petere Exp $ -->
|
||||
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/filelist.sgml,v 1.28 2003/04/10 01:22:44 petere Exp $ -->
|
||||
|
||||
<!entity history SYSTEM "history.sgml">
|
||||
<!entity info SYSTEM "info.sgml">
|
||||
@ -57,7 +57,6 @@
|
||||
<!entity wal SYSTEM "wal.sgml">
|
||||
|
||||
<!-- programmer's guide -->
|
||||
<!entity arch-pg SYSTEM "arch-pg.sgml">
|
||||
<!entity dfunc SYSTEM "dfunc.sgml">
|
||||
<!entity ecpg SYSTEM "ecpg.sgml">
|
||||
<!entity extend SYSTEM "extend.sgml">
|
||||
|
@ -1,5 +1,5 @@
|
||||
<!--
|
||||
$Header: /cvsroot/pgsql/doc/src/sgml/postgres.sgml,v 1.49 2003/03/25 16:15:38 petere Exp $
|
||||
$Header: /cvsroot/pgsql/doc/src/sgml/postgres.sgml,v 1.50 2003/04/10 01:22:44 petere Exp $
|
||||
-->
|
||||
|
||||
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V3.1//EN" [
|
||||
@ -210,15 +210,10 @@ $Header: /cvsroot/pgsql/doc/src/sgml/postgres.sgml,v 1.49 2003/03/25 16:15:38 pe
|
||||
</para>
|
||||
</partintro>
|
||||
|
||||
&arch-pg;
|
||||
&extend;
|
||||
&xfunc;
|
||||
&xtypes;
|
||||
&xoper;
|
||||
&xaggr;
|
||||
&rules;
|
||||
&xindex;
|
||||
&indexcost;
|
||||
&rules;
|
||||
&trigger;
|
||||
&spi;
|
||||
|
||||
|
@ -1,4 +1,4 @@
|
||||
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/queries.sgml,v 1.20 2003/03/13 01:30:29 petere Exp $ -->
|
||||
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/queries.sgml,v 1.21 2003/04/10 01:22:44 petere Exp $ -->
|
||||
|
||||
<chapter id="queries">
|
||||
<title>Queries</title>
|
||||
@ -550,6 +550,78 @@ FROM (SELECT * FROM table1) AS alias_name
|
||||
grouping or aggregation.
|
||||
</para>
|
||||
</sect3>
|
||||
|
||||
<sect3 id="queries-tablefunctions">
|
||||
<title>Table Functions</title>
|
||||
|
||||
<indexterm zone="queries-tablefunctions"><primary>table function</></>
|
||||
|
||||
<para>
|
||||
Table functions are functions that produce a set of rows, made up
|
||||
of either base data types (scalar types) or composite data types
|
||||
(table rows). They are used like a table, view, or subquery in
|
||||
the <literal>FROM</> clause of a query. Columns returned by table
|
||||
functions may be included in <literal>SELECT</>,
|
||||
<literal>JOIN</>, or <literal>WHERE</> clauses in the same manner
|
||||
as a table, view, or subquery column.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If a table function returns a base data type, the single result
|
||||
column is named like the function. If the function returns a
|
||||
composite type, the result columns get the same names as the
|
||||
individual attributes of the type.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
A table function may be aliased in the <literal>FROM</> clause,
|
||||
but it also may be left unaliased. If a function is used in the
|
||||
<literal>FROM</> clause with no alias, the function name is used
|
||||
as the resulting table name.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Some examples:
|
||||
<programlisting>
|
||||
CREATE TABLE foo (fooid int, foosubid int, fooname text);
|
||||
|
||||
CREATE FUNCTION getfoo(int) RETURNS SETOF foo AS '
|
||||
SELECT * FROM foo WHERE fooid = $1;
|
||||
' LANGUAGE SQL;
|
||||
|
||||
SELECT * FROM getfoo(1) AS t1;
|
||||
|
||||
SELECT * FROM foo
|
||||
WHERE foosubid IN (select foosubid from getfoo(foo.fooid) z
|
||||
where z.fooid = foo.fooid);
|
||||
|
||||
CREATE VIEW vw_getfoo AS SELECT * FROM getfoo(1);
|
||||
SELECT * FROM vw_getfoo;
|
||||
</programlisting>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
In some cases it is useful to define table functions that can
|
||||
return different column sets depending on how they are invoked.
|
||||
To support this, the table function can be declared as returning
|
||||
the pseudotype <type>record</>. When such a function is used in
|
||||
a query, the expected row structure must be specified in the
|
||||
query itself, so that the system can know how to parse and plan
|
||||
the query. Consider this example:
|
||||
<programlisting>
|
||||
SELECT *
|
||||
FROM dblink('dbname=mydb', 'select proname, prosrc from pg_proc')
|
||||
AS t1(proname name, prosrc text)
|
||||
WHERE proname LIKE 'bytea%';
|
||||
</programlisting>
|
||||
The <literal>dblink</> function executes a remote query (see
|
||||
<filename>contrib/dblink</>). It is declared to return
|
||||
<type>record</> since it might be used for any kind of query.
|
||||
The actual column set must be specified in the calling query so
|
||||
that the parser knows, for example, what <literal>*</> should
|
||||
expand to.
|
||||
</para>
|
||||
</sect3>
|
||||
</sect2>
|
||||
|
||||
<sect2 id="queries-where">
|
||||
@ -951,7 +1023,7 @@ SELECT DISTINCT ON (<replaceable>expression</replaceable> <optional>, <replaceab
|
||||
The <literal>DISTINCT ON</> clause is not part of the SQL standard
|
||||
and is sometimes considered bad style because of the potentially
|
||||
indeterminate nature of its results. With judicious use of
|
||||
<literal>GROUP BY</> and subselects in <literal>FROM</> the
|
||||
<literal>GROUP BY</> and subqueries in <literal>FROM</> the
|
||||
construct can be avoided, but it is often the most convenient
|
||||
alternative.
|
||||
</para>
|
||||
|
@ -1,9 +1,9 @@
|
||||
<!--
|
||||
$Header: /cvsroot/pgsql/doc/src/sgml/xaggr.sgml,v 1.19 2003/03/25 16:15:38 petere Exp $
|
||||
$Header: /cvsroot/pgsql/doc/src/sgml/xaggr.sgml,v 1.20 2003/04/10 01:22:44 petere Exp $
|
||||
-->
|
||||
|
||||
<chapter id="xaggr">
|
||||
<title>Extending <acronym>SQL</acronym>: Aggregates</title>
|
||||
<sect1 id="xaggr">
|
||||
<title>User-Defined Aggregates</title>
|
||||
|
||||
<indexterm zone="xaggr">
|
||||
<primary>aggregate functions</primary>
|
||||
@ -22,38 +22,36 @@ $Header: /cvsroot/pgsql/doc/src/sgml/xaggr.sgml,v 1.19 2003/03/25 16:15:38 peter
|
||||
function. The state transition function is just an
|
||||
ordinary function that could also be used outside the
|
||||
context of the aggregate. A <firstterm>final function</firstterm>
|
||||
can also be specified, in case the desired output of the aggregate
|
||||
can also be specified, in case the desired result of the aggregate
|
||||
is different from the data that needs to be kept in the running
|
||||
state value.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Thus, in addition to the input and result data types seen by a user
|
||||
Thus, in addition to the argument and result data types seen by a user
|
||||
of the aggregate, there is an internal state-value data type that
|
||||
may be different from both the input and result types.
|
||||
may be different from both the argument and result types.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If we define an aggregate that does not use a final function,
|
||||
we have an aggregate that computes a running function of
|
||||
the column values from each row. <function>Sum</> is an
|
||||
example of this kind of aggregate. <function>Sum</> starts at
|
||||
the column values from each row. <function>sum</> is an
|
||||
example of this kind of aggregate. <function>sum</> starts at
|
||||
zero and always adds the current row's value to
|
||||
its running total. For example, if we want to make a <function>sum</>
|
||||
aggregate to work on a data type for complex numbers,
|
||||
we only need the addition function for that data type.
|
||||
The aggregate definition is:
|
||||
The aggregate definition would be:
|
||||
|
||||
<programlisting>
|
||||
<screen>
|
||||
CREATE AGGREGATE complex_sum (
|
||||
sfunc = complex_add,
|
||||
basetype = complex,
|
||||
stype = complex,
|
||||
initcond = '(0,0)'
|
||||
);
|
||||
</programlisting>
|
||||
|
||||
<screen>
|
||||
SELECT complex_sum(a) FROM test_complex;
|
||||
|
||||
complex_sum
|
||||
@ -61,43 +59,43 @@ SELECT complex_sum(a) FROM test_complex;
|
||||
(34,53.9)
|
||||
</screen>
|
||||
|
||||
(In practice, we'd just name the aggregate <function>sum</function>, and rely on
|
||||
(In practice, we'd just name the aggregate <function>sum</function> and rely on
|
||||
<productname>PostgreSQL</productname> to figure out which kind
|
||||
of sum to apply to a column of type <type>complex</type>.)
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The above definition of <function>sum</function> will return zero (the initial
|
||||
state condition) if there are no non-null input values.
|
||||
Perhaps we want to return NULL in that case instead --- the SQL standard
|
||||
state condition) if there are no nonnull input values.
|
||||
Perhaps we want to return null in that case instead --- the SQL standard
|
||||
expects <function>sum</function> to behave that way. We can do this simply by
|
||||
omitting the <literal>initcond</literal> phrase, so that the initial state
|
||||
condition is NULL. Ordinarily this would mean that the <literal>sfunc</literal>
|
||||
would need to check for a NULL state-condition input, but for
|
||||
condition is null. Ordinarily this would mean that the <literal>sfunc</literal>
|
||||
would need to check for a null state-condition input, but for
|
||||
<function>sum</function> and some other simple aggregates like <function>max</> and <function>min</>,
|
||||
it's sufficient to insert the first non-null input value into
|
||||
it would be sufficient to insert the first nonnull input value into
|
||||
the state variable and then start applying the transition function
|
||||
at the second non-null input value. <productname>PostgreSQL</productname>
|
||||
will do that automatically if the initial condition is NULL and
|
||||
at the second nonnull input value. <productname>PostgreSQL</productname>
|
||||
will do that automatically if the initial condition is null and
|
||||
the transition function is marked <quote>strict</> (i.e., not to be called
|
||||
for NULL inputs).
|
||||
for null inputs).
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Another bit of default behavior for a <quote>strict</> transition function
|
||||
is that the previous state value is retained unchanged whenever a
|
||||
NULL input value is encountered. Thus, null values are ignored. If you
|
||||
need some other behavior for NULL inputs, just define your transition
|
||||
function as non-strict, and code it to test for NULL inputs and do
|
||||
null input value is encountered. Thus, null values are ignored. If you
|
||||
need some other behavior for null inputs, just do not define your transition
|
||||
function as strict, and code it to test for null inputs and do
|
||||
whatever is needed.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<function>Avg</> (average) is a more complex example of an aggregate. It requires
|
||||
<function>avg</> (average) is a more complex example of an aggregate. It requires
|
||||
two pieces of running state: the sum of the inputs and the count
|
||||
of the number of inputs. The final result is obtained by dividing
|
||||
these quantities. Average is typically implemented by using a
|
||||
two-element array as the transition state value. For example,
|
||||
two-element array as the state value. For example,
|
||||
the built-in implementation of <function>avg(float8)</function>
|
||||
looks like:
|
||||
|
||||
@ -116,7 +114,7 @@ CREATE AGGREGATE avg (
|
||||
For further details see the description of the <command>CREATE
|
||||
AGGREGATE</command> command in <xref linkend="reference">.
|
||||
</para>
|
||||
</chapter>
|
||||
</sect1>
|
||||
|
||||
<!-- Keep this comment at the end of the file
|
||||
Local variables:
|
||||
|
File diff suppressed because it is too large
Load Diff
@ -1,25 +1,9 @@
|
||||
<!--
|
||||
$Header: /cvsroot/pgsql/doc/src/sgml/xoper.sgml,v 1.22 2003/01/15 19:35:35 tgl Exp $
|
||||
$Header: /cvsroot/pgsql/doc/src/sgml/xoper.sgml,v 1.23 2003/04/10 01:22:45 petere Exp $
|
||||
-->
|
||||
|
||||
<Chapter Id="xoper">
|
||||
<Title>Extending <Acronym>SQL</Acronym>: Operators</Title>
|
||||
|
||||
<sect1 id="xoper-intro">
|
||||
<title>Introduction</title>
|
||||
|
||||
<Para>
|
||||
<ProductName>PostgreSQL</ProductName> supports left unary,
|
||||
right unary, and binary
|
||||
operators. Operators can be overloaded; that is,
|
||||
the same operator name can be used for different operators
|
||||
that have different numbers and types of operands. If
|
||||
there is an ambiguous situation and the system cannot
|
||||
determine the correct operator to use, it will return
|
||||
an error. You may have to type-cast the left and/or
|
||||
right operands to help it understand which operator you
|
||||
meant to use.
|
||||
</Para>
|
||||
<sect1 id="xoper">
|
||||
<title>User-defined Operators</title>
|
||||
|
||||
<Para>
|
||||
Every operator is <quote>syntactic sugar</quote> for a call to an
|
||||
@ -28,13 +12,18 @@ $Header: /cvsroot/pgsql/doc/src/sgml/xoper.sgml,v 1.22 2003/01/15 19:35:35 tgl E
|
||||
the operator. However, an operator is <emphasis>not merely</emphasis>
|
||||
syntactic sugar, because it carries additional information
|
||||
that helps the query planner optimize queries that use the
|
||||
operator. Much of this chapter will be devoted to explaining
|
||||
operator. The next section will be devoted to explaining
|
||||
that additional information.
|
||||
</Para>
|
||||
</sect1>
|
||||
|
||||
<sect1 id="xoper-example">
|
||||
<title>Example</title>
|
||||
<Para>
|
||||
<productname>PostgreSQL</productname> supports left unary, right
|
||||
unary, and binary operators. Operators can be overloaded; that is,
|
||||
the same operator name can be used for different operators that
|
||||
have different numbers and types of operands. When a query is
|
||||
executed, the system determines the operator to call from the
|
||||
number and types of the provided operands.
|
||||
</Para>
|
||||
|
||||
<Para>
|
||||
Here is an example of creating an operator for adding two complex
|
||||
@ -45,7 +34,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/xoper.sgml,v 1.22 2003/01/15 19:35:35 tgl E
|
||||
<ProgramListing>
|
||||
CREATE FUNCTION complex_add(complex, complex)
|
||||
RETURNS complex
|
||||
AS '<replaceable>PGROOT</replaceable>/tutorial/complex'
|
||||
AS '<replaceable>filename</replaceable>', 'complex_add'
|
||||
LANGUAGE C;
|
||||
|
||||
CREATE OPERATOR + (
|
||||
@ -58,7 +47,7 @@ CREATE OPERATOR + (
|
||||
</Para>
|
||||
|
||||
<Para>
|
||||
Now we can do:
|
||||
Now we could execute a query like this:
|
||||
|
||||
<screen>
|
||||
SELECT (a + b) AS c FROM test_complex;
|
||||
@ -78,20 +67,13 @@ SELECT (a + b) AS c FROM test_complex;
|
||||
<command>CREATE OPERATOR</command>. The <literal>commutator</>
|
||||
clause shown in the example is an optional hint to the query
|
||||
optimizer. Further details about <literal>commutator</> and other
|
||||
optimizer hints appear below.
|
||||
optimizer hints appear in the next section.
|
||||
</Para>
|
||||
</sect1>
|
||||
|
||||
<sect1 id="xoper-optimization">
|
||||
<title>Operator Optimization Information</title>
|
||||
|
||||
<note>
|
||||
<title>Author</title>
|
||||
<para>
|
||||
Written by Tom Lane.
|
||||
</para>
|
||||
</note>
|
||||
|
||||
<para>
|
||||
A <ProductName>PostgreSQL</ProductName> operator definition can include
|
||||
several optional clauses that tell the system useful things about how
|
||||
@ -99,7 +81,7 @@ SELECT (a + b) AS c FROM test_complex;
|
||||
appropriate, because they can make for considerable speedups in execution
|
||||
of queries that use the operator. But if you provide them, you must be
|
||||
sure that they are right! Incorrect use of an optimization clause can
|
||||
result in backend crashes, subtly wrong output, or other Bad Things.
|
||||
result in server process crashes, subtly wrong output, or other Bad Things.
|
||||
You can always leave out an optimization clause if you are not sure
|
||||
about it; the only consequence is that queries might run slower than
|
||||
they need to.
|
||||
@ -112,7 +94,7 @@ SELECT (a + b) AS c FROM test_complex;
|
||||
</para>
|
||||
|
||||
<sect2>
|
||||
<title>COMMUTATOR</title>
|
||||
<title><literal>COMMUTATOR</></title>
|
||||
|
||||
<para>
|
||||
The <literal>COMMUTATOR</> clause, if provided, names an operator that is the
|
||||
@ -155,7 +137,7 @@ SELECT (a + b) AS c FROM test_complex;
|
||||
<para>
|
||||
The other, more straightforward way is just to include <literal>COMMUTATOR</> clauses
|
||||
in both definitions. When <ProductName>PostgreSQL</ProductName> processes
|
||||
the first definition and realizes that <literal>COMMUTATOR</> refers to a non-existent
|
||||
the first definition and realizes that <literal>COMMUTATOR</> refers to a nonexistent
|
||||
operator, the system will make a dummy entry for that operator in the
|
||||
system catalog. This dummy entry will have valid data only
|
||||
for the operator name, left and right operand types, and result type,
|
||||
@ -164,9 +146,7 @@ SELECT (a + b) AS c FROM test_complex;
|
||||
dummy entry. Later, when you define the second operator, the system
|
||||
updates the dummy entry with the additional information from the second
|
||||
definition. If you try to use the dummy operator before it's been filled
|
||||
in, you'll just get an error message. (Note: This procedure did not work
|
||||
reliably in <ProductName>PostgreSQL</ProductName> versions before 6.5,
|
||||
but it is now the recommended way to do things.)
|
||||
in, you'll just get an error message.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
@ -174,7 +154,7 @@ SELECT (a + b) AS c FROM test_complex;
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>NEGATOR</title>
|
||||
<title><literal>NEGATOR</></title>
|
||||
|
||||
<para>
|
||||
The <literal>NEGATOR</> clause, if provided, names an operator that is the
|
||||
@ -194,14 +174,14 @@ SELECT (a + b) AS c FROM test_complex;
|
||||
|
||||
<para>
|
||||
An operator's negator must have the same left and/or right operand types
|
||||
as the operator itself, so just as with <literal>COMMUTATOR</>, only the operator
|
||||
as the operator to be defined, so just as with <literal>COMMUTATOR</>, only the operator
|
||||
name need be given in the <literal>NEGATOR</> clause.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Providing a negator is very helpful to the query optimizer since
|
||||
it allows expressions like <literal>NOT (x = y)</> to be simplified into
|
||||
x <> y. This comes up more often than you might think, because
|
||||
<literal>x <> y</>. This comes up more often than you might think, because
|
||||
<literal>NOT</> operations can be inserted as a consequence of other rearrangements.
|
||||
</para>
|
||||
|
||||
@ -213,12 +193,12 @@ SELECT (a + b) AS c FROM test_complex;
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>RESTRICT</title>
|
||||
<title><literal>RESTRICT</></title>
|
||||
|
||||
<para>
|
||||
The <literal>RESTRICT</> clause, if provided, names a restriction selectivity
|
||||
estimation function for the operator (note that this is a function
|
||||
name, not an operator name). <literal>RESTRICT</> clauses only make sense for
|
||||
estimation function for the operator. (Note that this is a function
|
||||
name, not an operator name.) <literal>RESTRICT</> clauses only make sense for
|
||||
binary operators that return <type>boolean</>. The idea behind a restriction
|
||||
selectivity estimator is to guess what fraction of the rows in a
|
||||
table will satisfy a <literal>WHERE</literal>-clause condition of the form
|
||||
@ -269,15 +249,15 @@ column OP constant
|
||||
You can use <function>scalarltsel</> and <function>scalargtsel</> for comparisons on data types that
|
||||
have some sensible means of being converted into numeric scalars for
|
||||
range comparisons. If possible, add the data type to those understood
|
||||
by the routine <function>convert_to_scalar()</function> in <filename>src/backend/utils/adt/selfuncs.c</filename>.
|
||||
(Eventually, this routine should be replaced by per-data-type functions
|
||||
by the function <function>convert_to_scalar()</function> in <filename>src/backend/utils/adt/selfuncs.c</filename>.
|
||||
(Eventually, this function should be replaced by per-data-type functions
|
||||
identified through a column of the <classname>pg_type</> system catalog; but that hasn't happened
|
||||
yet.) If you do not do this, things will still work, but the optimizer's
|
||||
estimates won't be as good as they could be.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
There are additional selectivity functions designed for geometric
|
||||
There are additional selectivity estimation functions designed for geometric
|
||||
operators in <filename>src/backend/utils/adt/geo_selfuncs.c</filename>: <function>areasel</function>, <function>positionsel</function>,
|
||||
and <function>contsel</function>. At this writing these are just stubs, but you may want
|
||||
to use them (or even better, improve them) anyway.
|
||||
@ -285,12 +265,12 @@ column OP constant
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>JOIN</title>
|
||||
<title><literal>JOIN</></title>
|
||||
|
||||
<para>
|
||||
The <literal>JOIN</> clause, if provided, names a join selectivity
|
||||
estimation function for the operator (note that this is a function
|
||||
name, not an operator name). <literal>JOIN</> clauses only make sense for
|
||||
estimation function for the operator. (Note that this is a function
|
||||
name, not an operator name.) <literal>JOIN</> clauses only make sense for
|
||||
binary operators that return <type>boolean</type>. The idea behind a join
|
||||
selectivity estimator is to guess what fraction of the rows in a
|
||||
pair of tables will satisfy a <literal>WHERE</>-clause condition of the form
|
||||
@ -319,13 +299,13 @@ table1.column1 OP table2.column2
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>HASHES</title>
|
||||
<title><literal>HASHES</></title>
|
||||
|
||||
<para>
|
||||
The <literal>HASHES</literal> clause, if present, tells the system that
|
||||
it is permissible to use the hash join method for a join based on this
|
||||
operator. <literal>HASHES</> only makes sense for binary operators that
|
||||
return <literal>boolean</>, and in practice the operator had better be
|
||||
operator. <literal>HASHES</> only makes sense for a binary operator that
|
||||
returns <literal>boolean</>, and in practice the operator had better be
|
||||
equality for some data type.
|
||||
</para>
|
||||
|
||||
@ -340,33 +320,35 @@ table1.column1 OP table2.column2
|
||||
|
||||
<para>
|
||||
In fact, logical equality is not good enough either; the operator
|
||||
had better represent pure bitwise equality, because the hash function
|
||||
will be computed on the memory representation of the values regardless
|
||||
of what the bits mean. For example, equality of
|
||||
time intervals is not bitwise equality; the interval equality operator
|
||||
considers two time intervals equal if they have the same
|
||||
duration, whether or not their endpoints are identical. What this means
|
||||
is that a join using <literal>=</literal> between interval fields would yield different
|
||||
results if implemented as a hash join than if implemented another way,
|
||||
because a large fraction of the pairs that should match will hash to
|
||||
different values and will never be compared by the hash join. But
|
||||
if the optimizer chose to use a different kind of join, all the pairs
|
||||
that the equality operator says are equal will be found.
|
||||
We don't want that kind of inconsistency, so we don't mark interval
|
||||
equality as hashable.
|
||||
had better represent pure bitwise equality, because the hash
|
||||
function will be computed on the memory representation of the
|
||||
values regardless of what the bits mean. For example, the
|
||||
polygon operator <literal>~=</literal>, which checks whether two
|
||||
polygons are the same, is not bitwise equality, because two
|
||||
polygons can be considered the same even if their vertices are
|
||||
specified in a different order. What this means is that a join
|
||||
using <literal>~=</literal> between polygon fields would yield
|
||||
different results if implemented as a hash join than if
|
||||
implemented another way, because a large fraction of the pairs
|
||||
that should match will hash to different values and will never be
|
||||
compared by the hash join. But if the optimizer chooses to use a
|
||||
different kind of join, all the pairs that the operator
|
||||
<literal>~=</literal> says are the same will be found. We don't
|
||||
want that kind of inconsistency, so we don't mark the polygon
|
||||
operator <literal>~=</literal> as hashable.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
There are also machine-dependent ways in which a hash join might fail
|
||||
to do the right thing. For example, if your data type
|
||||
is a structure in which there may be uninteresting pad bits, it's unsafe
|
||||
to mark the equality operator <literal>HASHES</>. (Unless, perhaps, you write
|
||||
your other operators to ensure that the unused bits are always zero.)
|
||||
to mark the equality operator <literal>HASHES</>. (Unless you write
|
||||
your other operators and functions to ensure that the unused bits are always zero, which is the recommended strategy.)
|
||||
Another example is that the floating-point data types are unsafe for hash
|
||||
joins. On machines that meet the <acronym>IEEE</> floating-point standard, minus
|
||||
zero and plus zero are different values (different bit patterns) but
|
||||
joins. On machines that meet the <acronym>IEEE</> floating-point standard, negative
|
||||
zero and positive zero are different values (different bit patterns) but
|
||||
they are defined to compare equal. So, if the equality operator on floating-point data types were marked
|
||||
<literal>HASHES</>, a minus zero and a plus zero would probably not be matched up
|
||||
<literal>HASHES</>, a negative zero and a positive zero would probably not be matched up
|
||||
by a hash join, but they would be matched up by any other join process.
|
||||
</para>
|
||||
|
||||
@ -403,9 +385,9 @@ table1.column1 OP table2.column2
|
||||
|
||||
<para>
|
||||
The <literal>MERGES</literal> clause, if present, tells the system that
|
||||
it is permissible to use the merge join method for a join based on this
|
||||
operator. <literal>MERGES</> only makes sense for binary operators that
|
||||
return <literal>boolean</>, and in practice the operator must represent
|
||||
it is permissible to use the merge-join method for a join based on this
|
||||
operator. <literal>MERGES</> only makes sense for a binary operator that
|
||||
returns <literal>boolean</>, and in practice the operator must represent
|
||||
equality for some data type or pair of data types.
|
||||
</para>
|
||||
|
||||
@ -420,7 +402,7 @@ table1.column1 OP table2.column2
|
||||
data types had better be the same (or at least bitwise equivalent),
|
||||
it is possible to merge-join two
|
||||
distinct data types so long as they are logically compatible. For
|
||||
example, the <type>int2</type>-versus-<type>int4</type> equality operator
|
||||
example, the <type>smallint</type>-versus-<type>integer</type> equality operator
|
||||
is merge-joinable.
|
||||
We only need sorting operators that will bring both data types into a
|
||||
logically compatible sequence.
|
||||
@ -429,11 +411,11 @@ table1.column1 OP table2.column2
|
||||
<para>
|
||||
Execution of a merge join requires that the system be able to identify
|
||||
four operators related to the merge-join equality operator: less-than
|
||||
comparison for the left input data type, less-than comparison for the
|
||||
right input data type, less-than comparison between the two data types, and
|
||||
comparison for the left operand data type, less-than comparison for the
|
||||
right operand data type, less-than comparison between the two data types, and
|
||||
greater-than comparison between the two data types. (These are actually
|
||||
four distinct operators if the merge-joinable operator has two different
|
||||
input data types; but when the input types are the same the three
|
||||
operand data types; but when the operand types are the same the three
|
||||
less-than operators are all the same operator.)
|
||||
It is possible to
|
||||
specify these operators individually by name, as the <literal>SORT1</>,
|
||||
@ -447,8 +429,8 @@ table1.column1 OP table2.column2
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The input data types of the four comparison operators can be deduced
|
||||
from the input types of the merge-joinable operator, so just as with
|
||||
The operand data types of the four comparison operators can be deduced
|
||||
from the operand types of the merge-joinable operator, so just as with
|
||||
<literal>COMMUTATOR</>, only the operator names need be given in these
|
||||
clauses. Unless you are using peculiar choices of operator names,
|
||||
it's sufficient to write <literal>MERGES</> and let the system fill in
|
||||
@ -469,7 +451,7 @@ table1.column1 OP table2.column2
|
||||
<listitem>
|
||||
<para>
|
||||
A merge-joinable equality operator must have a merge-joinable
|
||||
commutator (itself if the two data types are the same, or a related
|
||||
commutator (itself if the two operand data types are the same, or a related
|
||||
equality operator if they are different).
|
||||
</para>
|
||||
</listitem>
|
||||
@ -523,11 +505,8 @@ table1.column1 OP table2.column2
|
||||
<literal><</> and <literal>></> respectively.
|
||||
</para>
|
||||
</note>
|
||||
|
||||
</sect2>
|
||||
|
||||
</sect1>
|
||||
</Chapter>
|
||||
|
||||
<!-- Keep this comment at the end of the file
|
||||
Local variables:
|
||||
|
@ -1,5 +1,9 @@
|
||||
<chapter id="xtypes">
|
||||
<title>Extending <acronym>SQL</acronym>: Types</title>
|
||||
<!--
|
||||
$Header: /cvsroot/pgsql/doc/src/sgml/xtypes.sgml,v 1.17 2003/04/10 01:22:45 petere Exp $
|
||||
-->
|
||||
|
||||
<sect1 id="xtypes">
|
||||
<title>User-Defined Types</title>
|
||||
|
||||
<indexterm zone="xtypes">
|
||||
<primary>data types</primary>
|
||||
@ -7,22 +11,20 @@
|
||||
</indexterm>
|
||||
|
||||
<comment>
|
||||
This chapter needs to be updated for the version-1 function manager
|
||||
This section needs to be updated for the version-1 function manager
|
||||
interface.
|
||||
</comment>
|
||||
|
||||
<para>
|
||||
As previously mentioned, there are two kinds of types in
|
||||
<productname>PostgreSQL</productname>: base types (defined in a
|
||||
programming language) and composite types. This chapter describes
|
||||
how to define new base types.
|
||||
As described above, there are two kinds of data types in
|
||||
<productname>PostgreSQL</productname>: base types and composite
|
||||
types. This section describes how to define new base types.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The examples in this section can be found in
|
||||
<filename>complex.sql</filename> and <filename>complex.c</filename>
|
||||
in the tutorial directory. Composite examples are in
|
||||
<filename>funcs.sql</filename>.
|
||||
in the tutorial directory.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
@ -36,15 +38,15 @@
|
||||
These functions determine how the type appears in strings (for input
|
||||
by the user and output to the user) and how the type is organized in
|
||||
memory. The input function takes a null-terminated character string
|
||||
as its input and returns the internal (in memory) representation of
|
||||
as its argument and returns the internal (in memory) representation of
|
||||
the type. The output function takes the internal representation of
|
||||
the type and returns a null-terminated character string.
|
||||
the type as argument and returns a null-terminated character string.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Suppose we want to define a complex type which represents complex
|
||||
numbers. Naturally, we would choose to represent a complex in memory
|
||||
as the following <acronym>C</acronym> structure:
|
||||
Suppose we want to define a type <type>complex</> that represents
|
||||
complex numbers. A natural way to to represent a complex number in
|
||||
memory would be the following C structure:
|
||||
|
||||
<programlisting>
|
||||
typedef struct Complex {
|
||||
@ -53,24 +55,16 @@ typedef struct Complex {
|
||||
} Complex;
|
||||
</programlisting>
|
||||
|
||||
and a string of the form <literal>(x,y)</literal> as the external string
|
||||
representation.
|
||||
As the external string representation of the type, we choose a
|
||||
string of the form <literal>(x,y)</literal>.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The functions are usually not hard to write, especially the output
|
||||
function. However, there are a number of points to remember:
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
When defining your external (string) representation, remember
|
||||
that you must eventually write a complete and robust parser for
|
||||
that representation as your input function!
|
||||
</para>
|
||||
|
||||
<para>
|
||||
For instance:
|
||||
The input and output functions are usually not hard to write,
|
||||
especially the output function. But when defining the external
|
||||
string representation of the type, remember that you must eventually
|
||||
write a complete and robust parser for that representation as your
|
||||
input function. For instance:
|
||||
|
||||
<programlisting>
|
||||
Complex *
|
||||
@ -78,48 +72,42 @@ complex_in(char *str)
|
||||
{
|
||||
double x, y;
|
||||
Complex *result;
|
||||
if (sscanf(str, " ( %lf , %lf )", &x, &y) != 2) {
|
||||
|
||||
if (sscanf(str, " ( %lf , %lf )", &x, &y) != 2)
|
||||
{
|
||||
elog(ERROR, "complex_in: error in parsing %s", str);
|
||||
return NULL;
|
||||
}
|
||||
result = (Complex *)palloc(sizeof(Complex));
|
||||
result = (Complex *) palloc(sizeof(Complex));
|
||||
result->x = x;
|
||||
result->y = y;
|
||||
return (result);
|
||||
return result;
|
||||
}
|
||||
</programlisting>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The output function can simply be:
|
||||
The output function can simply be:
|
||||
|
||||
<programlisting>
|
||||
char *
|
||||
complex_out(Complex *complex)
|
||||
{
|
||||
char *result;
|
||||
|
||||
if (complex == NULL)
|
||||
return(NULL);
|
||||
result = (char *) palloc(60);
|
||||
sprintf(result, "(%g,%g)", complex->x, complex->y);
|
||||
return(result);
|
||||
return result;
|
||||
}
|
||||
</programlisting>
|
||||
</para>
|
||||
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
You should try to make the input and output functions inverses of
|
||||
each other. If you do not, you will have severe problems when
|
||||
you need to dump your data into a file and then read it back in
|
||||
(say, into someone else's database on another computer). This is
|
||||
a particularly common problem when floating-point numbers are
|
||||
involved.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>
|
||||
You should try to make the input and output functions inverses of
|
||||
each other. If you do not, you will have severe problems when you
|
||||
need to dump your data into a file and then read it back in. This
|
||||
is a particularly common problem when floating-point numbers are
|
||||
involved.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
@ -130,14 +118,18 @@ complex_out(Complex *complex)
|
||||
<programlisting>
|
||||
CREATE FUNCTION complex_in(cstring)
|
||||
RETURNS complex
|
||||
AS '<replaceable>PGROOT</replaceable>/tutorial/complex'
|
||||
AS '<replaceable>filename</replaceable>'
|
||||
LANGUAGE C;
|
||||
|
||||
CREATE FUNCTION complex_out(complex)
|
||||
RETURNS cstring
|
||||
AS '<replaceable>PGROOT</replaceable>/tutorial/complex'
|
||||
AS '<replaceable>filename</replaceable>'
|
||||
LANGUAGE C;
|
||||
</programlisting>
|
||||
|
||||
Notice that the declarations of the input and output functions must
|
||||
reference the not-yet-defined type. This is allowed, but will draw
|
||||
warning messages that may be ignored.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
@ -149,49 +141,36 @@ CREATE TYPE complex (
|
||||
output = complex_out
|
||||
);
|
||||
</programlisting>
|
||||
|
||||
Notice that the declarations of the input and output functions must
|
||||
reference the not-yet-defined type. This is allowed, but will draw
|
||||
warning messages that may be ignored.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<indexterm>
|
||||
<primary>arrays</primary>
|
||||
</indexterm>
|
||||
As discussed earlier, <productname>PostgreSQL</productname> fully
|
||||
supports arrays of base types. Additionally,
|
||||
<productname>PostgreSQL</productname> supports arrays of
|
||||
user-defined types as well. When you define a type,
|
||||
When you define a new base type,
|
||||
<productname>PostgreSQL</productname> automatically provides support
|
||||
for arrays of that type. For historical reasons, the array type has
|
||||
the same name as the user-defined type with the underscore character
|
||||
<literal>_</> prepended.
|
||||
for arrays of that
|
||||
type.<indexterm><primary>array</primary><secondary>of user-defined
|
||||
type</secondary></indexterm> For historical reasons, the array type
|
||||
has the same name as the base type with the underscore character
|
||||
(<literal>_</>) prepended.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Composite types do not need any function defined on them, since the
|
||||
system already understands what they look like inside.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<indexterm>
|
||||
<primary>TOAST</primary>
|
||||
<secondary>and user-defined types</secondary>
|
||||
</indexterm>
|
||||
If the values of your data type might exceed a few hundred bytes in
|
||||
size (in internal form), you should be careful to mark them
|
||||
TOAST-able. To do this, the internal representation must follow the
|
||||
standard layout for variable-length data: the first four bytes must
|
||||
be an <type>int32</type> containing the total length in bytes of the
|
||||
datum (including itself). Then, all your functions that accept
|
||||
values of the type must be careful to call
|
||||
<function>pg_detoast_datum()</function> on the supplied values ---
|
||||
after checking that the value is not NULL, if your function is not
|
||||
strict. Finally, select the appropriate storage option when giving
|
||||
the <command>CREATE TYPE</command> command.
|
||||
size (in internal form), you should mark them
|
||||
TOAST-able.<indexterm><primary>TOAST</primary><secondary>and
|
||||
user-defined types</secondary></indexterm> To do this, the internal
|
||||
representation must follow the standard layout for variable-length
|
||||
data: the first four bytes must be an <type>int32</type> containing
|
||||
the total length in bytes of the datum (including itself). Also,
|
||||
when running the <command>CREATE TYPE</command> command, specify the
|
||||
internal length as <literal>variable</> and select the appropriate
|
||||
storage option.
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
<para>
|
||||
For further details see the description of the <command>CREATE
|
||||
TYPE</command> command in <xref linkend="reference">.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
<!-- Keep this comment at the end of the file
|
||||
Local variables:
|
||||
|
Loading…
Reference in New Issue
Block a user