mirror of
https://git.postgresql.org/git/postgresql.git
synced 2025-01-12 18:34:36 +08:00
431 lines
17 KiB
HTML
431 lines
17 KiB
HTML
<HTML>
|
|
<HEAD>
|
|
<TITLE>The POSTGRES95 User Manual - THE QUERY LANGUAGE</TITLE>
|
|
</HEAD>
|
|
|
|
<BODY>
|
|
|
|
<font size=-1>
|
|
<A HREF="pg95user.html">[ TOC ]</A>
|
|
<A HREF="xaggr.html">[ Previous ]</A>
|
|
<A HREF="libpq.html">[ Next ]</A>
|
|
</font>
|
|
<HR>
|
|
<H1>11. INTERFACING EXTENSIONS TO INDICES</H1>
|
|
<HR>
|
|
The procedures described thus far let you define a new
|
|
type, new functions and new operators. However, we
|
|
cannot yet define a secondary index (such as a <B>B-tree</B>,
|
|
<B>R-tree</B> or hash access method) over a new type or its
|
|
operators.<p>
|
|
|
|
<A HREF="extend.html#about-the-postgres-system-catalogs">Look back at Figure 3</A>.
|
|
The right half shows the catalogs
|
|
that we must modify in order to tell POSTGRES how
|
|
to use a user-defined type and/or user-defined operators
|
|
with an index (i.e., <CODE>pg_am, pg_amop, pg_amproc</CODE> and
|
|
<CODE>pg_opclass</CODE>). Unfortunately, there is no simple command
|
|
to do this. We will demonstrate how to modify these
|
|
catalogs through a running example: a new operator
|
|
class for the <B>B-tree</B> access method that sorts integers
|
|
in ascending absolute value order.<p>
|
|
|
|
The <CODE>pg_am</CODE> class contains one instance for every user
|
|
defined access method. Support for the heap access
|
|
method is built into POSTGRES, but every other access
|
|
method is described here. The schema is
|
|
<p>
|
|
<center>
|
|
<table border=1>
|
|
<tr>
|
|
<td>amname </td><td> name of the access method </td>
|
|
</tr>
|
|
<td>amowner </td><td> object id of the owner's instance in pg_user </td>
|
|
</tr>
|
|
<tr>
|
|
<td>amkind </td><td> not used at present, but set to 'o' as a place holder </td>
|
|
</tr>
|
|
<tr>
|
|
<td>amstrategies </td><td> number of strategies for this access method (see below) </td>
|
|
</tr>
|
|
<tr>
|
|
<td>amsupport </td><td> number of support routines for this access method (see below) </td>
|
|
</tr>
|
|
<tr>
|
|
<td>amgettuple<br>
|
|
aminsert<br>
|
|
...</td>
|
|
<td>procedure identifiers for interface routines to the access
|
|
method. For example, regproc ids for opening, closing, and
|
|
getting instances from the access method appear here. </td>
|
|
</tr>
|
|
</table>
|
|
</center>
|
|
|
|
<p>
|
|
|
|
The <B>object ID</B> of the instance in <CODE>pg_am</CODE> is used as a
|
|
foreign key in lots of other classes. You don't need
|
|
to add a new instance to this class; all you're interested in
|
|
is the <B>object ID</B> of the access method instance
|
|
you want to extend:
|
|
|
|
<pre> SELECT oid FROM pg_am WHERE amname = 'btree'
|
|
|
|
+----+
|
|
|oid |
|
|
+----+
|
|
|403 |
|
|
+----+
|
|
</pre>
|
|
|
|
The <CODE>amstrategies</CODE> attribute exists to standardize
|
|
comparisons across data types. For example, <B>B-tree</B>s
|
|
impose a strict ordering on keys, lesser to greater.
|
|
Since POSTGRES allows the user to define operators,
|
|
POSTGRES cannot look at the name of an operator (eg, >
|
|
or <) and tell what kind of comparison it is. In fact,
|
|
some access methods don't impose any ordering at all.
|
|
For example, <B>R-tree</B>s express a rectangle-containment
|
|
relationship, whereas a hashed data structure expresses
|
|
only bitwise similarity based on the value of a hash
|
|
function. POSTGRES needs some consistent way of taking
|
|
a qualification in your query, looking at the operator
|
|
and then deciding if a usable index exists. This
|
|
implies that POSTGRES needs to know, for example, that
|
|
the <= and > operators partition a <B>B-tree</B>. POSTGRES
|
|
uses strategies to express these relationships between
|
|
operators and the way they can be used to scan indices.<p>
|
|
|
|
Defining a new set of strategies is beyond the scope of
|
|
this discussion, but we'll explain how <B>B-tree</B> strategies
|
|
work because you'll need to know that to add a new
|
|
operator class. In the <CODE>pg_am</CODE> class, the amstrategies
|
|
attribute is the number of strategies defined for this
|
|
access method. For <B>B-tree</B>s, this number is 5. These
|
|
strategies correspond to
|
|
<p>
|
|
|
|
<center>
|
|
<table border=1>
|
|
<tr>
|
|
<td>less than </td><td> 1 </td>
|
|
</tr>
|
|
<tr>
|
|
<td>less than or equal </td><td> 2 </td>
|
|
</tr>
|
|
<tr>
|
|
<td>equal </td><td> 3 </td>
|
|
</tr>
|
|
<tr>
|
|
<td>greater than or equal </td><td> 4 </td>
|
|
</tr>
|
|
<tr>
|
|
<td>greater than </td><td> 5 </td>
|
|
</tr>
|
|
</table>
|
|
</center>
|
|
<p>
|
|
|
|
The idea is that you'll need to add procedures corresponding
|
|
to the comparisons above to the <CODE>pg_amop</CODE> relation
|
|
(see below). The access method code can use these
|
|
strategy numbers, regardless of data type, to figure
|
|
out how to partition the <B>B-tree</B>, compute selectivity,
|
|
and so on. Don't worry about the details of adding
|
|
procedures yet; just understand that there must be a
|
|
set of these procedures for <CODE>int2, int4, oid,</CODE> and every
|
|
other data type on which a <B>B-tree</B> can operate.
|
|
<p>
|
|
Sometimes, strategies aren't enough information for the
|
|
system to figure out how to use an index. Some access
|
|
methods require other support routines in order to
|
|
work. For example, the <B>B-tree</B> access method must be
|
|
able to compare two keys and determine whether one is
|
|
greater than, equal to, or less than the other.
|
|
Similarly, the <B>R-tree</B> access method must be able to compute
|
|
intersections, unions, and sizes of rectangles. These
|
|
operations do not correspond to user qualifications in
|
|
SQL queries; they are administrative routines used by
|
|
the access methods, internally.<p>
|
|
|
|
In order to manage diverse support routines
|
|
consistently across all POSTGRES access methods, <CODE>pg_am</CODE>
|
|
includes an attribute called <CODE>amsupport</CODE>. This attribute
|
|
records the number of support routines used by an
|
|
access method. For <B>B-tree</B>s, this number is one -- the
|
|
routine to take two keys and return -1, 0, or +1,
|
|
depending on whether the first key is less than, equal
|
|
to, or greater than the second.<A HREF="#8"><font size=-1>[8]</font></A><p>
|
|
|
|
The <CODE>amstrategies</CODE> entry in pg_am is just the number of
|
|
strategies defined for the access method in question.
|
|
The procedures for less than, less equal, and so on
|
|
don't appear in <CODE>pg_am</CODE>. Similarly, <CODE>amsupport</CODE> is just
|
|
the number of support routines required by the access
|
|
method. The actual routines are listed elsewhere.<p>
|
|
|
|
The next class of interest is pg_opclass. This class
|
|
exists only to associate a name with an oid. In
|
|
pg_amop, every <B>B-tree</B> operator class has a set of
|
|
procedures, one through five, above. Some existing
|
|
opclasses are <CODE>int2_ops, int4_ops, and oid_ops</CODE>. You
|
|
need to add an instance with your opclass name (for
|
|
example, <CODE>complex_abs_ops</CODE>) to <CODE>pg_opclass</CODE>. The <CODE>oid</CODE> of
|
|
this instance is a foreign key in other classes.
|
|
|
|
<pre> INSERT INTO pg_opclass (opcname) VALUES ('complex_abs_ops');
|
|
|
|
SELECT oid, opcname
|
|
FROM pg_opclass
|
|
WHERE opcname = 'complex_abs_ops';
|
|
|
|
+------+--------------+
|
|
|oid | opcname |
|
|
+------+--------------+
|
|
|17314 | int4_abs_ops |
|
|
+------+--------------+
|
|
</pre>
|
|
|
|
Note that the oid for your <CODE>pg_opclass</CODE> instance will be
|
|
different! You should substitute your value for 17314
|
|
wherever it appears in this discussion.<p>
|
|
|
|
So now we have an access method and an operator class.
|
|
We still need a set of operators; the procedure for
|
|
defining operators was discussed earlier in this manual.
|
|
For the complex_abs_ops operator class on Btrees,
|
|
the operators we require are:
|
|
|
|
<pre> absolute value less-than
|
|
absolute value less-than-or-equal
|
|
absolute value equal
|
|
absolute value greater-than-or-equal
|
|
absolute value greater-than
|
|
</pre>
|
|
|
|
Suppose the code that implements the functions defined
|
|
is stored in the file
|
|
|
|
<pre>
|
|
/usr/local/postgres95/src/tutorial/complex.c
|
|
</pre>
|
|
|
|
Part of the code look like this: (note that we will
|
|
only show the equality operator for the rest of the
|
|
examples. The other four operators are very similar.
|
|
Refer to <CODE>complex.c</CODE> or <CODE>complex.sql</CODE> for the details.)
|
|
|
|
<pre> #define Mag(c) ((c)->x*(c)->x + (c)->y*(c)->y)
|
|
|
|
bool
|
|
complex_abs_eq(Complex *a, Complex *b)
|
|
{
|
|
double amag = Mag(a), bmag = Mag(b);
|
|
return (amag==bmag);
|
|
}
|
|
</pre>
|
|
|
|
There are a couple of important things that are happening below.<p>
|
|
|
|
First, note that operators for less-than, less-than-or
|
|
equal, equal, greater-than-or-equal, and greater-than
|
|
for <CODE>int4</CODE> are being defined. All of these operators are
|
|
already defined for <CODE>int4</CODE> under the names <, <=, =, >=,
|
|
and >. The new operators behave differently, of
|
|
course. In order to guarantee that POSTGRES uses these
|
|
new operators rather than the old ones, they need to be
|
|
named differently from the old ones. This is a key
|
|
point: you can overload operators in POSTGRES, but only
|
|
if the operator isn't already defined for the argument
|
|
types. That is, if you have < defined for (int4,
|
|
int4), you can't define it again. POSTGRES does not
|
|
check this when you define your operator, so be careful.
|
|
To avoid this problem, odd names will be used for
|
|
the operators. If you get this wrong, the access methods
|
|
are likely to crash when you try to do scans.<p>
|
|
|
|
The other important point is that all the operator
|
|
functions return Boolean values. The access methods
|
|
rely on this fact. (On the other hand, the support
|
|
function returns whatever the particular access method
|
|
expects -- in this case, a signed integer.)
|
|
The final routine in the file is the "support routine"
|
|
mentioned when we discussed the amsupport attribute of
|
|
the <CODE>pg_am</CODE> class. We will use this later on. For now,
|
|
ignore it.
|
|
|
|
<pre> CREATE FUNCTION complex_abs_eq(complex, complex)
|
|
RETURNS bool
|
|
AS '/usr/local/postgres95/tutorial/obj/complex.so'
|
|
LANGUAGE 'c';
|
|
</pre>
|
|
|
|
Now define the operators that use them. As noted, the
|
|
operator names must be unique among all operators that
|
|
take two <CODE>int4</CODE> operands. In order to see if the
|
|
operator names listed below are taken, we can do a query on
|
|
<CODE>pg_operator</CODE>:
|
|
|
|
<pre> /*
|
|
* this query uses the regular expression operator (~)
|
|
* to find three-character operator names that end in
|
|
* the character &
|
|
*/
|
|
SELECT *
|
|
FROM pg_operator
|
|
WHERE oprname ~ '^..&$'::text;
|
|
</pre>
|
|
|
|
to see if your name is taken for the types you want.
|
|
The important things here are the procedure (which are
|
|
the <B>C</B> functions defined above) and the restriction and
|
|
join selectivity functions. You should just use the
|
|
ones used below--note that there are different such
|
|
functions for the less-than, equal, and greater-than
|
|
cases. These must be supplied, or the access method
|
|
will crash when it tries to use the operator. You
|
|
should copy the names for restrict and join, but use
|
|
the procedure names you defined in the last step.
|
|
|
|
<pre> CREATE OPERATOR = (
|
|
leftarg = complex, rightarg = complex, procedure = complex_abs_eq,
|
|
restrict = eqsel, join = eqjoinsel
|
|
)
|
|
</pre>
|
|
|
|
Notice that five operators corresponding to less, less
|
|
equal, equal, greater, and greater equal are defined.<p>
|
|
|
|
We're just about finished. the last thing we need to do
|
|
is to update the <CODE>pg_amop</CODE> relation. To do this, we need
|
|
the following attributes:
|
|
<p>
|
|
|
|
<center>
|
|
<table border=1>
|
|
<td>amopid </td><td> the <CODE>oid</CODE> of the <CODE>pg_am</CODE> instance for B-tree
|
|
(== 403, see above) </td>
|
|
<tr>
|
|
</tr>
|
|
<td>amopclaid </td><td> the <CODE>oid</CODE> of the
|
|
<CODE>pg_opclass</CODE> instance for <CODE>int4_abs_ops</CODE> (==
|
|
whatever you got instead of <CODE>17314</CODE>, see above)</td>
|
|
<tr>
|
|
</tr>
|
|
<td>amopopr </td><td> the <CODE>oid</CODE>s of the operators for the opclass (which we'll
|
|
get in just a minute) </td>
|
|
<tr>
|
|
</tr>
|
|
<td>amopselect, amopnpages </td><td> cost functions.</td>
|
|
</tr>
|
|
</table>
|
|
</center>
|
|
<p>
|
|
The cost functions are used by the query optimizer to
|
|
decide whether or not to use a given index in a scan.
|
|
Fortunately, these already exist. The two functions
|
|
we'll use are <CODE>btreesel</CODE>, which estimates the selectivity
|
|
of the <B>B-tree</B>, and <CODE>btreenpage</CODE>, which estimates the
|
|
number of pages a search will touch in the tree.<p>
|
|
|
|
So we need the <CODE>oid</CODE>s of the operators we just defined.
|
|
We'll look up the names of all the operators that take
|
|
two <CODE>int4</CODE>s, and pick ours out:
|
|
|
|
<pre> SELECT o.oid AS opoid, o.oprname
|
|
INTO TABLE complex_ops_tmp
|
|
FROM pg_operator o, pg_type t
|
|
WHERE o.oprleft = t.oid and o.oprright = t.oid
|
|
and t.typname = 'complex';
|
|
|
|
which returns:
|
|
|
|
+------+---------+
|
|
|oid | oprname |
|
|
+------+---------+
|
|
|17321 | < |
|
|
+------+---------+
|
|
|17322 | <= |
|
|
+------+---------+
|
|
|17323 | = |
|
|
+------+---------+
|
|
|17324 | >= |
|
|
+------+---------+
|
|
|17325 | > |
|
|
+------+---------+
|
|
</pre>
|
|
|
|
(Again, some of your <CODE>oid</CODE> numbers will almost certainly
|
|
be different.) The operators we are interested in are
|
|
those with <CODE>oid</CODE>s 17321 through 17325. The values you
|
|
get will probably be different, and you should
|
|
substitute them for the values below. We can look at the
|
|
operator names and pick out the ones we just added.<p>
|
|
|
|
Now we're ready to update <CODE>pg_amop</CODE> with our new operator
|
|
class. The most important thing in this entire
|
|
discussion is that the operators are ordered, from less equal
|
|
through greater equal, in <CODE>pg_amop</CODE>. We add the
|
|
instances we need:
|
|
|
|
<pre> INSERT INTO pg_amop (amopid, amopclaid, amopopr, amopstrategy,
|
|
amopselect, amopnpages)
|
|
SELECT am.oid, opcl.oid, c.opoid, 3,
|
|
'btreesel'::regproc, 'btreenpage'::regproc
|
|
FROM pg_am am, pg_opclass opcl, complex_ops_tmp c
|
|
WHERE amname = 'btree' and opcname = 'complex_abs_ops'
|
|
and c.oprname = '=';
|
|
</pre>
|
|
|
|
Note the order: "less than" is 1, "less than or equal"
|
|
is 2, "equal" is 3, "greater than or equal" is 4, and
|
|
"greater than" is 5.<p>
|
|
|
|
The last step (finally!) is registration of the
|
|
"support routine" previously described in our discussion of
|
|
<CODE>pg_am</CODE>. The <CODE>oid</CODE> of this support routine is stored in
|
|
the <CODE>pg_amproc</CODE> class, keyed by the access method <CODE>oid</CODE> and
|
|
the operator class <CODE>oid</CODE>. First, we need to register the
|
|
function in POSTGRES (recall that we put the <B>C</B> code
|
|
that implements this routine in the bottom of the file
|
|
in which we implemented the operator routines):
|
|
|
|
<pre> CREATE FUNCTION int4_abs_cmp(int4, int4)
|
|
RETURNS int4
|
|
AS '/usr/local/postgres95/tutorial/obj/complex.so'
|
|
LANGUAGE 'c';
|
|
|
|
SELECT oid, proname FROM pg_proc WHERE prname = 'int4_abs_cmp';
|
|
|
|
+------+--------------+
|
|
|oid | proname |
|
|
+------+--------------+
|
|
|17328 | int4_abs_cmp |
|
|
+------+--------------+
|
|
</pre>
|
|
(Again, your <CODE>oid</CODE> number will probably be different and
|
|
you should substitute the value you see for the value
|
|
below.) Recalling that the <B>B-tree</B> instance's oid is
|
|
403 and that of <CODE>int4_abs_ops</CODE> is 17314, we can add the
|
|
new instance as follows:
|
|
|
|
<pre> INSERT INTO pg_amproc (amid, amopclaid, amproc, amprocnum)
|
|
VALUES ('403'::oid, -- btree oid
|
|
'17314'::oid, -- pg_opclass tuple
|
|
'17328'::oid, -- new pg_proc oid
|
|
'1'::int2);
|
|
</pre>
|
|
<p>
|
|
<HR>
|
|
<A NAME="8"><B>[8]</B></A> Strictly speaking, this routine can return a negative
|
|
number (< 0), 0, or a non-zero positive number (> 0).
|
|
<HR>
|
|
<font size=-1>
|
|
<A HREF="pg95user.html">[ TOC ]</A>
|
|
<A HREF="xaggr.html">[ Previous ]</A>
|
|
<A HREF="libpq.html">[ Next ]</A>
|
|
</font>
|
|
</BODY>
|
|
</HTML>
|