Add some notes about the basic mathematical laws that the system presumes

hold true for operators in a btree operator family.  This is mostly to
clarify my own thinking about what the planner can assume for optimization
purposes.  (blowing dust off an old abstract-algebra textbook...)
This commit is contained in:
Tom Lane 2007-01-12 17:04:54 +00:00
parent fc568b9d8f
commit d83235415b

View File

@ -1,4 +1,4 @@
$PostgreSQL: pgsql/src/backend/access/nbtree/README,v 1.16 2007/01/09 02:14:10 tgl Exp $
$PostgreSQL: pgsql/src/backend/access/nbtree/README,v 1.17 2007/01/12 17:04:54 tgl Exp $
This directory contains a correct implementation of Lehman and Yao's
high-concurrency B-tree management algorithm (P. Lehman and S. Yao,
@ -485,4 +485,47 @@ datatypes to supply us with a comparison procedure via pg_amproc.
This procedure must take two nonnull values A and B and return an int32 < 0,
0, or > 0 if A < B, A = B, or A > B, respectively. The procedure must
not return INT_MIN for "A < B", since the value may be negated before
being tested for sign. See nbtcompare.c for examples.
being tested for sign. A null result is disallowed, too. See nbtcompare.c
for examples.
There are some basic assumptions that a btree operator family must satisfy:
An = operator must be an equivalence relation; that is, for all non-null
values A,B,C of the datatype:
A = A is true reflexive law
if A = B, then B = A symmetric law
if A = B and B = C, then A = C transitive law
A < operator must be a strong ordering relation; that is, for all non-null
values A,B,C:
A < A is false irreflexive law
if A < B and B < C, then A < C transitive law
Furthermore, the ordering is total; that is, for all non-null values A,B:
exactly one of A < B, A = B, and B < A is true trichotomy law
(The trichotomy law justifies the definition of the comparison support
procedure, of course.)
The other three operators are defined in terms of these two in the obvious way,
and must act consistently with them.
For an operator family supporting multiple datatypes, the above laws must hold
when A,B,C are taken from any datatypes in the family. The transitive laws
are the trickiest to ensure, as in cross-type situations they represent
statements that the behaviors of two or three different operators are
consistent. As an example, it would not work to put float8 and numeric into
an opfamily, at least not with the current semantics that numerics are
converted to float8 for comparison to a float8. Because of the limited
accuracy of float8, this means there are distinct numeric values that will
compare equal to the same float8 value, and thus the transitive law fails.
It should be fairly clear why a btree index requires these laws to hold within
a single datatype: without them there is no ordering to arrange the keys with.
Also, index searches using a key of a different datatype require comparisons
to behave sanely across two datatypes. The extensions to three or more
datatypes within a family are not strictly required by the btree index
mechanism itself, but the planner relies on them for optimization purposes.