Remove TODO.detail files that contained useless or very old information.

Update TODO accordingly.
2025-02-23 19:39:53 +08:00 · 2004-02-12 18:11:54 +00:00 · 2004-02-12 18:11:54 +00:00 · 2b721d3d41
commit 2b721d3d41
parent 5de02e283f
10 changed files with 102 additions and 14561 deletions
--- a/doc/TODO.detail/foreign
+++ b/doc/TODO.detail/foreign
@ -1,542 +0,0 @@
-From fjoe@iclub.nsu.ru Tue Jan 23 03:38:45 2001
-Received: from mx.nsu.ru (root@mx.nsu.ru [193.124.215.71])
-	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA14458
-	for <pgman@candle.pha.pa.us>; Tue, 23 Jan 2001 03:38:24 -0500 (EST)
-Received: from iclub.nsu.ru (root@iclub.nsu.ru [193.124.222.66])
-	by mx.nsu.ru (8.9.1/8.9.0) with ESMTP id OAA29153;
-	Tue, 23 Jan 2001 14:31:27 +0600 (NOVT)
-Received: from localhost (fjoe@localhost)
-	by iclub.nsu.ru (8.11.1/8.11.1) with ESMTP id f0N8VOr15273;
-	Tue, 23 Jan 2001 14:31:25 +0600 (NS)
-	(envelope-from fjoe@iclub.nsu.ru)
-Date: Tue, 23 Jan 2001 14:31:24 +0600 (NS)
-From: Max Khon <fjoe@iclub.nsu.ru>
-To: Bruce Momjian <pgman@candle.pha.pa.us>
-cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
-Subject: Re: [HACKERS] Bug in FOREIGN KEY
-In-Reply-To: <200101230416.XAA04293@candle.pha.pa.us>
-Message-ID: <Pine.BSF.4.21.0101231429310.12474-100000@iclub.nsu.ru>
-MIME-Version: 1.0
-Content-Type: TEXT/PLAIN; charset=US-ASCII
-Status: RO
-
-hi, there!
-
-On Mon, 22 Jan 2001, Bruce Momjian wrote:
-
-> 
-> > This problem with foreign keys has been reported to me, and I have confirmed
-> > the bug exists in current sources.  The DELETE should succeed:
-> > 
-> > ---------------------------------------------------------------------------
-> > 
-> > CREATE TABLE primarytest2 (
-> >                            col1 INTEGER, 
-> >                            col2 INTEGER, 
-> >                            PRIMARY KEY(col1, col2)
-> >                           );
-> > 
-> > CREATE TABLE foreigntest2 (col3 INTEGER, 
-> >                            col4 INTEGER,
-> >                            FOREIGN KEY (col3, col4) REFERENCES primarytest2
-> >                          );
-> > test=> BEGIN;
-> > BEGIN
-> > test=> INSERT INTO primarytest2 VALUES (5,5);
-> > INSERT 27618 1
-> > test=> DELETE FROM primarytest2 WHERE col1 = 5 AND col2 = 5;
-> > ERROR:  triggered data change violation on relation "primarytest2"
-
-I have another (slightly different) example:
--- cut here ---
-test=> CREATE TABLE pr(obj_id int PRIMARY KEY);
-NOTICE:  CREATE TABLE/PRIMARY KEY will create implicit index 'pr_pkey' for
-table 'pr'
-CREATE
-test=> CREATE TABLE fr(obj_id int REFERENCES pr ON DELETE CASCADE);
-NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN KEY
-check(s)
-CREATE
-test=> BEGIN;
-BEGIN
-test=> INSERT INTO pr (obj_id) VALUES (1);
-INSERT 200539 1
-test=> INSERT INTO fr (obj_id) SELECT obj_id FROM pr;
-INSERT 200540 1
-test=> DELETE FROM fr;
-ERROR:  triggered data change violation on relation "fr"
-test=> 
--- cut here ---
-
-we are running postgresql 7.1 beta3
-
-/fjoe
-
-
-From sszabo@megazone23.bigpanda.com Tue Jan 23 13:41:55 2001
-Received: from megazone23.bigpanda.com (rfx-64-6-210-138.users.reflexcom.com [64.6.210.138])
-	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA19924
-	for <pgman@candle.pha.pa.us>; Tue, 23 Jan 2001 13:41:54 -0500 (EST)
-Received: from localhost (sszabo@localhost)
-	by megazone23.bigpanda.com (8.11.1/8.11.1) with ESMTP id f0NIfLa41018;
-	Tue, 23 Jan 2001 10:41:21 -0800 (PST)
-Date: Tue, 23 Jan 2001 10:41:21 -0800 (PST)
-From: Stephan Szabo <sszabo@megazone23.bigpanda.com>
-To: Bruce Momjian <pgman@candle.pha.pa.us>
-cc: Jan Wieck <janwieck@Yahoo.com>, Peter Eisentraut <peter_e@gmx.net>,
-        PostgreSQL-development <pgsql-hackers@postgresql.org>
-Subject: Re: [HACKERS] Bug in FOREIGN KEY
-In-Reply-To: <200101230417.XAA04332@candle.pha.pa.us>
-Message-ID: <Pine.BSF.4.21.0101231031290.40955-100000@megazone23.bigpanda.com>
-MIME-Version: 1.0
-Content-Type: TEXT/PLAIN; charset=US-ASCII
-Status: RO
-
-
-> >     Think  I misinterpreted the SQL3 specs WR to this detail. The
-> >     checks must be made per statement,  not  at  the  transaction
-> >     level.  I'll  try  to fix it, but we need to define what will
-> >     happen with referential actions in the  case  of  conflicting
-> >     actions on the same key - there are some possible conflicts:
-> > 
-> >     1.  DEFERRED ON DELETE NO ACTION or RESTRICT
-> > 
-> >         Do  the referencing rows reference to the new PK row with
-> >         the  same  key  now,  or  is  this  still  a   constraint
-> >         violation?  I  would say it's not, because the constraint
-> >         condition is satisfied at the end of the transaction. How
-> >         do other databases behave?
-> > 
-> >     2.  DEFERRED ON DELETE CASCADE, SET NULL or SET DEFAULT
-> > 
-> >         Again  I'd  say  that  the  action  should  be suppressed
-> >         because a matching PK row is present at transaction end -
-> >         it's  not  the same old row, but the constraint itself is
-> >         still satisfied.
-
-I'm not actually sure on the cascade, set null and set default.  The
-way they are written seems to imply to me that it's based on the state
-of the database before/after the command in question as opposed to the
-deferred state of the database because of the stuff about updating the
-state of partially matching rows immediately after the delete/update of
-the row which wouldn't really make sense when deferred.  Does anyone know
-what other systems do with a case something like this all in a
-transaction:
-
-create table a (a int primary key);
-create table b (b int references a match full on update cascade
-		 on delete cascade deferrable initially deferred);
-insert into a values (1);
-insert into a values (2);
-insert into b values (1);
-delete from a where a=1;
-select * from b;
-commit;
-
-
-From pgsql-hackers-owner+M3901@postgresql.org Fri Jan 26 17:00:24 2001
-Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
-	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA10576
-	for <pgman@candle.pha.pa.us>; Fri, 26 Jan 2001 17:00:24 -0500 (EST)
-Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
-	by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0QLtVq53019;
-	Fri, 26 Jan 2001 16:55:31 -0500 (EST)
-	(envelope-from pgsql-hackers-owner+M3901@postgresql.org)
-Received: from smtp1b.mail.yahoo.com (smtp3.mail.yahoo.com [128.11.68.135])
-	by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0QLqmq52691
-	for <pgsql-hackers@postgresql.org>; Fri, 26 Jan 2001 16:52:48 -0500 (EST)
-	(envelope-from janwieck@yahoo.com)
-Received: from j13.us.greatbridge.com (HELO jupiter.greatbridge.com) (216.54.52.153)
-  by smtp.mail.vip.suc.yahoo.com with SMTP; 26 Jan 2001 22:49:57 -0000
-X-Apparently-From: <janwieck@yahoo.com>
-Received: (from janwieck@localhost)
-	by jupiter.greatbridge.com (8.9.3/8.9.3) id RAA04701;
-	Fri, 26 Jan 2001 17:02:32 -0500
-From: Jan Wieck <janwieck@Yahoo.com>
-Message-Id: <200101262202.RAA04701@jupiter.greatbridge.com>
-Subject: Re: [HACKERS] Bug in FOREIGN KEY
-In-Reply-To: <200101262110.QAA06902@candle.pha.pa.us> from Bruce Momjian at "Jan
-	26, 2001 04:10:22 pm"
-To: Bruce Momjian <pgman@candle.pha.pa.us>
-Date: Fri, 26 Jan 2001 17:02:32 -0500 (EST)
-CC: Jan Wieck <janwieck@Yahoo.com>, Peter Eisentraut <peter_e@gmx.net>,
-        PostgreSQL-development <pgsql-hackers@postgresql.org>
-X-Mailer: ELM [version 2.4ME+ PL68 (25)]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Precedence: bulk
-Sender: pgsql-hackers-owner@postgresql.org
-Status: RO
-
-Bruce Momjian wrote:
-> Here is another bug:
->
-> test=> begin;
-> BEGIN
-> test=> INSERT INTO primarytest2 VALUES (5,5);
-> INSERT 18757 1
-> test=> UPDATE primarytest2 SET col2=1 WHERE col1 = 5 AND col2 = 5;
-> ERROR:  deferredTriggerGetPreviousEvent: event for tuple (0,10) not
-> found
-
-    Schema?
-
-
-Jan
-
--
-
-#======================================================================#
-# It's easier to get forgiveness for being wrong than for being right. #
-# Let's break this rule - forgive me.                                  #
-#================================================== JanWieck@Yahoo.com #
-
-
-
-_________________________________________________________
-Do You Yahoo!?
-Get your free @yahoo.com address at http://mail.yahoo.com
-
-
-From pgsql-hackers-owner+M3864@postgresql.org Fri Jan 26 10:07:36 2001
-Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
-	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA17732
-	for <pgman@candle.pha.pa.us>; Fri, 26 Jan 2001 10:07:35 -0500 (EST)
-Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
-	by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0QF3lq12782;
-	Fri, 26 Jan 2001 10:03:47 -0500 (EST)
-	(envelope-from pgsql-hackers-owner+M3864@postgresql.org)
-Received: from mailout00.sul.t-online.com (mailout00.sul.t-online.com [194.25.134.16])
-	by mail.postgresql.org (8.11.1/8.11.1) with ESMTP id f0QF0Yq12614
-	for <pgsql-hackers@postgresql.org>; Fri, 26 Jan 2001 10:00:34 -0500 (EST)
-	(envelope-from peter_e@gmx.net)
-Received: from fwd01.sul.t-online.com 
-	by mailout00.sul.t-online.com with smtp 
-	id 14MALp-0006Im-00; Fri, 26 Jan 2001 15:59:45 +0100
-Received: from peter.localdomain (520083510237-0001@[212.185.245.73]) by fmrl01.sul.t-online.com
-	with esmtp id 14MALQ-1Z0gkaC; Fri, 26 Jan 2001 15:59:20 +0100
-Date: Fri, 26 Jan 2001 16:07:27 +0100 (CET)
-From: Peter Eisentraut <peter_e@gmx.net>
-To: Hiroshi Inoue <Inoue@tpf.co.jp>
-cc: Bruce Momjian <pgman@candle.pha.pa.us>,
-        PostgreSQL-development <pgsql-hackers@postgresql.org>
-Subject: Re: [HACKERS] Open 7.1 items
-In-Reply-To: <3A70FA87.933B3D51@tpf.co.jp>
-Message-ID: <Pine.LNX.4.30.0101261604030.769-100000@peter.localdomain>
-MIME-Version: 1.0
-Content-Type: TEXT/PLAIN; charset=US-ASCII
-X-Sender: 520083510237-0001@t-dialin.net
-Precedence: bulk
-Sender: pgsql-hackers-owner@postgresql.org
-Status: RO
-
-Hiroshi Inoue writes:
-
-> What does this item mean ?
-> Is it the following ?
->
-> 	begin;
-> 	insert into pk (id) values (1);
-> 	update(delete from) pk where id=1;
-> 	ERROR:  triggered data change violation on relation pk"
->
-> If so, isn't it a simple bug ?
-
-Depends on the definition of "bug".  It's not spec compliant and it's not
-documented and it's annoying.  But it's been like this for a year and the
-issue is well known and can normally be avoided.  It looks like a
-documentation to-do to me.
-
-- 
-Peter Eisentraut      peter_e@gmx.net       http://yi.org/peter-e/
-
-
-From pgsql-hackers-owner+M3876@postgresql.org Fri Jan 26 13:07:10 2001
-Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
-	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA26086
-	for <pgman@candle.pha.pa.us>; Fri, 26 Jan 2001 13:07:09 -0500 (EST)
-Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
-	by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0QI4Vq30248;
-	Fri, 26 Jan 2001 13:04:31 -0500 (EST)
-	(envelope-from pgsql-hackers-owner+M3876@postgresql.org)
-Received: from sectorbase2.sectorbase.com ([208.48.122.131])
-	by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0QI3Aq30098
-	for <pgsql-hackers@postgreSQL.org>; Fri, 26 Jan 2001 13:03:11 -0500 (EST)
-	(envelope-from vmikheev@SECTORBASE.COM)
-Received: by sectorbase2.sectorbase.com with Internet Mail Service (5.5.2653.19)
-	id <D49FAF71>; Fri, 26 Jan 2001 09:41:23 -0800
-Message-ID: <8F4C99C66D04D4118F580090272A7A234D32C1@sectorbase1.sectorbase.com>
-From: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
-To: "'Jan Wieck'" <janwieck@Yahoo.com>,
-        PostgreSQL HACKERS
-  <pgsql-hackers@postgresql.org>,
-        Bruce Momjian <root@candle.pha.pa.us>
-Subject: RE: [HACKERS] Open 7.1 items
-Date: Fri, 26 Jan 2001 10:02:59 -0800
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2653.19)
-Content-Type: text/plain;
-	charset="iso-8859-1"
-Precedence: bulk
-Sender: pgsql-hackers-owner@postgresql.org
-Status: RO
-
-> > FOREIGN KEY INSERT & UPDATE/DELETE in transaction "change violation"
-> 
->     A well known issue, and I've asked multiple times how exactly
->     we want to define the behaviour for deferred constraints.  Do
->     foreign keys reference just to a key value and are happy with
->     it's existance, or do they refer to a particular row?
-
-I think first. The last is closer to OODBMS world, not to [O]RDBMS one.
-
->     Consider you have a deferred "ON DELETE  CASCADE"  constraint
->     and  do  a  DELETE, INSERT of a PK. Do the FK rows need to be
->     deleted or not?
-
-Good example. I think FK should not be deleted. If someone really
-want to delete "old" FK then he can do 
-
-DELETE PK;
-SET CONSTRAINT ... IMMEDIATE; -- FK need to be deleted here
-INSERT PK;
-
->     Consider you have a deferred "ON  DELETE  RESTRICT"  and  "ON
->     UPDATE  CASCADE" constraint. If you DELETE PK1 and UPDATE PK2
->     to PK1, the FK2 rows need to follow, but does PK2 inherit all
->     FK1 rows now so it's the master of both groups?
-
-Yes. Again one can use SET CONSTRAINT to achieve desirable results.
-It seems that SET CONSTRAINT was designed for these purposes - ie
-for better flexibility.
-
-Though, it would be better to look how other DBes handle all these
-cases -:)
-
-Vadim
-
-From janwieck@yahoo.com Fri Jan 26 12:20:27 2001
-Received: from smtp6.mail.yahoo.com (smtp6.mail.yahoo.com [128.11.69.103])
-	by candle.pha.pa.us (8.9.0/8.9.0) with SMTP id MAA22158
-	for <root@candle.pha.pa.us>; Fri, 26 Jan 2001 12:20:27 -0500 (EST)
-Received: from j13.us.greatbridge.com (HELO jupiter.greatbridge.com) (216.54.52.153)
-  by smtp.mail.vip.suc.yahoo.com with SMTP; 26 Jan 2001 17:20:26 -0000
-X-Apparently-From: <janwieck@yahoo.com>
-Received: (from janwieck@localhost)
-	by jupiter.greatbridge.com (8.9.3/8.9.3) id MAA03196;
-	Fri, 26 Jan 2001 12:30:05 -0500
-From: Jan Wieck <janwieck@yahoo.com>
-Message-Id: <200101261730.MAA03196@jupiter.greatbridge.com>
-Subject: Re: [HACKERS] Open 7.1 items
-To: PostgreSQL HACKERS <pgsql-hackers@postgreSQL.org>,
-        Bruce Momjian <root@candle.pha.pa.us>
-Date: Fri, 26 Jan 2001 12:30:05 -0500 (EST)
-X-Mailer: ELM [version 2.4ME+ PL68 (25)]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Status: RO
-
-Bruce Momjian wrote:
-> Here are my open 7.1 items.  Thanks for shrinking the list so far.
->
-> ---------------------------------------------------------------------------
->
-> FreeBSD locale bug
-> Reorder INSERT firing in rules
-
-    I  don't  recall  why this is wanted. AFAIK there's no reason
-    NOT to do so, except for the actual state of beeing  far  too
-    close to a release candidate.
-
-> Philip Warner UPDATE crash
-> JDBC LargeObject short read return value missing
-> SELECT cash_out(1) crashes all backends
-> LAZY VACUUM
-> FOREIGN KEY INSERT & UPDATE/DELETE in transaction "change violation"
-
-    A well known issue, and I've asked multiple times how exactly
-    we want to define the behaviour for deferred constraints.  Do
-    foreign keys reference just to a key value and are happy with
-    it's existance, or do they refer to a particular row?
-
-    Consider you have a deferred "ON DELETE  CASCADE"  constraint
-    and  do  a  DELETE, INSERT of a PK. Do the FK rows need to be
-    deleted or not?
-
-    Consider you have a deferred "ON  DELETE  RESTRICT"  and  "ON
-    UPDATE  CASCADE" constraint. If you DELETE PK1 and UPDATE PK2
-    to PK1, the FK2 rows need to follow, but does PK2 inherit all
-    FK1 rows now so it's the master of both groups?
-
-    These  are  only two possible combinations. There are many to
-    think of.  As said, I've asked before, but noone  voted  yet.
-    Move  the item to 7.2 anyway, because changing this behaviour
-    would require massive changes in the trigger queue *and*  the
-    generic  RI triggers, which cannot be tested enough any more.
-
-
-Jan
-
-> Usernames limited in length
-> Does pg_dump preserve COMMENTs?
-> Failure of nested cursors in JDBC
-> JDBC setMaxRows() is global variable affecting other objects
-> Does JDBC Makefile need current dir?
-> Fix for pg_dump of bad system tables
-> Steve Howe failure query with rules
-> ODBC/JDBC not disconnecting properly?
-> Magnus Hagander ODBC issues?
-> Merge MySQL/PgSQL translation scripts
-> Fix ipcclean on Linux
-> Merge global and template BKI files?
->
->
-> --
->   Bruce Momjian                        |  http://candle.pha.pa.us
->   pgman@candle.pha.pa.us               |  (610) 853-3000
->   +  If your life is a hard drive,     |  830 Blythe Avenue
->   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
->
-
-
--
-
-#======================================================================#
-# It's easier to get forgiveness for being wrong than for being right. #
-# Let's break this rule - forgive me.                                  #
-#================================================== JanWieck@Yahoo.com #
-
-
-_________________________________________________________
-Do You Yahoo!?
-Get your free @yahoo.com address at http://mail.yahoo.com
-
-
-From pgsql-general-owner+M590@postgresql.org Tue Nov 14 16:30:40 2000
-Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
-	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA22313
-	for <pgman@candle.pha.pa.us>; Tue, 14 Nov 2000 17:30:39 -0500 (EST)
-Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
-	by mail.postgresql.org (8.11.1/8.11.1) with SMTP id eAEMSJs66979;
-	Tue, 14 Nov 2000 17:28:21 -0500 (EST)
-	(envelope-from pgsql-general-owner+M590@postgresql.org)
-Received: from megazone23.bigpanda.com (138.210.6.64.reflexcom.com [64.6.210.138])
-	by mail.postgresql.org (8.11.1/8.11.1) with ESMTP id eAEMREs66800
-	for <pgsql-general@postgresql.org>; Tue, 14 Nov 2000 17:27:14 -0500 (EST)
-	(envelope-from sszabo@megazone23.bigpanda.com)
-Received: from localhost (sszabo@localhost)
-	by megazone23.bigpanda.com (8.11.1/8.11.0) with ESMTP id eAEMPpH69059;
-	Tue, 14 Nov 2000 14:25:51 -0800 (PST)
-Date: Tue, 14 Nov 2000 14:25:51 -0800 (PST)
-From: Stephan Szabo <sszabo@megazone23.bigpanda.com>
-To: "Beth K. Gatewood" <bethg@mbt.washington.edu>
-cc: pgsql-general@postgresql.org
-Subject: Re: [GENERAL] a request for some experienced input.....
-In-Reply-To: <3A11ACA1.E5D847DD@mbt.washington.edu>
-Message-ID: <Pine.BSF.4.21.0011141403380.68986-100000@megazone23.bigpanda.com>
-MIME-Version: 1.0
-Content-Type: TEXT/PLAIN; charset=US-ASCII
-Precedence: bulk
-Sender: pgsql-general-owner@postgresql.org
-Status: OR
-
-
-On Tue, 14 Nov 2000, Beth K. Gatewood wrote:
-
-> >
-> 
-> Stephan-
-> 
-> Thank you so much for taking the effort to answer this these questions.  You
-> help is truly appreciated....
-> 
-> I just have a few points for clarification.
-> 
-> >
-> > MATCH PARTIAL is a specific match type which describes which rows are
-> > considered matching rows for purposes of meeting or failing the
-> > constraint.  (In match partial, a fktable (NULL, 2) would match a pk
-> > table (1,2) as well as a pk table (2,2).  It's different from match
-> > full in which case (NULL,2) would be invalid or match unspecified
-> > in which case it would match due to the existance of the NULL in any
-> > case).  There are some bizarre implementation details involved with
-> > it and it's different from the others in ways that make it difficult.
-> > It's in my list of things to do, but I haven't come up with an acceptable
-> > mechanism in my head yet.
-> 
-> Does this mean, currently that I can not have foreign keys with null values?
-
-Not exactly...
-
-Match full = In FK row, all columns must be NULL or the value of each
-	column must not be null and there is a row in the PK table where
-	each referencing column equals the corresponding referenced
-	column.
-
-Unspecified = In FK row, at least one column must be NULL or each
-	referencing column shall be equal to the corresponding referenced
-	column in some row of the referenced table
-
-Match partial is similar to match full except we ignore the null columns
- for purposes of the each referencing column equals bit.
-
-For example:
-           PK Table Key values: (1,2), (1,3), (3,3)
- Attempted FK Table Key values: (1,2), (1,NULL), (5,NULL), (NULL, NULL)
- (hopefully I get this right)...
- In match full, only the 1st and 4th fk values are valid.
- In match partial, the 1st, 2nd, and 4th fk values are valid.
- In match unspecified, all the fk values are valid.
-
-The other note is that generally speaking, all three are basically the
-same for the single column key.  If you're only doing references on one
-column, the match type is mostly meaningless.
-
-> > PENDANT adds that for each row of the referenced table the values of
-> > the specified column(s) are the same as the values of the specified
-> > column(s) in some row of the referencing tables.
-> 
-> I am not sure I know what you mean here.....Are you saying that the value for
-> the FK column must match the value for the PK column?
-
-I haven't really looked at PENDANT, the above was just a small rewrite of
-some descriptive text in the sql99 draft I have.  There's a whole bunch
-of rules in the actual text of the referential constraint definition.
-
-The base stuff seems to be: (Rf is the referencing columns, T is the
-referenced table)
-
-      3) If PENDANT is specified, then:
-         a) For a given row in the referencing table, let pendant
-           reference designate an instance in which all Rf are
-           non-null.
-
-         b) Let number of pendant paths be the number of pendant
-           references to the same referenced row in a referenced table
-           from all referencing rows in all base tables.
-
-         c) For every row in T, the number of pendant paths is equal to
-	   or greater than 1.
-
-So, I'd read it as every row in T must have at least one referencing row
-in some base table.
-
-There are some details about updates and that you can't mix PENDANT and
-MATCH PARTIAL or SET DEFAULT actions.
-
-> > The main issues in 7.0 are that older versions (might be fixed in
-> > 7.0.3) would fail very badly if you used alter table to rename tables that
-> > were referenced in a fk constraint and that you need to give update
-> > permission to the referenced table.  For the former, 7.1 will (and 7.0.3
-> > may) give an elog(ERROR) to you rather than crashing the backend and the
-> > latter should be fixed for 7.1 (although you still need to have write
-> > perms to the referencing table for referential actions to work properly)
-> 
-> Are the steps to this outlined somewhere then?
-
-The permissions stuff is just a matter of using GRANT and REVOKE to set
-the permissions that a user has to a table.  
-
-
--- a/doc/TODO.detail/fsync
+++ b/doc/TODO.detail/fsync
@ -1,129 +0,0 @@
-From pgsql-hackers-owner+M908@postgresql.org Sun Nov 19 14:27:43 2000
-Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
-	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA10885
-	for <pgman@candle.pha.pa.us>; Sun, 19 Nov 2000 14:27:42 -0500 (EST)
-Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
-	by mail.postgresql.org (8.11.1/8.11.1) with SMTP id eAJJSMs83653;
-	Sun, 19 Nov 2000 14:28:22 -0500 (EST)
-	(envelope-from pgsql-hackers-owner+M908@postgresql.org)
-Received: from candle.pha.pa.us (candle.navpoint.com [162.33.245.46] (may be forged))
-	by mail.postgresql.org (8.11.1/8.11.1) with ESMTP id eAJJQns83565
-	for <pgsql-hackers@postgreSQL.org>; Sun, 19 Nov 2000 14:26:49 -0500 (EST)
-	(envelope-from pgman@candle.pha.pa.us)
-Received: (from pgman@localhost)
-	by candle.pha.pa.us (8.9.0/8.9.0) id OAA06790;
-	Sun, 19 Nov 2000 14:23:06 -0500 (EST)
-From: Bruce Momjian <pgman@candle.pha.pa.us>
-Message-Id: <200011191923.OAA06790@candle.pha.pa.us>
-Subject: Re: [HACKERS] WAL fsync scheduling
-In-Reply-To: <002101c0525e$2d964480$b97a30d0@sectorbase.com> "from Vadim Mikheev
-	at Nov 19, 2000 11:23:19 am"
-To: Vadim Mikheev <vmikheev@sectorbase.com>
-Date: Sun, 19 Nov 2000 14:23:06 -0500 (EST)
-CC: Tom Samplonius <tom@sdf.com>, Alfred@candle.pha.pa.us,
-        Perlstein <bright@wintelcom.net>, Larry@candle.pha.pa.us,
-        Rosenman <ler@lerctr.org>,
-        PostgreSQL-development <pgsql-hackers@postgresql.org>
-X-Mailer: ELM [version 2.4ME+ PL77 (25)]
-MIME-Version: 1.0
-Content-Transfer-Encoding: 7bit
-Content-Type: text/plain; charset=US-ASCII
-Precedence: bulk
-Sender: pgsql-hackers-owner@postgresql.org
-Status: OR
-
-[ Charset ISO-8859-1 unsupported, converting... ]
-> > There are two parts to transaction commit.  The first is writing all
-> > dirty buffers or log changes to the kernel, and second is fsync of the
->    ^^^^^^^^^^^^
-> Backend doesn't write any dirty buffer to the kernel at commit time.
-
-Yes, I suspected that.
-
-> 
-> > log file.
-> 
-> The first part is writing commit record into WAL buffers in shmem.
-> This is what XLogInsert does.  After that XLogFlush is called to ensure
-> that  entire commit record is on disk. XLogFlush does *both* write() and
-> fsync() (single slock is used for both writing and fsyncing) if it needs to
-> do it at all.
-
-Yes, I realize there are new steps in WAL.
-
-> 
-> > I suggest having a per-backend shared memory byte that has the following
-> > values:
-> > 
-> > START_LOG_WRITE
-> > WAIT_ON_FSYNC
-> > NOT_IN_COMMIT
-> > backend_number_doing_fsync
-> > 
-> > I suggest that when each backend starts a commit, it sets its byte to
-> > START_LOG_WRITE. 
->   ^^^^^^^^^^^^^^^^^^^^^^^
-> Isn't START_COMMIT more meaningful?
-
-Yes.
-
-> 
-> > When it gets ready to fsync, it checks all backends. 
->    ^^^^^^^^^^^^^^^^^^^^^^^^^^
-> What do you mean by this? The moment just after XLogInsert?
-
-Just before it calls fsync().
-
-> 
-> > If all are NOT_IN_COMMIT, it does fsync and continues.
-> 
-> 1st edition:
-> > If one or more are in START_LOG_WRITE, it waits until no one is in
-> > START_LOG_WRITE.  It then checks all WAIT_ON_FSYNC, and if it is the
-> > lowest backend in WAIT_ON_FSYNC, marks all others with its backend
-> > number, and does fsync.  It then clears all backends with its number to
-> > NOT_IN_COMMIT.  Other backend will see they are not the lowest
-> > WAIT_ON_FSYNC and will wait for their byte to be set to NOT_IN_COMMIT
-> > so they can then continue, knowing their data was synced.
-> 
-> 2nd edition:
-> > I have another idea.  If a backend gets to the point that it needs
-> > fsync, and there is another backend in START_LOG_WRITE, it can go to an
-> > interuptable sleep, knowing another backend will perform the fsync and
-> > wake it up.  Therefore, there is no busy-wait or timed sleep.
-> > 
-> > Of course, a backend must set its status to WAIT_ON_FSYNC to avoid a
-> > race condition.
-> 
-> The 2nd edition is much better. But I'm not sure do we really need in
-> these per-backend bytes in shmem. Why not just have some counters?
-> We can use a semaphore to wake-up all waiters at once.
-
-Yes, that is much better and clearer.  My idea was just to say, "if no
-one is entering commit phase, do the commit.  If someone else is coming,
-sleep and wait for them to do the fsync and wake me up with a singal."  
-
-> 
-> > This allows a single backend not to sleep, and allows multiple backends
-> > to bunch up only when they are all about to commit.
-> > 
-> > The reason backend numbers are written is so other backends entering the
-> > commit code will not interfere with the backends performing fsync.
-> 
-> Being waked-up backend can check what's written/fsynced by calling XLogFlush.
-
-Seems that may not be needed anymore with a counter.  The only issue is
-that other backends may enter commit while fsync() is happening.  The
-process that did the fsync must be sure to wake up only the backends
-that were waiting for it, and not other backends that may be also be
-doing fsync as a group while the first fsync was happening.  I leave
-those details to people more experienced.  :-)
-
-I am just glad people liked my idea.
-
-- 
-  Bruce Momjian                        |  http://candle.pha.pa.us
-  pgman@candle.pha.pa.us               |  (610) 853-3000
-  +  If your life is a hard drive,     |  830 Blythe Avenue
-  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
-
--- a/doc/TODO.detail/optimizer
+++ b/doc/TODO.detail/optimizer
--- a/doc/TODO.detail/persistent
+++ b/doc/TODO.detail/persistent
@ -1,102 +0,0 @@
-From owner-pgsql-hackers@hub.org Mon May 11 11:31:09 1998
-Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
-	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA03006
-	for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:31:07 -0400 (EDT)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.17 $) with ESMTP id LAA01663 for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:24:42 -0400 (EDT)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id LAA21841; Mon, 11 May 1998 11:15:25 -0400 (EDT)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 11 May 1998 11:15:12 +0000 (EDT)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id LAA21683 for pgsql-hackers-outgoing; Mon, 11 May 1998 11:15:09 -0400 (EDT)
-Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id LAA21451 for <hackers@postgreSQL.org>; Mon, 11 May 1998 11:15:03 -0400 (EDT)
-Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
-	by sss.sss.pgh.pa.us (8.8.5/8.8.5) with ESMTP id LAA24915;
-	Mon, 11 May 1998 11:14:43 -0400 (EDT)
-To: Brett McCormick <brett@work.chicken.org>
-cc: hackers@postgreSQL.org
-Subject: Re: [HACKERS] Re: [PATCHES] Try again: S_LOCK reduced contentionh] 
-In-reply-to: Your message of Mon, 11 May 1998 07:57:23 -0700 (PDT) 
-             <13655.4384.345723.466046@abraxas.scene.com> 
-Date: Mon, 11 May 1998 11:14:43 -0400
-Message-ID: <24913.894899683@sss.pgh.pa.us>
-From: Tom Lane <tgl@sss.pgh.pa.us>
-Sender: owner-pgsql-hackers@hub.org
-Precedence: bulk
-Status: RO
-
-Brett McCormick <brett@work.chicken.org> writes:
-> same way that the current network socket is passed -- through an execv
-> argument.  hopefully, however, the non-execv()ing fork will be in 6.4.
-
-Um, you missed the point, Brett.  David was hoping to transfer a client
-connection from the postmaster to an *already existing* backend process.
-Fork, with or without exec, solves the problem for a backend that's
-started after the postmaster has accepted the client socket.
-
-This does lead to a different line of thought, however.  Pre-started
-backends would have access to the "master" connection socket on which
-the postmaster listens for client connections, right?  Suppose that we
-fire the postmaster as postmaster, and demote it to being simply a
-manufacturer of new backend processes as old ones get used up.  Have
-one of the idle backend processes be the one doing the accept() on the
-master socket.  Once it has a client connection, it performs the
-authentication handshake and then starts serving the client (or just
-quits if authentication fails).  Meanwhile the next idle backend process
-has executed accept() on the master socket and is waiting for the next
-client; and shortly the postmaster/factory/whateverwecallitnow notices
-that it needs to start another backend to add to the idle-backend pool.
-
-This'd probably need some interlocking among the backends.  I have no
-idea whether it'd be safe to have all the idle backends trying to
-do accept() on the master socket simultaneously, but it sounds risky.
-Better to use a mutex so that only one gets to do it while the others
-sleep.
-
-			regards, tom lane
-
-
-From owner-pgsql-hackers@hub.org Mon May 11 11:35:55 1998
-Received: from hub.org (hub.org [209.47.148.200])
-	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA03043
-	for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:35:53 -0400 (EDT)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id LAA23494; Mon, 11 May 1998 11:27:10 -0400 (EDT)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 11 May 1998 11:27:02 +0000 (EDT)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id LAA23473 for pgsql-hackers-outgoing; Mon, 11 May 1998 11:27:01 -0400 (EDT)
-Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id LAA23462 for <hackers@postgreSQL.org>; Mon, 11 May 1998 11:26:56 -0400 (EDT)
-Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
-	by sss.sss.pgh.pa.us (8.8.5/8.8.5) with ESMTP id LAA25006;
-	Mon, 11 May 1998 11:26:44 -0400 (EDT)
-To: Brett McCormick <brett@work.chicken.org>
-cc: hackers@postgreSQL.org
-Subject: Re: [HACKERS] Re: [PATCHES] Try again: S_LOCK reduced contentionh] 
-In-reply-to: Your message of Mon, 11 May 1998 07:57:23 -0700 (PDT) 
-             <13655.4384.345723.466046@abraxas.scene.com> 
-Date: Mon, 11 May 1998 11:26:44 -0400
-Message-ID: <25004.894900404@sss.pgh.pa.us>
-From: Tom Lane <tgl@sss.pgh.pa.us>
-Sender: owner-pgsql-hackers@hub.org
-Precedence: bulk
-Status: RO
-
-Meanwhile, *I* missed the point about Brett's second comment :-(
-
-Brett McCormick <brett@work.chicken.org> writes:
-> There will have to be some sort of arg parsing in any case,
-> considering that you can pass configurable arguments to the backend..
-
-If we do the sort of change David and I were just discussing, then the
-pre-spawned backend would become responsible for parsing and dealing
-with the PGOPTIONS portion of the client's connection request message.
-That's just part of shifting the authentication handshake code from
-postmaster to backend, so it shouldn't be too hard.
-
-BUT: the whole point is to be able to initialize the backend before it
-is connected to a client.  How much of the expensive backend startup
-work depends on having the client connection options available?
-Any work that needs to know the options will have to wait until after
-the client connects.  If that means most of the startup work can't
-happen in advance anyway, then we're out of luck; a pre-started backend
-won't save enough time to be worth the effort.  (Unless we are willing
-to eliminate or redefine the troublesome options...)
-
-			regards, tom lane
-
-
--- a/doc/TODO.detail/pool
+++ b/doc/TODO.detail/pool
@ -1319,3 +1319,105 @@ DDI: +64(4)916-7201    MOB: +64(21)635-694    OFFICE: +64(4)499-2267
 ---------------------------(end of broadcast)---------------------------
 TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

+From owner-pgsql-hackers@hub.org Mon May 11 11:31:09 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA03006
+	for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:31:07 -0400 (EDT)
+Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.17 $) with ESMTP id LAA01663 for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:24:42 -0400 (EDT)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id LAA21841; Mon, 11 May 1998 11:15:25 -0400 (EDT)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 11 May 1998 11:15:12 +0000 (EDT)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id LAA21683 for pgsql-hackers-outgoing; Mon, 11 May 1998 11:15:09 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id LAA21451 for <hackers@postgreSQL.org>; Mon, 11 May 1998 11:15:03 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.8.5/8.8.5) with ESMTP id LAA24915;
+	Mon, 11 May 1998 11:14:43 -0400 (EDT)
+To: Brett McCormick <brett@work.chicken.org>
+cc: hackers@postgreSQL.org
+Subject: Re: [HACKERS] Re: [PATCHES] Try again: S_LOCK reduced contentionh] 
+In-reply-to: Your message of Mon, 11 May 1998 07:57:23 -0700 (PDT) 
+             <13655.4384.345723.466046@abraxas.scene.com> 
+Date: Mon, 11 May 1998 11:14:43 -0400
+Message-ID: <24913.894899683@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@hub.org
+Precedence: bulk
+Status: RO
+
+Brett McCormick <brett@work.chicken.org> writes:
+> same way that the current network socket is passed -- through an execv
+> argument.  hopefully, however, the non-execv()ing fork will be in 6.4.
+
+Um, you missed the point, Brett.  David was hoping to transfer a client
+connection from the postmaster to an *already existing* backend process.
+Fork, with or without exec, solves the problem for a backend that's
+started after the postmaster has accepted the client socket.
+
+This does lead to a different line of thought, however.  Pre-started
+backends would have access to the "master" connection socket on which
+the postmaster listens for client connections, right?  Suppose that we
+fire the postmaster as postmaster, and demote it to being simply a
+manufacturer of new backend processes as old ones get used up.  Have
+one of the idle backend processes be the one doing the accept() on the
+master socket.  Once it has a client connection, it performs the
+authentication handshake and then starts serving the client (or just
+quits if authentication fails).  Meanwhile the next idle backend process
+has executed accept() on the master socket and is waiting for the next
+client; and shortly the postmaster/factory/whateverwecallitnow notices
+that it needs to start another backend to add to the idle-backend pool.
+
+This'd probably need some interlocking among the backends.  I have no
+idea whether it'd be safe to have all the idle backends trying to
+do accept() on the master socket simultaneously, but it sounds risky.
+Better to use a mutex so that only one gets to do it while the others
+sleep.
+
+			regards, tom lane
+
+
+From owner-pgsql-hackers@hub.org Mon May 11 11:35:55 1998
+Received: from hub.org (hub.org [209.47.148.200])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA03043
+	for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:35:53 -0400 (EDT)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id LAA23494; Mon, 11 May 1998 11:27:10 -0400 (EDT)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 11 May 1998 11:27:02 +0000 (EDT)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id LAA23473 for pgsql-hackers-outgoing; Mon, 11 May 1998 11:27:01 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id LAA23462 for <hackers@postgreSQL.org>; Mon, 11 May 1998 11:26:56 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.8.5/8.8.5) with ESMTP id LAA25006;
+	Mon, 11 May 1998 11:26:44 -0400 (EDT)
+To: Brett McCormick <brett@work.chicken.org>
+cc: hackers@postgreSQL.org
+Subject: Re: [HACKERS] Re: [PATCHES] Try again: S_LOCK reduced contentionh] 
+In-reply-to: Your message of Mon, 11 May 1998 07:57:23 -0700 (PDT) 
+             <13655.4384.345723.466046@abraxas.scene.com> 
+Date: Mon, 11 May 1998 11:26:44 -0400
+Message-ID: <25004.894900404@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@hub.org
+Precedence: bulk
+Status: RO
+
+Meanwhile, *I* missed the point about Brett's second comment :-(
+
+Brett McCormick <brett@work.chicken.org> writes:
+> There will have to be some sort of arg parsing in any case,
+> considering that you can pass configurable arguments to the backend..
+
+If we do the sort of change David and I were just discussing, then the
+pre-spawned backend would become responsible for parsing and dealing
+with the PGOPTIONS portion of the client's connection request message.
+That's just part of shifting the authentication handshake code from
+postmaster to backend, so it shouldn't be too hard.
+
+BUT: the whole point is to be able to initialize the backend before it
+is connected to a client.  How much of the expensive backend startup
+work depends on having the client connection options available?
+Any work that needs to know the options will have to wait until after
+the client connects.  If that means most of the startup work can't
+happen in advance anyway, then we're out of luck; a pre-started backend
+won't save enough time to be worth the effort.  (Unless we are willing
+to eliminate or redefine the troublesome options...)
+
+			regards, tom lane
+
+
--- a/doc/TODO.detail/prepare
+++ b/doc/TODO.detail/prepare
--- a/doc/TODO.detail/replication
+++ b/doc/TODO.detail/replication
--- a/doc/TODO.detail/typeconv
+++ b/doc/TODO.detail/typeconv
@ -1,916 +0,0 @@
-From pgsql-hackers-owner+M1833@hub.org Sat May 13 22:49:26 2000
-Received: from news.tht.net (news.hub.org [216.126.91.242])
-	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07394
-	for <pgman@candle.pha.pa.us>; Sat, 13 May 2000 22:49:24 -0400 (EDT)
-Received: from hub.org (majordom@hub.org [216.126.84.1])
-	by news.tht.net (8.9.3/8.9.3) with ESMTP id WAB99859;
-	Sat, 13 May 2000 22:44:15 -0400 (EDT)
-	(envelope-from pgsql-hackers-owner+M1833@hub.org)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
-	by hub.org (8.9.3/8.9.3) with ESMTP id WAA51058
-	for <pgsql-hackers@postgreSQL.org>; Sat, 13 May 2000 22:41:16 -0400 (EDT)
-	(envelope-from tgl@sss.pgh.pa.us)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
-	by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id WAA18343
-	for <pgsql-hackers@postgreSQL.org>; Sat, 13 May 2000 22:40:38 -0400 (EDT)
-To: pgsql-hackers@postgresql.org
-Subject: [HACKERS] Proposal for fixing numeric type-resolution issues
-Date: Sat, 13 May 2000 22:40:38 -0400
-Message-ID: <18340.958272038@sss.pgh.pa.us>
-From: Tom Lane <tgl@sss.pgh.pa.us>
-X-Mailing-List: pgsql-hackers@postgresql.org
-Precedence: bulk
-Sender: pgsql-hackers-owner@hub.org
-Status: ORr
-
-We've got a collection of problems that are related to the parser's
-inability to make good type-resolution choices for numeric constants.
-In some cases you get a hard error; for example "NumericVar + 4.4"
-yields
-ERROR:  Unable to identify an operator '+' for types 'numeric' and 'float8'
-        You will have to retype this query using an explicit cast
-because "4.4" is initially typed as float8 and the system can't figure
-out whether to use numeric or float8 addition.  A more subtle problem
-is that a query like "... WHERE Int2Var < 42" is unable to make use of
-an index on the int2 column: 42 is resolved as int4, so the operator
-is int24lt, which works but is not in the opclass of an int2 index.
-
-Here is a proposal for fixing these problems.  I think we could get this
-done for 7.1 if people like it.
-
-The basic problem is that there's not enough smarts in the type resolver
-about the interrelationships of the numeric datatypes.  All it has is
-a concept of a most-preferred type within the category of numeric types.
-(We are abusing the most-preferred-type mechanism, BTW, because both
-FLOAT8 and NUMERIC claim to be the most-preferred type in the numeric
-category!  This is in fact why the resolver can't make a choice for
-"numeric+float8".)  We need more intelligence than that.
-
-I propose that we set up a strictly-ordered hierarchy of numeric
-datatypes, running from least preferred to most preferred:
-	int2, int4, int8, numeric, float4, float8.
-Rather than simply considering coercions to the most-preferred type,
-the type resolver should use the following rules:
-
-1. No value will be down-converted (eg int4 to int2) except by an
-explicit conversion.
-
-2. If there is not an exact matching operator, numeric values will be
-up-converted to the highest numeric datatype present among the operator
-or function's arguments.  For example, given "int2 + int8" we'd up-
-convert the int2 to int8 and apply int8 addition.
-
-The final piece of the puzzle is that the type initially assigned to
-an undecorated numeric constant should be NUMERIC if it contains a
-decimal point or exponent, and otherwise the smallest of int2, int4,
-int8, NUMERIC that will represent it.  This is a considerable change
-from the current lexer behavior, where you get either int4 or float8.
-
-For example, given "NumericVar + 4.4", the constant 4.4 will initially
-be assigned type NUMERIC, we will resolve the operator as numeric plus,
-and everything's fine.  Given "Float8Var + 4.4", the constant is still
-initially numeric, but will be up-converted to float8 so that float8
-addition can be used.  The end result is the same as in traditional
-Postgres: you get float8 addition.  Given "Int2Var < 42", the constant
-is initially typed as int2, since it fits, and we end up selecting
-int2lt, thereby allowing use of an int2 index.  (On the other hand,
-given "Int2Var < 100000", we'd end up using int4lt, which is correct
-to avoid overflow.)
-
-A couple of crucial subtleties here:
-
-1. We are assuming that the parser or optimizer will constant-fold
-any conversion functions that are introduced.  Thus, in the
-"Float8Var + 4.4" case, the 4.4 is represented as a float8 4.4 by the
-time execution begins, so there's no performance loss.
-
-2. We cannot lose precision by initially representing a constant as
-numeric and later converting it to float.  Nor can we exceed NUMERIC's
-range (the default 1000-digit limit is more than the range of IEEE
-float8 data).  It would not work as well to start out by representing
-a constant as float and then converting it to numeric.
-
-Presently, the pg_proc and pg_operator tables contain a pretty fair
-collection of cross-datatype numeric operators, such as int24lt,
-float48pl, etc.  We could perhaps leave these in, but I believe that
-it is better to remove them.  For example, if int42lt is left in place,
-then it would capture cases like "Int4Var < 42", whereas we need that
-to be translated to int4lt so that an int4 index can be used.  Removing
-these operators will eliminate some code bloat and system-catalog bloat
-to boot.
-
-As far as I can tell, this proposal is almost compatible with the rules
-given in SQL92: in particular, SQL92 specifies that an operator having
-both "approximate numeric" (float) and "exact numeric" (int or numeric)
-inputs should deliver an approximate-numeric result.  I propose
-deviating from SQL92 in a single respect: SQL92 specifies that a
-constant containing an exponent (eg 1.2E34) is approximate numeric,
-which implies that the result of an operator using it is approximate
-even if the other operand is exact.  I believe it's better to treat
-such a constant as exact (ie, type NUMERIC) and only convert it to
-float if the other operand is float.  Without doing that, an assignment
-like
-	UPDATE tab SET NumericVar = 1.234567890123456789012345E34;
-will not work as desired because the constant will be prematurely
-coerced to float, causing precision loss.
-
-Comments?
-
-			regards, tom lane
-
-From tgl@sss.pgh.pa.us Sun May 14 17:30:56 2000
-Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
-	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA05808
-	for <pgman@candle.pha.pa.us>; Sun, 14 May 2000 17:30:52 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.4 $) with ESMTP id RAA16657 for <pgman@candle.pha.pa.us>; Sun, 14 May 2000 17:29:52 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
-	by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id RAA20914;
-	Sun, 14 May 2000 17:29:30 -0400 (EDT)
-To: Bruce Momjian <pgman@candle.pha.pa.us>
-cc: PostgreSQL-development <pgsql-hackers@postgreSQL.org>
-Subject: Re: [HACKERS] type conversion discussion 
-In-reply-to: <200005141950.PAA04636@candle.pha.pa.us> 
-References: <200005141950.PAA04636@candle.pha.pa.us>
-Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
-	message dated "Sun, 14 May 2000 15:50:20 -0400"
-Date: Sun, 14 May 2000 17:29:30 -0400
-Message-ID: <20911.958339770@sss.pgh.pa.us>
-From: Tom Lane <tgl@sss.pgh.pa.us>
-Status: OR
-
-Bruce Momjian <pgman@candle.pha.pa.us> writes:
-> As some point, it seems we need to get all the PostgreSQL minds together
-> to discuss type conversion issues.  These problems continue to come up
-> from release to release.  We are getting better, but it seems a full
-> discussion could help solidify our strategy.
-
-OK, here are a few things that bug me about the current type-resolution
-code:
-
-1. Poor choice of type to attribute to numeric literals.  (A possible
-   solution is sketched in my earlier message, but do we need similar
-   mechanisms for other type categories?)
-
-2. Tensions between treating string literals as "unknown" type and
-   as "text" type, per this thread so far.
-
-3. IS_BINARY_COMPATIBLE seems like a bogus concept.  Do we really want a
-   fully symmetrical ring of types in each group?  I'd prefer to see a
-   one-way equivalence, which allows eg. OID to be silently converted
-   to INT4, but *not* vice versa (except perhaps by specific user cast).
-   This'd be more like a traditional "is-a" or inheritance relationship
-   between datatypes, which has well-understood semantics.
-
-4. I'm also concerned that the behavior of IS_BINARY_COMPATIBLE isn't
-   very predictable because it will happily go either way.  For example,
-   if I do 
-	select * from pg_class where oid = 1234;
-   it's unclear whether I will get an oideq or an int4eq operator ---
-   and that's a rather critical point since only one of them can exploit
-   an index on the oid column.  Currently, there is some klugery in the
-   planner that works around this by overriding the parser's choice of
-   operator to substitute one that is compatible with an available index.
-   That's a pretty ugly solution ... I'm not sure I know a better one,
-   but as long as we're discussing type resolution issues ...
-
-5. Lack of extensibility.  There's way too much knowledge hard-wired
-   into the parser about type categories, preferred types, binary
-   compatibility, etc.  All of it falls down when faced with
-   user-defined datatypes.  If we do something like I suggested with
-   a hardwired hierarchy of numeric datatypes, it'll get even worse.
-   All this stuff ought to be driven off fields in pg_type rather than
-   be hardwired into the code, so that the same concepts can be extended
-   to user-defined types.
-
-I don't have worked-out proposals for any of these but the first,
-but they've all been bothering me for a while.
-
-			regards, tom lane
-
-From tgl@sss.pgh.pa.us Sun May 14 21:02:31 2000
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
-	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA07700
-	for <pgman@candle.pha.pa.us>; Sun, 14 May 2000 21:02:28 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
-	by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id VAA21261;
-	Sun, 14 May 2000 21:03:17 -0400 (EDT)
-To: Bruce Momjian <pgman@candle.pha.pa.us>
-cc: PostgreSQL-development <pgsql-hackers@postgreSQL.org>
-Subject: Re: [HACKERS] type conversion discussion 
-In-reply-to: <20911.958339770@sss.pgh.pa.us> 
-References: <200005141950.PAA04636@candle.pha.pa.us> <20911.958339770@sss.pgh.pa.us>
-Comments: In-reply-to Tom Lane <tgl@sss.pgh.pa.us>
-	message dated "Sun, 14 May 2000 17:29:30 -0400"
-Date: Sun, 14 May 2000 21:03:17 -0400
-Message-ID: <21258.958352597@sss.pgh.pa.us>
-From: Tom Lane <tgl@sss.pgh.pa.us>
-Status: OR
-
-Here are the results of some further thoughts about type-conversion
-issues.  This is not a complete proposal yet, but a sketch of an
-approach that might solve several of the gripes in my previous proposal.
-
-While thinking about this, I realized that my numeric-types proposal
-of yesterday would break at least a few cases that work nicely now.
-For example, I frequently do things like
-	select * from pg_class where oid = 1234;
-whilst poking around in system tables and querytree dumps.  If that
-constant is initially resolved as int2, as I suggested yesterday,
-then we have "oid = int2" for which there is no operator.  To succeed
-we must decide to promote the constant to int4 --- but with no int4
-visible among the operands of the "=", it will not work to just "promote
-numerics to the highest type seen in the operands" as I suggested
-yesterday.  So there has to be some more interaction in there.
-
-Anyway, I was complaining about the looseness of the concept of
-binary-compatible types and the fact that the parser's type conversion
-knowledge is mostly hardwired.  These might be resolved by generalizing
-the numeric type hierarchy idea into a "type promotion lattice", which
-would work like this:
-
-* Add a "typpromote" column to pg_type, which contains either zero or
-  the OID of another type that the parser is allowed to promote this
-  type to when searching for usable functions/operators.  For example,
-  my numeric-types hierarchy of yesterday would be expressed by making
-  int2 promote to int4, int4 to int8, int8 to numeric, numeric to
-  float4, and float4 to float8.  The promotion idea also replaces the
-  current concept of binary-compatible types: for example, OID would
-  link to int4 and varchar would link to text (but not vice versa!).
-
-* Also add a "typpromotebin" boolean column to pg_type, which contains
-  't' if the type conversion indicated by typpromote is "free", ie,
-  no conversion function need be executed before regarding a value as
-  belonging to the promoted type.  This distinguishes binary-compatible
-  from non-binary-compatible cases.  If "typpromotebin" is 'f' and the
-  parser decides it needs to apply the conversion, then it has to look
-  up the appropriate conversion function in pg_proc.  (More about this
-  below.)
-
-Now, if the parser fails to find an exact match for a given function
-or operator name and the exact set of input data types, it proceeds by
-chasing up the promotion chains for the input data types and trying to
-locate a set of types for which there is a matching function/operator.
-If there are multiple possibilities, we choose the one which is the
-"least promoted" by some yet-to-be-determined metric.  (This metric
-would probably favor "free" conversions over non-free ones, but other
-than that I'm not quite sure how it should work.  The metric would
-replace a whole bunch of ad-hoc heuristics that are currently applied
-in the type resolver, so even if it seems rather ad-hoc it'd still be
-cleaner than what we have ;-).)
-
-In a situation like the "oid = int2" example above, this mechanism would
-presumably settle on "int4 = int4" as being the least-promoted
-equivalent operator.  (It could not find "oid = oid" since there is
-no promotion path from int2 to oid.)  That looks bad since it isn't
-compatible with an oidops index --- but I have a solution for that!
-I don't think we need the oid opclass at all; why shouldn't indexes
-on oid be expressed as int4 indexes to begin with?  In general, if
-two types are considered binary-equivalent under the old scheme, then
-the one that is considered the subtype probably shouldn't have separate
-index operators under this new scheme.  Instead it should just rely on
-the index operators of the promoted type.
-
-The point of the proposed typpromotebin field is to save a pg_proc
-lookup when trying to determine whether a particular promotion is "free"
-or not.  We could save even more lookups if we didn't store the boolean
-but instead the actual OID of the conversion function, or zero if the
-promotion is "free".  The trouble with that is that it creates a
-circularity problem when trying to define a new user type --- you can't
-define the conversion function if its input type doesn't exist yet.
-In any case, we want the parser to do a function lookup if we've
-advanced more than one step in the promotion hierarchy: if we've decided
-to promote int4 to float8 (which will be a four-step chain through int8,
-numeric, float4) we sure want the thing to use a direct int4tofloat8
-conversion function if available, not a chain of four conversion
-functions.  So on balance I think we want to look in pg_proc once we've
-decided which conversion to perform.  The only reason for having
-typpromotebin is that the promotion metric will want to know which
-conversions are free, and we don't want to have to do a lookup in
-pg_proc for each alternative we consider, only the ones that are finally
-selected to be used.
-
-I can think of at least one special case that still isn't cleanly
-handled under this scheme, and that is bpchar vs. varchar comparison.
-Currently, we have
-
-regression=# select 'a'::bpchar = 'a '::bpchar;
- ?column?
----------
- t
-(1 row)
-
-This is correct since trailing blanks are insignificant in bpchar land,
-so the two values should be considered equal.  If we try
-
-regression=# select 'a'::bpchar = 'a '::varchar;
-ERROR:  Unable to identify an operator '=' for types 'bpchar' and 'varchar'
-        You will have to retype this query using an explicit cast
-
-which is pretty bogus but at least it saves the system from making some
-random choice about whether bpchar or varchar comparison rules apply.
-On the other hand,
-
-regression=# select 'a'::bpchar = 'a '::text;
- ?column?
----------
- f
-(1 row)
-
-Here the bpchar value has been promoted to text and then text comparison
-(where trailing blanks *are* significant) is applied.  I'm not sure that
-we can really justify doing this in this case when we reject the bpchar
-vs varchar case, but maybe someone wants to argue that that's correct.
-
-The natural setup in my type-promotion scheme would be that both bpchar
-and varchar link to 'text' as their promoted type.  If we do nothing
-special then text-style comparison would be used in a bpchar vs varchar
-comparison, which is arguably wrong.
-
-One way to deal with this without introducing kluges into the type
-resolver is to provide a full set of bpchar vs text and text vs bpchar
-operators, and make sure that the promotion metric is such that these
-will be used in place of text vs text operators if they apply (which
-should hold, I think, for any reasonable metric).  This is probably
-the only way to get the "right" behavior in any case --- I think that
-the "right" behavior for such comparisons is to strip trailing blanks
-from the bpchar side but not the text/varchar side.  (I haven't checked
-to see if SQL92 agrees, though.)
-
-Another issue is how to fit resolution of "unknown" literals into this
-scheme.  We could probably continue to handle them more or less as we
-do now, but they might complicate the promotion metric.
-
-I am not clear yet on whether we'd still need the concept of "type
-categories" as they presently exist in the resolver.  It's possible
-that we wouldn't, which would be a nice simplification.  (If we do
-still need them, we should have a column in pg_type that defines the
-category of a type, instead of hard-wiring category assignments.)
-
-			regards, tom lane
-
-From e99re41@DoCS.UU.SE Mon May 15 07:39:03 2000
-Received: from meryl.it.uu.se (root@meryl.it.uu.se [130.238.12.42])
-	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id HAA10251
-	for <pgman@candle.pha.pa.us>; Mon, 15 May 2000 07:39:01 -0400 (EDT)
-Received: from Zebra.DoCS.UU.SE (e99re41@Zebra.DoCS.UU.SE [130.238.9.158])
-	by meryl.it.uu.se (8.8.5/8.8.5) with ESMTP id NAA10849;
-	Mon, 15 May 2000 13:39:45 +0200 (MET DST)
-Received: from localhost (e99re41@localhost) by Zebra.DoCS.UU.SE (8.6.12/8.6.12) with ESMTP id NAA26523; Mon, 15 May 2000 13:39:44 +0200
-X-Authentication-Warning: Zebra.DoCS.UU.SE: e99re41 owned process doing -bs
-Date: Mon, 15 May 2000 13:39:44 +0200 (MET DST)
-From: Peter Eisentraut <e99re41@DoCS.UU.SE>
-Reply-To: Peter Eisentraut <peter_e@gmx.net>
-To: Tom Lane <tgl@sss.pgh.pa.us>
-cc: Bruce Momjian <pgman@candle.pha.pa.us>,
-        PostgreSQL-development <pgsql-hackers@postgresql.org>
-Subject: Re: [HACKERS] type conversion discussion 
-In-Reply-To: <20911.958339770@sss.pgh.pa.us>
-Message-ID: <Pine.GSO.4.02A.10005151309020.26399-100000@Zebra.DoCS.UU.SE>
-MIME-Version: 1.0
-Content-Type: TEXT/PLAIN; charset=iso-8859-1
-Content-Transfer-Encoding: 8bit
-X-MIME-Autoconverted: from QUOTED-PRINTABLE to 8bit by candle.pha.pa.us id HAA10251
-Status: OR
-
-On Sun, 14 May 2000, Tom Lane wrote:
-
-> 1. Poor choice of type to attribute to numeric literals.  (A possible
->    solution is sketched in my earlier message, but do we need similar
->    mechanisms for other type categories?)
-
-I think your plan looks good for the numerical land. (I'll ponder the oid
-issues in a second.) For other type categories, perhaps not. Should a line
-be promoted to a polygon so you can check if it contains a point? Or a
-polygon to a box? Higher dimensions? :-)
-
-
-> 2. Tensions between treating string literals as "unknown" type and
->    as "text" type, per this thread so far.
-
-Yes, while we're at it, let's look at this in detail. I claim that
-something of the form 'xxx' should always be text (or char or whatever),
-period. Let's consider the cases were this could potentially clash with
-the current behaviour:
-
-a) The target type is unambiguously clear, e.g., UPDATE ... SET. Then you
-cast text to the target type. The effect is identical.
-
-b) The target type is completely unspecified, e.g. CREATE TABLE AS SELECT
-'xxx'; This will currently create an "unknown" column. It should arguably
-create a "text" column.
-
-Function argument resolution:
-
-c) There is only one function and it has a "text" argument. No-brainer.
-
-d) There is only one function and it has an argument other than text. Try
-to cast text to that type. (This is what's done in general, isn't it?)
-
-e) The function is overloaded for many types, amongst which is text. Then
-call the text version. I believe this would currently fail, which I'd
-consider a deficiency.
-
-f) The function is overloaded for many types, none of which is text. In
-that case you have to cast anyway, so you don't lose anything.
-
-On thing to also keep in mind regarding required casting for (b) and (f)
-is that SQL never allowed literals of "fancy" types (e.g., DATE) to have
-undecorated 'yyyy-mm-dd' constants, you always have to say DATE
-'yyyy-mm-dd'. What Postgres allows is a convencience where DATE would be
-obvious or implied. In the end it's a win-win situation: you tell the
-system what you want, and your code is clearer.
-
- 
-> 3. IS_BINARY_COMPATIBLE seems like a bogus concept.
-
-At least it's bogus when used for types which are not actually binary
-compatible, e.g. int4 and oid. The result of the current implementation is
-that you can perfectly happily insert and retrieve negative numbers from
-oid fields.
-
-I'm not so sure about the value of this particular equivalency anyway.
-AFAICS the only functions that make sense for oids are comparisons (incl.
-min, max), adding integers to them, subtracting one oid from another.
-Silent mangling with int4 means that you can multiply them, square them,
-add floating point numbers to them (doesn't really work in practice
-though), all things that have no business with oids.
-
-I'd say define the operators that are useful for oids explicitly for oids
-and require casts for all others, so the users know what they're doing.
-The fact that an oid is also a number should be an implementation detail.
-
-In my mind oids are like pointers in C. Indiscriminate mangling of
-pointers and integers in C has long been dismissed as questionable coding.
-
-
-Of course I'd be very willing to consider counterexamples to these
-theories ...
-
-- 
-Peter Eisentraut                  Sernanders väg 10:115
-peter_e@gmx.net                   75262 Uppsala
-http://yi.org/peter-e/            Sweden
-
-
-From tgl@sss.pgh.pa.us Tue Jun 13 04:58:20 2000
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
-	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA24281
-	for <pgman@candle.pha.pa.us>; Tue, 13 Jun 2000 03:58:18 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
-	by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id DAA02571;
-	Tue, 13 Jun 2000 03:58:43 -0400 (EDT)
-To: Bruce Momjian <pgman@candle.pha.pa.us>
-cc: pgsql-hackers@postgresql.org
-Subject: Re: [HACKERS] Proposal for fixing numeric type-resolution issues 
-In-reply-to: <200006130741.DAA23502@candle.pha.pa.us> 
-References: <200006130741.DAA23502@candle.pha.pa.us>
-Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
-	message dated "Tue, 13 Jun 2000 03:41:56 -0400"
-Date: Tue, 13 Jun 2000 03:58:43 -0400
-Message-ID: <2568.960883123@sss.pgh.pa.us>
-From: Tom Lane <tgl@sss.pgh.pa.us>
-Status: OR
-
-Bruce Momjian <pgman@candle.pha.pa.us> writes:
-> Again, anything to add to the TODO here?
-
-IIRC, there was some unhappiness with the proposal you quote, so I'm
-not sure we've quite agreed what to do... but clearly something must
-be done.
-
-			regards, tom lane
-
-
->> We've got a collection of problems that are related to the parser's
->> inability to make good type-resolution choices for numeric constants.
->> In some cases you get a hard error; for example "NumericVar + 4.4"
->> yields
->> ERROR:  Unable to identify an operator '+' for types 'numeric' and 'float8'
->> You will have to retype this query using an explicit cast
->> because "4.4" is initially typed as float8 and the system can't figure
->> out whether to use numeric or float8 addition.  A more subtle problem
->> is that a query like "... WHERE Int2Var < 42" is unable to make use of
->> an index on the int2 column: 42 is resolved as int4, so the operator
->> is int24lt, which works but is not in the opclass of an int2 index.
->> 
->> Here is a proposal for fixing these problems.  I think we could get this
->> done for 7.1 if people like it.
->> 
->> The basic problem is that there's not enough smarts in the type resolver
->> about the interrelationships of the numeric datatypes.  All it has is
->> a concept of a most-preferred type within the category of numeric types.
->> (We are abusing the most-preferred-type mechanism, BTW, because both
->> FLOAT8 and NUMERIC claim to be the most-preferred type in the numeric
->> category!  This is in fact why the resolver can't make a choice for
->> "numeric+float8".)  We need more intelligence than that.
->> 
->> I propose that we set up a strictly-ordered hierarchy of numeric
->> datatypes, running from least preferred to most preferred:
->> int2, int4, int8, numeric, float4, float8.
->> Rather than simply considering coercions to the most-preferred type,
->> the type resolver should use the following rules:
->> 
->> 1. No value will be down-converted (eg int4 to int2) except by an
->> explicit conversion.
->> 
->> 2. If there is not an exact matching operator, numeric values will be
->> up-converted to the highest numeric datatype present among the operator
->> or function's arguments.  For example, given "int2 + int8" we'd up-
->> convert the int2 to int8 and apply int8 addition.
->> 
->> The final piece of the puzzle is that the type initially assigned to
->> an undecorated numeric constant should be NUMERIC if it contains a
->> decimal point or exponent, and otherwise the smallest of int2, int4,
->> int8, NUMERIC that will represent it.  This is a considerable change
->> from the current lexer behavior, where you get either int4 or float8.
->> 
->> For example, given "NumericVar + 4.4", the constant 4.4 will initially
->> be assigned type NUMERIC, we will resolve the operator as numeric plus,
->> and everything's fine.  Given "Float8Var + 4.4", the constant is still
->> initially numeric, but will be up-converted to float8 so that float8
->> addition can be used.  The end result is the same as in traditional
->> Postgres: you get float8 addition.  Given "Int2Var < 42", the constant
->> is initially typed as int2, since it fits, and we end up selecting
->> int2lt, thereby allowing use of an int2 index.  (On the other hand,
->> given "Int2Var < 100000", we'd end up using int4lt, which is correct
->> to avoid overflow.)
->> 
->> A couple of crucial subtleties here:
->> 
->> 1. We are assuming that the parser or optimizer will constant-fold
->> any conversion functions that are introduced.  Thus, in the
->> "Float8Var + 4.4" case, the 4.4 is represented as a float8 4.4 by the
->> time execution begins, so there's no performance loss.
->> 
->> 2. We cannot lose precision by initially representing a constant as
->> numeric and later converting it to float.  Nor can we exceed NUMERIC's
->> range (the default 1000-digit limit is more than the range of IEEE
->> float8 data).  It would not work as well to start out by representing
->> a constant as float and then converting it to numeric.
->> 
->> Presently, the pg_proc and pg_operator tables contain a pretty fair
->> collection of cross-datatype numeric operators, such as int24lt,
->> float48pl, etc.  We could perhaps leave these in, but I believe that
->> it is better to remove them.  For example, if int42lt is left in place,
->> then it would capture cases like "Int4Var < 42", whereas we need that
->> to be translated to int4lt so that an int4 index can be used.  Removing
->> these operators will eliminate some code bloat and system-catalog bloat
->> to boot.
->> 
->> As far as I can tell, this proposal is almost compatible with the rules
->> given in SQL92: in particular, SQL92 specifies that an operator having
->> both "approximate numeric" (float) and "exact numeric" (int or numeric)
->> inputs should deliver an approximate-numeric result.  I propose
->> deviating from SQL92 in a single respect: SQL92 specifies that a
->> constant containing an exponent (eg 1.2E34) is approximate numeric,
->> which implies that the result of an operator using it is approximate
->> even if the other operand is exact.  I believe it's better to treat
->> such a constant as exact (ie, type NUMERIC) and only convert it to
->> float if the other operand is float.  Without doing that, an assignment
->> like
->> UPDATE tab SET NumericVar = 1.234567890123456789012345E34;
->> will not work as desired because the constant will be prematurely
->> coerced to float, causing precision loss.
->> 
->> Comments?
->> 
->> regards, tom lane
->> 
-
-
-> -- 
->   Bruce Momjian                        |  http://www.op.net/~candle
->   pgman@candle.pha.pa.us               |  (610) 853-3000
->   +  If your life is a hard drive,     |  830 Blythe Avenue
->   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
-
-From tgl@sss.pgh.pa.us Mon Jun 12 14:09:45 2000
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
-	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA01993
-	for <pgman@candle.pha.pa.us>; Mon, 12 Jun 2000 13:09:43 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
-	by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id NAA01515;
-	Mon, 12 Jun 2000 13:10:01 -0400 (EDT)
-To: Peter Eisentraut <peter_e@gmx.net>
-cc: Bruce Momjian <pgman@candle.pha.pa.us>,
-        "Thomas G. Lockhart" <lockhart@alumni.caltech.edu>,
-        PostgreSQL-development <pgsql-hackers@postgresql.org>
-Subject: Re: [HACKERS] Adding time to DATE type 
-In-reply-to: <Pine.LNX.4.21.0006110322150.9195-100000@localhost.localdomain> 
-References: <Pine.LNX.4.21.0006110322150.9195-100000@localhost.localdomain>
-Comments: In-reply-to Peter Eisentraut <peter_e@gmx.net>
-	message dated "Sun, 11 Jun 2000 13:41:24 +0200"
-Date: Mon, 12 Jun 2000 13:10:00 -0400
-Message-ID: <1512.960829800@sss.pgh.pa.us>
-From: Tom Lane <tgl@sss.pgh.pa.us>
-Status: ORr
-
-Peter Eisentraut <peter_e@gmx.net> writes:
-> Bruce Momjian writes:
->> Can someone give me a TODO summary for this issue?
-
-> * make 'text' constants default to text type (not unknown)
-
-> (I think not everyone's completely convinced on this issue, but I don't
-> recall anyone being firmly opposed to it.)
-
-It would be a mistake to eliminate the distinction between unknown and
-text.  See for example my just-posted response to John Cochran on
-pgsql-general about why 'BOULEVARD'::text behaves differently from
-'BOULEVARD'::char.  If string literals are immediately assigned type
-text then we will have serious problems with char(n) fields.
-
-I think it's fine to assign string literals a type of 'unknown'
-initially.  What we need to do is add a phase of type resolution that
-considers treating them as text, but only after the existing logic fails
-to deduce a type.
-
-(BTW it might be better to treat string literals as defaulting to char(n)
-instead of text, allowing the normal promotion rules to replace char(n)
-with text if necessary.  Not sure if that would make things more or less
-confusing for operations that intermix fixed- and variable-width char
-types.)
-
-			regards, tom lane
-
-From pgsql-hackers-owner+M1936@postgresql.org Sun Dec 10 13:17:54 2000
-Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
-	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA20676
-	for <pgman@candle.pha.pa.us>; Sun, 10 Dec 2000 13:17:54 -0500 (EST)
-Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
-	by mail.postgresql.org (8.11.1/8.11.1) with SMTP id eBAIGvZ40566;
-	Sun, 10 Dec 2000 13:16:57 -0500 (EST)
-	(envelope-from pgsql-hackers-owner+M1936@postgresql.org)
-Received: from sss.pgh.pa.us (sss.pgh.pa.us [209.114.132.154])
-	by mail.postgresql.org (8.11.1/8.11.1) with ESMTP id eBAI8HZ39820
-	for <pgsql-hackers@postgreSQL.org>; Sun, 10 Dec 2000 13:08:17 -0500 (EST)
-	(envelope-from tgl@sss.pgh.pa.us)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
-	by sss.pgh.pa.us (8.11.1/8.11.1) with ESMTP id eBAI82o28682;
-	Sun, 10 Dec 2000 13:08:02 -0500 (EST)
-To: Thomas Lockhart <lockhart@alumni.caltech.edu>
-cc: pgsql-hackers@postgresql.org
-Subject: [HACKERS] Unknown-type resolution rules, redux
-Date: Sun, 10 Dec 2000 13:08:02 -0500
-Message-ID: <28679.976471682@sss.pgh.pa.us>
-From: Tom Lane <tgl@sss.pgh.pa.us>
-Precedence: bulk
-Sender: pgsql-hackers-owner@postgresql.org
-Status: OR
-
-parse_coerce.c contains the following conversation --- I believe the
-first XXX comment is from me and the second from you:
-
-    /*
-     * Still too many candidates? Try assigning types for the unknown
-     * columns.
-     *
-     * We do this by examining each unknown argument position to see if all
-     * the candidates agree on the type category of that slot.  If so, and
-     * if some candidates accept the preferred type in that category,
-     * eliminate the candidates with other input types.  If we are down to
-     * one candidate at the end, we win.
-     *
-     * XXX It's kinda bogus to do this left-to-right, isn't it?  If we
-     * eliminate some candidates because they are non-preferred at the
-     * first slot, we won't notice that they didn't have the same type
-     * category for a later slot.
-     * XXX Hmm. How else would you do this? These candidates are here because
-     * they all have the same number of matches on arguments with explicit
-     * types, so from here on left-to-right resolution is as good as any.
-     * Need a counterexample to see otherwise...
-     */
-
-The comment is out of date anyway because it fails to mention the new
-rule about preferring STRING category.  But to answer your request for
-a counterexample: consider
-
-	SELECT foo('bar', 'baz')
-
-First, suppose the available candidates are
-
-	foo(float8, int4)
-	foo(float8, point)
-
-In this case, we examine the first argument position, see that all the
-candidates agree on NUMERIC category, so we consider resolving the first
-unknown input to float8.  That eliminates neither candidate so we move
-on to the second argument position.  Here there is a conflict of
-categories so we can't eliminate anything, and we decide the call is
-ambiguous.  That's correct (or at least Operating As Designed ;-)).
-
-But now suppose we have
-
-	foo(float8, int4)
-	foo(float4, point)
-
-Here, at the first position we will still see that all candidates agree
-on NUMERIC category, and then we will eliminate candidate 2 because it
-isn't the preferred type in that category.  Now when we come to the
-second argument position, there's only one candidate left so there's
-no category conflict.  Result: this call is considered non-ambiguous.
-
-This means there is a left-to-right bias in the algorithm.  For example,
-the exact same call *would* be considered ambiguous if the candidates'
-argument orders were reversed:
-
-	foo(int4, float8)
-	foo(point, float4)
-
-I do not like that.  You could maybe argue that earlier arguments are
-more important than later ones for functions, but it's harder to make
-that case for binary operators --- and in any case this behavior is
-extremely difficult to explain in prose.
-
-To fix this, I think we need to split the loop into two passes.
-The first pass does *not* remove any candidates.  What it does is to
-look separately at each UNKNOWN-argument position and attempt to deduce
-a probable category for it, using the following rules:
-
-* If any candidate has an input type of STRING category, use STRING
-category; else if all candidates agree on the category, use that
-category; else fail because no resolution can be made.
-
-* The first pass must also remember whether any candidates are of a
-preferred type within the selected category.
-
-The probable categories and exists-preferred-type booleans are saved in
-local arrays.  (Note this has to be done this way because
-IsPreferredType currently allows more than one type to be considered
-preferred in a category ... so the first pass cannot try to determine a
-unique type, only a category.)
-
-If we find a category for every UNKNOWN arg, then we enter a second loop
-in which we discard candidates.  In this pass we discard a candidate if
-(a) it is of the wrong category, or (b) it is of the right category but
-is not of preferred type in that category, *and* we found candidate(s)
-of preferred type at this slot.
-
-If we end with exactly one candidate then we win.
-
-It is clear in this algorithm that there is no order dependency: the
-conditions for keeping or discarding a candidate are fixed before we
-start the second pass, and do not vary depending on which other
-candidates were discarded before it.
-
-Comments?
-
-			regards, tom lane
-
-From pgsql-general-owner+M18949=candle.pha.pa.us=pgman@postgresql.org Sat Dec 29 15:47:47 2001
-Return-path: <pgsql-general-owner+M18949=candle.pha.pa.us=pgman@postgresql.org>
-Received: from rs.postgresql.org (server1.pgsql.org [64.39.15.238] (may be forged))
-	by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id fBTKlkT05111
-	for <pgman@candle.pha.pa.us>; Sat, 29 Dec 2001 15:47:46 -0500 (EST)
-Received: from postgresql.org (postgresql.org [64.49.215.8])
-	by rs.postgresql.org (8.11.6/8.11.6) with ESMTP id fBTKhZN74322
-	for <pgman@candle.pha.pa.us>; Sat, 29 Dec 2001 14:43:35 -0600 (CST)
-	(envelope-from pgsql-general-owner+M18949=candle.pha.pa.us=pgman@postgresql.org)
-Received: from candle.pha.pa.us (216-55-132-35.dsl.san-diego.abac.net [216.55.132.35])
-	by postgresql.org (8.11.3/8.11.4) with ESMTP id fBTKaem38452
-	for <pgsql-general@postgresql.org>; Sat, 29 Dec 2001 15:36:40 -0500 (EST)
-	(envelope-from pgman@candle.pha.pa.us)
-Received: (from pgman@localhost)
-	by candle.pha.pa.us (8.11.6/8.10.1) id fBTKaTg04256;
-	Sat, 29 Dec 2001 15:36:29 -0500 (EST)
-From: Bruce Momjian <pgman@candle.pha.pa.us>
-Message-ID: <200112292036.fBTKaTg04256@candle.pha.pa.us>
-Subject: Re: [GENERAL] Casting Varchar to Numeric
-In-Reply-To: <20011206150158.O28880-100000@megazone23.bigpanda.com>
-To: Stephan Szabo <sszabo@megazone23.bigpanda.com>
-Date: Sat, 29 Dec 2001 15:36:29 -0500 (EST)
-cc: Andy Marden <amarden@usa.net>, pgsql-general@postgresql.org
-X-Mailer: ELM [version 2.4ME+ PL96 (25)]
-MIME-Version: 1.0
-Content-Transfer-Encoding: 7bit
-Content-Type: text/plain; charset=US-ASCII
-Precedence: bulk
-Sender: pgsql-general-owner@postgresql.org
-Status: OR
-
-> On Mon, 3 Dec 2001, Andy Marden wrote:
-> 
-> > Martijn,
-> >
-> > It does work (believe it or not). I've now tried the method you mention
-> > below - that also works and is much nicer. I can't believe that PostgreSQL
-> > can't work this out. Surely implementing an algorithm that understands that
-> > if you can go from a ->b and b->c then you can certainly go from a->c. If
-> 
-> It's more complicated than that (and postgres does some of this but not
-> all), for example the cast text->float8->numeric potentially loses
-> precision and should probably not be an automatic cast for that reason.
-> 
-> > this is viewed as too complex a task for the internals - at least a diagram
-> > or some way of understanding how you should go from a->c would be immensely
-> > helpful wouldn't it! Daunting for anyone picking up the database and trying
-> > to do something simple(!)
-> 
-> There may be a need for documentation on this.  Would you like to write
-> some ;)
-
-OK, I ran some tests:
-	
-	test=> create table test (x text);
-	CREATE
-	test=> insert into test values ('323');
-	INSERT 5122745 1
-	test=> select cast (x as numeric) from test;
-	ERROR:  Cannot cast type 'text' to 'numeric'
-
-I can see problems with automatically casting numeric to text because
-you have to guess the desired format, but going from text to numeric
-seems quite easy to do.  Is there a reason we don't do it?
-
-I can cast to integer and float8 fine:
-	
-	test=> select cast ( x as integer) from test;
-	 ?column? 
-	----------
-	      323
-	(1 row)
-
-	test=> select cast ( x as float8) from test;
-	 ?column? 
-	----------
-	      323
-	(1 row)
-
-- 
-  Bruce Momjian                        |  http://candle.pha.pa.us
-  pgman@candle.pha.pa.us               |  (610) 853-3000
-  +  If your life is a hard drive,     |  830 Blythe Avenue
-  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
-
---------------------------(end of broadcast)---------------------------
-TIP 2: you can get off all lists at once with the unregister command
-    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
-
-From pgsql-general-owner+M18951=candle.pha.pa.us=pgman@postgresql.org Sat Dec 29 19:10:38 2001
-Return-path: <pgsql-general-owner+M18951=candle.pha.pa.us=pgman@postgresql.org>
-Received: from west.navpoint.com (west.navpoint.com [207.106.42.13])
-	by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id fBU0AbT23972
-	for <pgman@candle.pha.pa.us>; Sat, 29 Dec 2001 19:10:37 -0500 (EST)
-Received: from rs.postgresql.org (server1.pgsql.org [64.39.15.238] (may be forged))
-	by west.navpoint.com (8.11.6/8.10.1) with ESMTP id fBTNVj008959
-	for <pgman@candle.pha.pa.us>; Sat, 29 Dec 2001 18:31:45 -0500 (EST)
-Received: from postgresql.org (postgresql.org [64.49.215.8])
-	by rs.postgresql.org (8.11.6/8.11.6) with ESMTP id fBTNQrN78655
-	for <pgman@candle.pha.pa.us>; Sat, 29 Dec 2001 17:26:53 -0600 (CST)
-	(envelope-from pgsql-general-owner+M18951=candle.pha.pa.us=pgman@postgresql.org)
-Received: from sss.pgh.pa.us ([192.204.191.242])
-	by postgresql.org (8.11.3/8.11.4) with ESMTP id fBTN8Fm47978
-	for <pgsql-general@postgresql.org>; Sat, 29 Dec 2001 18:08:15 -0500 (EST)
-	(envelope-from tgl@sss.pgh.pa.us)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
-	by sss.pgh.pa.us (8.11.4/8.11.4) with ESMTP id fBTN7vg20245;
-	Sat, 29 Dec 2001 18:07:57 -0500 (EST)
-To: Bruce Momjian <pgman@candle.pha.pa.us>
-cc: Stephan Szabo <sszabo@megazone23.bigpanda.com>,
-   Andy Marden <amarden@usa.net>, pgsql-general@postgresql.org
-Subject: Re: [GENERAL] Casting Varchar to Numeric 
-In-Reply-To: <200112292036.fBTKaTg04256@candle.pha.pa.us> 
-References: <200112292036.fBTKaTg04256@candle.pha.pa.us>
-Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
-	message dated "Sat, 29 Dec 2001 15:36:29 -0500"
-Date: Sat, 29 Dec 2001 18:07:57 -0500
-Message-ID: <20242.1009667277@sss.pgh.pa.us>
-From: Tom Lane <tgl@sss.pgh.pa.us>
-Precedence: bulk
-Sender: pgsql-general-owner@postgresql.org
-Status: OR
-
-Bruce Momjian <pgman@candle.pha.pa.us> writes:
-> I can see problems with automatically casting numeric to text because
-> you have to guess the desired format, but going from text to numeric
-> seems quite easy to do.  Is there a reason we don't do it?
-
-I do not think it's a good idea to have implicit casts between text and
-everything under the sun, because that essentially destroys the type
-checking system.  What we need (see previous discussion) is a flag in
-pg_proc that says whether a type conversion function may be invoked
-implicitly or not.  I've got no problem with offering text(numeric) and
-numeric(text) functions that are invoked by explicit function calls or
-casts --- I just don't want the system trying to use them to make
-sense of a bogus query.
-
-> I can cast to integer and float8 fine:
-
-I don't believe that those should be available as implicit casts either.
-They are, at the moment:
-
-regression=# select 33 || 44.0;
- ?column?
----------
- 3344
-(1 row)
-
-Ugh.
-
-			regards, tom lane
-
---------------------------(end of broadcast)---------------------------
-TIP 6: Have you searched our list archives?
-
-http://archives.postgresql.org
-
--- a/doc/TODO.detail/vacuum
+++ b/doc/TODO.detail/vacuum
--- a/doc/TODO.detail/yacc
+++ b/doc/TODO.detail/yacc
@ -1,402 +0,0 @@
-From selkovjr@mcs.anl.gov Sat Jul 25 05:31:05 1998
-Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
-	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id FAA16564
-	for <maillist@candle.pha.pa.us>; Sat, 25 Jul 1998 05:31:03 -0400 (EDT)
-Received: from antares.mcs.anl.gov (mcs.anl.gov [140.221.9.6]) by renoir.op.net (o1/$ Revision: 1.18 $) with SMTP id FAA01775 for <maillist@candle.pha.pa.us>; Sat, 25 Jul 1998 05:28:22 -0400 (EDT)
-Received: from mcs.anl.gov (wit.mcs.anl.gov [140.221.5.148]) by antares.mcs.anl.gov (8.6.10/8.6.10)  with ESMTP
-	id EAA28698 for <maillist@candle.pha.pa.us>; Sat, 25 Jul 1998 04:27:05 -0500
-Sender: selkovjr@mcs.anl.gov
-Message-ID: <35B9968D.21CF60A2@mcs.anl.gov>
-Date: Sat, 25 Jul 1998 08:25:49 +0000
-From: "Gene Selkov, Jr." <selkovjr@mcs.anl.gov>
-Organization: MCS, Argonne Natl. Lab
-X-Mailer: Mozilla 4.03 [en] (X11; I; Linux 2.0.32 i586)
-MIME-Version: 1.0
-To: Bruce Momjian <maillist@candle.pha.pa.us>
-Subject: position-aware scanners
-References: <199807250524.BAA07296@candle.pha.pa.us>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: RO
-
-Bruce,
-
-I attached here (trough the web links) a couple examples, totally
-irrelevant to postgres but good enough to discuss token locations. I
-might as well try to patch the backend parser, though not sure how soon.
-
-
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-1. 
-
-The first c parser I wrote,
-http://wit.mcs.anl.gov/~selkovjr/unit-troff.tgz, is not very
-sophisticated, so token locations reported by yyerr() may be slightly
-incorrect (+/- one position depending on the existence and type of the
-lookahead token. It is a filter used to typeset the units of measurement
-with eqn. To use it, unpack the tar file and run make. The Makefile is
-not too generic but I built it on various systems including linux,
-freebsd and sunos 4.3. The invocation can be something like this:
-
-./check 0 parse "l**3/(mmoll*min)"
-parse error, expecting `BASIC_UNIT' or `INTEGER' or `POSITIVE_NUMBER' or
-`'(''
-
-l**3/(mmoll*min)
-      ^^^^^
-
-Now to the guts. As far as I can imagine, the only way to consistently
-keep track of each character read by the scanner (regardless of the
-length of expressions it will match) is to redefine its YY_INPUT like
-this:
-
-#undef YY_INPUT
-#define YY_INPUT(buf,result,max_size) \
-{ \
-	int c	= (int) buffer[pos++]; \
-	result = (c == '\0') ?	YY_NULL	: (buf[0] = c, 1); \
-}
-
-Here, buffer is the pointer to the origin of the string being scanned
-and pos is a global variable, similar in usage to a file pointer (you
-can both read and manipulate it at will). The buffer and the pointer are
-initialized by the function 
-
-void setString(char *s)
-{
-   buffer = s;
-   pos = 0;
-}
-
-each time the new string is to be parsed. This (exportable) function is
-part of the interface. 
-
-In this simplistic design, yyerror() is part of the scanner module and
-it uses the pos variable to report the location of unexpected tokens.
-The downside of such arrangement is that in case of error condition, you
-can't easily tell whether your context is current or lookahead token, it
-just reports the position of the last token read (be it $ (end of
-buffer) or something else):
-
-./check 0 convert "mol/foo"
-parse error, expecting `BASIC_UNIT' or `INTEGER' or `POSITIVE_NUMBER' or
-`'(''
-
-mol/foo
-       ^^^
-
-(should be at the beginning of "foo")
-
-./check 0 convert "mmol//l"        
-parse error, expecting `BASIC_UNIT' or `INTEGER' or `POSITIVE_NUMBER' or
-`'(''
-
-mmol//l
-    ^
-
-(should be at the second '/')
-
-
-I believe this is why most simple parsers made with yacc would report
-parse errors being "at or near" some token, which is fair enough if the
-expression is not too complex.
-
-
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-2. The second version of the same scanner,
-http://wit.mcs.anl.gov/~selkovjr/scanner-example.tgz, addresses this
-problem by recording exact locations of the tokens in each instance of
-the token semantic data structure. The global,
-
-UNIT_YYSTYPE unit_yylval;
-
-would be normally used to export the token semantics (including its
-original or modified text and location data) to the parser.
-Unfortunately, I cannot show you the parser part in c, because that's
-about when I stopped writing parsers in c. Instead, I included a small
-test program, test.c, that mimics the parser's expectations for the
-scanner data pretty well. I am assuming here that you are not interested
-in digging someone else's ugly guts for relatively small bit of
-information; let me know if I am wrong and I will send you the complete
-perl code (also generated with bison).
-
-To run this example, unpack the tar file and run Make. Then do
-
-  gcc test.c scanner.o
-
-and run a.out
-
-Note the line
-
-    yylval = unit_getyylval();
-
-in test.c. You will not normally need it in a c parser. It is enough to
-define yylval as an external variable and link it to yylval in yylex()
-
-In the bison-generated parser, yylval gets pushed into a stack (pointed
-to by yylsp) each time a new token is read. For each syntax rule, the
-bison macros @1, @2, ... are just shortcuts to locations in the stack 1,
-2, ... levels deep. In following code fragment, @3 refers to the
-location info for the third term in the rule (INTEGER):
-
-(sorry about perl, but I think you can do the same things in c without
-significant changes to your existing parser)
-
-term:           base    {
-                        $$ = $1;
-                        $$->{'order'} = 1;
-                }
-        |       base EXP INTEGER {
-                        $$ = $1;
-                        $$->{'order'} = @3->{'text'};
-                        $$->{'scale'} = $$->{'scale'} ** $$->{'order'};
-                        if ( $$->{'order'} == 0 ) {
-                                yyerror("Error: expecting a non-zero
-integer exponent");
-                                YYERROR;
-                        }
-                }
-
-
-which translates to:
-
-  ($yyn == 10)    && do {
-          $yyval = $yyvsa[-1];
-          $yyval->{'order'} = 1;
-          last SWITCH;
-  };
-
-  ($yyn == 11)    && do {
-          $yyval = $yyvsa[-3];
-          $yyval->{'order'} = $yylsa[-1]->{'text'}
-          $yyval->{'scale'} = $yyval->{'scale'} ** $yyval->{'order'};
-          if ( $yyval->{'order'} == 0 ) {
-                   yyerror("Error: expecting a non-zero integer
-exponent");
-                   goto yyerrlab1 ;
-          }
-          last SWITCH;
-  };
-
-In c, you will have a bit more complicated pointer arithmetic to adress
-the stack, but the usage of objects will be the same. Note here that it
-is convenient to keep all information about the token in its location
-info, (yylsa, yylsp, yylval, @n), while everything relating to the value
-of the expression, or to the parse tree, is better placed in the
-semantic stack (yyssa, yyssp, yysval, $n). Also note that in some cases
-you can do semantic checks inside rules and report useful messages
-before or instead of invoking yyerror();
-
-Finally, it is useful to make the following wrapper function around
-external yylex() in order to maintain your own token stack. Unlike the
-parser's internal stack which is only as deep as the rule being reduced,
-this one can hold all tokens recognized during the current run, and that
-can be extremely helpful for error reporting and any transformations you
-may need. In this way, you can even scan (tokenize) the whole buffer
-before handing it off to the parser (who knows, you may need a token
-ahead of what is currently seen by the parser):
-
-
-sub tokenize {
-    undef @tokenTable;
-    my ($tok, $text, $name, $unit, $first_line, $first_column,
-$last_line, $last_column);
-    
-    while ( ($tok = &UnitLex::yylex()) > 0 ) { # this is where the
-c-coded yylex is called,
-                                               # UnitLex is the perl
-extension encapsulating it                            
-       ( $text, $name, $unit, $first_line, $first_column, $last_line,
-$last_column ) = &UnitLex::getyylval;
-       push(@tokenTable, 
-           Unit::yyltype->new (
-              'token'         => $tok,
-              'text'          => $text,
-              'name'          => $name,
-              'unit'          => $unit,
-              'first_line'    => $first_line,
-              'first_column'  => $first_column,
-              'last_line'     => $last_line,
-              'last_column'   => $last_column,
-           )
-       )
-    }
-
-}
-
-
-It is now a lot easier to handle various state-related problems, such as
-backtracking and error reporting. The yylex() function as seen by the
-parser might be constructed somewhat like this:
-
-sub yylex {
-    $yylloc = $tokenTable[$tokenNo];  # $tokenNo is a global; now
-instead of a "file pointer",
-                                      # as in the first example, we have
-a "token pointer"
-    undef $yylval;
-
-
-    # disregard this; name this block "computing semantic values"       
-    if ( $yylloc->{'token'} == UNIT) {
-        $yylval = Unit::Operand->new(
-        'unit'  => Unit::Dict::unit($yylloc->{'unit'}),
-        'base'  => Unit::Dict::base($yylloc->{'unit'}),
-        'scale' => Unit::Dict::scale($yylloc->{'unit'}),
-        'scaleToBase' => Unit::Dict::scaleToBase($yylloc->{'unit'}),
-        'loc'   => $yylloc,
-       );    
-    }
-    elsif ( ($yylloc->{'token'} == INTEGER ) || ($yylloc->{'token'} ==
-POSITIVE_NUMBER) ) {
-        $yylval = Unit::Operand->new(
-          'unit' => '1',
-          'base' => '1',
-          'scale' => 1,
-          'scaleToBase' => 1,
-          'loc'   => $yylloc,
-        );
-    }
-
-    $tokenNo++;
-    return(%{$yylloc}->{'token'}); # This is all the parser needs to
-know about this token. 
-                                   # But we already made sure we saved
-everything we need to know.
-}
-
-
-Now the most interesting part, the error reporting routine:
-
-
-sub yyerror {
-    my ($str) = @_;
-    my ($message, $start, $end, $loc);
-
-    $loc = $tokenTable[$tokenNo-1]; # This is the same as to say, 
-                                    # "obtain the location info for the
-current token"
-  
-    # You may use this routine for your own purposes or let parser use
-it
-    if( $str ne 'parse error' ) {
-        $message = "$str instead of `" . $loc->{'name'} . "' <" .
-$loc->{'text'} . ">,  at line " . $loc->{'first_line'} . ":\n\
-n";
-    }
-    else {
-        $message = "unexpected token `" . $loc->{'name'} . "' <" .
-$loc->{'text'} . ">,  at line " . loc->{'first_line'} . ":\n
-\n";
-    }
-
-    $message .= $parseBuffer . "\n"; # that's the original string that
-was used to set the parser buffer
-
-    $message .= ( ' ' x ($loc->{'first_column'} + 1) ) . ( '^' x
-length($loc->{'text'}) ). "\n";
-    if( $str ne 'parse error' ) {
-        print STDERR "$str instead of `", $loc->{'name'}, "' {",
-$loc->{'text'}, "},  at line ", $loc->{'first_line'}, ":\n\n";
-    }
-    else {
-        print STDERR "unexpected token `", $loc->{'name'}, "' {",
-$loc->{'text'}, "},  at line ", $loc->{'first_line'}, ":\n\n";
-    }
-    
-    print STDERR "$parseBuffer\n";
-    print STDERR ' ' x ($loc->{'first_column'} + 1), '^' x
-length($loc->{'text'}), "\n";
-}
-
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Scanners used in these examples assume there is a single line of text on
-the input (the first_line and last_line elements of yylloc are simply
-ignored). If you want to be able to parse multi-line buffers, just add a
-lex rule for '\n' that will increment the line count and reset the pos
-variable to zero.
-
-
-Ugly as it may seem, I find this approach extremely liberating. If the
-grammar becomes too complicated for a LALR(1) parser, I can cascade
-multiple parsers. The token table can then be used to reassemble parts
-of original expression for subordinate parsers, preserving the location
-info all the way down, so that subordinate parsers can report their
-problems consistently. You probably don't need this, as SQL is very well
-thought of and has parsable grammar. But it may be of some help, for
-error reporting. 
-
-
--Gene
-
-From pgsql-patches-owner+M1499@postgresql.org Sat Aug  4 13:11:53 2001
-Return-path: <pgsql-patches-owner+M1499@postgresql.org>
-Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
-	by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f74HBrh11339
-	for <pgman@candle.pha.pa.us>; Sat, 4 Aug 2001 13:11:53 -0400 (EDT)
-Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
-	by postgresql.org (8.11.3/8.11.4) with SMTP id f74H89655183;
-	Sat, 4 Aug 2001 13:08:09 -0400 (EDT)
-	(envelope-from pgsql-patches-owner+M1499@postgresql.org)
-Received: from sss.pgh.pa.us ([192.204.191.242])
-	by postgresql.org (8.11.3/8.11.4) with ESMTP id f74Gxb653074
-	for <pgsql-patches@postgresql.org>; Sat, 4 Aug 2001 12:59:37 -0400 (EDT)
-	(envelope-from tgl@sss.pgh.pa.us)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
-	by sss.pgh.pa.us (8.11.4/8.11.4) with ESMTP id f74GtPC29183;
-	Sat, 4 Aug 2001 12:55:25 -0400 (EDT)
-To: Dave Page <dpage@vale-housing.co.uk>
-cc: "'Fernando Nasser'" <fnasser@cygnus.com>,
-   Bruce Momjian <pgman@candle.pha.pa.us>, Neil Padgett <npadgett@redhat.com>,
-   pgsql-patches@postgresql.org
-Subject: Re: [PATCHES] Patch for Improved Syntax Error Reporting 
-In-Reply-To: <8568FC767B4AD311AC33006097BCD3D61A2D70@woody.vale-housing.co.uk> 
-References: <8568FC767B4AD311AC33006097BCD3D61A2D70@woody.vale-housing.co.uk>
-Comments: In-reply-to Dave Page <dpage@vale-housing.co.uk>
-	message dated "Sat, 04 Aug 2001 12:37:23 +0100"
-Date: Sat, 04 Aug 2001 12:55:24 -0400
-Message-ID: <29180.996944124@sss.pgh.pa.us>
-From: Tom Lane <tgl@sss.pgh.pa.us>
-Precedence: bulk
-Sender: pgsql-patches-owner@postgresql.org
-Status: OR
-
-Dave Page <dpage@vale-housing.co.uk> writes:
-> Oh, I quite agree. I'm not adverse to updating my code, I just want to avoid
-> users getting misleading messages until I come up with those updates.
-
-Hmm ... if they were actively misleading then I'd share your concern.
-
-I guess what you're thinking is that the error offset reported by the
-backend won't correspond directly to what the user typed, and if the
-user tries to use the offset to manually count off characters, he may
-arrive at the wrong place?  Good point.  I'm not sure whether a message
-like
-
-	ERROR:  parser: parse error at or near 'frum';
-	POSITION: 42
-
-would be likely to encourage people to try that.  Thoughts?  (I do think
-this is a good argument for not embedding the position straight into the
-main error message though...)
-
-One possible compromise is to combine the straight character-offset
-approach with a simplistic context display:
-
-	ERROR:  parser: parse error at or near 'frum';
-	POSITION: 42  ... oid,relname FRUM ...
-
-The idea is to define the "POSITION" field as an integer offset possibly
-followed by whitespace and noise words.  An updated client would grab
-the offset, ignore the rest of the field, and do the right thing.  A
-not-updated client would display the entire message, and with any luck
-the user would read it correctly.
-
-			regards, tom lane
-
---------------------------(end of broadcast)---------------------------
-TIP 5: Have you checked our extensive FAQ?
-
-http://www.postgresql.org/users-lounge/docs/faq.html
-