mirror of
https://git.postgresql.org/git/postgresql.git
synced 2025-02-23 19:39:53 +08:00
Remove TODO.detail files that contained useless or very old information.
Update TODO accordingly.
This commit is contained in:
parent
5de02e283f
commit
2b721d3d41
@ -1,542 +0,0 @@
|
||||
From fjoe@iclub.nsu.ru Tue Jan 23 03:38:45 2001
|
||||
Received: from mx.nsu.ru (root@mx.nsu.ru [193.124.215.71])
|
||||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA14458
|
||||
for <pgman@candle.pha.pa.us>; Tue, 23 Jan 2001 03:38:24 -0500 (EST)
|
||||
Received: from iclub.nsu.ru (root@iclub.nsu.ru [193.124.222.66])
|
||||
by mx.nsu.ru (8.9.1/8.9.0) with ESMTP id OAA29153;
|
||||
Tue, 23 Jan 2001 14:31:27 +0600 (NOVT)
|
||||
Received: from localhost (fjoe@localhost)
|
||||
by iclub.nsu.ru (8.11.1/8.11.1) with ESMTP id f0N8VOr15273;
|
||||
Tue, 23 Jan 2001 14:31:25 +0600 (NS)
|
||||
(envelope-from fjoe@iclub.nsu.ru)
|
||||
Date: Tue, 23 Jan 2001 14:31:24 +0600 (NS)
|
||||
From: Max Khon <fjoe@iclub.nsu.ru>
|
||||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||||
cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||||
Subject: Re: [HACKERS] Bug in FOREIGN KEY
|
||||
In-Reply-To: <200101230416.XAA04293@candle.pha.pa.us>
|
||||
Message-ID: <Pine.BSF.4.21.0101231429310.12474-100000@iclub.nsu.ru>
|
||||
MIME-Version: 1.0
|
||||
Content-Type: TEXT/PLAIN; charset=US-ASCII
|
||||
Status: RO
|
||||
|
||||
hi, there!
|
||||
|
||||
On Mon, 22 Jan 2001, Bruce Momjian wrote:
|
||||
|
||||
>
|
||||
> > This problem with foreign keys has been reported to me, and I have confirmed
|
||||
> > the bug exists in current sources. The DELETE should succeed:
|
||||
> >
|
||||
> > ---------------------------------------------------------------------------
|
||||
> >
|
||||
> > CREATE TABLE primarytest2 (
|
||||
> > col1 INTEGER,
|
||||
> > col2 INTEGER,
|
||||
> > PRIMARY KEY(col1, col2)
|
||||
> > );
|
||||
> >
|
||||
> > CREATE TABLE foreigntest2 (col3 INTEGER,
|
||||
> > col4 INTEGER,
|
||||
> > FOREIGN KEY (col3, col4) REFERENCES primarytest2
|
||||
> > );
|
||||
> > test=> BEGIN;
|
||||
> > BEGIN
|
||||
> > test=> INSERT INTO primarytest2 VALUES (5,5);
|
||||
> > INSERT 27618 1
|
||||
> > test=> DELETE FROM primarytest2 WHERE col1 = 5 AND col2 = 5;
|
||||
> > ERROR: triggered data change violation on relation "primarytest2"
|
||||
|
||||
I have another (slightly different) example:
|
||||
--- cut here ---
|
||||
test=> CREATE TABLE pr(obj_id int PRIMARY KEY);
|
||||
NOTICE: CREATE TABLE/PRIMARY KEY will create implicit index 'pr_pkey' for
|
||||
table 'pr'
|
||||
CREATE
|
||||
test=> CREATE TABLE fr(obj_id int REFERENCES pr ON DELETE CASCADE);
|
||||
NOTICE: CREATE TABLE will create implicit trigger(s) for FOREIGN KEY
|
||||
check(s)
|
||||
CREATE
|
||||
test=> BEGIN;
|
||||
BEGIN
|
||||
test=> INSERT INTO pr (obj_id) VALUES (1);
|
||||
INSERT 200539 1
|
||||
test=> INSERT INTO fr (obj_id) SELECT obj_id FROM pr;
|
||||
INSERT 200540 1
|
||||
test=> DELETE FROM fr;
|
||||
ERROR: triggered data change violation on relation "fr"
|
||||
test=>
|
||||
--- cut here ---
|
||||
|
||||
we are running postgresql 7.1 beta3
|
||||
|
||||
/fjoe
|
||||
|
||||
|
||||
From sszabo@megazone23.bigpanda.com Tue Jan 23 13:41:55 2001
|
||||
Received: from megazone23.bigpanda.com (rfx-64-6-210-138.users.reflexcom.com [64.6.210.138])
|
||||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA19924
|
||||
for <pgman@candle.pha.pa.us>; Tue, 23 Jan 2001 13:41:54 -0500 (EST)
|
||||
Received: from localhost (sszabo@localhost)
|
||||
by megazone23.bigpanda.com (8.11.1/8.11.1) with ESMTP id f0NIfLa41018;
|
||||
Tue, 23 Jan 2001 10:41:21 -0800 (PST)
|
||||
Date: Tue, 23 Jan 2001 10:41:21 -0800 (PST)
|
||||
From: Stephan Szabo <sszabo@megazone23.bigpanda.com>
|
||||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||||
cc: Jan Wieck <janwieck@Yahoo.com>, Peter Eisentraut <peter_e@gmx.net>,
|
||||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||||
Subject: Re: [HACKERS] Bug in FOREIGN KEY
|
||||
In-Reply-To: <200101230417.XAA04332@candle.pha.pa.us>
|
||||
Message-ID: <Pine.BSF.4.21.0101231031290.40955-100000@megazone23.bigpanda.com>
|
||||
MIME-Version: 1.0
|
||||
Content-Type: TEXT/PLAIN; charset=US-ASCII
|
||||
Status: RO
|
||||
|
||||
|
||||
> > Think I misinterpreted the SQL3 specs WR to this detail. The
|
||||
> > checks must be made per statement, not at the transaction
|
||||
> > level. I'll try to fix it, but we need to define what will
|
||||
> > happen with referential actions in the case of conflicting
|
||||
> > actions on the same key - there are some possible conflicts:
|
||||
> >
|
||||
> > 1. DEFERRED ON DELETE NO ACTION or RESTRICT
|
||||
> >
|
||||
> > Do the referencing rows reference to the new PK row with
|
||||
> > the same key now, or is this still a constraint
|
||||
> > violation? I would say it's not, because the constraint
|
||||
> > condition is satisfied at the end of the transaction. How
|
||||
> > do other databases behave?
|
||||
> >
|
||||
> > 2. DEFERRED ON DELETE CASCADE, SET NULL or SET DEFAULT
|
||||
> >
|
||||
> > Again I'd say that the action should be suppressed
|
||||
> > because a matching PK row is present at transaction end -
|
||||
> > it's not the same old row, but the constraint itself is
|
||||
> > still satisfied.
|
||||
|
||||
I'm not actually sure on the cascade, set null and set default. The
|
||||
way they are written seems to imply to me that it's based on the state
|
||||
of the database before/after the command in question as opposed to the
|
||||
deferred state of the database because of the stuff about updating the
|
||||
state of partially matching rows immediately after the delete/update of
|
||||
the row which wouldn't really make sense when deferred. Does anyone know
|
||||
what other systems do with a case something like this all in a
|
||||
transaction:
|
||||
|
||||
create table a (a int primary key);
|
||||
create table b (b int references a match full on update cascade
|
||||
on delete cascade deferrable initially deferred);
|
||||
insert into a values (1);
|
||||
insert into a values (2);
|
||||
insert into b values (1);
|
||||
delete from a where a=1;
|
||||
select * from b;
|
||||
commit;
|
||||
|
||||
|
||||
From pgsql-hackers-owner+M3901@postgresql.org Fri Jan 26 17:00:24 2001
|
||||
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
|
||||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA10576
|
||||
for <pgman@candle.pha.pa.us>; Fri, 26 Jan 2001 17:00:24 -0500 (EST)
|
||||
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
|
||||
by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0QLtVq53019;
|
||||
Fri, 26 Jan 2001 16:55:31 -0500 (EST)
|
||||
(envelope-from pgsql-hackers-owner+M3901@postgresql.org)
|
||||
Received: from smtp1b.mail.yahoo.com (smtp3.mail.yahoo.com [128.11.68.135])
|
||||
by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0QLqmq52691
|
||||
for <pgsql-hackers@postgresql.org>; Fri, 26 Jan 2001 16:52:48 -0500 (EST)
|
||||
(envelope-from janwieck@yahoo.com)
|
||||
Received: from j13.us.greatbridge.com (HELO jupiter.greatbridge.com) (216.54.52.153)
|
||||
by smtp.mail.vip.suc.yahoo.com with SMTP; 26 Jan 2001 22:49:57 -0000
|
||||
X-Apparently-From: <janwieck@yahoo.com>
|
||||
Received: (from janwieck@localhost)
|
||||
by jupiter.greatbridge.com (8.9.3/8.9.3) id RAA04701;
|
||||
Fri, 26 Jan 2001 17:02:32 -0500
|
||||
From: Jan Wieck <janwieck@Yahoo.com>
|
||||
Message-Id: <200101262202.RAA04701@jupiter.greatbridge.com>
|
||||
Subject: Re: [HACKERS] Bug in FOREIGN KEY
|
||||
In-Reply-To: <200101262110.QAA06902@candle.pha.pa.us> from Bruce Momjian at "Jan
|
||||
26, 2001 04:10:22 pm"
|
||||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||||
Date: Fri, 26 Jan 2001 17:02:32 -0500 (EST)
|
||||
CC: Jan Wieck <janwieck@Yahoo.com>, Peter Eisentraut <peter_e@gmx.net>,
|
||||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||||
X-Mailer: ELM [version 2.4ME+ PL68 (25)]
|
||||
MIME-Version: 1.0
|
||||
Content-Type: text/plain; charset=US-ASCII
|
||||
Content-Transfer-Encoding: 7bit
|
||||
Precedence: bulk
|
||||
Sender: pgsql-hackers-owner@postgresql.org
|
||||
Status: RO
|
||||
|
||||
Bruce Momjian wrote:
|
||||
> Here is another bug:
|
||||
>
|
||||
> test=> begin;
|
||||
> BEGIN
|
||||
> test=> INSERT INTO primarytest2 VALUES (5,5);
|
||||
> INSERT 18757 1
|
||||
> test=> UPDATE primarytest2 SET col2=1 WHERE col1 = 5 AND col2 = 5;
|
||||
> ERROR: deferredTriggerGetPreviousEvent: event for tuple (0,10) not
|
||||
> found
|
||||
|
||||
Schema?
|
||||
|
||||
|
||||
Jan
|
||||
|
||||
--
|
||||
|
||||
#======================================================================#
|
||||
# It's easier to get forgiveness for being wrong than for being right. #
|
||||
# Let's break this rule - forgive me. #
|
||||
#================================================== JanWieck@Yahoo.com #
|
||||
|
||||
|
||||
|
||||
_________________________________________________________
|
||||
Do You Yahoo!?
|
||||
Get your free @yahoo.com address at http://mail.yahoo.com
|
||||
|
||||
|
||||
From pgsql-hackers-owner+M3864@postgresql.org Fri Jan 26 10:07:36 2001
|
||||
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
|
||||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA17732
|
||||
for <pgman@candle.pha.pa.us>; Fri, 26 Jan 2001 10:07:35 -0500 (EST)
|
||||
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
|
||||
by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0QF3lq12782;
|
||||
Fri, 26 Jan 2001 10:03:47 -0500 (EST)
|
||||
(envelope-from pgsql-hackers-owner+M3864@postgresql.org)
|
||||
Received: from mailout00.sul.t-online.com (mailout00.sul.t-online.com [194.25.134.16])
|
||||
by mail.postgresql.org (8.11.1/8.11.1) with ESMTP id f0QF0Yq12614
|
||||
for <pgsql-hackers@postgresql.org>; Fri, 26 Jan 2001 10:00:34 -0500 (EST)
|
||||
(envelope-from peter_e@gmx.net)
|
||||
Received: from fwd01.sul.t-online.com
|
||||
by mailout00.sul.t-online.com with smtp
|
||||
id 14MALp-0006Im-00; Fri, 26 Jan 2001 15:59:45 +0100
|
||||
Received: from peter.localdomain (520083510237-0001@[212.185.245.73]) by fmrl01.sul.t-online.com
|
||||
with esmtp id 14MALQ-1Z0gkaC; Fri, 26 Jan 2001 15:59:20 +0100
|
||||
Date: Fri, 26 Jan 2001 16:07:27 +0100 (CET)
|
||||
From: Peter Eisentraut <peter_e@gmx.net>
|
||||
To: Hiroshi Inoue <Inoue@tpf.co.jp>
|
||||
cc: Bruce Momjian <pgman@candle.pha.pa.us>,
|
||||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||||
Subject: Re: [HACKERS] Open 7.1 items
|
||||
In-Reply-To: <3A70FA87.933B3D51@tpf.co.jp>
|
||||
Message-ID: <Pine.LNX.4.30.0101261604030.769-100000@peter.localdomain>
|
||||
MIME-Version: 1.0
|
||||
Content-Type: TEXT/PLAIN; charset=US-ASCII
|
||||
X-Sender: 520083510237-0001@t-dialin.net
|
||||
Precedence: bulk
|
||||
Sender: pgsql-hackers-owner@postgresql.org
|
||||
Status: RO
|
||||
|
||||
Hiroshi Inoue writes:
|
||||
|
||||
> What does this item mean ?
|
||||
> Is it the following ?
|
||||
>
|
||||
> begin;
|
||||
> insert into pk (id) values (1);
|
||||
> update(delete from) pk where id=1;
|
||||
> ERROR: triggered data change violation on relation pk"
|
||||
>
|
||||
> If so, isn't it a simple bug ?
|
||||
|
||||
Depends on the definition of "bug". It's not spec compliant and it's not
|
||||
documented and it's annoying. But it's been like this for a year and the
|
||||
issue is well known and can normally be avoided. It looks like a
|
||||
documentation to-do to me.
|
||||
|
||||
--
|
||||
Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
|
||||
|
||||
|
||||
From pgsql-hackers-owner+M3876@postgresql.org Fri Jan 26 13:07:10 2001
|
||||
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
|
||||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA26086
|
||||
for <pgman@candle.pha.pa.us>; Fri, 26 Jan 2001 13:07:09 -0500 (EST)
|
||||
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
|
||||
by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0QI4Vq30248;
|
||||
Fri, 26 Jan 2001 13:04:31 -0500 (EST)
|
||||
(envelope-from pgsql-hackers-owner+M3876@postgresql.org)
|
||||
Received: from sectorbase2.sectorbase.com ([208.48.122.131])
|
||||
by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0QI3Aq30098
|
||||
for <pgsql-hackers@postgreSQL.org>; Fri, 26 Jan 2001 13:03:11 -0500 (EST)
|
||||
(envelope-from vmikheev@SECTORBASE.COM)
|
||||
Received: by sectorbase2.sectorbase.com with Internet Mail Service (5.5.2653.19)
|
||||
id <D49FAF71>; Fri, 26 Jan 2001 09:41:23 -0800
|
||||
Message-ID: <8F4C99C66D04D4118F580090272A7A234D32C1@sectorbase1.sectorbase.com>
|
||||
From: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
|
||||
To: "'Jan Wieck'" <janwieck@Yahoo.com>,
|
||||
PostgreSQL HACKERS
|
||||
<pgsql-hackers@postgresql.org>,
|
||||
Bruce Momjian <root@candle.pha.pa.us>
|
||||
Subject: RE: [HACKERS] Open 7.1 items
|
||||
Date: Fri, 26 Jan 2001 10:02:59 -0800
|
||||
MIME-Version: 1.0
|
||||
X-Mailer: Internet Mail Service (5.5.2653.19)
|
||||
Content-Type: text/plain;
|
||||
charset="iso-8859-1"
|
||||
Precedence: bulk
|
||||
Sender: pgsql-hackers-owner@postgresql.org
|
||||
Status: RO
|
||||
|
||||
> > FOREIGN KEY INSERT & UPDATE/DELETE in transaction "change violation"
|
||||
>
|
||||
> A well known issue, and I've asked multiple times how exactly
|
||||
> we want to define the behaviour for deferred constraints. Do
|
||||
> foreign keys reference just to a key value and are happy with
|
||||
> it's existance, or do they refer to a particular row?
|
||||
|
||||
I think first. The last is closer to OODBMS world, not to [O]RDBMS one.
|
||||
|
||||
> Consider you have a deferred "ON DELETE CASCADE" constraint
|
||||
> and do a DELETE, INSERT of a PK. Do the FK rows need to be
|
||||
> deleted or not?
|
||||
|
||||
Good example. I think FK should not be deleted. If someone really
|
||||
want to delete "old" FK then he can do
|
||||
|
||||
DELETE PK;
|
||||
SET CONSTRAINT ... IMMEDIATE; -- FK need to be deleted here
|
||||
INSERT PK;
|
||||
|
||||
> Consider you have a deferred "ON DELETE RESTRICT" and "ON
|
||||
> UPDATE CASCADE" constraint. If you DELETE PK1 and UPDATE PK2
|
||||
> to PK1, the FK2 rows need to follow, but does PK2 inherit all
|
||||
> FK1 rows now so it's the master of both groups?
|
||||
|
||||
Yes. Again one can use SET CONSTRAINT to achieve desirable results.
|
||||
It seems that SET CONSTRAINT was designed for these purposes - ie
|
||||
for better flexibility.
|
||||
|
||||
Though, it would be better to look how other DBes handle all these
|
||||
cases -:)
|
||||
|
||||
Vadim
|
||||
|
||||
From janwieck@yahoo.com Fri Jan 26 12:20:27 2001
|
||||
Received: from smtp6.mail.yahoo.com (smtp6.mail.yahoo.com [128.11.69.103])
|
||||
by candle.pha.pa.us (8.9.0/8.9.0) with SMTP id MAA22158
|
||||
for <root@candle.pha.pa.us>; Fri, 26 Jan 2001 12:20:27 -0500 (EST)
|
||||
Received: from j13.us.greatbridge.com (HELO jupiter.greatbridge.com) (216.54.52.153)
|
||||
by smtp.mail.vip.suc.yahoo.com with SMTP; 26 Jan 2001 17:20:26 -0000
|
||||
X-Apparently-From: <janwieck@yahoo.com>
|
||||
Received: (from janwieck@localhost)
|
||||
by jupiter.greatbridge.com (8.9.3/8.9.3) id MAA03196;
|
||||
Fri, 26 Jan 2001 12:30:05 -0500
|
||||
From: Jan Wieck <janwieck@yahoo.com>
|
||||
Message-Id: <200101261730.MAA03196@jupiter.greatbridge.com>
|
||||
Subject: Re: [HACKERS] Open 7.1 items
|
||||
To: PostgreSQL HACKERS <pgsql-hackers@postgreSQL.org>,
|
||||
Bruce Momjian <root@candle.pha.pa.us>
|
||||
Date: Fri, 26 Jan 2001 12:30:05 -0500 (EST)
|
||||
X-Mailer: ELM [version 2.4ME+ PL68 (25)]
|
||||
MIME-Version: 1.0
|
||||
Content-Type: text/plain; charset=US-ASCII
|
||||
Content-Transfer-Encoding: 7bit
|
||||
Status: RO
|
||||
|
||||
Bruce Momjian wrote:
|
||||
> Here are my open 7.1 items. Thanks for shrinking the list so far.
|
||||
>
|
||||
> ---------------------------------------------------------------------------
|
||||
>
|
||||
> FreeBSD locale bug
|
||||
> Reorder INSERT firing in rules
|
||||
|
||||
I don't recall why this is wanted. AFAIK there's no reason
|
||||
NOT to do so, except for the actual state of beeing far too
|
||||
close to a release candidate.
|
||||
|
||||
> Philip Warner UPDATE crash
|
||||
> JDBC LargeObject short read return value missing
|
||||
> SELECT cash_out(1) crashes all backends
|
||||
> LAZY VACUUM
|
||||
> FOREIGN KEY INSERT & UPDATE/DELETE in transaction "change violation"
|
||||
|
||||
A well known issue, and I've asked multiple times how exactly
|
||||
we want to define the behaviour for deferred constraints. Do
|
||||
foreign keys reference just to a key value and are happy with
|
||||
it's existance, or do they refer to a particular row?
|
||||
|
||||
Consider you have a deferred "ON DELETE CASCADE" constraint
|
||||
and do a DELETE, INSERT of a PK. Do the FK rows need to be
|
||||
deleted or not?
|
||||
|
||||
Consider you have a deferred "ON DELETE RESTRICT" and "ON
|
||||
UPDATE CASCADE" constraint. If you DELETE PK1 and UPDATE PK2
|
||||
to PK1, the FK2 rows need to follow, but does PK2 inherit all
|
||||
FK1 rows now so it's the master of both groups?
|
||||
|
||||
These are only two possible combinations. There are many to
|
||||
think of. As said, I've asked before, but noone voted yet.
|
||||
Move the item to 7.2 anyway, because changing this behaviour
|
||||
would require massive changes in the trigger queue *and* the
|
||||
generic RI triggers, which cannot be tested enough any more.
|
||||
|
||||
|
||||
Jan
|
||||
|
||||
> Usernames limited in length
|
||||
> Does pg_dump preserve COMMENTs?
|
||||
> Failure of nested cursors in JDBC
|
||||
> JDBC setMaxRows() is global variable affecting other objects
|
||||
> Does JDBC Makefile need current dir?
|
||||
> Fix for pg_dump of bad system tables
|
||||
> Steve Howe failure query with rules
|
||||
> ODBC/JDBC not disconnecting properly?
|
||||
> Magnus Hagander ODBC issues?
|
||||
> Merge MySQL/PgSQL translation scripts
|
||||
> Fix ipcclean on Linux
|
||||
> Merge global and template BKI files?
|
||||
>
|
||||
>
|
||||
> --
|
||||
> Bruce Momjian | http://candle.pha.pa.us
|
||||
> pgman@candle.pha.pa.us | (610) 853-3000
|
||||
> + If your life is a hard drive, | 830 Blythe Avenue
|
||||
> + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
|
||||
>
|
||||
|
||||
|
||||
--
|
||||
|
||||
#======================================================================#
|
||||
# It's easier to get forgiveness for being wrong than for being right. #
|
||||
# Let's break this rule - forgive me. #
|
||||
#================================================== JanWieck@Yahoo.com #
|
||||
|
||||
|
||||
_________________________________________________________
|
||||
Do You Yahoo!?
|
||||
Get your free @yahoo.com address at http://mail.yahoo.com
|
||||
|
||||
|
||||
From pgsql-general-owner+M590@postgresql.org Tue Nov 14 16:30:40 2000
|
||||
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
|
||||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA22313
|
||||
for <pgman@candle.pha.pa.us>; Tue, 14 Nov 2000 17:30:39 -0500 (EST)
|
||||
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
|
||||
by mail.postgresql.org (8.11.1/8.11.1) with SMTP id eAEMSJs66979;
|
||||
Tue, 14 Nov 2000 17:28:21 -0500 (EST)
|
||||
(envelope-from pgsql-general-owner+M590@postgresql.org)
|
||||
Received: from megazone23.bigpanda.com (138.210.6.64.reflexcom.com [64.6.210.138])
|
||||
by mail.postgresql.org (8.11.1/8.11.1) with ESMTP id eAEMREs66800
|
||||
for <pgsql-general@postgresql.org>; Tue, 14 Nov 2000 17:27:14 -0500 (EST)
|
||||
(envelope-from sszabo@megazone23.bigpanda.com)
|
||||
Received: from localhost (sszabo@localhost)
|
||||
by megazone23.bigpanda.com (8.11.1/8.11.0) with ESMTP id eAEMPpH69059;
|
||||
Tue, 14 Nov 2000 14:25:51 -0800 (PST)
|
||||
Date: Tue, 14 Nov 2000 14:25:51 -0800 (PST)
|
||||
From: Stephan Szabo <sszabo@megazone23.bigpanda.com>
|
||||
To: "Beth K. Gatewood" <bethg@mbt.washington.edu>
|
||||
cc: pgsql-general@postgresql.org
|
||||
Subject: Re: [GENERAL] a request for some experienced input.....
|
||||
In-Reply-To: <3A11ACA1.E5D847DD@mbt.washington.edu>
|
||||
Message-ID: <Pine.BSF.4.21.0011141403380.68986-100000@megazone23.bigpanda.com>
|
||||
MIME-Version: 1.0
|
||||
Content-Type: TEXT/PLAIN; charset=US-ASCII
|
||||
Precedence: bulk
|
||||
Sender: pgsql-general-owner@postgresql.org
|
||||
Status: OR
|
||||
|
||||
|
||||
On Tue, 14 Nov 2000, Beth K. Gatewood wrote:
|
||||
|
||||
> >
|
||||
>
|
||||
> Stephan-
|
||||
>
|
||||
> Thank you so much for taking the effort to answer this these questions. You
|
||||
> help is truly appreciated....
|
||||
>
|
||||
> I just have a few points for clarification.
|
||||
>
|
||||
> >
|
||||
> > MATCH PARTIAL is a specific match type which describes which rows are
|
||||
> > considered matching rows for purposes of meeting or failing the
|
||||
> > constraint. (In match partial, a fktable (NULL, 2) would match a pk
|
||||
> > table (1,2) as well as a pk table (2,2). It's different from match
|
||||
> > full in which case (NULL,2) would be invalid or match unspecified
|
||||
> > in which case it would match due to the existance of the NULL in any
|
||||
> > case). There are some bizarre implementation details involved with
|
||||
> > it and it's different from the others in ways that make it difficult.
|
||||
> > It's in my list of things to do, but I haven't come up with an acceptable
|
||||
> > mechanism in my head yet.
|
||||
>
|
||||
> Does this mean, currently that I can not have foreign keys with null values?
|
||||
|
||||
Not exactly...
|
||||
|
||||
Match full = In FK row, all columns must be NULL or the value of each
|
||||
column must not be null and there is a row in the PK table where
|
||||
each referencing column equals the corresponding referenced
|
||||
column.
|
||||
|
||||
Unspecified = In FK row, at least one column must be NULL or each
|
||||
referencing column shall be equal to the corresponding referenced
|
||||
column in some row of the referenced table
|
||||
|
||||
Match partial is similar to match full except we ignore the null columns
|
||||
for purposes of the each referencing column equals bit.
|
||||
|
||||
For example:
|
||||
PK Table Key values: (1,2), (1,3), (3,3)
|
||||
Attempted FK Table Key values: (1,2), (1,NULL), (5,NULL), (NULL, NULL)
|
||||
(hopefully I get this right)...
|
||||
In match full, only the 1st and 4th fk values are valid.
|
||||
In match partial, the 1st, 2nd, and 4th fk values are valid.
|
||||
In match unspecified, all the fk values are valid.
|
||||
|
||||
The other note is that generally speaking, all three are basically the
|
||||
same for the single column key. If you're only doing references on one
|
||||
column, the match type is mostly meaningless.
|
||||
|
||||
> > PENDANT adds that for each row of the referenced table the values of
|
||||
> > the specified column(s) are the same as the values of the specified
|
||||
> > column(s) in some row of the referencing tables.
|
||||
>
|
||||
> I am not sure I know what you mean here.....Are you saying that the value for
|
||||
> the FK column must match the value for the PK column?
|
||||
|
||||
I haven't really looked at PENDANT, the above was just a small rewrite of
|
||||
some descriptive text in the sql99 draft I have. There's a whole bunch
|
||||
of rules in the actual text of the referential constraint definition.
|
||||
|
||||
The base stuff seems to be: (Rf is the referencing columns, T is the
|
||||
referenced table)
|
||||
|
||||
3) If PENDANT is specified, then:
|
||||
a) For a given row in the referencing table, let pendant
|
||||
reference designate an instance in which all Rf are
|
||||
non-null.
|
||||
|
||||
b) Let number of pendant paths be the number of pendant
|
||||
references to the same referenced row in a referenced table
|
||||
from all referencing rows in all base tables.
|
||||
|
||||
c) For every row in T, the number of pendant paths is equal to
|
||||
or greater than 1.
|
||||
|
||||
So, I'd read it as every row in T must have at least one referencing row
|
||||
in some base table.
|
||||
|
||||
There are some details about updates and that you can't mix PENDANT and
|
||||
MATCH PARTIAL or SET DEFAULT actions.
|
||||
|
||||
> > The main issues in 7.0 are that older versions (might be fixed in
|
||||
> > 7.0.3) would fail very badly if you used alter table to rename tables that
|
||||
> > were referenced in a fk constraint and that you need to give update
|
||||
> > permission to the referenced table. For the former, 7.1 will (and 7.0.3
|
||||
> > may) give an elog(ERROR) to you rather than crashing the backend and the
|
||||
> > latter should be fixed for 7.1 (although you still need to have write
|
||||
> > perms to the referencing table for referential actions to work properly)
|
||||
>
|
||||
> Are the steps to this outlined somewhere then?
|
||||
|
||||
The permissions stuff is just a matter of using GRANT and REVOKE to set
|
||||
the permissions that a user has to a table.
|
||||
|
||||
|
@ -1,129 +0,0 @@
|
||||
From pgsql-hackers-owner+M908@postgresql.org Sun Nov 19 14:27:43 2000
|
||||
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
|
||||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA10885
|
||||
for <pgman@candle.pha.pa.us>; Sun, 19 Nov 2000 14:27:42 -0500 (EST)
|
||||
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
|
||||
by mail.postgresql.org (8.11.1/8.11.1) with SMTP id eAJJSMs83653;
|
||||
Sun, 19 Nov 2000 14:28:22 -0500 (EST)
|
||||
(envelope-from pgsql-hackers-owner+M908@postgresql.org)
|
||||
Received: from candle.pha.pa.us (candle.navpoint.com [162.33.245.46] (may be forged))
|
||||
by mail.postgresql.org (8.11.1/8.11.1) with ESMTP id eAJJQns83565
|
||||
for <pgsql-hackers@postgreSQL.org>; Sun, 19 Nov 2000 14:26:49 -0500 (EST)
|
||||
(envelope-from pgman@candle.pha.pa.us)
|
||||
Received: (from pgman@localhost)
|
||||
by candle.pha.pa.us (8.9.0/8.9.0) id OAA06790;
|
||||
Sun, 19 Nov 2000 14:23:06 -0500 (EST)
|
||||
From: Bruce Momjian <pgman@candle.pha.pa.us>
|
||||
Message-Id: <200011191923.OAA06790@candle.pha.pa.us>
|
||||
Subject: Re: [HACKERS] WAL fsync scheduling
|
||||
In-Reply-To: <002101c0525e$2d964480$b97a30d0@sectorbase.com> "from Vadim Mikheev
|
||||
at Nov 19, 2000 11:23:19 am"
|
||||
To: Vadim Mikheev <vmikheev@sectorbase.com>
|
||||
Date: Sun, 19 Nov 2000 14:23:06 -0500 (EST)
|
||||
CC: Tom Samplonius <tom@sdf.com>, Alfred@candle.pha.pa.us,
|
||||
Perlstein <bright@wintelcom.net>, Larry@candle.pha.pa.us,
|
||||
Rosenman <ler@lerctr.org>,
|
||||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||||
X-Mailer: ELM [version 2.4ME+ PL77 (25)]
|
||||
MIME-Version: 1.0
|
||||
Content-Transfer-Encoding: 7bit
|
||||
Content-Type: text/plain; charset=US-ASCII
|
||||
Precedence: bulk
|
||||
Sender: pgsql-hackers-owner@postgresql.org
|
||||
Status: OR
|
||||
|
||||
[ Charset ISO-8859-1 unsupported, converting... ]
|
||||
> > There are two parts to transaction commit. The first is writing all
|
||||
> > dirty buffers or log changes to the kernel, and second is fsync of the
|
||||
> ^^^^^^^^^^^^
|
||||
> Backend doesn't write any dirty buffer to the kernel at commit time.
|
||||
|
||||
Yes, I suspected that.
|
||||
|
||||
>
|
||||
> > log file.
|
||||
>
|
||||
> The first part is writing commit record into WAL buffers in shmem.
|
||||
> This is what XLogInsert does. After that XLogFlush is called to ensure
|
||||
> that entire commit record is on disk. XLogFlush does *both* write() and
|
||||
> fsync() (single slock is used for both writing and fsyncing) if it needs to
|
||||
> do it at all.
|
||||
|
||||
Yes, I realize there are new steps in WAL.
|
||||
|
||||
>
|
||||
> > I suggest having a per-backend shared memory byte that has the following
|
||||
> > values:
|
||||
> >
|
||||
> > START_LOG_WRITE
|
||||
> > WAIT_ON_FSYNC
|
||||
> > NOT_IN_COMMIT
|
||||
> > backend_number_doing_fsync
|
||||
> >
|
||||
> > I suggest that when each backend starts a commit, it sets its byte to
|
||||
> > START_LOG_WRITE.
|
||||
> ^^^^^^^^^^^^^^^^^^^^^^^
|
||||
> Isn't START_COMMIT more meaningful?
|
||||
|
||||
Yes.
|
||||
|
||||
>
|
||||
> > When it gets ready to fsync, it checks all backends.
|
||||
> ^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
> What do you mean by this? The moment just after XLogInsert?
|
||||
|
||||
Just before it calls fsync().
|
||||
|
||||
>
|
||||
> > If all are NOT_IN_COMMIT, it does fsync and continues.
|
||||
>
|
||||
> 1st edition:
|
||||
> > If one or more are in START_LOG_WRITE, it waits until no one is in
|
||||
> > START_LOG_WRITE. It then checks all WAIT_ON_FSYNC, and if it is the
|
||||
> > lowest backend in WAIT_ON_FSYNC, marks all others with its backend
|
||||
> > number, and does fsync. It then clears all backends with its number to
|
||||
> > NOT_IN_COMMIT. Other backend will see they are not the lowest
|
||||
> > WAIT_ON_FSYNC and will wait for their byte to be set to NOT_IN_COMMIT
|
||||
> > so they can then continue, knowing their data was synced.
|
||||
>
|
||||
> 2nd edition:
|
||||
> > I have another idea. If a backend gets to the point that it needs
|
||||
> > fsync, and there is another backend in START_LOG_WRITE, it can go to an
|
||||
> > interuptable sleep, knowing another backend will perform the fsync and
|
||||
> > wake it up. Therefore, there is no busy-wait or timed sleep.
|
||||
> >
|
||||
> > Of course, a backend must set its status to WAIT_ON_FSYNC to avoid a
|
||||
> > race condition.
|
||||
>
|
||||
> The 2nd edition is much better. But I'm not sure do we really need in
|
||||
> these per-backend bytes in shmem. Why not just have some counters?
|
||||
> We can use a semaphore to wake-up all waiters at once.
|
||||
|
||||
Yes, that is much better and clearer. My idea was just to say, "if no
|
||||
one is entering commit phase, do the commit. If someone else is coming,
|
||||
sleep and wait for them to do the fsync and wake me up with a singal."
|
||||
|
||||
>
|
||||
> > This allows a single backend not to sleep, and allows multiple backends
|
||||
> > to bunch up only when they are all about to commit.
|
||||
> >
|
||||
> > The reason backend numbers are written is so other backends entering the
|
||||
> > commit code will not interfere with the backends performing fsync.
|
||||
>
|
||||
> Being waked-up backend can check what's written/fsynced by calling XLogFlush.
|
||||
|
||||
Seems that may not be needed anymore with a counter. The only issue is
|
||||
that other backends may enter commit while fsync() is happening. The
|
||||
process that did the fsync must be sure to wake up only the backends
|
||||
that were waiting for it, and not other backends that may be also be
|
||||
doing fsync as a group while the first fsync was happening. I leave
|
||||
those details to people more experienced. :-)
|
||||
|
||||
I am just glad people liked my idea.
|
||||
|
||||
--
|
||||
Bruce Momjian | http://candle.pha.pa.us
|
||||
pgman@candle.pha.pa.us | (610) 853-3000
|
||||
+ If your life is a hard drive, | 830 Blythe Avenue
|
||||
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
|
||||
|
File diff suppressed because it is too large
Load Diff
@ -1,102 +0,0 @@
|
||||
From owner-pgsql-hackers@hub.org Mon May 11 11:31:09 1998
|
||||
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
|
||||
by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA03006
|
||||
for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:31:07 -0400 (EDT)
|
||||
Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.17 $) with ESMTP id LAA01663 for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:24:42 -0400 (EDT)
|
||||
Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id LAA21841; Mon, 11 May 1998 11:15:25 -0400 (EDT)
|
||||
Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 11 May 1998 11:15:12 +0000 (EDT)
|
||||
Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id LAA21683 for pgsql-hackers-outgoing; Mon, 11 May 1998 11:15:09 -0400 (EDT)
|
||||
Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id LAA21451 for <hackers@postgreSQL.org>; Mon, 11 May 1998 11:15:03 -0400 (EDT)
|
||||
Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
|
||||
by sss.sss.pgh.pa.us (8.8.5/8.8.5) with ESMTP id LAA24915;
|
||||
Mon, 11 May 1998 11:14:43 -0400 (EDT)
|
||||
To: Brett McCormick <brett@work.chicken.org>
|
||||
cc: hackers@postgreSQL.org
|
||||
Subject: Re: [HACKERS] Re: [PATCHES] Try again: S_LOCK reduced contentionh]
|
||||
In-reply-to: Your message of Mon, 11 May 1998 07:57:23 -0700 (PDT)
|
||||
<13655.4384.345723.466046@abraxas.scene.com>
|
||||
Date: Mon, 11 May 1998 11:14:43 -0400
|
||||
Message-ID: <24913.894899683@sss.pgh.pa.us>
|
||||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||||
Sender: owner-pgsql-hackers@hub.org
|
||||
Precedence: bulk
|
||||
Status: RO
|
||||
|
||||
Brett McCormick <brett@work.chicken.org> writes:
|
||||
> same way that the current network socket is passed -- through an execv
|
||||
> argument. hopefully, however, the non-execv()ing fork will be in 6.4.
|
||||
|
||||
Um, you missed the point, Brett. David was hoping to transfer a client
|
||||
connection from the postmaster to an *already existing* backend process.
|
||||
Fork, with or without exec, solves the problem for a backend that's
|
||||
started after the postmaster has accepted the client socket.
|
||||
|
||||
This does lead to a different line of thought, however. Pre-started
|
||||
backends would have access to the "master" connection socket on which
|
||||
the postmaster listens for client connections, right? Suppose that we
|
||||
fire the postmaster as postmaster, and demote it to being simply a
|
||||
manufacturer of new backend processes as old ones get used up. Have
|
||||
one of the idle backend processes be the one doing the accept() on the
|
||||
master socket. Once it has a client connection, it performs the
|
||||
authentication handshake and then starts serving the client (or just
|
||||
quits if authentication fails). Meanwhile the next idle backend process
|
||||
has executed accept() on the master socket and is waiting for the next
|
||||
client; and shortly the postmaster/factory/whateverwecallitnow notices
|
||||
that it needs to start another backend to add to the idle-backend pool.
|
||||
|
||||
This'd probably need some interlocking among the backends. I have no
|
||||
idea whether it'd be safe to have all the idle backends trying to
|
||||
do accept() on the master socket simultaneously, but it sounds risky.
|
||||
Better to use a mutex so that only one gets to do it while the others
|
||||
sleep.
|
||||
|
||||
regards, tom lane
|
||||
|
||||
|
||||
From owner-pgsql-hackers@hub.org Mon May 11 11:35:55 1998
|
||||
Received: from hub.org (hub.org [209.47.148.200])
|
||||
by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA03043
|
||||
for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:35:53 -0400 (EDT)
|
||||
Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id LAA23494; Mon, 11 May 1998 11:27:10 -0400 (EDT)
|
||||
Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 11 May 1998 11:27:02 +0000 (EDT)
|
||||
Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id LAA23473 for pgsql-hackers-outgoing; Mon, 11 May 1998 11:27:01 -0400 (EDT)
|
||||
Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id LAA23462 for <hackers@postgreSQL.org>; Mon, 11 May 1998 11:26:56 -0400 (EDT)
|
||||
Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
|
||||
by sss.sss.pgh.pa.us (8.8.5/8.8.5) with ESMTP id LAA25006;
|
||||
Mon, 11 May 1998 11:26:44 -0400 (EDT)
|
||||
To: Brett McCormick <brett@work.chicken.org>
|
||||
cc: hackers@postgreSQL.org
|
||||
Subject: Re: [HACKERS] Re: [PATCHES] Try again: S_LOCK reduced contentionh]
|
||||
In-reply-to: Your message of Mon, 11 May 1998 07:57:23 -0700 (PDT)
|
||||
<13655.4384.345723.466046@abraxas.scene.com>
|
||||
Date: Mon, 11 May 1998 11:26:44 -0400
|
||||
Message-ID: <25004.894900404@sss.pgh.pa.us>
|
||||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||||
Sender: owner-pgsql-hackers@hub.org
|
||||
Precedence: bulk
|
||||
Status: RO
|
||||
|
||||
Meanwhile, *I* missed the point about Brett's second comment :-(
|
||||
|
||||
Brett McCormick <brett@work.chicken.org> writes:
|
||||
> There will have to be some sort of arg parsing in any case,
|
||||
> considering that you can pass configurable arguments to the backend..
|
||||
|
||||
If we do the sort of change David and I were just discussing, then the
|
||||
pre-spawned backend would become responsible for parsing and dealing
|
||||
with the PGOPTIONS portion of the client's connection request message.
|
||||
That's just part of shifting the authentication handshake code from
|
||||
postmaster to backend, so it shouldn't be too hard.
|
||||
|
||||
BUT: the whole point is to be able to initialize the backend before it
|
||||
is connected to a client. How much of the expensive backend startup
|
||||
work depends on having the client connection options available?
|
||||
Any work that needs to know the options will have to wait until after
|
||||
the client connects. If that means most of the startup work can't
|
||||
happen in advance anyway, then we're out of luck; a pre-started backend
|
||||
won't save enough time to be worth the effort. (Unless we are willing
|
||||
to eliminate or redefine the troublesome options...)
|
||||
|
||||
regards, tom lane
|
||||
|
||||
|
@ -1319,3 +1319,105 @@ DDI: +64(4)916-7201 MOB: +64(21)635-694 OFFICE: +64(4)499-2267
|
||||
---------------------------(end of broadcast)---------------------------
|
||||
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
|
||||
|
||||
From owner-pgsql-hackers@hub.org Mon May 11 11:31:09 1998
|
||||
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
|
||||
by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA03006
|
||||
for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:31:07 -0400 (EDT)
|
||||
Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.17 $) with ESMTP id LAA01663 for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:24:42 -0400 (EDT)
|
||||
Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id LAA21841; Mon, 11 May 1998 11:15:25 -0400 (EDT)
|
||||
Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 11 May 1998 11:15:12 +0000 (EDT)
|
||||
Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id LAA21683 for pgsql-hackers-outgoing; Mon, 11 May 1998 11:15:09 -0400 (EDT)
|
||||
Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id LAA21451 for <hackers@postgreSQL.org>; Mon, 11 May 1998 11:15:03 -0400 (EDT)
|
||||
Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
|
||||
by sss.sss.pgh.pa.us (8.8.5/8.8.5) with ESMTP id LAA24915;
|
||||
Mon, 11 May 1998 11:14:43 -0400 (EDT)
|
||||
To: Brett McCormick <brett@work.chicken.org>
|
||||
cc: hackers@postgreSQL.org
|
||||
Subject: Re: [HACKERS] Re: [PATCHES] Try again: S_LOCK reduced contentionh]
|
||||
In-reply-to: Your message of Mon, 11 May 1998 07:57:23 -0700 (PDT)
|
||||
<13655.4384.345723.466046@abraxas.scene.com>
|
||||
Date: Mon, 11 May 1998 11:14:43 -0400
|
||||
Message-ID: <24913.894899683@sss.pgh.pa.us>
|
||||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||||
Sender: owner-pgsql-hackers@hub.org
|
||||
Precedence: bulk
|
||||
Status: RO
|
||||
|
||||
Brett McCormick <brett@work.chicken.org> writes:
|
||||
> same way that the current network socket is passed -- through an execv
|
||||
> argument. hopefully, however, the non-execv()ing fork will be in 6.4.
|
||||
|
||||
Um, you missed the point, Brett. David was hoping to transfer a client
|
||||
connection from the postmaster to an *already existing* backend process.
|
||||
Fork, with or without exec, solves the problem for a backend that's
|
||||
started after the postmaster has accepted the client socket.
|
||||
|
||||
This does lead to a different line of thought, however. Pre-started
|
||||
backends would have access to the "master" connection socket on which
|
||||
the postmaster listens for client connections, right? Suppose that we
|
||||
fire the postmaster as postmaster, and demote it to being simply a
|
||||
manufacturer of new backend processes as old ones get used up. Have
|
||||
one of the idle backend processes be the one doing the accept() on the
|
||||
master socket. Once it has a client connection, it performs the
|
||||
authentication handshake and then starts serving the client (or just
|
||||
quits if authentication fails). Meanwhile the next idle backend process
|
||||
has executed accept() on the master socket and is waiting for the next
|
||||
client; and shortly the postmaster/factory/whateverwecallitnow notices
|
||||
that it needs to start another backend to add to the idle-backend pool.
|
||||
|
||||
This'd probably need some interlocking among the backends. I have no
|
||||
idea whether it'd be safe to have all the idle backends trying to
|
||||
do accept() on the master socket simultaneously, but it sounds risky.
|
||||
Better to use a mutex so that only one gets to do it while the others
|
||||
sleep.
|
||||
|
||||
regards, tom lane
|
||||
|
||||
|
||||
From owner-pgsql-hackers@hub.org Mon May 11 11:35:55 1998
|
||||
Received: from hub.org (hub.org [209.47.148.200])
|
||||
by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA03043
|
||||
for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:35:53 -0400 (EDT)
|
||||
Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id LAA23494; Mon, 11 May 1998 11:27:10 -0400 (EDT)
|
||||
Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 11 May 1998 11:27:02 +0000 (EDT)
|
||||
Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id LAA23473 for pgsql-hackers-outgoing; Mon, 11 May 1998 11:27:01 -0400 (EDT)
|
||||
Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id LAA23462 for <hackers@postgreSQL.org>; Mon, 11 May 1998 11:26:56 -0400 (EDT)
|
||||
Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
|
||||
by sss.sss.pgh.pa.us (8.8.5/8.8.5) with ESMTP id LAA25006;
|
||||
Mon, 11 May 1998 11:26:44 -0400 (EDT)
|
||||
To: Brett McCormick <brett@work.chicken.org>
|
||||
cc: hackers@postgreSQL.org
|
||||
Subject: Re: [HACKERS] Re: [PATCHES] Try again: S_LOCK reduced contentionh]
|
||||
In-reply-to: Your message of Mon, 11 May 1998 07:57:23 -0700 (PDT)
|
||||
<13655.4384.345723.466046@abraxas.scene.com>
|
||||
Date: Mon, 11 May 1998 11:26:44 -0400
|
||||
Message-ID: <25004.894900404@sss.pgh.pa.us>
|
||||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||||
Sender: owner-pgsql-hackers@hub.org
|
||||
Precedence: bulk
|
||||
Status: RO
|
||||
|
||||
Meanwhile, *I* missed the point about Brett's second comment :-(
|
||||
|
||||
Brett McCormick <brett@work.chicken.org> writes:
|
||||
> There will have to be some sort of arg parsing in any case,
|
||||
> considering that you can pass configurable arguments to the backend..
|
||||
|
||||
If we do the sort of change David and I were just discussing, then the
|
||||
pre-spawned backend would become responsible for parsing and dealing
|
||||
with the PGOPTIONS portion of the client's connection request message.
|
||||
That's just part of shifting the authentication handshake code from
|
||||
postmaster to backend, so it shouldn't be too hard.
|
||||
|
||||
BUT: the whole point is to be able to initialize the backend before it
|
||||
is connected to a client. How much of the expensive backend startup
|
||||
work depends on having the client connection options available?
|
||||
Any work that needs to know the options will have to wait until after
|
||||
the client connects. If that means most of the startup work can't
|
||||
happen in advance anyway, then we're out of luck; a pre-started backend
|
||||
won't save enough time to be worth the effort. (Unless we are willing
|
||||
to eliminate or redefine the troublesome options...)
|
||||
|
||||
regards, tom lane
|
||||
|
||||
|
||||
|
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@ -1,916 +0,0 @@
|
||||
From pgsql-hackers-owner+M1833@hub.org Sat May 13 22:49:26 2000
|
||||
Received: from news.tht.net (news.hub.org [216.126.91.242])
|
||||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07394
|
||||
for <pgman@candle.pha.pa.us>; Sat, 13 May 2000 22:49:24 -0400 (EDT)
|
||||
Received: from hub.org (majordom@hub.org [216.126.84.1])
|
||||
by news.tht.net (8.9.3/8.9.3) with ESMTP id WAB99859;
|
||||
Sat, 13 May 2000 22:44:15 -0400 (EDT)
|
||||
(envelope-from pgsql-hackers-owner+M1833@hub.org)
|
||||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||||
by hub.org (8.9.3/8.9.3) with ESMTP id WAA51058
|
||||
for <pgsql-hackers@postgreSQL.org>; Sat, 13 May 2000 22:41:16 -0400 (EDT)
|
||||
(envelope-from tgl@sss.pgh.pa.us)
|
||||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id WAA18343
|
||||
for <pgsql-hackers@postgreSQL.org>; Sat, 13 May 2000 22:40:38 -0400 (EDT)
|
||||
To: pgsql-hackers@postgresql.org
|
||||
Subject: [HACKERS] Proposal for fixing numeric type-resolution issues
|
||||
Date: Sat, 13 May 2000 22:40:38 -0400
|
||||
Message-ID: <18340.958272038@sss.pgh.pa.us>
|
||||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||||
Precedence: bulk
|
||||
Sender: pgsql-hackers-owner@hub.org
|
||||
Status: ORr
|
||||
|
||||
We've got a collection of problems that are related to the parser's
|
||||
inability to make good type-resolution choices for numeric constants.
|
||||
In some cases you get a hard error; for example "NumericVar + 4.4"
|
||||
yields
|
||||
ERROR: Unable to identify an operator '+' for types 'numeric' and 'float8'
|
||||
You will have to retype this query using an explicit cast
|
||||
because "4.4" is initially typed as float8 and the system can't figure
|
||||
out whether to use numeric or float8 addition. A more subtle problem
|
||||
is that a query like "... WHERE Int2Var < 42" is unable to make use of
|
||||
an index on the int2 column: 42 is resolved as int4, so the operator
|
||||
is int24lt, which works but is not in the opclass of an int2 index.
|
||||
|
||||
Here is a proposal for fixing these problems. I think we could get this
|
||||
done for 7.1 if people like it.
|
||||
|
||||
The basic problem is that there's not enough smarts in the type resolver
|
||||
about the interrelationships of the numeric datatypes. All it has is
|
||||
a concept of a most-preferred type within the category of numeric types.
|
||||
(We are abusing the most-preferred-type mechanism, BTW, because both
|
||||
FLOAT8 and NUMERIC claim to be the most-preferred type in the numeric
|
||||
category! This is in fact why the resolver can't make a choice for
|
||||
"numeric+float8".) We need more intelligence than that.
|
||||
|
||||
I propose that we set up a strictly-ordered hierarchy of numeric
|
||||
datatypes, running from least preferred to most preferred:
|
||||
int2, int4, int8, numeric, float4, float8.
|
||||
Rather than simply considering coercions to the most-preferred type,
|
||||
the type resolver should use the following rules:
|
||||
|
||||
1. No value will be down-converted (eg int4 to int2) except by an
|
||||
explicit conversion.
|
||||
|
||||
2. If there is not an exact matching operator, numeric values will be
|
||||
up-converted to the highest numeric datatype present among the operator
|
||||
or function's arguments. For example, given "int2 + int8" we'd up-
|
||||
convert the int2 to int8 and apply int8 addition.
|
||||
|
||||
The final piece of the puzzle is that the type initially assigned to
|
||||
an undecorated numeric constant should be NUMERIC if it contains a
|
||||
decimal point or exponent, and otherwise the smallest of int2, int4,
|
||||
int8, NUMERIC that will represent it. This is a considerable change
|
||||
from the current lexer behavior, where you get either int4 or float8.
|
||||
|
||||
For example, given "NumericVar + 4.4", the constant 4.4 will initially
|
||||
be assigned type NUMERIC, we will resolve the operator as numeric plus,
|
||||
and everything's fine. Given "Float8Var + 4.4", the constant is still
|
||||
initially numeric, but will be up-converted to float8 so that float8
|
||||
addition can be used. The end result is the same as in traditional
|
||||
Postgres: you get float8 addition. Given "Int2Var < 42", the constant
|
||||
is initially typed as int2, since it fits, and we end up selecting
|
||||
int2lt, thereby allowing use of an int2 index. (On the other hand,
|
||||
given "Int2Var < 100000", we'd end up using int4lt, which is correct
|
||||
to avoid overflow.)
|
||||
|
||||
A couple of crucial subtleties here:
|
||||
|
||||
1. We are assuming that the parser or optimizer will constant-fold
|
||||
any conversion functions that are introduced. Thus, in the
|
||||
"Float8Var + 4.4" case, the 4.4 is represented as a float8 4.4 by the
|
||||
time execution begins, so there's no performance loss.
|
||||
|
||||
2. We cannot lose precision by initially representing a constant as
|
||||
numeric and later converting it to float. Nor can we exceed NUMERIC's
|
||||
range (the default 1000-digit limit is more than the range of IEEE
|
||||
float8 data). It would not work as well to start out by representing
|
||||
a constant as float and then converting it to numeric.
|
||||
|
||||
Presently, the pg_proc and pg_operator tables contain a pretty fair
|
||||
collection of cross-datatype numeric operators, such as int24lt,
|
||||
float48pl, etc. We could perhaps leave these in, but I believe that
|
||||
it is better to remove them. For example, if int42lt is left in place,
|
||||
then it would capture cases like "Int4Var < 42", whereas we need that
|
||||
to be translated to int4lt so that an int4 index can be used. Removing
|
||||
these operators will eliminate some code bloat and system-catalog bloat
|
||||
to boot.
|
||||
|
||||
As far as I can tell, this proposal is almost compatible with the rules
|
||||
given in SQL92: in particular, SQL92 specifies that an operator having
|
||||
both "approximate numeric" (float) and "exact numeric" (int or numeric)
|
||||
inputs should deliver an approximate-numeric result. I propose
|
||||
deviating from SQL92 in a single respect: SQL92 specifies that a
|
||||
constant containing an exponent (eg 1.2E34) is approximate numeric,
|
||||
which implies that the result of an operator using it is approximate
|
||||
even if the other operand is exact. I believe it's better to treat
|
||||
such a constant as exact (ie, type NUMERIC) and only convert it to
|
||||
float if the other operand is float. Without doing that, an assignment
|
||||
like
|
||||
UPDATE tab SET NumericVar = 1.234567890123456789012345E34;
|
||||
will not work as desired because the constant will be prematurely
|
||||
coerced to float, causing precision loss.
|
||||
|
||||
Comments?
|
||||
|
||||
regards, tom lane
|
||||
|
||||
From tgl@sss.pgh.pa.us Sun May 14 17:30:56 2000
|
||||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA05808
|
||||
for <pgman@candle.pha.pa.us>; Sun, 14 May 2000 17:30:52 -0400 (EDT)
|
||||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.4 $) with ESMTP id RAA16657 for <pgman@candle.pha.pa.us>; Sun, 14 May 2000 17:29:52 -0400 (EDT)
|
||||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id RAA20914;
|
||||
Sun, 14 May 2000 17:29:30 -0400 (EDT)
|
||||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||||
cc: PostgreSQL-development <pgsql-hackers@postgreSQL.org>
|
||||
Subject: Re: [HACKERS] type conversion discussion
|
||||
In-reply-to: <200005141950.PAA04636@candle.pha.pa.us>
|
||||
References: <200005141950.PAA04636@candle.pha.pa.us>
|
||||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||||
message dated "Sun, 14 May 2000 15:50:20 -0400"
|
||||
Date: Sun, 14 May 2000 17:29:30 -0400
|
||||
Message-ID: <20911.958339770@sss.pgh.pa.us>
|
||||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||||
Status: OR
|
||||
|
||||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||||
> As some point, it seems we need to get all the PostgreSQL minds together
|
||||
> to discuss type conversion issues. These problems continue to come up
|
||||
> from release to release. We are getting better, but it seems a full
|
||||
> discussion could help solidify our strategy.
|
||||
|
||||
OK, here are a few things that bug me about the current type-resolution
|
||||
code:
|
||||
|
||||
1. Poor choice of type to attribute to numeric literals. (A possible
|
||||
solution is sketched in my earlier message, but do we need similar
|
||||
mechanisms for other type categories?)
|
||||
|
||||
2. Tensions between treating string literals as "unknown" type and
|
||||
as "text" type, per this thread so far.
|
||||
|
||||
3. IS_BINARY_COMPATIBLE seems like a bogus concept. Do we really want a
|
||||
fully symmetrical ring of types in each group? I'd prefer to see a
|
||||
one-way equivalence, which allows eg. OID to be silently converted
|
||||
to INT4, but *not* vice versa (except perhaps by specific user cast).
|
||||
This'd be more like a traditional "is-a" or inheritance relationship
|
||||
between datatypes, which has well-understood semantics.
|
||||
|
||||
4. I'm also concerned that the behavior of IS_BINARY_COMPATIBLE isn't
|
||||
very predictable because it will happily go either way. For example,
|
||||
if I do
|
||||
select * from pg_class where oid = 1234;
|
||||
it's unclear whether I will get an oideq or an int4eq operator ---
|
||||
and that's a rather critical point since only one of them can exploit
|
||||
an index on the oid column. Currently, there is some klugery in the
|
||||
planner that works around this by overriding the parser's choice of
|
||||
operator to substitute one that is compatible with an available index.
|
||||
That's a pretty ugly solution ... I'm not sure I know a better one,
|
||||
but as long as we're discussing type resolution issues ...
|
||||
|
||||
5. Lack of extensibility. There's way too much knowledge hard-wired
|
||||
into the parser about type categories, preferred types, binary
|
||||
compatibility, etc. All of it falls down when faced with
|
||||
user-defined datatypes. If we do something like I suggested with
|
||||
a hardwired hierarchy of numeric datatypes, it'll get even worse.
|
||||
All this stuff ought to be driven off fields in pg_type rather than
|
||||
be hardwired into the code, so that the same concepts can be extended
|
||||
to user-defined types.
|
||||
|
||||
I don't have worked-out proposals for any of these but the first,
|
||||
but they've all been bothering me for a while.
|
||||
|
||||
regards, tom lane
|
||||
|
||||
From tgl@sss.pgh.pa.us Sun May 14 21:02:31 2000
|
||||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA07700
|
||||
for <pgman@candle.pha.pa.us>; Sun, 14 May 2000 21:02:28 -0400 (EDT)
|
||||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id VAA21261;
|
||||
Sun, 14 May 2000 21:03:17 -0400 (EDT)
|
||||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||||
cc: PostgreSQL-development <pgsql-hackers@postgreSQL.org>
|
||||
Subject: Re: [HACKERS] type conversion discussion
|
||||
In-reply-to: <20911.958339770@sss.pgh.pa.us>
|
||||
References: <200005141950.PAA04636@candle.pha.pa.us> <20911.958339770@sss.pgh.pa.us>
|
||||
Comments: In-reply-to Tom Lane <tgl@sss.pgh.pa.us>
|
||||
message dated "Sun, 14 May 2000 17:29:30 -0400"
|
||||
Date: Sun, 14 May 2000 21:03:17 -0400
|
||||
Message-ID: <21258.958352597@sss.pgh.pa.us>
|
||||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||||
Status: OR
|
||||
|
||||
Here are the results of some further thoughts about type-conversion
|
||||
issues. This is not a complete proposal yet, but a sketch of an
|
||||
approach that might solve several of the gripes in my previous proposal.
|
||||
|
||||
While thinking about this, I realized that my numeric-types proposal
|
||||
of yesterday would break at least a few cases that work nicely now.
|
||||
For example, I frequently do things like
|
||||
select * from pg_class where oid = 1234;
|
||||
whilst poking around in system tables and querytree dumps. If that
|
||||
constant is initially resolved as int2, as I suggested yesterday,
|
||||
then we have "oid = int2" for which there is no operator. To succeed
|
||||
we must decide to promote the constant to int4 --- but with no int4
|
||||
visible among the operands of the "=", it will not work to just "promote
|
||||
numerics to the highest type seen in the operands" as I suggested
|
||||
yesterday. So there has to be some more interaction in there.
|
||||
|
||||
Anyway, I was complaining about the looseness of the concept of
|
||||
binary-compatible types and the fact that the parser's type conversion
|
||||
knowledge is mostly hardwired. These might be resolved by generalizing
|
||||
the numeric type hierarchy idea into a "type promotion lattice", which
|
||||
would work like this:
|
||||
|
||||
* Add a "typpromote" column to pg_type, which contains either zero or
|
||||
the OID of another type that the parser is allowed to promote this
|
||||
type to when searching for usable functions/operators. For example,
|
||||
my numeric-types hierarchy of yesterday would be expressed by making
|
||||
int2 promote to int4, int4 to int8, int8 to numeric, numeric to
|
||||
float4, and float4 to float8. The promotion idea also replaces the
|
||||
current concept of binary-compatible types: for example, OID would
|
||||
link to int4 and varchar would link to text (but not vice versa!).
|
||||
|
||||
* Also add a "typpromotebin" boolean column to pg_type, which contains
|
||||
't' if the type conversion indicated by typpromote is "free", ie,
|
||||
no conversion function need be executed before regarding a value as
|
||||
belonging to the promoted type. This distinguishes binary-compatible
|
||||
from non-binary-compatible cases. If "typpromotebin" is 'f' and the
|
||||
parser decides it needs to apply the conversion, then it has to look
|
||||
up the appropriate conversion function in pg_proc. (More about this
|
||||
below.)
|
||||
|
||||
Now, if the parser fails to find an exact match for a given function
|
||||
or operator name and the exact set of input data types, it proceeds by
|
||||
chasing up the promotion chains for the input data types and trying to
|
||||
locate a set of types for which there is a matching function/operator.
|
||||
If there are multiple possibilities, we choose the one which is the
|
||||
"least promoted" by some yet-to-be-determined metric. (This metric
|
||||
would probably favor "free" conversions over non-free ones, but other
|
||||
than that I'm not quite sure how it should work. The metric would
|
||||
replace a whole bunch of ad-hoc heuristics that are currently applied
|
||||
in the type resolver, so even if it seems rather ad-hoc it'd still be
|
||||
cleaner than what we have ;-).)
|
||||
|
||||
In a situation like the "oid = int2" example above, this mechanism would
|
||||
presumably settle on "int4 = int4" as being the least-promoted
|
||||
equivalent operator. (It could not find "oid = oid" since there is
|
||||
no promotion path from int2 to oid.) That looks bad since it isn't
|
||||
compatible with an oidops index --- but I have a solution for that!
|
||||
I don't think we need the oid opclass at all; why shouldn't indexes
|
||||
on oid be expressed as int4 indexes to begin with? In general, if
|
||||
two types are considered binary-equivalent under the old scheme, then
|
||||
the one that is considered the subtype probably shouldn't have separate
|
||||
index operators under this new scheme. Instead it should just rely on
|
||||
the index operators of the promoted type.
|
||||
|
||||
The point of the proposed typpromotebin field is to save a pg_proc
|
||||
lookup when trying to determine whether a particular promotion is "free"
|
||||
or not. We could save even more lookups if we didn't store the boolean
|
||||
but instead the actual OID of the conversion function, or zero if the
|
||||
promotion is "free". The trouble with that is that it creates a
|
||||
circularity problem when trying to define a new user type --- you can't
|
||||
define the conversion function if its input type doesn't exist yet.
|
||||
In any case, we want the parser to do a function lookup if we've
|
||||
advanced more than one step in the promotion hierarchy: if we've decided
|
||||
to promote int4 to float8 (which will be a four-step chain through int8,
|
||||
numeric, float4) we sure want the thing to use a direct int4tofloat8
|
||||
conversion function if available, not a chain of four conversion
|
||||
functions. So on balance I think we want to look in pg_proc once we've
|
||||
decided which conversion to perform. The only reason for having
|
||||
typpromotebin is that the promotion metric will want to know which
|
||||
conversions are free, and we don't want to have to do a lookup in
|
||||
pg_proc for each alternative we consider, only the ones that are finally
|
||||
selected to be used.
|
||||
|
||||
I can think of at least one special case that still isn't cleanly
|
||||
handled under this scheme, and that is bpchar vs. varchar comparison.
|
||||
Currently, we have
|
||||
|
||||
regression=# select 'a'::bpchar = 'a '::bpchar;
|
||||
?column?
|
||||
----------
|
||||
t
|
||||
(1 row)
|
||||
|
||||
This is correct since trailing blanks are insignificant in bpchar land,
|
||||
so the two values should be considered equal. If we try
|
||||
|
||||
regression=# select 'a'::bpchar = 'a '::varchar;
|
||||
ERROR: Unable to identify an operator '=' for types 'bpchar' and 'varchar'
|
||||
You will have to retype this query using an explicit cast
|
||||
|
||||
which is pretty bogus but at least it saves the system from making some
|
||||
random choice about whether bpchar or varchar comparison rules apply.
|
||||
On the other hand,
|
||||
|
||||
regression=# select 'a'::bpchar = 'a '::text;
|
||||
?column?
|
||||
----------
|
||||
f
|
||||
(1 row)
|
||||
|
||||
Here the bpchar value has been promoted to text and then text comparison
|
||||
(where trailing blanks *are* significant) is applied. I'm not sure that
|
||||
we can really justify doing this in this case when we reject the bpchar
|
||||
vs varchar case, but maybe someone wants to argue that that's correct.
|
||||
|
||||
The natural setup in my type-promotion scheme would be that both bpchar
|
||||
and varchar link to 'text' as their promoted type. If we do nothing
|
||||
special then text-style comparison would be used in a bpchar vs varchar
|
||||
comparison, which is arguably wrong.
|
||||
|
||||
One way to deal with this without introducing kluges into the type
|
||||
resolver is to provide a full set of bpchar vs text and text vs bpchar
|
||||
operators, and make sure that the promotion metric is such that these
|
||||
will be used in place of text vs text operators if they apply (which
|
||||
should hold, I think, for any reasonable metric). This is probably
|
||||
the only way to get the "right" behavior in any case --- I think that
|
||||
the "right" behavior for such comparisons is to strip trailing blanks
|
||||
from the bpchar side but not the text/varchar side. (I haven't checked
|
||||
to see if SQL92 agrees, though.)
|
||||
|
||||
Another issue is how to fit resolution of "unknown" literals into this
|
||||
scheme. We could probably continue to handle them more or less as we
|
||||
do now, but they might complicate the promotion metric.
|
||||
|
||||
I am not clear yet on whether we'd still need the concept of "type
|
||||
categories" as they presently exist in the resolver. It's possible
|
||||
that we wouldn't, which would be a nice simplification. (If we do
|
||||
still need them, we should have a column in pg_type that defines the
|
||||
category of a type, instead of hard-wiring category assignments.)
|
||||
|
||||
regards, tom lane
|
||||
|
||||
From e99re41@DoCS.UU.SE Mon May 15 07:39:03 2000
|
||||
Received: from meryl.it.uu.se (root@meryl.it.uu.se [130.238.12.42])
|
||||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id HAA10251
|
||||
for <pgman@candle.pha.pa.us>; Mon, 15 May 2000 07:39:01 -0400 (EDT)
|
||||
Received: from Zebra.DoCS.UU.SE (e99re41@Zebra.DoCS.UU.SE [130.238.9.158])
|
||||
by meryl.it.uu.se (8.8.5/8.8.5) with ESMTP id NAA10849;
|
||||
Mon, 15 May 2000 13:39:45 +0200 (MET DST)
|
||||
Received: from localhost (e99re41@localhost) by Zebra.DoCS.UU.SE (8.6.12/8.6.12) with ESMTP id NAA26523; Mon, 15 May 2000 13:39:44 +0200
|
||||
X-Authentication-Warning: Zebra.DoCS.UU.SE: e99re41 owned process doing -bs
|
||||
Date: Mon, 15 May 2000 13:39:44 +0200 (MET DST)
|
||||
From: Peter Eisentraut <e99re41@DoCS.UU.SE>
|
||||
Reply-To: Peter Eisentraut <peter_e@gmx.net>
|
||||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||||
cc: Bruce Momjian <pgman@candle.pha.pa.us>,
|
||||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||||
Subject: Re: [HACKERS] type conversion discussion
|
||||
In-Reply-To: <20911.958339770@sss.pgh.pa.us>
|
||||
Message-ID: <Pine.GSO.4.02A.10005151309020.26399-100000@Zebra.DoCS.UU.SE>
|
||||
MIME-Version: 1.0
|
||||
Content-Type: TEXT/PLAIN; charset=iso-8859-1
|
||||
Content-Transfer-Encoding: 8bit
|
||||
X-MIME-Autoconverted: from QUOTED-PRINTABLE to 8bit by candle.pha.pa.us id HAA10251
|
||||
Status: OR
|
||||
|
||||
On Sun, 14 May 2000, Tom Lane wrote:
|
||||
|
||||
> 1. Poor choice of type to attribute to numeric literals. (A possible
|
||||
> solution is sketched in my earlier message, but do we need similar
|
||||
> mechanisms for other type categories?)
|
||||
|
||||
I think your plan looks good for the numerical land. (I'll ponder the oid
|
||||
issues in a second.) For other type categories, perhaps not. Should a line
|
||||
be promoted to a polygon so you can check if it contains a point? Or a
|
||||
polygon to a box? Higher dimensions? :-)
|
||||
|
||||
|
||||
> 2. Tensions between treating string literals as "unknown" type and
|
||||
> as "text" type, per this thread so far.
|
||||
|
||||
Yes, while we're at it, let's look at this in detail. I claim that
|
||||
something of the form 'xxx' should always be text (or char or whatever),
|
||||
period. Let's consider the cases were this could potentially clash with
|
||||
the current behaviour:
|
||||
|
||||
a) The target type is unambiguously clear, e.g., UPDATE ... SET. Then you
|
||||
cast text to the target type. The effect is identical.
|
||||
|
||||
b) The target type is completely unspecified, e.g. CREATE TABLE AS SELECT
|
||||
'xxx'; This will currently create an "unknown" column. It should arguably
|
||||
create a "text" column.
|
||||
|
||||
Function argument resolution:
|
||||
|
||||
c) There is only one function and it has a "text" argument. No-brainer.
|
||||
|
||||
d) There is only one function and it has an argument other than text. Try
|
||||
to cast text to that type. (This is what's done in general, isn't it?)
|
||||
|
||||
e) The function is overloaded for many types, amongst which is text. Then
|
||||
call the text version. I believe this would currently fail, which I'd
|
||||
consider a deficiency.
|
||||
|
||||
f) The function is overloaded for many types, none of which is text. In
|
||||
that case you have to cast anyway, so you don't lose anything.
|
||||
|
||||
On thing to also keep in mind regarding required casting for (b) and (f)
|
||||
is that SQL never allowed literals of "fancy" types (e.g., DATE) to have
|
||||
undecorated 'yyyy-mm-dd' constants, you always have to say DATE
|
||||
'yyyy-mm-dd'. What Postgres allows is a convencience where DATE would be
|
||||
obvious or implied. In the end it's a win-win situation: you tell the
|
||||
system what you want, and your code is clearer.
|
||||
|
||||
|
||||
> 3. IS_BINARY_COMPATIBLE seems like a bogus concept.
|
||||
|
||||
At least it's bogus when used for types which are not actually binary
|
||||
compatible, e.g. int4 and oid. The result of the current implementation is
|
||||
that you can perfectly happily insert and retrieve negative numbers from
|
||||
oid fields.
|
||||
|
||||
I'm not so sure about the value of this particular equivalency anyway.
|
||||
AFAICS the only functions that make sense for oids are comparisons (incl.
|
||||
min, max), adding integers to them, subtracting one oid from another.
|
||||
Silent mangling with int4 means that you can multiply them, square them,
|
||||
add floating point numbers to them (doesn't really work in practice
|
||||
though), all things that have no business with oids.
|
||||
|
||||
I'd say define the operators that are useful for oids explicitly for oids
|
||||
and require casts for all others, so the users know what they're doing.
|
||||
The fact that an oid is also a number should be an implementation detail.
|
||||
|
||||
In my mind oids are like pointers in C. Indiscriminate mangling of
|
||||
pointers and integers in C has long been dismissed as questionable coding.
|
||||
|
||||
|
||||
Of course I'd be very willing to consider counterexamples to these
|
||||
theories ...
|
||||
|
||||
--
|
||||
Peter Eisentraut Sernanders väg 10:115
|
||||
peter_e@gmx.net 75262 Uppsala
|
||||
http://yi.org/peter-e/ Sweden
|
||||
|
||||
|
||||
From tgl@sss.pgh.pa.us Tue Jun 13 04:58:20 2000
|
||||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA24281
|
||||
for <pgman@candle.pha.pa.us>; Tue, 13 Jun 2000 03:58:18 -0400 (EDT)
|
||||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id DAA02571;
|
||||
Tue, 13 Jun 2000 03:58:43 -0400 (EDT)
|
||||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||||
cc: pgsql-hackers@postgresql.org
|
||||
Subject: Re: [HACKERS] Proposal for fixing numeric type-resolution issues
|
||||
In-reply-to: <200006130741.DAA23502@candle.pha.pa.us>
|
||||
References: <200006130741.DAA23502@candle.pha.pa.us>
|
||||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||||
message dated "Tue, 13 Jun 2000 03:41:56 -0400"
|
||||
Date: Tue, 13 Jun 2000 03:58:43 -0400
|
||||
Message-ID: <2568.960883123@sss.pgh.pa.us>
|
||||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||||
Status: OR
|
||||
|
||||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||||
> Again, anything to add to the TODO here?
|
||||
|
||||
IIRC, there was some unhappiness with the proposal you quote, so I'm
|
||||
not sure we've quite agreed what to do... but clearly something must
|
||||
be done.
|
||||
|
||||
regards, tom lane
|
||||
|
||||
|
||||
>> We've got a collection of problems that are related to the parser's
|
||||
>> inability to make good type-resolution choices for numeric constants.
|
||||
>> In some cases you get a hard error; for example "NumericVar + 4.4"
|
||||
>> yields
|
||||
>> ERROR: Unable to identify an operator '+' for types 'numeric' and 'float8'
|
||||
>> You will have to retype this query using an explicit cast
|
||||
>> because "4.4" is initially typed as float8 and the system can't figure
|
||||
>> out whether to use numeric or float8 addition. A more subtle problem
|
||||
>> is that a query like "... WHERE Int2Var < 42" is unable to make use of
|
||||
>> an index on the int2 column: 42 is resolved as int4, so the operator
|
||||
>> is int24lt, which works but is not in the opclass of an int2 index.
|
||||
>>
|
||||
>> Here is a proposal for fixing these problems. I think we could get this
|
||||
>> done for 7.1 if people like it.
|
||||
>>
|
||||
>> The basic problem is that there's not enough smarts in the type resolver
|
||||
>> about the interrelationships of the numeric datatypes. All it has is
|
||||
>> a concept of a most-preferred type within the category of numeric types.
|
||||
>> (We are abusing the most-preferred-type mechanism, BTW, because both
|
||||
>> FLOAT8 and NUMERIC claim to be the most-preferred type in the numeric
|
||||
>> category! This is in fact why the resolver can't make a choice for
|
||||
>> "numeric+float8".) We need more intelligence than that.
|
||||
>>
|
||||
>> I propose that we set up a strictly-ordered hierarchy of numeric
|
||||
>> datatypes, running from least preferred to most preferred:
|
||||
>> int2, int4, int8, numeric, float4, float8.
|
||||
>> Rather than simply considering coercions to the most-preferred type,
|
||||
>> the type resolver should use the following rules:
|
||||
>>
|
||||
>> 1. No value will be down-converted (eg int4 to int2) except by an
|
||||
>> explicit conversion.
|
||||
>>
|
||||
>> 2. If there is not an exact matching operator, numeric values will be
|
||||
>> up-converted to the highest numeric datatype present among the operator
|
||||
>> or function's arguments. For example, given "int2 + int8" we'd up-
|
||||
>> convert the int2 to int8 and apply int8 addition.
|
||||
>>
|
||||
>> The final piece of the puzzle is that the type initially assigned to
|
||||
>> an undecorated numeric constant should be NUMERIC if it contains a
|
||||
>> decimal point or exponent, and otherwise the smallest of int2, int4,
|
||||
>> int8, NUMERIC that will represent it. This is a considerable change
|
||||
>> from the current lexer behavior, where you get either int4 or float8.
|
||||
>>
|
||||
>> For example, given "NumericVar + 4.4", the constant 4.4 will initially
|
||||
>> be assigned type NUMERIC, we will resolve the operator as numeric plus,
|
||||
>> and everything's fine. Given "Float8Var + 4.4", the constant is still
|
||||
>> initially numeric, but will be up-converted to float8 so that float8
|
||||
>> addition can be used. The end result is the same as in traditional
|
||||
>> Postgres: you get float8 addition. Given "Int2Var < 42", the constant
|
||||
>> is initially typed as int2, since it fits, and we end up selecting
|
||||
>> int2lt, thereby allowing use of an int2 index. (On the other hand,
|
||||
>> given "Int2Var < 100000", we'd end up using int4lt, which is correct
|
||||
>> to avoid overflow.)
|
||||
>>
|
||||
>> A couple of crucial subtleties here:
|
||||
>>
|
||||
>> 1. We are assuming that the parser or optimizer will constant-fold
|
||||
>> any conversion functions that are introduced. Thus, in the
|
||||
>> "Float8Var + 4.4" case, the 4.4 is represented as a float8 4.4 by the
|
||||
>> time execution begins, so there's no performance loss.
|
||||
>>
|
||||
>> 2. We cannot lose precision by initially representing a constant as
|
||||
>> numeric and later converting it to float. Nor can we exceed NUMERIC's
|
||||
>> range (the default 1000-digit limit is more than the range of IEEE
|
||||
>> float8 data). It would not work as well to start out by representing
|
||||
>> a constant as float and then converting it to numeric.
|
||||
>>
|
||||
>> Presently, the pg_proc and pg_operator tables contain a pretty fair
|
||||
>> collection of cross-datatype numeric operators, such as int24lt,
|
||||
>> float48pl, etc. We could perhaps leave these in, but I believe that
|
||||
>> it is better to remove them. For example, if int42lt is left in place,
|
||||
>> then it would capture cases like "Int4Var < 42", whereas we need that
|
||||
>> to be translated to int4lt so that an int4 index can be used. Removing
|
||||
>> these operators will eliminate some code bloat and system-catalog bloat
|
||||
>> to boot.
|
||||
>>
|
||||
>> As far as I can tell, this proposal is almost compatible with the rules
|
||||
>> given in SQL92: in particular, SQL92 specifies that an operator having
|
||||
>> both "approximate numeric" (float) and "exact numeric" (int or numeric)
|
||||
>> inputs should deliver an approximate-numeric result. I propose
|
||||
>> deviating from SQL92 in a single respect: SQL92 specifies that a
|
||||
>> constant containing an exponent (eg 1.2E34) is approximate numeric,
|
||||
>> which implies that the result of an operator using it is approximate
|
||||
>> even if the other operand is exact. I believe it's better to treat
|
||||
>> such a constant as exact (ie, type NUMERIC) and only convert it to
|
||||
>> float if the other operand is float. Without doing that, an assignment
|
||||
>> like
|
||||
>> UPDATE tab SET NumericVar = 1.234567890123456789012345E34;
|
||||
>> will not work as desired because the constant will be prematurely
|
||||
>> coerced to float, causing precision loss.
|
||||
>>
|
||||
>> Comments?
|
||||
>>
|
||||
>> regards, tom lane
|
||||
>>
|
||||
|
||||
|
||||
> --
|
||||
> Bruce Momjian | http://www.op.net/~candle
|
||||
> pgman@candle.pha.pa.us | (610) 853-3000
|
||||
> + If your life is a hard drive, | 830 Blythe Avenue
|
||||
> + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
|
||||
|
||||
From tgl@sss.pgh.pa.us Mon Jun 12 14:09:45 2000
|
||||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA01993
|
||||
for <pgman@candle.pha.pa.us>; Mon, 12 Jun 2000 13:09:43 -0400 (EDT)
|
||||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id NAA01515;
|
||||
Mon, 12 Jun 2000 13:10:01 -0400 (EDT)
|
||||
To: Peter Eisentraut <peter_e@gmx.net>
|
||||
cc: Bruce Momjian <pgman@candle.pha.pa.us>,
|
||||
"Thomas G. Lockhart" <lockhart@alumni.caltech.edu>,
|
||||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||||
Subject: Re: [HACKERS] Adding time to DATE type
|
||||
In-reply-to: <Pine.LNX.4.21.0006110322150.9195-100000@localhost.localdomain>
|
||||
References: <Pine.LNX.4.21.0006110322150.9195-100000@localhost.localdomain>
|
||||
Comments: In-reply-to Peter Eisentraut <peter_e@gmx.net>
|
||||
message dated "Sun, 11 Jun 2000 13:41:24 +0200"
|
||||
Date: Mon, 12 Jun 2000 13:10:00 -0400
|
||||
Message-ID: <1512.960829800@sss.pgh.pa.us>
|
||||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||||
Status: ORr
|
||||
|
||||
Peter Eisentraut <peter_e@gmx.net> writes:
|
||||
> Bruce Momjian writes:
|
||||
>> Can someone give me a TODO summary for this issue?
|
||||
|
||||
> * make 'text' constants default to text type (not unknown)
|
||||
|
||||
> (I think not everyone's completely convinced on this issue, but I don't
|
||||
> recall anyone being firmly opposed to it.)
|
||||
|
||||
It would be a mistake to eliminate the distinction between unknown and
|
||||
text. See for example my just-posted response to John Cochran on
|
||||
pgsql-general about why 'BOULEVARD'::text behaves differently from
|
||||
'BOULEVARD'::char. If string literals are immediately assigned type
|
||||
text then we will have serious problems with char(n) fields.
|
||||
|
||||
I think it's fine to assign string literals a type of 'unknown'
|
||||
initially. What we need to do is add a phase of type resolution that
|
||||
considers treating them as text, but only after the existing logic fails
|
||||
to deduce a type.
|
||||
|
||||
(BTW it might be better to treat string literals as defaulting to char(n)
|
||||
instead of text, allowing the normal promotion rules to replace char(n)
|
||||
with text if necessary. Not sure if that would make things more or less
|
||||
confusing for operations that intermix fixed- and variable-width char
|
||||
types.)
|
||||
|
||||
regards, tom lane
|
||||
|
||||
From pgsql-hackers-owner+M1936@postgresql.org Sun Dec 10 13:17:54 2000
|
||||
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
|
||||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA20676
|
||||
for <pgman@candle.pha.pa.us>; Sun, 10 Dec 2000 13:17:54 -0500 (EST)
|
||||
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
|
||||
by mail.postgresql.org (8.11.1/8.11.1) with SMTP id eBAIGvZ40566;
|
||||
Sun, 10 Dec 2000 13:16:57 -0500 (EST)
|
||||
(envelope-from pgsql-hackers-owner+M1936@postgresql.org)
|
||||
Received: from sss.pgh.pa.us (sss.pgh.pa.us [209.114.132.154])
|
||||
by mail.postgresql.org (8.11.1/8.11.1) with ESMTP id eBAI8HZ39820
|
||||
for <pgsql-hackers@postgreSQL.org>; Sun, 10 Dec 2000 13:08:17 -0500 (EST)
|
||||
(envelope-from tgl@sss.pgh.pa.us)
|
||||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||||
by sss.pgh.pa.us (8.11.1/8.11.1) with ESMTP id eBAI82o28682;
|
||||
Sun, 10 Dec 2000 13:08:02 -0500 (EST)
|
||||
To: Thomas Lockhart <lockhart@alumni.caltech.edu>
|
||||
cc: pgsql-hackers@postgresql.org
|
||||
Subject: [HACKERS] Unknown-type resolution rules, redux
|
||||
Date: Sun, 10 Dec 2000 13:08:02 -0500
|
||||
Message-ID: <28679.976471682@sss.pgh.pa.us>
|
||||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||||
Precedence: bulk
|
||||
Sender: pgsql-hackers-owner@postgresql.org
|
||||
Status: OR
|
||||
|
||||
parse_coerce.c contains the following conversation --- I believe the
|
||||
first XXX comment is from me and the second from you:
|
||||
|
||||
/*
|
||||
* Still too many candidates? Try assigning types for the unknown
|
||||
* columns.
|
||||
*
|
||||
* We do this by examining each unknown argument position to see if all
|
||||
* the candidates agree on the type category of that slot. If so, and
|
||||
* if some candidates accept the preferred type in that category,
|
||||
* eliminate the candidates with other input types. If we are down to
|
||||
* one candidate at the end, we win.
|
||||
*
|
||||
* XXX It's kinda bogus to do this left-to-right, isn't it? If we
|
||||
* eliminate some candidates because they are non-preferred at the
|
||||
* first slot, we won't notice that they didn't have the same type
|
||||
* category for a later slot.
|
||||
* XXX Hmm. How else would you do this? These candidates are here because
|
||||
* they all have the same number of matches on arguments with explicit
|
||||
* types, so from here on left-to-right resolution is as good as any.
|
||||
* Need a counterexample to see otherwise...
|
||||
*/
|
||||
|
||||
The comment is out of date anyway because it fails to mention the new
|
||||
rule about preferring STRING category. But to answer your request for
|
||||
a counterexample: consider
|
||||
|
||||
SELECT foo('bar', 'baz')
|
||||
|
||||
First, suppose the available candidates are
|
||||
|
||||
foo(float8, int4)
|
||||
foo(float8, point)
|
||||
|
||||
In this case, we examine the first argument position, see that all the
|
||||
candidates agree on NUMERIC category, so we consider resolving the first
|
||||
unknown input to float8. That eliminates neither candidate so we move
|
||||
on to the second argument position. Here there is a conflict of
|
||||
categories so we can't eliminate anything, and we decide the call is
|
||||
ambiguous. That's correct (or at least Operating As Designed ;-)).
|
||||
|
||||
But now suppose we have
|
||||
|
||||
foo(float8, int4)
|
||||
foo(float4, point)
|
||||
|
||||
Here, at the first position we will still see that all candidates agree
|
||||
on NUMERIC category, and then we will eliminate candidate 2 because it
|
||||
isn't the preferred type in that category. Now when we come to the
|
||||
second argument position, there's only one candidate left so there's
|
||||
no category conflict. Result: this call is considered non-ambiguous.
|
||||
|
||||
This means there is a left-to-right bias in the algorithm. For example,
|
||||
the exact same call *would* be considered ambiguous if the candidates'
|
||||
argument orders were reversed:
|
||||
|
||||
foo(int4, float8)
|
||||
foo(point, float4)
|
||||
|
||||
I do not like that. You could maybe argue that earlier arguments are
|
||||
more important than later ones for functions, but it's harder to make
|
||||
that case for binary operators --- and in any case this behavior is
|
||||
extremely difficult to explain in prose.
|
||||
|
||||
To fix this, I think we need to split the loop into two passes.
|
||||
The first pass does *not* remove any candidates. What it does is to
|
||||
look separately at each UNKNOWN-argument position and attempt to deduce
|
||||
a probable category for it, using the following rules:
|
||||
|
||||
* If any candidate has an input type of STRING category, use STRING
|
||||
category; else if all candidates agree on the category, use that
|
||||
category; else fail because no resolution can be made.
|
||||
|
||||
* The first pass must also remember whether any candidates are of a
|
||||
preferred type within the selected category.
|
||||
|
||||
The probable categories and exists-preferred-type booleans are saved in
|
||||
local arrays. (Note this has to be done this way because
|
||||
IsPreferredType currently allows more than one type to be considered
|
||||
preferred in a category ... so the first pass cannot try to determine a
|
||||
unique type, only a category.)
|
||||
|
||||
If we find a category for every UNKNOWN arg, then we enter a second loop
|
||||
in which we discard candidates. In this pass we discard a candidate if
|
||||
(a) it is of the wrong category, or (b) it is of the right category but
|
||||
is not of preferred type in that category, *and* we found candidate(s)
|
||||
of preferred type at this slot.
|
||||
|
||||
If we end with exactly one candidate then we win.
|
||||
|
||||
It is clear in this algorithm that there is no order dependency: the
|
||||
conditions for keeping or discarding a candidate are fixed before we
|
||||
start the second pass, and do not vary depending on which other
|
||||
candidates were discarded before it.
|
||||
|
||||
Comments?
|
||||
|
||||
regards, tom lane
|
||||
|
||||
From pgsql-general-owner+M18949=candle.pha.pa.us=pgman@postgresql.org Sat Dec 29 15:47:47 2001
|
||||
Return-path: <pgsql-general-owner+M18949=candle.pha.pa.us=pgman@postgresql.org>
|
||||
Received: from rs.postgresql.org (server1.pgsql.org [64.39.15.238] (may be forged))
|
||||
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id fBTKlkT05111
|
||||
for <pgman@candle.pha.pa.us>; Sat, 29 Dec 2001 15:47:46 -0500 (EST)
|
||||
Received: from postgresql.org (postgresql.org [64.49.215.8])
|
||||
by rs.postgresql.org (8.11.6/8.11.6) with ESMTP id fBTKhZN74322
|
||||
for <pgman@candle.pha.pa.us>; Sat, 29 Dec 2001 14:43:35 -0600 (CST)
|
||||
(envelope-from pgsql-general-owner+M18949=candle.pha.pa.us=pgman@postgresql.org)
|
||||
Received: from candle.pha.pa.us (216-55-132-35.dsl.san-diego.abac.net [216.55.132.35])
|
||||
by postgresql.org (8.11.3/8.11.4) with ESMTP id fBTKaem38452
|
||||
for <pgsql-general@postgresql.org>; Sat, 29 Dec 2001 15:36:40 -0500 (EST)
|
||||
(envelope-from pgman@candle.pha.pa.us)
|
||||
Received: (from pgman@localhost)
|
||||
by candle.pha.pa.us (8.11.6/8.10.1) id fBTKaTg04256;
|
||||
Sat, 29 Dec 2001 15:36:29 -0500 (EST)
|
||||
From: Bruce Momjian <pgman@candle.pha.pa.us>
|
||||
Message-ID: <200112292036.fBTKaTg04256@candle.pha.pa.us>
|
||||
Subject: Re: [GENERAL] Casting Varchar to Numeric
|
||||
In-Reply-To: <20011206150158.O28880-100000@megazone23.bigpanda.com>
|
||||
To: Stephan Szabo <sszabo@megazone23.bigpanda.com>
|
||||
Date: Sat, 29 Dec 2001 15:36:29 -0500 (EST)
|
||||
cc: Andy Marden <amarden@usa.net>, pgsql-general@postgresql.org
|
||||
X-Mailer: ELM [version 2.4ME+ PL96 (25)]
|
||||
MIME-Version: 1.0
|
||||
Content-Transfer-Encoding: 7bit
|
||||
Content-Type: text/plain; charset=US-ASCII
|
||||
Precedence: bulk
|
||||
Sender: pgsql-general-owner@postgresql.org
|
||||
Status: OR
|
||||
|
||||
> On Mon, 3 Dec 2001, Andy Marden wrote:
|
||||
>
|
||||
> > Martijn,
|
||||
> >
|
||||
> > It does work (believe it or not). I've now tried the method you mention
|
||||
> > below - that also works and is much nicer. I can't believe that PostgreSQL
|
||||
> > can't work this out. Surely implementing an algorithm that understands that
|
||||
> > if you can go from a ->b and b->c then you can certainly go from a->c. If
|
||||
>
|
||||
> It's more complicated than that (and postgres does some of this but not
|
||||
> all), for example the cast text->float8->numeric potentially loses
|
||||
> precision and should probably not be an automatic cast for that reason.
|
||||
>
|
||||
> > this is viewed as too complex a task for the internals - at least a diagram
|
||||
> > or some way of understanding how you should go from a->c would be immensely
|
||||
> > helpful wouldn't it! Daunting for anyone picking up the database and trying
|
||||
> > to do something simple(!)
|
||||
>
|
||||
> There may be a need for documentation on this. Would you like to write
|
||||
> some ;)
|
||||
|
||||
OK, I ran some tests:
|
||||
|
||||
test=> create table test (x text);
|
||||
CREATE
|
||||
test=> insert into test values ('323');
|
||||
INSERT 5122745 1
|
||||
test=> select cast (x as numeric) from test;
|
||||
ERROR: Cannot cast type 'text' to 'numeric'
|
||||
|
||||
I can see problems with automatically casting numeric to text because
|
||||
you have to guess the desired format, but going from text to numeric
|
||||
seems quite easy to do. Is there a reason we don't do it?
|
||||
|
||||
I can cast to integer and float8 fine:
|
||||
|
||||
test=> select cast ( x as integer) from test;
|
||||
?column?
|
||||
----------
|
||||
323
|
||||
(1 row)
|
||||
|
||||
test=> select cast ( x as float8) from test;
|
||||
?column?
|
||||
----------
|
||||
323
|
||||
(1 row)
|
||||
|
||||
--
|
||||
Bruce Momjian | http://candle.pha.pa.us
|
||||
pgman@candle.pha.pa.us | (610) 853-3000
|
||||
+ If your life is a hard drive, | 830 Blythe Avenue
|
||||
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
|
||||
|
||||
---------------------------(end of broadcast)---------------------------
|
||||
TIP 2: you can get off all lists at once with the unregister command
|
||||
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
|
||||
|
||||
From pgsql-general-owner+M18951=candle.pha.pa.us=pgman@postgresql.org Sat Dec 29 19:10:38 2001
|
||||
Return-path: <pgsql-general-owner+M18951=candle.pha.pa.us=pgman@postgresql.org>
|
||||
Received: from west.navpoint.com (west.navpoint.com [207.106.42.13])
|
||||
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id fBU0AbT23972
|
||||
for <pgman@candle.pha.pa.us>; Sat, 29 Dec 2001 19:10:37 -0500 (EST)
|
||||
Received: from rs.postgresql.org (server1.pgsql.org [64.39.15.238] (may be forged))
|
||||
by west.navpoint.com (8.11.6/8.10.1) with ESMTP id fBTNVj008959
|
||||
for <pgman@candle.pha.pa.us>; Sat, 29 Dec 2001 18:31:45 -0500 (EST)
|
||||
Received: from postgresql.org (postgresql.org [64.49.215.8])
|
||||
by rs.postgresql.org (8.11.6/8.11.6) with ESMTP id fBTNQrN78655
|
||||
for <pgman@candle.pha.pa.us>; Sat, 29 Dec 2001 17:26:53 -0600 (CST)
|
||||
(envelope-from pgsql-general-owner+M18951=candle.pha.pa.us=pgman@postgresql.org)
|
||||
Received: from sss.pgh.pa.us ([192.204.191.242])
|
||||
by postgresql.org (8.11.3/8.11.4) with ESMTP id fBTN8Fm47978
|
||||
for <pgsql-general@postgresql.org>; Sat, 29 Dec 2001 18:08:15 -0500 (EST)
|
||||
(envelope-from tgl@sss.pgh.pa.us)
|
||||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||||
by sss.pgh.pa.us (8.11.4/8.11.4) with ESMTP id fBTN7vg20245;
|
||||
Sat, 29 Dec 2001 18:07:57 -0500 (EST)
|
||||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||||
cc: Stephan Szabo <sszabo@megazone23.bigpanda.com>,
|
||||
Andy Marden <amarden@usa.net>, pgsql-general@postgresql.org
|
||||
Subject: Re: [GENERAL] Casting Varchar to Numeric
|
||||
In-Reply-To: <200112292036.fBTKaTg04256@candle.pha.pa.us>
|
||||
References: <200112292036.fBTKaTg04256@candle.pha.pa.us>
|
||||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||||
message dated "Sat, 29 Dec 2001 15:36:29 -0500"
|
||||
Date: Sat, 29 Dec 2001 18:07:57 -0500
|
||||
Message-ID: <20242.1009667277@sss.pgh.pa.us>
|
||||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||||
Precedence: bulk
|
||||
Sender: pgsql-general-owner@postgresql.org
|
||||
Status: OR
|
||||
|
||||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||||
> I can see problems with automatically casting numeric to text because
|
||||
> you have to guess the desired format, but going from text to numeric
|
||||
> seems quite easy to do. Is there a reason we don't do it?
|
||||
|
||||
I do not think it's a good idea to have implicit casts between text and
|
||||
everything under the sun, because that essentially destroys the type
|
||||
checking system. What we need (see previous discussion) is a flag in
|
||||
pg_proc that says whether a type conversion function may be invoked
|
||||
implicitly or not. I've got no problem with offering text(numeric) and
|
||||
numeric(text) functions that are invoked by explicit function calls or
|
||||
casts --- I just don't want the system trying to use them to make
|
||||
sense of a bogus query.
|
||||
|
||||
> I can cast to integer and float8 fine:
|
||||
|
||||
I don't believe that those should be available as implicit casts either.
|
||||
They are, at the moment:
|
||||
|
||||
regression=# select 33 || 44.0;
|
||||
?column?
|
||||
----------
|
||||
3344
|
||||
(1 row)
|
||||
|
||||
Ugh.
|
||||
|
||||
regards, tom lane
|
||||
|
||||
---------------------------(end of broadcast)---------------------------
|
||||
TIP 6: Have you searched our list archives?
|
||||
|
||||
http://archives.postgresql.org
|
||||
|
File diff suppressed because it is too large
Load Diff
@ -1,402 +0,0 @@
|
||||
From selkovjr@mcs.anl.gov Sat Jul 25 05:31:05 1998
|
||||
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
|
||||
by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id FAA16564
|
||||
for <maillist@candle.pha.pa.us>; Sat, 25 Jul 1998 05:31:03 -0400 (EDT)
|
||||
Received: from antares.mcs.anl.gov (mcs.anl.gov [140.221.9.6]) by renoir.op.net (o1/$ Revision: 1.18 $) with SMTP id FAA01775 for <maillist@candle.pha.pa.us>; Sat, 25 Jul 1998 05:28:22 -0400 (EDT)
|
||||
Received: from mcs.anl.gov (wit.mcs.anl.gov [140.221.5.148]) by antares.mcs.anl.gov (8.6.10/8.6.10) with ESMTP
|
||||
id EAA28698 for <maillist@candle.pha.pa.us>; Sat, 25 Jul 1998 04:27:05 -0500
|
||||
Sender: selkovjr@mcs.anl.gov
|
||||
Message-ID: <35B9968D.21CF60A2@mcs.anl.gov>
|
||||
Date: Sat, 25 Jul 1998 08:25:49 +0000
|
||||
From: "Gene Selkov, Jr." <selkovjr@mcs.anl.gov>
|
||||
Organization: MCS, Argonne Natl. Lab
|
||||
X-Mailer: Mozilla 4.03 [en] (X11; I; Linux 2.0.32 i586)
|
||||
MIME-Version: 1.0
|
||||
To: Bruce Momjian <maillist@candle.pha.pa.us>
|
||||
Subject: position-aware scanners
|
||||
References: <199807250524.BAA07296@candle.pha.pa.us>
|
||||
Content-Type: text/plain; charset=us-ascii
|
||||
Content-Transfer-Encoding: 7bit
|
||||
Status: RO
|
||||
|
||||
Bruce,
|
||||
|
||||
I attached here (trough the web links) a couple examples, totally
|
||||
irrelevant to postgres but good enough to discuss token locations. I
|
||||
might as well try to patch the backend parser, though not sure how soon.
|
||||
|
||||
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
1.
|
||||
|
||||
The first c parser I wrote,
|
||||
http://wit.mcs.anl.gov/~selkovjr/unit-troff.tgz, is not very
|
||||
sophisticated, so token locations reported by yyerr() may be slightly
|
||||
incorrect (+/- one position depending on the existence and type of the
|
||||
lookahead token. It is a filter used to typeset the units of measurement
|
||||
with eqn. To use it, unpack the tar file and run make. The Makefile is
|
||||
not too generic but I built it on various systems including linux,
|
||||
freebsd and sunos 4.3. The invocation can be something like this:
|
||||
|
||||
./check 0 parse "l**3/(mmoll*min)"
|
||||
parse error, expecting `BASIC_UNIT' or `INTEGER' or `POSITIVE_NUMBER' or
|
||||
`'(''
|
||||
|
||||
l**3/(mmoll*min)
|
||||
^^^^^
|
||||
|
||||
Now to the guts. As far as I can imagine, the only way to consistently
|
||||
keep track of each character read by the scanner (regardless of the
|
||||
length of expressions it will match) is to redefine its YY_INPUT like
|
||||
this:
|
||||
|
||||
#undef YY_INPUT
|
||||
#define YY_INPUT(buf,result,max_size) \
|
||||
{ \
|
||||
int c = (int) buffer[pos++]; \
|
||||
result = (c == '\0') ? YY_NULL : (buf[0] = c, 1); \
|
||||
}
|
||||
|
||||
Here, buffer is the pointer to the origin of the string being scanned
|
||||
and pos is a global variable, similar in usage to a file pointer (you
|
||||
can both read and manipulate it at will). The buffer and the pointer are
|
||||
initialized by the function
|
||||
|
||||
void setString(char *s)
|
||||
{
|
||||
buffer = s;
|
||||
pos = 0;
|
||||
}
|
||||
|
||||
each time the new string is to be parsed. This (exportable) function is
|
||||
part of the interface.
|
||||
|
||||
In this simplistic design, yyerror() is part of the scanner module and
|
||||
it uses the pos variable to report the location of unexpected tokens.
|
||||
The downside of such arrangement is that in case of error condition, you
|
||||
can't easily tell whether your context is current or lookahead token, it
|
||||
just reports the position of the last token read (be it $ (end of
|
||||
buffer) or something else):
|
||||
|
||||
./check 0 convert "mol/foo"
|
||||
parse error, expecting `BASIC_UNIT' or `INTEGER' or `POSITIVE_NUMBER' or
|
||||
`'(''
|
||||
|
||||
mol/foo
|
||||
^^^
|
||||
|
||||
(should be at the beginning of "foo")
|
||||
|
||||
./check 0 convert "mmol//l"
|
||||
parse error, expecting `BASIC_UNIT' or `INTEGER' or `POSITIVE_NUMBER' or
|
||||
`'(''
|
||||
|
||||
mmol//l
|
||||
^
|
||||
|
||||
(should be at the second '/')
|
||||
|
||||
|
||||
I believe this is why most simple parsers made with yacc would report
|
||||
parse errors being "at or near" some token, which is fair enough if the
|
||||
expression is not too complex.
|
||||
|
||||
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
2. The second version of the same scanner,
|
||||
http://wit.mcs.anl.gov/~selkovjr/scanner-example.tgz, addresses this
|
||||
problem by recording exact locations of the tokens in each instance of
|
||||
the token semantic data structure. The global,
|
||||
|
||||
UNIT_YYSTYPE unit_yylval;
|
||||
|
||||
would be normally used to export the token semantics (including its
|
||||
original or modified text and location data) to the parser.
|
||||
Unfortunately, I cannot show you the parser part in c, because that's
|
||||
about when I stopped writing parsers in c. Instead, I included a small
|
||||
test program, test.c, that mimics the parser's expectations for the
|
||||
scanner data pretty well. I am assuming here that you are not interested
|
||||
in digging someone else's ugly guts for relatively small bit of
|
||||
information; let me know if I am wrong and I will send you the complete
|
||||
perl code (also generated with bison).
|
||||
|
||||
To run this example, unpack the tar file and run Make. Then do
|
||||
|
||||
gcc test.c scanner.o
|
||||
|
||||
and run a.out
|
||||
|
||||
Note the line
|
||||
|
||||
yylval = unit_getyylval();
|
||||
|
||||
in test.c. You will not normally need it in a c parser. It is enough to
|
||||
define yylval as an external variable and link it to yylval in yylex()
|
||||
|
||||
In the bison-generated parser, yylval gets pushed into a stack (pointed
|
||||
to by yylsp) each time a new token is read. For each syntax rule, the
|
||||
bison macros @1, @2, ... are just shortcuts to locations in the stack 1,
|
||||
2, ... levels deep. In following code fragment, @3 refers to the
|
||||
location info for the third term in the rule (INTEGER):
|
||||
|
||||
(sorry about perl, but I think you can do the same things in c without
|
||||
significant changes to your existing parser)
|
||||
|
||||
term: base {
|
||||
$$ = $1;
|
||||
$$->{'order'} = 1;
|
||||
}
|
||||
| base EXP INTEGER {
|
||||
$$ = $1;
|
||||
$$->{'order'} = @3->{'text'};
|
||||
$$->{'scale'} = $$->{'scale'} ** $$->{'order'};
|
||||
if ( $$->{'order'} == 0 ) {
|
||||
yyerror("Error: expecting a non-zero
|
||||
integer exponent");
|
||||
YYERROR;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
which translates to:
|
||||
|
||||
($yyn == 10) && do {
|
||||
$yyval = $yyvsa[-1];
|
||||
$yyval->{'order'} = 1;
|
||||
last SWITCH;
|
||||
};
|
||||
|
||||
($yyn == 11) && do {
|
||||
$yyval = $yyvsa[-3];
|
||||
$yyval->{'order'} = $yylsa[-1]->{'text'}
|
||||
$yyval->{'scale'} = $yyval->{'scale'} ** $yyval->{'order'};
|
||||
if ( $yyval->{'order'} == 0 ) {
|
||||
yyerror("Error: expecting a non-zero integer
|
||||
exponent");
|
||||
goto yyerrlab1 ;
|
||||
}
|
||||
last SWITCH;
|
||||
};
|
||||
|
||||
In c, you will have a bit more complicated pointer arithmetic to adress
|
||||
the stack, but the usage of objects will be the same. Note here that it
|
||||
is convenient to keep all information about the token in its location
|
||||
info, (yylsa, yylsp, yylval, @n), while everything relating to the value
|
||||
of the expression, or to the parse tree, is better placed in the
|
||||
semantic stack (yyssa, yyssp, yysval, $n). Also note that in some cases
|
||||
you can do semantic checks inside rules and report useful messages
|
||||
before or instead of invoking yyerror();
|
||||
|
||||
Finally, it is useful to make the following wrapper function around
|
||||
external yylex() in order to maintain your own token stack. Unlike the
|
||||
parser's internal stack which is only as deep as the rule being reduced,
|
||||
this one can hold all tokens recognized during the current run, and that
|
||||
can be extremely helpful for error reporting and any transformations you
|
||||
may need. In this way, you can even scan (tokenize) the whole buffer
|
||||
before handing it off to the parser (who knows, you may need a token
|
||||
ahead of what is currently seen by the parser):
|
||||
|
||||
|
||||
sub tokenize {
|
||||
undef @tokenTable;
|
||||
my ($tok, $text, $name, $unit, $first_line, $first_column,
|
||||
$last_line, $last_column);
|
||||
|
||||
while ( ($tok = &UnitLex::yylex()) > 0 ) { # this is where the
|
||||
c-coded yylex is called,
|
||||
# UnitLex is the perl
|
||||
extension encapsulating it
|
||||
( $text, $name, $unit, $first_line, $first_column, $last_line,
|
||||
$last_column ) = &UnitLex::getyylval;
|
||||
push(@tokenTable,
|
||||
Unit::yyltype->new (
|
||||
'token' => $tok,
|
||||
'text' => $text,
|
||||
'name' => $name,
|
||||
'unit' => $unit,
|
||||
'first_line' => $first_line,
|
||||
'first_column' => $first_column,
|
||||
'last_line' => $last_line,
|
||||
'last_column' => $last_column,
|
||||
)
|
||||
)
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
|
||||
It is now a lot easier to handle various state-related problems, such as
|
||||
backtracking and error reporting. The yylex() function as seen by the
|
||||
parser might be constructed somewhat like this:
|
||||
|
||||
sub yylex {
|
||||
$yylloc = $tokenTable[$tokenNo]; # $tokenNo is a global; now
|
||||
instead of a "file pointer",
|
||||
# as in the first example, we have
|
||||
a "token pointer"
|
||||
undef $yylval;
|
||||
|
||||
|
||||
# disregard this; name this block "computing semantic values"
|
||||
if ( $yylloc->{'token'} == UNIT) {
|
||||
$yylval = Unit::Operand->new(
|
||||
'unit' => Unit::Dict::unit($yylloc->{'unit'}),
|
||||
'base' => Unit::Dict::base($yylloc->{'unit'}),
|
||||
'scale' => Unit::Dict::scale($yylloc->{'unit'}),
|
||||
'scaleToBase' => Unit::Dict::scaleToBase($yylloc->{'unit'}),
|
||||
'loc' => $yylloc,
|
||||
);
|
||||
}
|
||||
elsif ( ($yylloc->{'token'} == INTEGER ) || ($yylloc->{'token'} ==
|
||||
POSITIVE_NUMBER) ) {
|
||||
$yylval = Unit::Operand->new(
|
||||
'unit' => '1',
|
||||
'base' => '1',
|
||||
'scale' => 1,
|
||||
'scaleToBase' => 1,
|
||||
'loc' => $yylloc,
|
||||
);
|
||||
}
|
||||
|
||||
$tokenNo++;
|
||||
return(%{$yylloc}->{'token'}); # This is all the parser needs to
|
||||
know about this token.
|
||||
# But we already made sure we saved
|
||||
everything we need to know.
|
||||
}
|
||||
|
||||
|
||||
Now the most interesting part, the error reporting routine:
|
||||
|
||||
|
||||
sub yyerror {
|
||||
my ($str) = @_;
|
||||
my ($message, $start, $end, $loc);
|
||||
|
||||
$loc = $tokenTable[$tokenNo-1]; # This is the same as to say,
|
||||
# "obtain the location info for the
|
||||
current token"
|
||||
|
||||
# You may use this routine for your own purposes or let parser use
|
||||
it
|
||||
if( $str ne 'parse error' ) {
|
||||
$message = "$str instead of `" . $loc->{'name'} . "' <" .
|
||||
$loc->{'text'} . ">, at line " . $loc->{'first_line'} . ":\n\
|
||||
n";
|
||||
}
|
||||
else {
|
||||
$message = "unexpected token `" . $loc->{'name'} . "' <" .
|
||||
$loc->{'text'} . ">, at line " . loc->{'first_line'} . ":\n
|
||||
\n";
|
||||
}
|
||||
|
||||
$message .= $parseBuffer . "\n"; # that's the original string that
|
||||
was used to set the parser buffer
|
||||
|
||||
$message .= ( ' ' x ($loc->{'first_column'} + 1) ) . ( '^' x
|
||||
length($loc->{'text'}) ). "\n";
|
||||
if( $str ne 'parse error' ) {
|
||||
print STDERR "$str instead of `", $loc->{'name'}, "' {",
|
||||
$loc->{'text'}, "}, at line ", $loc->{'first_line'}, ":\n\n";
|
||||
}
|
||||
else {
|
||||
print STDERR "unexpected token `", $loc->{'name'}, "' {",
|
||||
$loc->{'text'}, "}, at line ", $loc->{'first_line'}, ":\n\n";
|
||||
}
|
||||
|
||||
print STDERR "$parseBuffer\n";
|
||||
print STDERR ' ' x ($loc->{'first_column'} + 1), '^' x
|
||||
length($loc->{'text'}), "\n";
|
||||
}
|
||||
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Scanners used in these examples assume there is a single line of text on
|
||||
the input (the first_line and last_line elements of yylloc are simply
|
||||
ignored). If you want to be able to parse multi-line buffers, just add a
|
||||
lex rule for '\n' that will increment the line count and reset the pos
|
||||
variable to zero.
|
||||
|
||||
|
||||
Ugly as it may seem, I find this approach extremely liberating. If the
|
||||
grammar becomes too complicated for a LALR(1) parser, I can cascade
|
||||
multiple parsers. The token table can then be used to reassemble parts
|
||||
of original expression for subordinate parsers, preserving the location
|
||||
info all the way down, so that subordinate parsers can report their
|
||||
problems consistently. You probably don't need this, as SQL is very well
|
||||
thought of and has parsable grammar. But it may be of some help, for
|
||||
error reporting.
|
||||
|
||||
|
||||
--Gene
|
||||
|
||||
From pgsql-patches-owner+M1499@postgresql.org Sat Aug 4 13:11:53 2001
|
||||
Return-path: <pgsql-patches-owner+M1499@postgresql.org>
|
||||
Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
|
||||
by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f74HBrh11339
|
||||
for <pgman@candle.pha.pa.us>; Sat, 4 Aug 2001 13:11:53 -0400 (EDT)
|
||||
Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
|
||||
by postgresql.org (8.11.3/8.11.4) with SMTP id f74H89655183;
|
||||
Sat, 4 Aug 2001 13:08:09 -0400 (EDT)
|
||||
(envelope-from pgsql-patches-owner+M1499@postgresql.org)
|
||||
Received: from sss.pgh.pa.us ([192.204.191.242])
|
||||
by postgresql.org (8.11.3/8.11.4) with ESMTP id f74Gxb653074
|
||||
for <pgsql-patches@postgresql.org>; Sat, 4 Aug 2001 12:59:37 -0400 (EDT)
|
||||
(envelope-from tgl@sss.pgh.pa.us)
|
||||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||||
by sss.pgh.pa.us (8.11.4/8.11.4) with ESMTP id f74GtPC29183;
|
||||
Sat, 4 Aug 2001 12:55:25 -0400 (EDT)
|
||||
To: Dave Page <dpage@vale-housing.co.uk>
|
||||
cc: "'Fernando Nasser'" <fnasser@cygnus.com>,
|
||||
Bruce Momjian <pgman@candle.pha.pa.us>, Neil Padgett <npadgett@redhat.com>,
|
||||
pgsql-patches@postgresql.org
|
||||
Subject: Re: [PATCHES] Patch for Improved Syntax Error Reporting
|
||||
In-Reply-To: <8568FC767B4AD311AC33006097BCD3D61A2D70@woody.vale-housing.co.uk>
|
||||
References: <8568FC767B4AD311AC33006097BCD3D61A2D70@woody.vale-housing.co.uk>
|
||||
Comments: In-reply-to Dave Page <dpage@vale-housing.co.uk>
|
||||
message dated "Sat, 04 Aug 2001 12:37:23 +0100"
|
||||
Date: Sat, 04 Aug 2001 12:55:24 -0400
|
||||
Message-ID: <29180.996944124@sss.pgh.pa.us>
|
||||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||||
Precedence: bulk
|
||||
Sender: pgsql-patches-owner@postgresql.org
|
||||
Status: OR
|
||||
|
||||
Dave Page <dpage@vale-housing.co.uk> writes:
|
||||
> Oh, I quite agree. I'm not adverse to updating my code, I just want to avoid
|
||||
> users getting misleading messages until I come up with those updates.
|
||||
|
||||
Hmm ... if they were actively misleading then I'd share your concern.
|
||||
|
||||
I guess what you're thinking is that the error offset reported by the
|
||||
backend won't correspond directly to what the user typed, and if the
|
||||
user tries to use the offset to manually count off characters, he may
|
||||
arrive at the wrong place? Good point. I'm not sure whether a message
|
||||
like
|
||||
|
||||
ERROR: parser: parse error at or near 'frum';
|
||||
POSITION: 42
|
||||
|
||||
would be likely to encourage people to try that. Thoughts? (I do think
|
||||
this is a good argument for not embedding the position straight into the
|
||||
main error message though...)
|
||||
|
||||
One possible compromise is to combine the straight character-offset
|
||||
approach with a simplistic context display:
|
||||
|
||||
ERROR: parser: parse error at or near 'frum';
|
||||
POSITION: 42 ... oid,relname FRUM ...
|
||||
|
||||
The idea is to define the "POSITION" field as an integer offset possibly
|
||||
followed by whitespace and noise words. An updated client would grab
|
||||
the offset, ignore the rest of the field, and do the right thing. A
|
||||
not-updated client would display the entire message, and with any luck
|
||||
the user would read it correctly.
|
||||
|
||||
regards, tom lane
|
||||
|
||||
---------------------------(end of broadcast)---------------------------
|
||||
TIP 5: Have you checked our extensive FAQ?
|
||||
|
||||
http://www.postgresql.org/users-lounge/docs/faq.html
|
||||
|
Loading…
Reference in New Issue
Block a user