mirror of
https://git.postgresql.org/git/postgresql.git
synced 2024-12-27 08:39:28 +08:00
1557 lines
59 KiB
Plaintext
1557 lines
59 KiB
Plaintext
From daybee@bellatlantic.net Sun Aug 23 20:21:48 1998
|
||
Received: from iconmail.bellatlantic.net (iconmail.bellatlantic.net [199.173.162.30])
|
||
by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id UAA26688
|
||
for <maillist@candle.pha.pa.us>; Sun, 23 Aug 1998 20:21:46 -0400 (EDT)
|
||
Received: from bellatlantic.net (client196-126-169.bellatlantic.net [151.196.126.169])
|
||
by iconmail.bellatlantic.net (IConNet Sendmail) with ESMTP id UAA09478;
|
||
Sun, 23 Aug 1998 20:18:35 -0400 (EDT)
|
||
Message-ID: <35E0ABF0.578694C8@bellatlantic.net>
|
||
Date: Sun, 23 Aug 1998 19:55:29 -0400
|
||
From: David Hartwig <daybee@bellatlantic.net>
|
||
Organization: Home
|
||
X-Mailer: Mozilla 4.04 [en] (Win95; I)
|
||
MIME-Version: 1.0
|
||
To: Bruce Momjian <maillist@candle.pha.pa.us>
|
||
CC: hannu@trust.ee, pgsql-interfaces@postgreSQL.org, hackers@postgreSQL.org
|
||
Subject: Re: [INTERFACES] Re: [HACKERS] changes in 6.4
|
||
References: <199808220353.XAA04528@candle.pha.pa.us>
|
||
Content-Type: text/plain; charset=us-ascii
|
||
Content-Transfer-Encoding: 7bit
|
||
Status: ROr
|
||
|
||
|
||
|
||
Bruce Momjian wrote:
|
||
|
||
> >
|
||
> > Hannu Krosing wrote:
|
||
> >
|
||
> > > > The days where every release fixed server crashes, or added a feature
|
||
> > > > that users were 'screaming for' may be a thing of the past.
|
||
> > >
|
||
> > > Is anyone working on fixing the exploding optimisations for many OR-s,
|
||
> > > at least the canonic case used by access?
|
||
> > >
|
||
> > > My impression is that this has fallen somewhere between
|
||
> > > insightdist and Vadim.
|
||
> >
|
||
> > This is really big for the ODBCers. (And I suspect for JDBCers too.) Many
|
||
> > desktop libraries and end-user tools depend on this "record set" strategy to
|
||
> > operate effectively.
|
||
> >
|
||
> > I have put together a workable hack that runs just before cnfify(). The
|
||
> > option is activated through the SET command. Once activated, it identifies
|
||
> > queries with this particular multi-OR pattern generated by these RECORD SET
|
||
> > strategies. Qualified query trees are rewritten as multiple UNIONs. (One
|
||
> > for each OR grouping).
|
||
> >
|
||
> > The results are profound. Queries that used to scan tables because of the
|
||
> > ORs, now make use of any indexes. Thus, the size of the table has virtually
|
||
> > no effect on performance. Furthermore, queries that used to crash the
|
||
> > backend, now run in under a second.
|
||
> >
|
||
> > Currently the down sides are:
|
||
> > 1. If there is no usable index, performance is significantly worse. The
|
||
> > patch does not check to make sure that there is a usable index. I could use
|
||
> > some pointers on this.
|
||
> >
|
||
> > 2. Small tables are actually a bit slower than without the patch.
|
||
> >
|
||
> > 3. Not very elegant. I am looking for a more generalized solution.
|
||
> > I have lots of ideas, but I would need to know the backend much better before
|
||
> > attempting any of them. My favorite idea is before cnfify(), to factor the
|
||
> > OR terms and pull out the constants into a virtual (temporary) table spaces.
|
||
> > Then rewrite the query as a join. The optimizer will (should) treat the new
|
||
> > query accordingly. This assumes that an efficient factoring algorithm exists
|
||
> > and that temporary tables can exist in the heap.
|
||
> >
|
||
> > Illustration:
|
||
> > SELECT ... FROM tab WHERE
|
||
> > (var1 = const1 AND var2 = const2) OR
|
||
> > (var1 = const3 AND var2 = const4) OR
|
||
> > (var1 = const5 AND var2 = const6)
|
||
> >
|
||
> > SELECT ... FROM tab, tmp WHERE
|
||
> > (var1 = var_x AND var2 = var_y)
|
||
> >
|
||
> > tmp
|
||
> > var_x | var_y
|
||
> > --------------
|
||
> > const1|const2
|
||
> > const3|const4
|
||
> > const5|const6
|
||
>
|
||
> David, where are we on this? I know we have OR's using indexes. Do we
|
||
> still need to look this as a fix, or are we OK. I have not gotten far
|
||
> enough in the optimizer to know how to fix the
|
||
|
||
Bruce,
|
||
|
||
If the question is, have I come up with a solution for the cnf'ify problem: No
|
||
|
||
If the question is, is it still important: Very much yes.
|
||
|
||
It is essential for many RAD tools using remote data objects which make use of key
|
||
sets. Your recent optimization of the OR list goes a long way, but inevitably
|
||
users are confronted with multi-part keys.
|
||
|
||
When I look at the problem my head spins. I do not have the experience (yet?)
|
||
with the backend to be mucking around in the optimizer. As I see it, cnf'ify is
|
||
doing just what it is supposed to do. Boundless boolean logic.
|
||
|
||
I think hope may lay though, in identifying each AND'ed group associated with a key
|
||
and tagging it as a special sub-root node which cnf'ify does not penetrate. This
|
||
node would be allowed to pass to the later stages of the optimizer where it will be
|
||
used to plan index scans. Easy for me to say.
|
||
|
||
In the meantime, I still have the patch that I described in prior email. It has
|
||
worked well for us. Let me restate that. We could not survive without it!
|
||
However, I do not feel that is a sufficiently functional approach that should be
|
||
incorporated as a final solution. I will submit the patch if you, (anyone) does
|
||
not come up with a better solution. It is coded to be activated by a SET KSQO to
|
||
minimize its reach.
|
||
|
||
|
||
From daybee@bellatlantic.net Sun Aug 30 12:06:24 1998
|
||
Received: from iconmail.bellatlantic.net (iconmail.bellatlantic.net [199.173.162.30])
|
||
by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id MAA12860
|
||
for <maillist@candle.pha.pa.us>; Sun, 30 Aug 1998 12:06:22 -0400 (EDT)
|
||
Received: from bellatlantic.net (client196-126-73.bellatlantic.net [151.196.126.73])
|
||
by iconmail.bellatlantic.net (IConNet Sendmail) with ESMTP id MAA18468;
|
||
Sun, 30 Aug 1998 12:03:33 -0400 (EDT)
|
||
Message-ID: <35E9726E.C6E73049@bellatlantic.net>
|
||
Date: Sun, 30 Aug 1998 11:40:31 -0400
|
||
From: David Hartwig <daybee@bellatlantic.net>
|
||
Organization: Home
|
||
X-Mailer: Mozilla 4.06 [en] (Win98; I)
|
||
MIME-Version: 1.0
|
||
To: Bruce Momjian <maillist@candle.pha.pa.us>
|
||
CC: hannu@trust.ee, pgsql-interfaces@postgreSQL.org, hackers@postgreSQL.org
|
||
Subject: Re: [INTERFACES] Re: [HACKERS] changes in 6.4
|
||
References: <199808290344.XAA28089@candle.pha.pa.us>
|
||
Content-Type: text/plain; charset=us-ascii
|
||
Content-Transfer-Encoding: 7bit
|
||
Status: RO
|
||
|
||
|
||
|
||
Bruce Momjian wrote:
|
||
|
||
> OK, let me try this one.
|
||
>
|
||
> Why is the system cnf'ifying the query. Because it wants to have a
|
||
> list of qualifications that are AND'ed, so it can just pick the most
|
||
> restrictive/cheapest, and evaluate that one first.
|
||
>
|
||
> If you have:
|
||
>
|
||
> (a=b and c=d) or e=1
|
||
>
|
||
> In this case, without cnf'ify, it has to evaluate both of them, because
|
||
> if one is false, you can't be sure another would be true. In the
|
||
> cnf'ify case,
|
||
>
|
||
> (a=b or e=1) and (c=d or e=1)
|
||
>
|
||
> In this case, it can choose either, and act on just one, if a row fails
|
||
> to meet it, it can stop and not evaluate it using the other restriction.
|
||
>
|
||
> The fact is that it is only going to use fancy join/index in one of the
|
||
> two cases, so it tries to pick the best one, and does a brute-force
|
||
> qualification test on the remaining item if the first one tried is true.
|
||
>
|
||
> The problem is of course large where clauses can exponentially expand
|
||
> this. What it really trying to do is to pick a cheapest restriction,
|
||
> but the memory explosion and query failure are serious problems.
|
||
>
|
||
> The issue is that it thinks it is doing something to help things, while
|
||
> it is actually hurting things.
|
||
>
|
||
> In the ODBC case of:
|
||
>
|
||
> (x=3 and y=4) or
|
||
> (x=3 and y=5) or
|
||
> (x=3 and y=6) or ...
|
||
>
|
||
> it clearly is not going to gain anything by choosing any CHEAPEST path,
|
||
> because they are all the same in terms of cost, and the use by ODBC
|
||
> clients is hurting reliability.
|
||
>
|
||
> I am inclined to agree with David's solution of breaking apart the query
|
||
> into separate UNION queries in certain cases. It seems to be the most
|
||
> logical solution, because the cnf'ify code is working counter to its
|
||
> purpose in these cases.
|
||
>
|
||
> Now, the question is how/where to implement this. I see your idea of
|
||
> making the OR a join to a temp table that holds all the constants.
|
||
> Another idea would be to do actual UNION queries:
|
||
>
|
||
> SELECT * FROM tab
|
||
> WHERE (x=3 and y=4)
|
||
> UNION
|
||
> SELECT * FROM tab
|
||
> WHERE (x=3 and y=5)
|
||
> UNION
|
||
> SELECT * FROM tab
|
||
> WHERE (x=3 and y=6) ...
|
||
>
|
||
> This would work well for tables with indexes, but for a sequential scan,
|
||
> you are doing a sequential scan for each UNION.
|
||
|
||
Practically speaking, the lack of an index concern, may not be justified. The reason
|
||
these queries are being generated, with this shape, is because remote data objects on the
|
||
client side are being told that a primary key exists on these tables. The object is told
|
||
about these keys in one of two ways.
|
||
|
||
1. It queries the database for the primary key of the table. The ODBC driver serviced
|
||
this request by querying for the attributes used in {table_name}_pkey.
|
||
|
||
2. The user manually specifies the primary key. In this case an actual index may not
|
||
exist. (i.e. MS Access asks the user for this information if a primary key is not found
|
||
in a table)
|
||
|
||
The second case is the only one that would cause a problem. Fortunately, the solution is
|
||
simple. Add a primary key index!
|
||
|
||
My only concern is to be able to accurately identify a query with the proper signature
|
||
before rewriting it as a UNION. To what degree should this inspection be taken?
|
||
|
||
BTW, I would not do the rewrite on OR's without AND's since you have fixed the OR's use
|
||
of the index.
|
||
|
||
There is one other potential issue. My experience with using arrays in tables and UNIONS
|
||
creates problems. There are missing array comparison operators which are used by the
|
||
implied DISTINCT.
|
||
|
||
> Another idea is
|
||
> subselects. Also, you have to make sure you return the proper rows,
|
||
> keeping duplicates where they are in the base table, but not returning
|
||
> them when the meet more than one qualification.
|
||
>
|
||
> SELECT * FROM tab
|
||
> WHERE (x,y) IN (SELECT 3, 4
|
||
> UNION
|
||
> SELECT 3, 5
|
||
> UNION
|
||
> SELECT 3, 6)
|
||
>
|
||
> I believe we actually support this. This is not going to use an index
|
||
> on tab, so it may be slow if x and y are indexed.
|
||
>
|
||
> Another more bizarre solution is:
|
||
>
|
||
> SELECT * FROM tab
|
||
> WHERE (x,y) = (SELECT 3, 4) OR
|
||
> (x,y) = (SELECT 3, 5) OR
|
||
> (x,y) = (SELECT 3, 6)
|
||
>
|
||
> Again, I think we do this too. I don't think cnf'ify does anything with
|
||
> this. I also believe "=" uses indexes on subselects, while IN does not
|
||
> because IN could return lots of rows, and an index is slower than a
|
||
> non-index join on lots of rows. Of course, now that we index OR's.
|
||
>
|
||
> Let me ask another question. If I do:
|
||
>
|
||
> SELECT * FROM tab WHERE x=3 OR x=4
|
||
>
|
||
> it works, and uses indexes. Why can't the optimizer just not cnf'ify
|
||
> things sometimes, and just do:
|
||
>
|
||
> SELECT * FROM tab
|
||
> WHERE (x=3 AND y=4) OR
|
||
> (x=3 AND y=5) OR
|
||
> (x=3 AND y=6)
|
||
>
|
||
> Why can it handle x=3 OR x=4, but not the more complicated case above,
|
||
> without trying to be too smart? If x,y is a multi-key index, it could
|
||
> use that quite easily. If not, it can do a sequentail scan and run the
|
||
> tests.
|
||
>
|
||
> Another issue. To the optimizer, x=3 and x=y are totally different. In
|
||
> x=3, it is a column compared to a constant, while in x=y, it is a join.
|
||
> That makes a huge difference.
|
||
>
|
||
> In the case of (a=b and c=d) or e=1, you pick the best path and do the
|
||
> a=b join, and throw in the e=1 entries. You can't easily do both joins,
|
||
> because you also need the e=1 stuff.
|
||
>
|
||
> I wounder what would happen if we prevent cnf'ifying of cases where the
|
||
> OR represent only column = constant restrictions.
|
||
>
|
||
> I meant to really go through the optimizer this month, but other backend
|
||
> items took my time.
|
||
>
|
||
> Can someone run some tests on disabling the cnf'ify calls. It is my
|
||
> understanding that with the non-cnf-ify'ed query, it can't choose an
|
||
> optimial path, and starts to do either straight index matches,
|
||
> sequential scans, or cartesian products where it joins every row to
|
||
> every other row looking for a match.
|
||
>
|
||
> Let's say we turn off cnf-ify just for non-join queries. Does that
|
||
> help?
|
||
>
|
||
> I am not sure of the ramifications of telling the optimizer it no longer
|
||
> has a variety of paths to choose for evaluating the query.
|
||
|
||
I did not try this earlier because I thought it was too good to be true. I was right.
|
||
I tried commenting out the normalize() function in the cnfify(). The EXPLAIN showed a
|
||
sequential scan and the resulting tuple set was empty. Time will not allow me to dig
|
||
into this further this weekend.
|
||
|
||
Unless you come up with a better solution, I am going to submit my patch on Monday to
|
||
make the Sept. 1st deadline. It includes a SET switch to activate the rewrite so as not
|
||
to cause problems outside the ODBC users. We can either improve, it or yank it, by the
|
||
Oct. 1st deadline.
|
||
|
||
|
||
From infotecn@tin.it Mon Aug 31 03:01:51 1998
|
||
Received: from mail.tol.it (mail.tin.it [194.243.154.49])
|
||
by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id DAA09740
|
||
for <maillist@candle.pha.pa.us>; Mon, 31 Aug 1998 03:01:48 -0400 (EDT)
|
||
Received: from Server.InfoTecna.com (a-mz6-50.tin.it [212.216.9.113])
|
||
by mail.tol.it (8.8.4/8.8.4) with ESMTP
|
||
id JAA16451; Mon, 31 Aug 1998 09:00:35 +0200 (MET DST)
|
||
Received: from tm3.InfoTecna.com (Tm1.InfoTecna.com [192.168.1.1])
|
||
by Server.InfoTecna.com (8.8.5/8.8.5) with SMTP id IAA18678;
|
||
Mon, 31 Aug 1998 08:53:13 +0200
|
||
Message-Id: <3.0.5.32.19980831085312.00986cc0@MBox.InfoTecna.com>
|
||
X-Sender: denis@MBox.InfoTecna.com
|
||
X-Mailer: QUALCOMM Windows Eudora Light Version 3.0.5 (32)
|
||
Date: Mon, 31 Aug 1998 08:53:12 +0200
|
||
To: David Hartwig <daybee@bellatlantic.net>,
|
||
Bruce Momjian <maillist@candle.pha.pa.us>
|
||
From: Sbragion Denis <infotecn@tin.it>
|
||
Subject: Re: [INTERFACES] Re: [HACKERS] changes in 6.4
|
||
Cc: hannu@trust.ee, pgsql-interfaces@postgreSQL.org, hackers@postgreSQL.org
|
||
In-Reply-To: <35E9726E.C6E73049@bellatlantic.net>
|
||
References: <199808290344.XAA28089@candle.pha.pa.us>
|
||
Mime-Version: 1.0
|
||
Content-Type: text/plain; charset="us-ascii"
|
||
Status: RO
|
||
|
||
Hello,
|
||
|
||
At 11.40 30/08/98 -0400, David Hartwig wrote:
|
||
>> Why is the system cnf'ifying the query. Because it wants to have a
|
||
>> list of qualifications that are AND'ed, so it can just pick the most
|
||
>> restrictive/cheapest, and evaluate that one first.
|
||
|
||
Just a small question about all this optimizations stuff. I'm not a
|
||
database expert but I think we are talking about a NP-complete problem.
|
||
Could'nt we convert this optimization problem into another NP one that is
|
||
known to have a good solution ? For example for the traveling salesman
|
||
problem there's an alghoritm that provide a solution that's never more than
|
||
two times the optimal one an provides results that are *really* near the
|
||
optimal one most of the times. The simplex alghoritm may be another
|
||
example. I think that this kind of alghoritm would be better than a
|
||
collection ot tricks for special cases, and this tricks could be used
|
||
anyway when special cases are detected. Furthermore I also know that exists
|
||
a free program I used in the past that provides this kind of optimizations
|
||
for chip design. I don't remember the exact name of the program but I
|
||
remember it came from Berkeley university. Of course may be I'm totally
|
||
missing the point.
|
||
|
||
Hope it helps !
|
||
|
||
Bye!
|
||
|
||
Dr. Sbragion Denis
|
||
InfoTecna
|
||
Tel, Fax: +39 39 2324054
|
||
URL: http://space.tin.it/internet/dsbragio
|
||
|
||
From andreas.zeugswetter@telecom.at Mon Aug 31 06:31:13 1998
|
||
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
|
||
by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id GAA14231
|
||
for <maillist@candle.pha.pa.us>; Mon, 31 Aug 1998 06:31:12 -0400 (EDT)
|
||
Received: from gandalf.telecom.at (gandalf.telecom.at [194.118.26.84]) by renoir.op.net (o1/$Revision: 1.6 $) with ESMTP id GAA21099 for <maillist@candle.pha.pa.us>; Mon, 31 Aug 1998 06:23:41 -0400 (EDT)
|
||
Received: from zeugswettera.user.lan.at (zeugswettera.user.lan.at [10.4.123.227]) by gandalf.telecom.at (A.B.C.Delta4/8.8.8) with SMTP id MAA38132; Mon, 31 Aug 1998 12:22:07 +0200
|
||
Received: by zeugswettera.user.lan.at with Microsoft Mail
|
||
id <01BDD4DA.C7F5B690@zeugswettera.user.lan.at>; Mon, 31 Aug 1998 12:27:55 +0200
|
||
Message-ID: <01BDD4DA.C7F5B690@zeugswettera.user.lan.at>
|
||
From: Andreas Zeugswetter <andreas.zeugswetter@telecom.at>
|
||
To: "'maillist@candle.pha.pa.us'" <maillist@candle.pha.pa.us>
|
||
Cc: "hackers@postgreSQL.org" <hackers@postgreSQL.org>
|
||
Subject: AW: [INTERFACES] Re: [HACKERS] changes in 6.4
|
||
Date: Mon, 31 Aug 1998 12:22:05 +0200
|
||
Encoding: 31 TEXT
|
||
Status: RO
|
||
|
||
|
||
>Another idea would be to do actual UNION queries:
|
||
>
|
||
> SELECT * FROM tab
|
||
> WHERE (x=3 and y=4)
|
||
> UNION
|
||
> SELECT * FROM tab
|
||
> WHERE (x=3 and y=5)
|
||
> UNION
|
||
> SELECT * FROM tab
|
||
> WHERE (x=3 and y=6) ...
|
||
>
|
||
>This would work well for tables with indexes, but for a sequential scan,
|
||
>you are doing a sequential scan for each UNION.
|
||
|
||
The most important Application for this syntax will be M$ Access
|
||
because it uses this syntax to display x rows from a table in a particular
|
||
sort order. In this case x and y will be the primary key and therefore have a
|
||
unique index. So I think this special case should work good.
|
||
|
||
The strategy could be something like:
|
||
iff x, y is a unique index
|
||
do the union access path
|
||
else
|
||
do something else
|
||
done
|
||
|
||
I think hand written SQL can always be rewritten if it is not fast enough
|
||
using this syntax.
|
||
|
||
Andreas
|
||
|
||
|
||
From owner-pgsql-patches@hub.org Tue Sep 1 02:01:10 1998
|
||
Received: from hub.org (hub.org [209.47.148.200])
|
||
by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id CAA28687
|
||
for <maillist@candle.pha.pa.us>; Tue, 1 Sep 1998 02:01:06 -0400 (EDT)
|
||
Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id BAA02180; Tue, 1 Sep 1998 01:48:43 -0400 (EDT)
|
||
Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Tue, 01 Sep 1998 01:47:48 +0000 (EDT)
|
||
Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id BAA02160 for pgsql-patches-outgoing; Tue, 1 Sep 1998 01:47:46 -0400 (EDT)
|
||
Received: from iconmail.bellatlantic.net (iconmail.bellatlantic.net [199.173.162.30]) by hub.org (8.8.8/8.7.5) with ESMTP id BAA02147 for <pgsql-patches@postgreSQL.org>; Tue, 1 Sep 1998 01:47:42 -0400 (EDT)
|
||
Received: from bellatlantic.net (client196-126-3.bellatlantic.net [151.196.126.3])
|
||
by iconmail.bellatlantic.net (IConNet Sendmail) with ESMTP id XAA27530
|
||
for <pgsql-patches@postgreSQL.org>; Mon, 31 Aug 1998 23:24:07 -0400 (EDT)
|
||
Message-ID: <35EB2B33.EBF1E9AA@bellatlantic.net>
|
||
Date: Mon, 31 Aug 1998 19:01:07 -0400
|
||
From: David Hartwig <daybee@bellatlantic.net>
|
||
Organization: Insight Distribution Systems
|
||
X-Mailer: Mozilla 4.04 [en] (X11; I; Linux 2.0.29 i586)
|
||
MIME-Version: 1.0
|
||
To: patches <pgsql-patches@postgreSQL.org>
|
||
Subject: [PATCHES] Interim AND/OR memory exaustion fix.
|
||
Content-Type: multipart/mixed; boundary="------------BEFD1E6DA78A2DC20B524E32"
|
||
Sender: owner-pgsql-patches@hub.org
|
||
Precedence: bulk
|
||
Status: ROr
|
||
|
||
This is a multi-part message in MIME format.
|
||
--------------BEFD1E6DA78A2DC20B524E32
|
||
Content-Type: text/plain; charset=us-ascii
|
||
Content-Transfer-Encoding: 7bit
|
||
|
||
I will be cleaning this up more before the Oct 1 deadline.
|
||
|
||
--------------BEFD1E6DA78A2DC20B524E32
|
||
Content-Type: text/plain; charset=us-ascii; name="keyset.patch"
|
||
Content-Transfer-Encoding: 7bit
|
||
Content-Disposition: inline; filename="keyset.patch"
|
||
|
||
*** ./backend/commands/variable.c.orig Thu Jul 30 19:25:26 1998
|
||
--- ./backend/commands/variable.c Mon Aug 31 17:23:32 1998
|
||
***************
|
||
*** 24,29 ****
|
||
--- 24,30 ----
|
||
extern bool _use_geqo_;
|
||
extern int32 _use_geqo_rels_;
|
||
extern bool _use_right_sided_plans_;
|
||
+ extern bool _use_keyset_query_optimizer;
|
||
|
||
/*-----------------------------------------------------------------------*/
|
||
static const char *
|
||
***************
|
||
*** 559,564 ****
|
||
--- 560,568 ----
|
||
},
|
||
#endif
|
||
{
|
||
+ "ksqo", parse_ksqo, show_ksqo, reset_ksqo
|
||
+ },
|
||
+ {
|
||
NULL, NULL, NULL, NULL
|
||
}
|
||
};
|
||
***************
|
||
*** 611,615 ****
|
||
--- 615,663 ----
|
||
|
||
elog(NOTICE, "Unrecognized variable %s", name);
|
||
|
||
+ return TRUE;
|
||
+ }
|
||
+
|
||
+
|
||
+ /*-----------------------------------------------------------------------
|
||
+ KSQO code will one day be unnecessary when the optimizer makes use of
|
||
+ indexes when multiple ORs are specified in the where clause.
|
||
+ See optimizer/prep/prepkeyset.c for more on this.
|
||
+ daveh@insightdist.com 6/16/98
|
||
+ -----------------------------------------------------------------------*/
|
||
+ bool
|
||
+ parse_ksqo(const char *value)
|
||
+ {
|
||
+ if (value == NULL)
|
||
+ {
|
||
+ reset_ksqo();
|
||
+ return TRUE;
|
||
+ }
|
||
+
|
||
+ if (strcasecmp(value, "on") == 0)
|
||
+ _use_keyset_query_optimizer = true;
|
||
+ else if (strcasecmp(value, "off") == 0)
|
||
+ _use_keyset_query_optimizer = false;
|
||
+ else
|
||
+ elog(ERROR, "Bad value for Key Set Query Optimizer (%s)", value);
|
||
+
|
||
+ return TRUE;
|
||
+ }
|
||
+
|
||
+ bool
|
||
+ show_ksqo()
|
||
+ {
|
||
+
|
||
+ if (_use_keyset_query_optimizer)
|
||
+ elog(NOTICE, "Key Set Query Optimizer is ON");
|
||
+ else
|
||
+ elog(NOTICE, "Key Set Query Optimizer is OFF");
|
||
+ return TRUE;
|
||
+ }
|
||
+
|
||
+ bool
|
||
+ reset_ksqo()
|
||
+ {
|
||
+ _use_keyset_query_optimizer = false;
|
||
return TRUE;
|
||
}
|
||
*** ./backend/optimizer/plan/planner.c.orig Sun Aug 30 04:28:02 1998
|
||
--- ./backend/optimizer/plan/planner.c Mon Aug 31 17:23:32 1998
|
||
***************
|
||
*** 69,74 ****
|
||
--- 69,75 ----
|
||
PlannerInitPlan = NULL;
|
||
PlannerPlanId = 0;
|
||
|
||
+ transformKeySetQuery(parse);
|
||
result_plan = union_planner(parse);
|
||
|
||
Assert(PlannerQueryLevel == 1);
|
||
*** ./backend/optimizer/prep/Makefile.orig Sun Apr 5 20:23:48 1998
|
||
--- ./backend/optimizer/prep/Makefile Mon Aug 31 17:23:32 1998
|
||
***************
|
||
*** 13,19 ****
|
||
|
||
CFLAGS += -I../..
|
||
|
||
! OBJS = prepqual.o preptlist.o prepunion.o
|
||
|
||
# not ready yet: predmig.o xfunc.o
|
||
|
||
--- 13,19 ----
|
||
|
||
CFLAGS += -I../..
|
||
|
||
! OBJS = prepqual.o preptlist.o prepunion.o prepkeyset.o
|
||
|
||
# not ready yet: predmig.o xfunc.o
|
||
|
||
*** ./backend/optimizer/prep/prepkeyset.c.orig Mon Aug 31 17:23:32 1998
|
||
--- ./backend/optimizer/prep/prepkeyset.c Mon Aug 31 18:30:58 1998
|
||
***************
|
||
*** 0 ****
|
||
--- 1,213 ----
|
||
+ /*-------------------------------------------------------------------------
|
||
+ *
|
||
+ * prepkeyset.c--
|
||
+ * Special preperation for keyset queries.
|
||
+ *
|
||
+ * Copyright (c) 1994, Regents of the University of California
|
||
+ *
|
||
+ *-------------------------------------------------------------------------
|
||
+ */
|
||
+ #include <stdio.h>
|
||
+ #include <string.h>
|
||
+
|
||
+ #include "postgres.h"
|
||
+ #include "nodes/pg_list.h"
|
||
+ #include "nodes/parsenodes.h"
|
||
+ #include "utils/elog.h"
|
||
+
|
||
+ #include "nodes/nodes.h"
|
||
+ #include "nodes/execnodes.h"
|
||
+ #include "nodes/plannodes.h"
|
||
+ #include "nodes/primnodes.h"
|
||
+ #include "nodes/relation.h"
|
||
+
|
||
+ #include "catalog/pg_type.h"
|
||
+ #include "lib/stringinfo.h"
|
||
+ #include "optimizer/planmain.h"
|
||
+ /*
|
||
+ * Node_Copy--
|
||
+ * a macro to simplify calling of copyObject on the specified field
|
||
+ */
|
||
+ #define Node_Copy(from, newnode, field) newnode->field = copyObject(from->field)
|
||
+
|
||
+ /***** DEBUG stuff
|
||
+ #define TABS {int i; printf("\n"); for (i = 0; i<level; i++) printf("\t"); }
|
||
+ static int level = 0;
|
||
+ ******/
|
||
+
|
||
+ bool _use_keyset_query_optimizer = FALSE;
|
||
+
|
||
+ static int inspectOpNode(Expr *expr);
|
||
+ static int inspectAndNode(Expr *expr);
|
||
+ static int inspectOrNode(Expr *expr);
|
||
+
|
||
+ /**********************************************************************
|
||
+ * This routine transforms query trees with the following form:
|
||
+ * SELECT a,b, ... FROM one_table WHERE
|
||
+ * (v1 = const1 AND v2 = const2 [ vn = constn ]) OR
|
||
+ * (v1 = const3 AND v2 = const4 [ vn = constn ]) OR
|
||
+ * (v1 = const5 AND v2 = const6 [ vn = constn ]) OR
|
||
+ * ...
|
||
+ * [(v1 = constn AND v2 = constn [ vn = constn ])]
|
||
+ *
|
||
+ * into
|
||
+ *
|
||
+ * SELECT a,b, ... FROM one_table WHERE
|
||
+ * (v1 = const1 AND v2 = const2 [ vn = constn ]) UNION
|
||
+ * SELECT a,b, ... FROM one_table WHERE
|
||
+ * (v1 = const3 AND v2 = const4 [ vn = constn ]) UNION
|
||
+ * SELECT a,b, ... FROM one_table WHERE
|
||
+ * (v1 = const5 AND v2 = const6 [ vn = constn ]) UNION
|
||
+ * ...
|
||
+ * SELECT a,b, ... FROM one_table WHERE
|
||
+ * [(v1 = constn AND v2 = constn [ vn = constn ])]
|
||
+ *
|
||
+ *
|
||
+ * To qualify for transformation the query must not be a sub select,
|
||
+ * a HAVING, or a GROUP BY. It must be a single table and have KSQO
|
||
+ * set to 'on'.
|
||
+ *
|
||
+ * The primary use of this transformation is to avoid the exponrntial
|
||
+ * memory consumption of cnfify() and to make use of index access
|
||
+ * methods.
|
||
+ *
|
||
+ * daveh@insightdist.com 1998-08-31
|
||
+ *
|
||
+ * Needs to better identify the signeture WHERE clause.
|
||
+ * May want to also prune out duplicate where clauses.
|
||
+ **********************************************************************/
|
||
+ void
|
||
+ transformKeySetQuery(Query *origNode)
|
||
+ {
|
||
+ /* Qualify as a key set query candidate */
|
||
+ if (_use_keyset_query_optimizer == FALSE ||
|
||
+ origNode->groupClause ||
|
||
+ origNode->havingQual ||
|
||
+ origNode->hasAggs ||
|
||
+ origNode->utilityStmt ||
|
||
+ origNode->unionClause ||
|
||
+ origNode->unionall ||
|
||
+ origNode->hasSubLinks ||
|
||
+ origNode->commandType != CMD_SELECT)
|
||
+ return;
|
||
+
|
||
+ /* Qualify single table query */
|
||
+
|
||
+ /* Qualify where clause */
|
||
+ if ( ! inspectOrNode((Expr*)origNode->qual)) {
|
||
+ return;
|
||
+ }
|
||
+
|
||
+ /* Copy essential elements into a union node */
|
||
+ /*
|
||
+ elog(NOTICE, "OR_EXPR=%d, OP_EXPR=%d, AND_EXPR=%d", OR_EXPR, OP_EXPR, AND_EXPR);
|
||
+ elog(NOTICE, "T_List=%d, T_Expr=%d, T_Var=%d, T_Const=%d", T_List, T_Expr, T_Var, T_Const);
|
||
+ elog(NOTICE, "opType=%d", ((Expr*)origNode->qual)->opType);
|
||
+ */
|
||
+ while (((Expr*)origNode->qual)->opType == OR_EXPR) {
|
||
+ Query *unionNode = makeNode(Query);
|
||
+
|
||
+ /* Pull up Expr = */
|
||
+ unionNode->qual = lsecond(((Expr*)origNode->qual)->args);
|
||
+
|
||
+ /* Pull up balance of tree */
|
||
+ origNode->qual = lfirst(((Expr*)origNode->qual)->args);
|
||
+
|
||
+ /*
|
||
+ elog(NOTICE, "origNode: opType=%d, nodeTag=%d", ((Expr*)origNode->qual)->opType, nodeTag(origNode->qual));
|
||
+ elog(NOTICE, "unionNode: opType=%d, nodeTag=%d", ((Expr*)unionNode->qual)->opType, nodeTag(unionNode->qual));
|
||
+ */
|
||
+
|
||
+ unionNode->commandType = origNode->commandType;
|
||
+ unionNode->resultRelation = origNode->resultRelation;
|
||
+ unionNode->isPortal = origNode->isPortal;
|
||
+ unionNode->isBinary = origNode->isBinary;
|
||
+
|
||
+ if (origNode->uniqueFlag)
|
||
+ unionNode->uniqueFlag = pstrdup(origNode->uniqueFlag);
|
||
+
|
||
+ Node_Copy(origNode, unionNode, sortClause);
|
||
+ Node_Copy(origNode, unionNode, rtable);
|
||
+ Node_Copy(origNode, unionNode, targetList);
|
||
+
|
||
+ origNode->unionClause = lappend(origNode->unionClause, unionNode);
|
||
+ }
|
||
+ return;
|
||
+ }
|
||
+
|
||
+
|
||
+
|
||
+
|
||
+ static int
|
||
+ inspectOrNode(Expr *expr)
|
||
+ {
|
||
+ int fr = 0, sr = 0;
|
||
+ Expr *firstExpr, *secondExpr;
|
||
+
|
||
+ if ( ! (expr && nodeTag(expr) == T_Expr && expr->opType == OR_EXPR))
|
||
+ return 0;
|
||
+
|
||
+ firstExpr = lfirst(expr->args);
|
||
+ secondExpr = lsecond(expr->args);
|
||
+ if (nodeTag(firstExpr) != T_Expr || nodeTag(secondExpr) != T_Expr)
|
||
+ return 0;
|
||
+
|
||
+ if (firstExpr->opType == OR_EXPR)
|
||
+ fr = inspectOrNode(firstExpr);
|
||
+ else if (firstExpr->opType == OP_EXPR) /* Need to make sure it is last */
|
||
+ fr = inspectOpNode(firstExpr);
|
||
+ else if (firstExpr->opType == AND_EXPR) /* Need to make sure it is last */
|
||
+ fr = inspectAndNode(firstExpr);
|
||
+
|
||
+
|
||
+ if (secondExpr->opType == AND_EXPR)
|
||
+ sr = inspectAndNode(secondExpr);
|
||
+ else if (secondExpr->opType == OP_EXPR)
|
||
+ sr = inspectOpNode(secondExpr);
|
||
+
|
||
+ return (fr && sr);
|
||
+ }
|
||
+
|
||
+
|
||
+ static int
|
||
+ inspectAndNode(Expr *expr)
|
||
+ {
|
||
+ int fr = 0, sr = 0;
|
||
+ Expr *firstExpr, *secondExpr;
|
||
+
|
||
+ if ( ! (expr && nodeTag(expr) == T_Expr && expr->opType == AND_EXPR))
|
||
+ return 0;
|
||
+
|
||
+ firstExpr = lfirst(expr->args);
|
||
+ secondExpr = lsecond(expr->args);
|
||
+ if (nodeTag(firstExpr) != T_Expr || nodeTag(secondExpr) != T_Expr)
|
||
+ return 0;
|
||
+
|
||
+ if (firstExpr->opType == AND_EXPR)
|
||
+ fr = inspectAndNode(firstExpr);
|
||
+ else if (firstExpr->opType == OP_EXPR)
|
||
+ fr = inspectOpNode(firstExpr);
|
||
+
|
||
+ if (secondExpr->opType == OP_EXPR)
|
||
+ sr = inspectOpNode(secondExpr);
|
||
+
|
||
+ return (fr && sr);
|
||
+ }
|
||
+
|
||
+
|
||
+ static int
|
||
+ /******************************************************************
|
||
+ * Return TRUE if T_Var = T_Const, else FALSE
|
||
+ * Actually it does not test for =. Need to do this!
|
||
+ ******************************************************************/
|
||
+ inspectOpNode(Expr *expr)
|
||
+ {
|
||
+ Expr *firstExpr, *secondExpr;
|
||
+
|
||
+ if (nodeTag(expr) != T_Expr || expr->opType != OP_EXPR)
|
||
+ return 0;
|
||
+
|
||
+ firstExpr = lfirst(expr->args);
|
||
+ secondExpr = lsecond(expr->args);
|
||
+ return (firstExpr && secondExpr && nodeTag(firstExpr) == T_Var && nodeTag(secondExpr) == T_Const);
|
||
+ }
|
||
*** ./include/commands/variable.h.orig Thu Jul 30 19:27:05 1998
|
||
--- ./include/commands/variable.h Mon Aug 31 17:23:32 1998
|
||
***************
|
||
*** 54,58 ****
|
||
--- 54,61 ----
|
||
extern bool show_geqo(void);
|
||
extern bool reset_geqo(void);
|
||
extern bool parse_geqo(const char *);
|
||
+ extern bool show_ksqo(void);
|
||
+ extern bool reset_ksqo(void);
|
||
+ extern bool parse_ksqo(const char *);
|
||
|
||
#endif /* VARIABLE_H */
|
||
*** ./include/optimizer/planmain.h.orig Mon Aug 31 18:27:03 1998
|
||
--- ./include/optimizer/planmain.h Mon Aug 31 18:26:04 1998
|
||
***************
|
||
*** 67,71 ****
|
||
--- 67,72 ----
|
||
extern List *check_having_qual_for_aggs(Node *clause,
|
||
List *subplanTargetList, List *groupClause);
|
||
extern List *check_having_qual_for_vars(Node *clause, List *targetlist_so_far);
|
||
+ extern void transformKeySetQuery(Query *origNode);
|
||
|
||
#endif /* PLANMAIN_H */
|
||
|
||
--------------BEFD1E6DA78A2DC20B524E32--
|
||
|
||
|
||
|
||
From daveh@insightdist.com Thu Sep 3 12:34:48 1998
|
||
Received: from u1.abs.net (root@u1.abs.net [207.114.0.131])
|
||
by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id MAA07696
|
||
for <maillist@candle.pha.pa.us>; Thu, 3 Sep 1998 12:34:46 -0400 (EDT)
|
||
Received: from insightdist.com (nobody@localhost)
|
||
by u1.abs.net (8.9.0/8.9.0) with UUCP id MAA23590
|
||
for maillist@candle.pha.pa.us; Thu, 3 Sep 1998 12:17:44 -0400 (EDT)
|
||
X-Authentication-Warning: u1.abs.net: nobody set sender to insightdist.com!daveh using -f
|
||
Received: from ceodev by insightdist.com (AIX 3.2/UCB 5.64/4.03)
|
||
id AA56436; Thu, 3 Sep 1998 11:51:24 -0400
|
||
Received: from daveh by ceodev (AIX 4.1/UCB 5.64/4.03)
|
||
id AA45986; Thu, 3 Sep 1998 11:51:24 -0400
|
||
Message-Id: <35EEBBEF.2158F68A@insightdist.com>
|
||
Date: Thu, 03 Sep 1998 11:55:28 -0400
|
||
From: David Hartwig <daveh@insightdist.com>
|
||
Organization: Insight Distribution Systems
|
||
X-Mailer: Mozilla 4.05 [en] (Win95; I)
|
||
Mime-Version: 1.0
|
||
To: Bruce Momjian <maillist@candle.pha.pa.us>
|
||
Cc: David Hartwig <daybee@bellatlantic.net>, pgsql-patches@postgreSQL.org
|
||
Subject: Re: [PATCHES] Interim AND/OR memory exaustion fix.
|
||
References: <199809030236.WAA22888@candle.pha.pa.us>
|
||
Content-Type: text/plain; charset=us-ascii
|
||
Content-Transfer-Encoding: 7bit
|
||
Status: RO
|
||
|
||
|
||
|
||
Bruce Momjian wrote:
|
||
|
||
> > I will be cleaning this up more before the Oct 1 deadline.
|
||
>
|
||
> > *** ./backend/commands/variable.c.orig Thu Jul 30 19:25:26 1998
|
||
> > --- ./backend/commands/variable.c Mon Aug 31 17:23:32 1998
|
||
>
|
||
> Applied. Let's keep talking to see if we can come up with a nice
|
||
> general solution to this.
|
||
>
|
||
|
||
Agreed.
|
||
|
||
> I have been thinking, and the trouble case is a query that uses only one
|
||
> table, and had only "column = value" statements. I believe this can be
|
||
> easily identified and reworked somehow.
|
||
>
|
||
|
||
If you are referring to the AND'less set of OR's, I do have plans to not let
|
||
that qualify since you have gotten the index scan working with OR's.
|
||
|
||
I also think that the qualification process should be tightened up. For
|
||
example force the number of AND's to be the same in each OR grouping. And
|
||
have at least n OR's to qualify. We just need to head off the memory
|
||
exhaustion.
|
||
|
||
> Your subtable idea may be a good one.
|
||
>
|
||
|
||
This sounds like a 6.5 thing. I needed to stop the bleeding for 6.4.
|
||
|
||
|
||
From bga@mug.org Tue Sep 8 03:39:37 1998
|
||
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA06237
|
||
for <maillist@candle.pha.pa.us>; Tue, 8 Sep 1998 03:39:36 -0400 (EDT)
|
||
Received: from bgalli.mug.org (bajor.mug.org [207.158.132.1]) by renoir.op.net (o1/$Revision: 1.6 $) with ESMTP id DAA03648 for <maillist@candle.pha.pa.us>; Tue, 8 Sep 1998 03:38:52 -0400 (EDT)
|
||
Received: from localhost (bga@localhost) by bgalli.mug.org (8.8.7/SCO5) with SMTP id DAA02895 for <maillist@candle.pha.pa.us>; Tue, 8 Sep 1998 03:31:26 -0400 (EDT)
|
||
Message-Id: <199809080731.DAA02895@bgalli.mug.org>
|
||
X-Authentication-Warning: bgalli.mug.org: bga@localhost didn't use HELO protocol
|
||
X-Mailer: exmh version 2.0.2 2/24/98
|
||
From: "Billy G. Allie" <Bill.Allie@mug.org>
|
||
Reply-To: "Billy G. Allie" <Bill.Allie@mug.org>
|
||
To: Bruce Momjian <maillist@candle.pha.pa.us>
|
||
Subject: Re: [HACKERS] flock patch breaks things here
|
||
In-reply-to: Your message of "Mon, 31 Aug 1998 00:36:34 EDT."
|
||
<199808310436.AAA07618@candle.pha.pa.us>
|
||
Mime-Version: 1.0
|
||
Content-Type: text/plain; charset=us-ascii
|
||
Date: Tue, 08 Sep 1998 03:31:26 -0400
|
||
Sender: bga@mug.org
|
||
Status: ROr
|
||
|
||
Bruce Momjian writes:
|
||
|
||
> I have been thinking about this. First, we can easily use fopen(r+) to
|
||
> check to see if the file exists, and if it does read the pid and do a
|
||
> kill -0 to see if it is running. If no one else does it, I will take it
|
||
> on.
|
||
|
||
It is better to use open with the O_CREAT and O_EXCL set. If the file does not
|
||
exist it will be created and the PID can be written to it. If the file exists
|
||
then the call will fail, at which point it can be opened with fread, and the
|
||
PID it contains can be checked to see if it still exists with kill. The open
|
||
call has the added advantage that 'The check for the existence of the file and
|
||
the creation of the file if it does not exist is atomic with respect to other
|
||
processes executing open naming the same filename in the same directory with
|
||
O_EXCL and O_CREAT set.' [from the UnixAWare 7 man page, open(2)].
|
||
|
||
Also, you can't just delete the file, create it and write the your PID to it
|
||
and assume that you have the lock, you need to close the file, sleep some
|
||
small amount of time and then open and read the file to see if you still have
|
||
the lock. If you like, I can take this task on.
|
||
|
||
Oh, the postmaster must clear the PID when it exits.
|
||
|
||
>
|
||
> Second, where to put the pid file. There is reason to put in /tmp,
|
||
> because it will get cleared in a reboot, and because it is locking the
|
||
> port number 5432. There is also reason to put it in /data because you
|
||
> can't have more than one postmaster running on a single data directory.
|
||
>
|
||
> So, we really want to lock both places. If this is going to make it
|
||
> easier for people to run more than one postmaster, because it will
|
||
> prevent/warn administrators when they try and put two postmasters in the
|
||
> same data dir or port, I say create the pid lock files both places, and
|
||
> give the admin a clear description of what he is doing wrong in each
|
||
> case.
|
||
|
||
IHMO, the pid should be put in the data directory. The reasoning that it will get cleared in a reboot is not sufficent since the logic used to create the PID file will delete it if the PID it contains is not a running process. Besides, I have used systems where /tmp was not cleared out on a re-boot (for various reasons). Also, I would rather have a script that explicitly removes the PID locking file at system statup (if it exists), in which case, it doesn't matter where it resides.
|
||
--
|
||
____ | Billy G. Allie | Domain....: Bill.Allie@mug.org
|
||
| /| | 7436 Hartwell | Compuserve: 76337,2061
|
||
|-/-|----- | Dearborn, MI 48126| MSN.......: B_G_Allie@email.msn.com
|
||
|/ |LLIE | (313) 582-1540 |
|
||
|
||
|
||
|
||
From owner-pgsql-general@hub.org Thu Oct 1 14:00:57 1998
|
||
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA12443
|
||
for <maillist@candle.pha.pa.us>; Thu, 1 Oct 1998 14:00:56 -0400 (EDT)
|
||
Received: from hub.org (majordom@hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.6 $) with ESMTP id NAA07930 for <maillist@candle.pha.pa.us>; Thu, 1 Oct 1998 13:57:47 -0400 (EDT)
|
||
Received: from localhost (majordom@localhost)
|
||
by hub.org (8.8.8/8.8.8) with SMTP id NAA26913;
|
||
Thu, 1 Oct 1998 13:56:29 -0400 (EDT)
|
||
(envelope-from owner-pgsql-general@hub.org)
|
||
Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Thu, 01 Oct 1998 13:55:56 +0000 (EDT)
|
||
Received: (from majordom@localhost)
|
||
by hub.org (8.8.8/8.8.8) id NAA26856
|
||
for pgsql-general-outgoing; Thu, 1 Oct 1998 13:55:54 -0400 (EDT)
|
||
(envelope-from owner-pgsql-general@postgreSQL.org)
|
||
X-Authentication-Warning: hub.org: majordom set sender to owner-pgsql-general@postgreSQL.org using -f
|
||
Received: from mail.utexas.edu (wb3-a.mail.utexas.edu [128.83.126.138])
|
||
by hub.org (8.8.8/8.8.8) with SMTP id NAA26840
|
||
for <pgsql-general@hub.org>; Thu, 1 Oct 1998 13:55:49 -0400 (EDT)
|
||
(envelope-from taral@mail.utexas.edu)
|
||
Received: (qmail 1198 invoked by uid 0); 1 Oct 1998 17:55:40 -0000
|
||
Received: from dial-24-13.ots.utexas.edu (HELO taral) (128.83.128.157)
|
||
by umbs-smtp-3 with SMTP; 1 Oct 1998 17:55:40 -0000
|
||
From: "Taral" <taral@mail.utexas.edu>
|
||
To: <pgsql-general@hub.org>
|
||
Subject: [GENERAL] CNF vs DNF
|
||
Date: Thu, 1 Oct 1998 12:55:39 -0500
|
||
Message-ID: <000001bded64$b34b2200$3b291f0a@taral>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3155.0
|
||
In-Reply-To: <F10BB1FAF801D111829B0060971D839F445B3D@cpsmail>
|
||
Importance: Normal
|
||
Sender: owner-pgsql-general@postgreSQL.org
|
||
Precedence: bulk
|
||
Status: RO
|
||
|
||
> select * from aa where (bb = 2 and ff = 3) or (bb = 4 and ff = 5);
|
||
|
||
I've been told that the system restructures these in CNF (conjunctive normal
|
||
form)... i.e. the above query turns into:
|
||
|
||
select * from aa where (bb = 2 or bb = 4) and (ff = 3 or bb = 4) and (bb = 2
|
||
or ff = 5) and (ff = 3 or ff = 5);
|
||
|
||
Much longer and much less efficient, AFAICT. Isn't it more efficient to do a
|
||
union of many queries (DNF) than an intersection of many subqueries (CNF)?
|
||
Certainly remembering the subqueries takes less memory... Also, queries
|
||
already in DNF are probably more common than queries in CNF, requiring less
|
||
rewrite.
|
||
|
||
Can someone clarify this?
|
||
|
||
Taral
|
||
|
||
|
||
|
||
From taral@mail.utexas.edu Fri Oct 2 01:35:42 1998
|
||
Received: from mail.utexas.edu (wb1-a.mail.utexas.edu [128.83.126.134])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with SMTP id BAA28231
|
||
for <maillist@candle.pha.pa.us>; Fri, 2 Oct 1998 01:35:27 -0400 (EDT)
|
||
Received: (qmail 16318 invoked by uid 0); 2 Oct 1998 05:35:13 -0000
|
||
Received: from dial-42-8.ots.utexas.edu (HELO taral) (128.83.111.216)
|
||
by umbs-smtp-1 with SMTP; 2 Oct 1998 05:35:13 -0000
|
||
From: "Taral" <taral@mail.utexas.edu>
|
||
To: "Bruce Momjian" <maillist@candle.pha.pa.us>
|
||
Cc: <pgsql-general@postgreSQL.org>
|
||
Subject: RE: [GENERAL] Long update query ? (also Re: [GENERAL] CNF vs. DNF)
|
||
Date: Fri, 2 Oct 1998 00:35:12 -0500
|
||
Message-ID: <000001bdedc6$6cf75d20$3b291f0a@taral>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
In-Reply-To: <199810020218.WAA23299@candle.pha.pa.us>
|
||
Importance: Normal
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3155.0
|
||
Status: ROr
|
||
|
||
> It currently convert to CNF so it can select the most restrictive
|
||
> restriction and join, and use those first. However, the CNF conversion
|
||
> is a memory exploder for some queries, and we certainly need to have
|
||
> another method to split up those queries into UNIONS. I think we need
|
||
> to code to identify those queries capable of being converted to UNIONS,
|
||
> and do that before the query gets to the CNF section. That would be
|
||
> great, and David Hartwig has implemented a limited capability of doing
|
||
> this, but we really need a general routine to do this with 100%
|
||
> reliability.
|
||
|
||
Well, if you're talking about a routine to generate a heuristic for CNF vs.
|
||
DNF, it is possible to precalculate the query sizes for CNF and DNF
|
||
rewrites...
|
||
|
||
For conversion to CNF:
|
||
|
||
At every node:
|
||
|
||
if nodeType = AND then f(node) = f(left) + f(right)
|
||
if nodeType = OR then f(node) = f(left) * f(right)
|
||
|
||
f(root) = a reasonably (but not wonderful) metric
|
||
|
||
For DNF just switch AND and OR in the above. You may want to compute both
|
||
metrics and compare... take the smaller one and use that path.
|
||
|
||
How to deal with other operators depends on their implementation...
|
||
|
||
Taral
|
||
|
||
|
||
From taral@mail.utexas.edu Fri Oct 2 12:48:27 1998
|
||
Received: from mail.utexas.edu (wb4-a.mail.utexas.edu [128.83.126.140])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with SMTP id MAA11438
|
||
for <maillist@candle.pha.pa.us>; Fri, 2 Oct 1998 12:48:25 -0400 (EDT)
|
||
Received: (qmail 15628 invoked by uid 0); 2 Oct 1998 16:47:50 -0000
|
||
Received: from dial-42-8.ots.utexas.edu (HELO taral) (128.83.111.216)
|
||
by umbs-smtp-4 with SMTP; 2 Oct 1998 16:47:50 -0000
|
||
From: "Taral" <taral@mail.utexas.edu>
|
||
To: "Bruce Momjian" <maillist@candle.pha.pa.us>
|
||
Cc: <hackers@postgreSQL.org>
|
||
Subject: RE: [GENERAL] Long update query ? (also Re: [GENERAL] CNF vs. DNF)
|
||
Date: Fri, 2 Oct 1998 11:47:48 -0500
|
||
Message-ID: <000301bdee24$63308740$3b291f0a@taral>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
In-reply-to: <199810021640.MAA10925@candle.pha.pa.us>
|
||
Importance: Normal
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3155.0
|
||
Status: RO
|
||
|
||
> > Create a temporary oid hash? (for each table selected on, I guess)
|
||
>
|
||
> What I did with indexes was to run the previous OR clause index
|
||
> restrictions through the qualification code, and make sure it failed,
|
||
> but I am not sure how that is going to work with a more complex WHERE
|
||
> clause. Perhaps I need to restrict this to just simple cases of
|
||
> constants, which are easy to pick out an run through. Doing this with
|
||
> joins would be very hard, I think.
|
||
|
||
Actually, I was thinking more of an index of returned rows... After each
|
||
subquery, the backend would check each row to see if it was already in the
|
||
index... Simple duplicate check, in other words. Of course, I don't know how
|
||
well this would behave with large tables being returned...
|
||
|
||
Anyone else have some ideas they want to throw in?
|
||
|
||
Taral
|
||
|
||
|
||
From taral@mail.utexas.edu Fri Oct 2 17:13:01 1998
|
||
Received: from mail.utexas.edu (wb1-a.mail.utexas.edu [128.83.126.134])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with SMTP id RAA20838
|
||
for <maillist@candle.pha.pa.us>; Fri, 2 Oct 1998 17:12:27 -0400 (EDT)
|
||
Received: (qmail 17418 invoked by uid 0); 2 Oct 1998 21:12:19 -0000
|
||
Received: from dial-46-30.ots.utexas.edu (HELO taral) (128.83.112.158)
|
||
by umbs-smtp-1 with SMTP; 2 Oct 1998 21:12:19 -0000
|
||
From: "Taral" <taral@mail.utexas.edu>
|
||
To: "Bruce Momjian" <maillist@candle.pha.pa.us>, <jwieck@debis.com>
|
||
Cc: <hackers@postgreSQL.org>
|
||
Subject: RE: [HACKERS] RE: [GENERAL] Long update query ? (also Re: [GENERAL] CNF vs. DNF)
|
||
Date: Fri, 2 Oct 1998 16:12:19 -0500
|
||
Message-ID: <000001bdee49$56c7cd40$3b291f0a@taral>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
In-reply-to: <199810021758.NAA15524@candle.pha.pa.us>
|
||
Importance: Normal
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3155.0
|
||
Status: ROr
|
||
|
||
> Another idea is that we rewrite queries such as:
|
||
>
|
||
> SELECT *
|
||
> FROM tab
|
||
> WHERE (a=1 AND b=2 AND c=3) OR
|
||
> (a=1 AND b=2 AND c=4) OR
|
||
> (a=1 AND b=2 AND c=5) OR
|
||
> (a=1 AND b=2 AND c=6)
|
||
>
|
||
> into:
|
||
>
|
||
> SELECT *
|
||
> FROM tab
|
||
> WHERE (a=1 AND b=2) AND (c=3 OR c=4 OR c=5 OR c=6)
|
||
|
||
Very nice, but that's like trying to code factorization of numbers... not
|
||
pretty, and very CPU intensive on complex queries...
|
||
|
||
Taral
|
||
|
||
|
||
From taral@mail.utexas.edu Fri Oct 2 17:49:59 1998
|
||
Received: from mail.utexas.edu (wb2-a.mail.utexas.edu [128.83.126.136])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with SMTP id RAA21488
|
||
for <maillist@candle.pha.pa.us>; Fri, 2 Oct 1998 17:49:52 -0400 (EDT)
|
||
Received: (qmail 23729 invoked by uid 0); 2 Oct 1998 21:49:27 -0000
|
||
Received: from dial-2-6.ots.utexas.edu (HELO taral) (128.83.204.22)
|
||
by umbs-smtp-2 with SMTP; 2 Oct 1998 21:49:27 -0000
|
||
From: "Taral" <taral@mail.utexas.edu>
|
||
To: "Bruce Momjian" <maillist@candle.pha.pa.us>
|
||
Cc: <jwieck@debis.com>, <hackers@postgreSQL.org>
|
||
Subject: RE: [HACKERS] RE: [GENERAL] Long update query ? (also Re: [GENERAL] CNF vs. DNF)
|
||
Date: Fri, 2 Oct 1998 16:49:26 -0500
|
||
Message-ID: <000001bdee4e$86688b20$3b291f0a@taral>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
In-Reply-To: <199810022139.RAA21082@candle.pha.pa.us>
|
||
Importance: Normal
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3155.0
|
||
Status: ROr
|
||
|
||
> > Very nice, but that's like trying to code factorization of
|
||
> numbers... not
|
||
> > pretty, and very CPU intensive on complex queries...
|
||
>
|
||
> Yes, but how large are the WHERE clauses going to be? Considering the
|
||
> cost of cnfify() and UNION, it seems like a clear win. Is it general
|
||
> enough to solve our problems?
|
||
|
||
Could be... the examples I received where the cnfify() was really bad were
|
||
cases where the query was submitted alredy in DNF... and where the UNION was
|
||
a simple one. However, I don't know of any algorithms for generic
|
||
simplification of logical constraints. One problem is resolution/selection
|
||
of factors:
|
||
|
||
SELECT * FROM a WHERE (a = 1 AND b = 2 AND c = 3) OR (a = 4 AND b = 2 AND c
|
||
= 3) OR (a = 1 AND b = 5 AND c = 3) OR (a = 1 AND b = 2 AND c = 6);
|
||
|
||
Try that on for size. You can understand why that code gets ugly, fast.
|
||
Somebody could try coding it, but it's not a clear win to me.
|
||
|
||
My original heuristic was missing one thing: "Where the heuristic fails to
|
||
process or decide, default to CNF." Since that's the current behavior, we're
|
||
less likely to break things.
|
||
|
||
Taral
|
||
|
||
|
||
From owner-pgsql-hackers@hub.org Fri Oct 2 19:28:09 1998
|
||
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA23341
|
||
for <maillist@candle.pha.pa.us>; Fri, 2 Oct 1998 19:28:08 -0400 (EDT)
|
||
Received: from hub.org (majordom@hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.6 $) with ESMTP id SAA18003 for <maillist@candle.pha.pa.us>; Fri, 2 Oct 1998 18:21:37 -0400 (EDT)
|
||
Received: from localhost (majordom@localhost)
|
||
by hub.org (8.8.8/8.8.8) with SMTP id SAA01250;
|
||
Fri, 2 Oct 1998 18:08:02 -0400 (EDT)
|
||
(envelope-from owner-pgsql-hackers@hub.org)
|
||
Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 02 Oct 1998 18:04:37 +0000 (EDT)
|
||
Received: (from majordom@localhost)
|
||
by hub.org (8.8.8/8.8.8) id SAA00847
|
||
for pgsql-hackers-outgoing; Fri, 2 Oct 1998 18:04:35 -0400 (EDT)
|
||
(envelope-from owner-pgsql-hackers@postgreSQL.org)
|
||
X-Authentication-Warning: hub.org: majordom set sender to owner-pgsql-hackers@postgreSQL.org using -f
|
||
Received: from mail.utexas.edu (wb2-a.mail.utexas.edu [128.83.126.136])
|
||
by hub.org (8.8.8/8.8.8) with SMTP id SAA00806
|
||
for <hackers@postgreSQL.org>; Fri, 2 Oct 1998 18:04:26 -0400 (EDT)
|
||
(envelope-from taral@mail.utexas.edu)
|
||
Received: (qmail 29662 invoked by uid 0); 2 Oct 1998 22:04:25 -0000
|
||
Received: from dial-2-6.ots.utexas.edu (HELO taral) (128.83.204.22)
|
||
by umbs-smtp-2 with SMTP; 2 Oct 1998 22:04:25 -0000
|
||
From: "Taral" <taral@mail.utexas.edu>
|
||
To: "Bruce Momjian" <maillist@candle.pha.pa.us>
|
||
Cc: <jwieck@debis.com>, <hackers@postgreSQL.org>
|
||
Subject: RE: [HACKERS] RE: [GENERAL] Long update query ? (also Re: [GENERAL] CNF vs. DNF)
|
||
Date: Fri, 2 Oct 1998 17:04:24 -0500
|
||
Message-ID: <000201bdee50$9d9c4320$3b291f0a@taral>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
In-Reply-To: <199810022157.RAA21769@candle.pha.pa.us>
|
||
Importance: Normal
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3155.0
|
||
Sender: owner-pgsql-hackers@postgreSQL.org
|
||
Precedence: bulk
|
||
Status: ROr
|
||
|
||
> How do we do that with UNION, and return the right rows. Seems the
|
||
> _join_ happending multiple times would be much worse than the factoring.
|
||
|
||
Ok... We have two problems:
|
||
|
||
1) DNF for unjoined queries.
|
||
2) Factorization for the rest.
|
||
|
||
I have some solutions for (1). Not for (2). Remember that unjoined queries
|
||
are quite common. :)
|
||
|
||
For (1), we can always try to parallel the multiple queries... especially in
|
||
the case where a sequential search is required.
|
||
|
||
Taral
|
||
|
||
|
||
|
||
From owner-pgsql-hackers@hub.org Sat Oct 3 23:32:35 1998
|
||
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA06644
|
||
for <maillist@candle.pha.pa.us>; Sat, 3 Oct 1998 23:31:13 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.6 $) with ESMTP id XAA26912 for <maillist@candle.pha.pa.us>; Sat, 3 Oct 1998 23:14:01 -0400 (EDT)
|
||
Received: from localhost (majordom@localhost)
|
||
by hub.org (8.8.8/8.8.8) with SMTP id WAA04407;
|
||
Sat, 3 Oct 1998 22:07:05 -0400 (EDT)
|
||
(envelope-from owner-pgsql-hackers@hub.org)
|
||
Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sat, 03 Oct 1998 22:02:00 +0000 (EDT)
|
||
Received: (from majordom@localhost)
|
||
by hub.org (8.8.8/8.8.8) id WAA04010
|
||
for pgsql-hackers-outgoing; Sat, 3 Oct 1998 22:01:59 -0400 (EDT)
|
||
(envelope-from owner-pgsql-hackers@postgreSQL.org)
|
||
X-Authentication-Warning: hub.org: majordom set sender to owner-pgsql-hackers@postgreSQL.org using -f
|
||
Received: from candle.pha.pa.us (maillist@s5-03.ppp.op.net [209.152.195.67])
|
||
by hub.org (8.8.8/8.8.8) with ESMTP id WAA03968
|
||
for <hackers@postgreSQL.org>; Sat, 3 Oct 1998 22:00:37 -0400 (EDT)
|
||
(envelope-from maillist@candle.pha.pa.us)
|
||
Received: (from maillist@localhost)
|
||
by candle.pha.pa.us (8.9.0/8.9.0) id VAA04640;
|
||
Sat, 3 Oct 1998 21:57:30 -0400 (EDT)
|
||
From: Bruce Momjian <maillist@candle.pha.pa.us>
|
||
Message-Id: <199810040157.VAA04640@candle.pha.pa.us>
|
||
Subject: Re: [HACKERS] RE: [GENERAL] Long update query ? (also Re: [GENERAL] CNF vs. DNF)
|
||
In-Reply-To: <000201bdee50$9d9c4320$3b291f0a@taral> from Taral at "Oct 2, 1998 5: 4:24 pm"
|
||
To: taral@mail.utexas.edu (Taral)
|
||
Date: Sat, 3 Oct 1998 21:57:30 -0400 (EDT)
|
||
Cc: jwieck@debis.com, hackers@postgreSQL.org
|
||
X-Mailer: ELM [version 2.4ME+ PL47 (25)]
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain; charset=US-ASCII
|
||
Content-Transfer-Encoding: 7bit
|
||
Sender: owner-pgsql-hackers@postgreSQL.org
|
||
Precedence: bulk
|
||
Status: RO
|
||
|
||
|
||
I have another idea.
|
||
|
||
When we cnfify, this:
|
||
|
||
(A AND B) OR (C AND D)
|
||
|
||
becomes
|
||
|
||
(A OR C) AND (A OR D) AND (B OR C) AND (B OR D)
|
||
|
||
however if A and C are identical, this could become:
|
||
|
||
(A OR A) AND (A OR D) AND (B OR A) AND (B OR D)
|
||
|
||
and A OR A is A:
|
||
|
||
A AND (A OR D) AND (B OR A) AND (B OR D)
|
||
|
||
and since we are now saying A has to be true, we can remove OR's with A:
|
||
|
||
A AND (B OR D)
|
||
|
||
Much smaller, and a big win for queries like:
|
||
|
||
SELECT *
|
||
FROM tab
|
||
WHERE (a=1 AND b=2) OR
|
||
(a=1 AND b=3)
|
||
|
||
This becomes:
|
||
|
||
(a=1) AND (b=2 OR b=3)
|
||
|
||
which is accurate, and uses our OR indexing.
|
||
|
||
Seems I could code cnfify() to look for identical qualifications in two
|
||
joined OR clauses and remove the duplicates.
|
||
|
||
Sound like big win, and fairly easy and inexpensive in processing time.
|
||
|
||
Comments?
|
||
|
||
--
|
||
Bruce Momjian | http://www.op.net/~candle
|
||
maillist@candle.pha.pa.us | (610) 853-3000
|
||
+ If your life is a hard drive, | 830 Blythe Avenue
|
||
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
|
||
|
||
|
||
|
||
From taral@mail.utexas.edu Sat Oct 3 22:43:41 1998
|
||
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA05961
|
||
for <maillist@candle.pha.pa.us>; Sat, 3 Oct 1998 22:42:18 -0400 (EDT)
|
||
Received: from mail.utexas.edu (wb2-a.mail.utexas.edu [128.83.126.136]) by renoir.op.net (o1/$Revision: 1.6 $) with SMTP id WAA25111 for <maillist@candle.pha.pa.us>; Sat, 3 Oct 1998 22:27:34 -0400 (EDT)
|
||
Received: (qmail 25622 invoked by uid 0); 4 Oct 1998 02:26:21 -0000
|
||
Received: from dial-42-9.ots.utexas.edu (HELO taral) (128.83.111.217)
|
||
by umbs-smtp-2 with SMTP; 4 Oct 1998 02:26:21 -0000
|
||
From: "Taral" <taral@mail.utexas.edu>
|
||
To: "Bruce Momjian" <maillist@candle.pha.pa.us>
|
||
Cc: <jwieck@debis.com>, <hackers@postgreSQL.org>
|
||
Subject: RE: [HACKERS] RE: [GENERAL] Long update query ? (also Re: [GENERAL] CNF vs. DNF)
|
||
Date: Sat, 3 Oct 1998 21:26:20 -0500
|
||
Message-ID: <000501bdef3e$5f5293a0$3b291f0a@taral>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
Importance: Normal
|
||
In-Reply-To: <199810040157.VAA04640@candle.pha.pa.us>
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3155.0
|
||
Status: ROr
|
||
|
||
> however if A and C are identical, this could become:
|
||
>
|
||
> (A OR A) AND (A OR D) AND (B OR A) AND (B OR D)
|
||
>
|
||
> and A OR A is A:
|
||
>
|
||
> A AND (A OR D) AND (B OR A) AND (B OR D)
|
||
>
|
||
> and since we are now saying A has to be true, we can remove OR's with A:
|
||
>
|
||
> A AND (B OR D)
|
||
|
||
Very nice... and you could do that after each iteration of the rewrite,
|
||
preventing the size from getting too big. :)
|
||
|
||
I have a symbolic expression tree evaluator that would be perfect for
|
||
this... I'll see if I can't adapt it.
|
||
|
||
Can someone mail me the structures for expression trees? I don't want to
|
||
have to excise them from the source. Please?
|
||
|
||
Taral
|
||
|
||
|
||
From daveh@insightdist.com Mon Nov 9 13:31:07 1998
|
||
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA00997
|
||
for <maillist@candle.pha.pa.us>; Mon, 9 Nov 1998 13:31:00 -0500 (EST)
|
||
Received: from u1.abs.net (root@u1.abs.net [207.114.0.131]) by renoir.op.net (o1/$Revision: 1.6 $) with ESMTP id NAA26657 for <maillist@candle.pha.pa.us>; Mon, 9 Nov 1998 13:10:14 -0500 (EST)
|
||
Received: from insightdist.com (nobody@localhost)
|
||
by u1.abs.net (8.9.0/8.9.0) with UUCP id MAA17710
|
||
for maillist@candle.pha.pa.us; Mon, 9 Nov 1998 12:52:05 -0500 (EST)
|
||
X-Authentication-Warning: u1.abs.net: nobody set sender to insightdist.com!daveh using -f
|
||
Received: from ceodev by insightdist.com (AIX 3.2/UCB 5.64/4.03)
|
||
id AA43498; Mon, 9 Nov 1998 12:38:24 -0500
|
||
Received: from daveh by ceodev (AIX 4.1/UCB 5.64/4.03)
|
||
id AA54446; Mon, 9 Nov 1998 12:38:23 -0500
|
||
Message-Id: <3647296F.6F7FDDD2@insightdist.com>
|
||
Date: Mon, 09 Nov 1998 12:42:07 -0500
|
||
From: David Hartwig <daveh@insightdist.com>
|
||
Organization: Insight Distribution Systems
|
||
X-Mailer: Mozilla 4.5 [en] (Win98; I)
|
||
X-Accept-Language: en
|
||
Mime-Version: 1.0
|
||
To: Bob Kruger <bkruger@mindspring.com>,
|
||
Bruce Momjian <maillist@candle.pha.pa.us>
|
||
Cc: pgsql-general@postgreSQL.org, Byron Nikolaidis <byronn@insightdist.com>
|
||
Subject: Re: [GENERAL] Incrementing a Serial Field
|
||
References: <3.0.5.32.19981109110757.0082c950@mindspring.com>
|
||
Content-Type: multipart/mixed;
|
||
boundary="------------3D3EE7F67DFC542D3928BB7E"
|
||
Status: ROr
|
||
|
||
This is a multi-part message in MIME format.
|
||
--------------3D3EE7F67DFC542D3928BB7E
|
||
Content-Type: multipart/alternative;
|
||
boundary="------------43E2CC34278FA08EFC9E0611"
|
||
|
||
|
||
--------------43E2CC34278FA08EFC9E0611
|
||
Content-Type: text/plain; charset=us-ascii
|
||
Content-Transfer-Encoding: 7bit
|
||
|
||
|
||
|
||
Bob Kruger wrote:
|
||
|
||
> The second question is that I noticed the ODBC bug (feature?) when linking
|
||
> Postgres to MS Access still exists. This bug occurs when linking a MS
|
||
> Access table to a Postgres table, and identifying more than one field as
|
||
> the unique record identifier. This makes Postgres run until it exhausts
|
||
> all available memory. Does anyone know a way around this? Enabling read
|
||
> only ODBC is a feature I would like to make available, but I do not want
|
||
> the possibility of postgres crashing because of an error on the part of a
|
||
> MS Access user.
|
||
>
|
||
> BTW - Having capability to be linked to an Access database is not an
|
||
> option. The current project I am working on calls for that, so it is a
|
||
> necessary evil that I hav to live with.
|
||
>
|
||
|
||
In the driver connection settings add the following line.
|
||
|
||
SET ksql TO 'on';
|
||
|
||
Stands for: keyset query optimization. This is not considered a final
|
||
solution. As such, it is undocumented. Some time in the next day or so, we
|
||
will be releasing a version of the driver which will automatically SET ksqo.
|
||
|
||
You will most likely be satisfied with the results. One problem with this
|
||
solution, however, is that it does not work if you have any (some kinds of?)
|
||
arrays in the table you are browsing. This is a sideffect of the rewrite to a
|
||
UNION which performs an internal sort unique.
|
||
|
||
Also, if you are using row versioning you may need to overload some operators
|
||
for xid and int4. I have included a script that will take care of this.
|
||
|
||
Bruce, can I get these operators hardcoded into 6.4.1- assuming there will be
|
||
one. The operators necessitated by the UNION sideffects.
|
||
|
||
|
||
--------------43E2CC34278FA08EFC9E0611
|
||
Content-Type: text/html; charset=us-ascii
|
||
Content-Transfer-Encoding: 7bit
|
||
|
||
<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
|
||
<html>
|
||
|
||
<p>Bob Kruger wrote:
|
||
<blockquote TYPE=CITE>The second question is that I noticed the ODBC bug
|
||
(feature?) when linking
|
||
<br>Postgres to MS Access still exists. This bug occurs when linking
|
||
a MS
|
||
<br>Access table to a Postgres table, and identifying more than one field
|
||
as
|
||
<br>the unique record identifier. This makes Postgres run until it
|
||
exhausts
|
||
<br>all available memory. Does anyone know a way around this?
|
||
Enabling read
|
||
<br>only ODBC is a feature I would like to make available, but I do not
|
||
want
|
||
<br>the possibility of postgres crashing because of an error on the part
|
||
of a
|
||
<br>MS Access user.
|
||
<p>BTW - Having capability to be linked to an Access database is not an
|
||
<br>option. The current project I am working on calls for that, so
|
||
it is a
|
||
<br>necessary evil that I hav to live with.
|
||
<br> </blockquote>
|
||
In the driver connection settings add the following line.
|
||
<p> <tt> SET ksql TO 'on';</tt><tt></tt>
|
||
<p>Stands for: keyset query optimization. This is not considered
|
||
a final solution. As such, it is undocumented. Some time
|
||
in the next day or so, we will be releasing a version of the driver which
|
||
will automatically SET ksqo.
|
||
<p>You will most likely be satisfied with the results. One
|
||
problem with this solution, however, is that it does not work if
|
||
you have any (some kinds of?) arrays in the table you are browsing.
|
||
This is a sideffect of the rewrite to a UNION which performs an internal
|
||
sort unique.
|
||
<p>Also, if you are using row versioning you may need to overload some
|
||
operators for xid and int4. I have included a script that will take
|
||
care of this.
|
||
<p>Bruce, can I get these operators hardcoded into 6.4.1- assuming there
|
||
will be one. The operators necessitated by the UNION
|
||
sideffects.
|
||
<br> </html>
|
||
|
||
--------------43E2CC34278FA08EFC9E0611--
|
||
|
||
--------------3D3EE7F67DFC542D3928BB7E
|
||
Content-Type: text/plain; charset=us-ascii;
|
||
name="xidint4.sql"
|
||
Content-Transfer-Encoding: 7bit
|
||
Content-Disposition: inline;
|
||
filename="xidint4.sql"
|
||
|
||
-- Insight Distribution Systems - System V - Apr 1998
|
||
-- @(#)xidint4.sql 1.2 :/sccs/sql/extend/s.xidint4.sql 10/2/98 13:40:19"
|
||
|
||
create function int4eq(xid,int4)
|
||
returns bool
|
||
as ''
|
||
language 'internal';
|
||
|
||
create operator = (
|
||
leftarg=xid,
|
||
rightarg=int4,
|
||
procedure=int4eq,
|
||
commutator='=',
|
||
negator='<>',
|
||
restrict=eqsel,
|
||
join=eqjoinsel
|
||
);
|
||
|
||
create function int4lt(xid,xid)
|
||
returns bool
|
||
as ''
|
||
language 'internal';
|
||
|
||
create operator < (
|
||
leftarg=xid,
|
||
rightarg=xid,
|
||
procedure=int4lt,
|
||
commutator='=',
|
||
negator='<>',
|
||
restrict=eqsel,
|
||
join=eqjoinsel
|
||
);
|
||
|
||
|
||
|
||
--------------3D3EE7F67DFC542D3928BB7E--
|
||
|
||
|