mirror of
https://git.postgresql.org/git/postgresql.git
synced 2025-01-12 18:34:36 +08:00
11370 lines
488 KiB
Plaintext
11370 lines
488 KiB
Plaintext
From pgsql-hackers-owner+M174@hub.org Sun Mar 12 22:31:11 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA25886
|
||
for <pgman@candle.pha.pa.us>; Sun, 12 Mar 2000 23:31:10 -0500 (EST)
|
||
Received: from news.tht.net (news.hub.org [216.126.91.242]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id XAA04589 for <pgman@candle.pha.pa.us>; Sun, 12 Mar 2000 23:19:33 -0500 (EST)
|
||
Received: from hub.org (hub.org [216.126.84.1])
|
||
by news.tht.net (8.9.3/8.9.3) with SMTP id XAA42854;
|
||
Sun, 12 Mar 2000 23:05:05 -0500 (EST)
|
||
(envelope-from pgsql-hackers-owner+M174@hub.org)
|
||
Received: from candle.pha.pa.us (root@s5-03.ppp.op.net [209.152.195.67])
|
||
by hub.org (8.9.3/8.9.3) with ESMTP id XAA95917
|
||
for <pgsql-hackers@postgreSQL.org>; Sun, 12 Mar 2000 23:00:56 -0500 (EST)
|
||
(envelope-from pgman@candle.pha.pa.us)
|
||
Received: (from pgman@localhost)
|
||
by candle.pha.pa.us (8.9.0/8.9.0) id WAA25403
|
||
for pgsql-hackers@postgreSQL.org; Sun, 12 Mar 2000 22:59:56 -0500 (EST)
|
||
From: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
Message-Id: <200003130359.WAA25403@candle.pha.pa.us>
|
||
Subject: [HACKERS] Fix for RENAME
|
||
To: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Date: Sun, 12 Mar 2000 22:59:56 -0500 (EST)
|
||
X-Mailer: ELM [version 2.4ME+ PL72 (25)]
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain; charset=US-ASCII
|
||
Content-Transfer-Encoding: 7bit
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
I have thought about the issue with ALTER TABLE RENAME and keeping the
|
||
file system in sync with the database.
|
||
|
||
It seems there are three commands that can cause these to get out of
|
||
sync:
|
||
|
||
CREATE TABLE/INDEX
|
||
DROP TABLE/INDEX
|
||
ALTER TABLE RENAME
|
||
|
||
Now, if we had file names based only on the oid, we can eliminate file
|
||
renaming for RENAME, but the others are still a problem.
|
||
|
||
Seems there are three ways to get out of sync:
|
||
|
||
ABORT transaction
|
||
backend crash
|
||
OS crash
|
||
|
||
The last two are the same, except the backend crash restarts the
|
||
postmaster, while the OS crash has the postmaster starting up normally.
|
||
|
||
Here is my idea. Create a C List of file names to unlink on transaction
|
||
commit or abort. For CREATE, unlink created files on transaction ABORT.
|
||
For DROP, unlink dropped files on COMMIT. For RENAME, create a hard
|
||
link for the new table linked to old table, and unlink the old file name
|
||
on COMMIT or the new file on ABORT.
|
||
|
||
That takes care of COMMIT and ABORT. For backend crash or OS crash, add
|
||
a postgres command-line flag for recovery. Have the postmaster on
|
||
startup or shared memory refresh start up a postgres backend on every
|
||
database with the recovery flag set. Have the postgres backend find all
|
||
the oids in the pg_class table, and have it go through every file in the
|
||
database directory and remove all files that don't match the oids/names
|
||
in pg_class. Also, remove all old sort, noname, and temp files at the
|
||
same time. Seems we should be doing this anyway.
|
||
|
||
Care would have to be taken that a corrupted database that caused a
|
||
postgres crash on connection would not get the postmaster startup into
|
||
an infinite loop.
|
||
|
||
Comments?
|
||
|
||
--
|
||
Bruce Momjian | http://www.op.net/~candle
|
||
pgman@candle.pha.pa.us | (610) 853-3000
|
||
+ If your life is a hard drive, | 830 Blythe Avenue
|
||
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
|
||
|
||
From reedstrm@wallace.ece.rice.edu Tue Mar 14 12:33:31 2000
|
||
Received: from wallace.ece.rice.edu (root@wallace.ece.rice.edu [128.42.12.154])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA23826
|
||
for <pgman@candle.pha.pa.us>; Tue, 14 Mar 2000 13:33:29 -0500 (EST)
|
||
Received: by wallace.ece.rice.edu
|
||
via sendmail from stdin
|
||
id <m12Uw8K-000LELC@wallace.ece.rice.edu> (Debian Smail3.2.0.102)
|
||
for pgman@candle.pha.pa.us; Tue, 14 Mar 2000 12:33:32 -0600 (CST)
|
||
Date: Tue, 14 Mar 2000 12:33:32 -0600
|
||
From: "Ross J. Reedstrom" <reedstrm@wallace.ece.rice.edu>
|
||
To: Hiroshi Inoue <Inoue@tpf.co.jp>
|
||
Cc: Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Fix for RENAME
|
||
Message-ID: <20000314123331.A6094@rice.edu>
|
||
References: <200003140317.WAA27733@candle.pha.pa.us> <000c01bf8d75$a0016800$2801007e@tpf.co.jp>
|
||
Mime-Version: 1.0
|
||
Content-Type: text/plain; charset=us-ascii
|
||
User-Agent: Mutt/1.0i
|
||
In-Reply-To: <000c01bf8d75$a0016800$2801007e@tpf.co.jp>; from Inoue@tpf.co.jp on Tue, Mar 14, 2000 at 02:24:52PM +0900
|
||
Status: RO
|
||
|
||
Hiroshi -
|
||
I've just about finished working up a patch to store the physical
|
||
file name in the pg_class table. There are only two places that
|
||
require a Rule for generating the filename, and one of them is
|
||
only used for bootstrapping. For the initial cut, I used the rule:
|
||
|
||
The filename consists of the TABLENAME, and underscore, and the OID.
|
||
If this is longer than NAMEDATALEN, shorten the TABLENAME.
|
||
|
||
I implemented this rule by exporting Tom's makeObjectName function
|
||
from analyze.c, which is used to make other system generated names
|
||
that are have a requirement to be human readable. Replacing this
|
||
rule with any other in the future would be straightforward, except
|
||
for bootstrap. There are a number of places in bootstrap that need to
|
||
know the filename. I've factored them out into yet another set of
|
||
#defines (in catname.h) to make that easier.
|
||
|
||
|
||
I'm working through the regression tests right now: this is a relatively
|
||
extensive change, since it modifies the low level access routines, and the
|
||
buffer cache (which I indexed on physical filename, rather than relname,
|
||
as it is now) Hopefully, I caught all the places that assume relname ==
|
||
filename == unique name within a single database (see, I want schemas...)
|
||
|
||
Ross
|
||
--
|
||
Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu>
|
||
NSBRI Research Scientist/Programmer
|
||
Computer and Information Technology Institute
|
||
Rice University, 6100 S. Main St., Houston, TX 77005
|
||
|
||
|
||
|
||
|
||
|
||
On Tue, Mar 14, 2000 at 02:24:52PM +0900, Hiroshi Inoue wrote:
|
||
> > -----Original Message-----
|
||
> > From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
|
||
> >
|
||
> > > > They use the existing table file. It is only when
|
||
> > > > adding/removing/renaming file system files that this
|
||
> > out-of-sync problem
|
||
> > > > happens.
|
||
> > > >
|
||
> >
|
||
> > Not sure. I was going to get the CREATE/DROP/RENAME working as it
|
||
> > should then as we add more features, we can implement this solution for
|
||
> > them too.
|
||
> >
|
||
>
|
||
> Hmm,is general solution difficult ?
|
||
> Is more flexible naming rule bad ?
|
||
>
|
||
> This the 3rd or 4th time that I mention the following.
|
||
>
|
||
> PostgreSQL doesn't keep the information in itself where tables are
|
||
> allocated. So we need a naming rule to find where existent tables
|
||
> are allocated. Don't you wonder the spec ?
|
||
>
|
||
> Regards.
|
||
>
|
||
> Hiroshi Inoue
|
||
> Inoue@tpf.co.jp
|
||
>
|
||
>
|
||
|
||
From pgsql-hackers-owner+M74@hub.org Tue Mar 14 18:14:15 2000
|
||
Received: from hub.org (hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA06093
|
||
for <pgman@candle.pha.pa.us>; Tue, 14 Mar 2000 19:14:13 -0500 (EST)
|
||
Received: from hub.org (hub.org [216.126.84.1])
|
||
by hub.org (8.9.3/8.9.3) with SMTP id SAA95465;
|
||
Tue, 14 Mar 2000 18:45:35 -0500 (EST)
|
||
(envelope-from pgsql-hackers-owner+M74@hub.org)
|
||
Received: from wallace.ece.rice.edu (root@wallace.ece.rice.edu [128.42.12.154])
|
||
by hub.org (8.9.3/8.9.3) with ESMTP id NAA31276
|
||
for <pgsql-hackers@postgresql.org>; Tue, 14 Mar 2000 13:33:52 -0500 (EST)
|
||
(envelope-from reedstrm@wallace.ece.rice.edu)
|
||
Received: by wallace.ece.rice.edu
|
||
via sendmail from stdin
|
||
id <m12Uw8K-000LELC@wallace.ece.rice.edu> (Debian Smail3.2.0.102)
|
||
for pgsql-hackers@postgresql.org; Tue, 14 Mar 2000 12:33:32 -0600 (CST)
|
||
Date: Tue, 14 Mar 2000 12:33:32 -0600
|
||
From: "Ross J. Reedstrom" <reedstrm@wallace.ece.rice.edu>
|
||
To: Hiroshi Inoue <Inoue@tpf.co.jp>
|
||
Cc: Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Fix for RENAME
|
||
Message-ID: <20000314123331.A6094@rice.edu>
|
||
References: <200003140317.WAA27733@candle.pha.pa.us> <000c01bf8d75$a0016800$2801007e@tpf.co.jp>
|
||
Mime-Version: 1.0
|
||
Content-Type: text/plain; charset=us-ascii
|
||
User-Agent: Mutt/1.0i
|
||
In-Reply-To: <000c01bf8d75$a0016800$2801007e@tpf.co.jp>; from Inoue@tpf.co.jp on Tue, Mar 14, 2000 at 02:24:52PM +0900
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
Hiroshi -
|
||
I've just about finished working up a patch to store the physical
|
||
file name in the pg_class table. There are only two places that
|
||
require a Rule for generating the filename, and one of them is
|
||
only used for bootstrapping. For the initial cut, I used the rule:
|
||
|
||
The filename consists of the TABLENAME, and underscore, and the OID.
|
||
If this is longer than NAMEDATALEN, shorten the TABLENAME.
|
||
|
||
I implemented this rule by exporting Tom's makeObjectName function
|
||
from analyze.c, which is used to make other system generated names
|
||
that are have a requirement to be human readable. Replacing this
|
||
rule with any other in the future would be straightforward, except
|
||
for bootstrap. There are a number of places in bootstrap that need to
|
||
know the filename. I've factored them out into yet another set of
|
||
#defines (in catname.h) to make that easier.
|
||
|
||
|
||
I'm working through the regression tests right now: this is a relatively
|
||
extensive change, since it modifies the low level access routines, and the
|
||
buffer cache (which I indexed on physical filename, rather than relname,
|
||
as it is now) Hopefully, I caught all the places that assume relname ==
|
||
filename == unique name within a single database (see, I want schemas...)
|
||
|
||
Ross
|
||
--
|
||
Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu>
|
||
NSBRI Research Scientist/Programmer
|
||
Computer and Information Technology Institute
|
||
Rice University, 6100 S. Main St., Houston, TX 77005
|
||
|
||
|
||
|
||
|
||
|
||
On Tue, Mar 14, 2000 at 02:24:52PM +0900, Hiroshi Inoue wrote:
|
||
> > -----Original Message-----
|
||
> > From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
|
||
> >
|
||
> > > > They use the existing table file. It is only when
|
||
> > > > adding/removing/renaming file system files that this
|
||
> > out-of-sync problem
|
||
> > > > happens.
|
||
> > > >
|
||
> >
|
||
> > Not sure. I was going to get the CREATE/DROP/RENAME working as it
|
||
> > should then as we add more features, we can implement this solution for
|
||
> > them too.
|
||
> >
|
||
>
|
||
> Hmm,is general solution difficult ?
|
||
> Is more flexible naming rule bad ?
|
||
>
|
||
> This the 3rd or 4th time that I mention the following.
|
||
>
|
||
> PostgreSQL doesn't keep the information in itself where tables are
|
||
> allocated. So we need a naming rule to find where existent tables
|
||
> are allocated. Don't you wonder the spec ?
|
||
>
|
||
> Regards.
|
||
>
|
||
> Hiroshi Inoue
|
||
> Inoue@tpf.co.jp
|
||
>
|
||
>
|
||
|
||
From mascarm@mascari.com Tue Mar 14 16:34:04 2000
|
||
Received: from corvette.mascari.com (dhcp26136016.columbus.rr.com [24.26.136.16])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA04395
|
||
for <pgman@candle.pha.pa.us>; Tue, 14 Mar 2000 17:32:14 -0500 (EST)
|
||
Received: from mascari.com (ferrari.mascari.com [192.168.2.1])
|
||
by corvette.mascari.com (8.9.3/8.9.3) with ESMTP id RAA09562;
|
||
Tue, 14 Mar 2000 17:27:22 -0500
|
||
Message-ID: <38CEBD0A.52ADB37E@mascari.com>
|
||
Date: Tue, 14 Mar 2000 17:28:26 -0500
|
||
From: Mike Mascari <mascarm@mascari.com>
|
||
X-Mailer: Mozilla 4.7 [en] (Win95; I)
|
||
X-Accept-Language: en
|
||
MIME-Version: 1.0
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
CC: Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Fix for RENAME
|
||
References: <200003141545.KAA17518@candle.pha.pa.us>
|
||
Content-Type: text/plain; charset=us-ascii
|
||
Content-Transfer-Encoding: 7bit
|
||
Status: RO
|
||
|
||
Bruce Momjian wrote:
|
||
>
|
||
> > Hmm,is general solution difficult ?
|
||
> > Is more flexible naming rule bad ?
|
||
> >
|
||
> > This the 3rd or 4th time that I mention the following.
|
||
>
|
||
> That's because I didn't understand.
|
||
>
|
||
> >
|
||
> > PostgreSQL doesn't keep the information in itself where tables are
|
||
> > allocated. So we need a naming rule to find where existent tables
|
||
> > are allocated. Don't you wonder the spec ?
|
||
>
|
||
> How does naming the files in the database help our DROP/CREATE problem?
|
||
> It would help RENAME a little bit. Not sure about the others because
|
||
> currently they don't have a problem.
|
||
|
||
I've been thinking about this somewhat, and I think the first
|
||
step necessary in correctly supporting ROLLBACK-able DDL
|
||
statements in transactions is the change to <relname>_<oid>.
|
||
Imagine the scenario:
|
||
|
||
CREATE TABLE test (key int4);
|
||
|
||
a) Session #1:
|
||
|
||
BEGIN;
|
||
|
||
b) Session #2:
|
||
|
||
BEGIN;
|
||
DROP TABLE test;
|
||
CREATE TABLE test (value varchar(32));
|
||
|
||
c) Session #1:
|
||
|
||
DROP TABLE test;
|
||
COMMIT;
|
||
|
||
d) Session #2:
|
||
|
||
COMMIT;
|
||
|
||
What's clear to me is that, if DDL statements are to be
|
||
ROLLBACK-able, either (1) an AccessExclusive lock is held on the
|
||
relation until transaction commit (like Phillip Warner stated was
|
||
Dec/Rdb's behavior) or (2) PostgreSQL must be capable of
|
||
supporting "multi-versioned schema" as well as tuples. Before
|
||
step 'c' is executed, both tables must simultaneously exist in
|
||
the database with the same name, which works fine in the cataloge
|
||
thanks to MVCC, but requires that, on disk, there exists:
|
||
|
||
test_01231 - Session #1's table, available for ROLLBACK
|
||
test_13421 - Session #2's table, available for COMMIT
|
||
|
||
Now, I believe it was Andreas who suggested that VACUUM be
|
||
modified to perform cleanup. I agree with this. VACUUM will need
|
||
to check for aborted relation tuples in pg_class and remove the
|
||
associated file from the filesystem in the event, for example,
|
||
that Session #2 aborted -or- Session #1 aborted leaving the
|
||
original pg_class tuple the "active" one and Session #2 attempted
|
||
to COMMIT, which violates the UNIQUE constraint on the relname of
|
||
pg_class. In addition, for "active" relation entries, VACUUM
|
||
should verify the filename is
|
||
<relname>_<oid> for the given oid. If it is not, it should rename
|
||
the filename on the filesystem. Again, this is purely cosmetic
|
||
for administrative purposes only, but would allow
|
||
for lack of atomicity only with respect to the label of the
|
||
relation file, until the next
|
||
VACUUM is run.
|
||
|
||
For the case of ALTER TABLE RENAME, ALTER TABLE DROP COLUMN,
|
||
etc., the same functionality would apply. But, as in previous
|
||
discussions regarding ALTER TABLE DROP COLUMN, PostgreSQL MUST be
|
||
capable of allowing multiple tuples with different attribute
|
||
counts and types within the same relation:
|
||
|
||
CREATE TABLE test (key int4);
|
||
|
||
a) Session #1:
|
||
|
||
BEGIN;
|
||
|
||
b) Session #2:
|
||
|
||
BEGIN;
|
||
ALTER TABLE test ADD COLUMN value int4;
|
||
INSERT INTO test values (1, 1);
|
||
|
||
c) Session #1:
|
||
|
||
INSERT INTO test values (0);
|
||
COMMIT;
|
||
|
||
d) Session #2:
|
||
|
||
COMMIT;
|
||
|
||
This also means that Hiroshi's plan to suppress the visibility of
|
||
attributes for ALTER TABLE DROP COLUMN would be required anyway,
|
||
to allow for "multi-versioning" of attributes within a single
|
||
tuple (i.e., like multi-versioning of tuples within relations),
|
||
an attribute is either visible or not, but the tuple should
|
||
always grow, until, of course, the next VACUUM.
|
||
|
||
So, to support rollback-able DDL statements ("multi-versioning
|
||
schema", if you will), PostgreSQL needs:
|
||
|
||
1) relation names of the form <relname>_<oid>
|
||
2) support "multi-versioning" of attributes within a single tuple
|
||
3) modify VACUUM to:
|
||
|
||
A) Remove filesystem files whose pg_class tuples are no longer
|
||
valid
|
||
B) Rename filesystem files to relname of pg_class when the
|
||
<relname>_<oid> doesn't match
|
||
C) Reconstruct relations after attributes have been
|
||
added/dropped.
|
||
|
||
4) All DDL statements should perform their non-create filesystem
|
||
functions in the now infamous "post-transaction-commit" trigger.
|
||
If the backend should crash between the time the transaction
|
||
committed and the rename() or unlink(), no adverse affects would
|
||
be encountered with the database WRT data, VACUUM would clean up
|
||
the rename() problem, and, worst-case scenario, an old
|
||
<relname>_<oid> file would lie around unused. But at least it
|
||
would no longer prohibit the creation of a table by the same
|
||
name....
|
||
|
||
Just my humble opinion,
|
||
|
||
Mike Mascari
|
||
|
||
From Inoue@tpf.co.jp Tue Mar 14 20:31:35 2000
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA08792
|
||
for <pgman@candle.pha.pa.us>; Tue, 14 Mar 2000 21:30:35 -0500 (EST)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id LAA00515; Wed, 15 Mar 2000 11:29:09 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Ross J. Reedstrom" <reedstrm@wallace.ece.rice.edu>,
|
||
"Bruce Momjian" <pgman@candle.pha.pa.us>
|
||
Cc: "PostgreSQL-development" <pgsql-hackers@postgresql.org>
|
||
Subject: RE: [HACKERS] Fix for RENAME
|
||
Date: Wed, 15 Mar 2000 11:35:46 +0900
|
||
Message-ID: <000c01bf8e27$2b3c3ce0$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
In-Reply-To: <20000314123331.A6094@rice.edu>
|
||
Importance: Normal
|
||
Status: ROr
|
||
|
||
> -----Original Message-----
|
||
> From: Ross J. Reedstrom [mailto:reedstrm@wallace.ece.rice.edu]
|
||
>
|
||
> Hiroshi -
|
||
> I've just about finished working up a patch to store the physical
|
||
> file name in the pg_class table. There are only two places that
|
||
> require a Rule for generating the filename, and one of them is
|
||
> only used for bootstrapping.
|
||
|
||
Thanks for your trial.
|
||
It's nice that only two places require naming rule.
|
||
|
||
I don't stick to one naming rule.
|
||
The only limitation is the uniqueness and the rule
|
||
could be changed according to situations.
|
||
For example,we could change the naming rule according to
|
||
the kind of relation such as system/user relations.
|
||
|
||
I'm now inclined to introduce a new system relation to store
|
||
the physical path name. It could also have table(data)space
|
||
information in the (near ?) future.
|
||
It seems better to separate it from pg_class because table(data?)
|
||
space may change the concept of table allocation.
|
||
|
||
Comments ?
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
|
||
From Inoue@tpf.co.jp Wed Mar 15 02:00:58 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA17887
|
||
for <pgman@candle.pha.pa.us>; Wed, 15 Mar 2000 03:00:57 -0500 (EST)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id CAA02974 for <pgman@candle.pha.pa.us>; Wed, 15 Mar 2000 02:54:44 -0500 (EST)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id QAA00734; Wed, 15 Mar 2000 16:53:56 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Bruce Momjian" <pgman@candle.pha.pa.us>
|
||
Cc: "Ross J. Reedstrom" <reedstrm@wallace.ece.rice.edu>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgresql.org>
|
||
Subject: RE: [HACKERS] Fix for RENAME
|
||
Date: Wed, 15 Mar 2000 17:00:35 +0900
|
||
Message-ID: <001101bf8e54$8b941cc0$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
In-Reply-To: <200003150433.XAA13256@candle.pha.pa.us>
|
||
Importance: Normal
|
||
Status: ROr
|
||
|
||
> -----Original Message-----
|
||
> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
|
||
>
|
||
> > I'm now inclined to introduce a new system relation to store
|
||
> > the physical path name. It could also have table(data)space
|
||
> > information in the (near ?) future.
|
||
> > It seems better to separate it from pg_class because table(data?)
|
||
> > space may change the concept of table allocation.
|
||
>
|
||
> Why not just put it in pg_class?
|
||
>
|
||
|
||
Not sure,it's only my feeling.
|
||
Comments please,everyone.
|
||
|
||
We have taken a practical way which doesn't break file per table
|
||
assumption in this thread and it wouldn't so difficult to implement.
|
||
In fact Ross has already tried it.
|
||
|
||
However there was a discussion about data(table)space for
|
||
months ago and currently a new discussion is there.
|
||
Judging from the previous discussion,I can't expect so much
|
||
that it could get a practical consensus(How many opinions there
|
||
were). We can make a practical step toward future by encapsulating
|
||
the information of table allocation. Separating table alloc info from
|
||
pg_class seems one of the way.
|
||
There may be more essential things for encapsulation.
|
||
|
||
Comments ?
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
|
||
From pgsql-hackers-owner+M196@hub.org Thu Mar 16 03:02:35 2000
|
||
Received: from hub.org (hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id EAA05789
|
||
for <pgman@candle.pha.pa.us>; Thu, 16 Mar 2000 04:02:29 -0500 (EST)
|
||
Received: from hub.org (hub.org [216.126.84.1])
|
||
by hub.org (8.9.3/8.9.3) with SMTP id CAA27302;
|
||
Thu, 16 Mar 2000 02:58:55 -0500 (EST)
|
||
(envelope-from pgsql-hackers-owner+M196@hub.org)
|
||
Received: from downtown.oche.de (root@downtown.oche.de [194.94.253.3])
|
||
by hub.org (8.9.3/8.9.3) with ESMTP id CAA23907
|
||
for <pgsql-hackers@postgresql.org>; Thu, 16 Mar 2000 02:37:54 -0500 (EST)
|
||
(envelope-from mne@darwin.oche.de)
|
||
Received: from darwin.oche.de (uucp@localhost)
|
||
by downtown.oche.de (8.9.3/8.9.3/Debian/GNU) with SMTP id IAA30654
|
||
for <pgsql-hackers@postgresql.org>; Thu, 16 Mar 2000 08:40:04 +0100
|
||
Received: from mne by darwin.oche.de with local (Exim 3.12 #1 (Debian))
|
||
id 12VUhX-0003Vz-00
|
||
for <pgsql-hackers@postgreSQL.org>; Thu, 16 Mar 2000 08:28:11 +0100
|
||
Date: Thu, 16 Mar 2000 08:28:11 +0100 (CET)
|
||
From: Martin Neumann <mne@mne.de>
|
||
Subject: [HACKERS] RfD: Design of tablespaces
|
||
To: pgsql-hackers@postgresql.org
|
||
MIME-Version: 1.0
|
||
Content-Type: TEXT/plain; CHARSET=US-ASCII
|
||
Message-Id: <E12VUhX-0003Vz-00@darwin.oche.de>
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
|
||
I have written some thoughts on the concept of tablespace
|
||
down. I would be happy to get some comments on it.
|
||
|
||
-----------------------------------------------------------------
|
||
Implementation of tablespaces within PostgreSQL
|
||
- a brainstorming paper designed for general discussion -
|
||
|
||
by Martin Neumann, 2000/3/15
|
||
|
||
|
||
1. What are tablespaces?
|
||
-------------------------
|
||
|
||
Tablespaces make it possible to distribute storage objects
|
||
over multiple points of storage (POS). Therefor one could
|
||
say a tablespace can be a POS.
|
||
|
||
Example:
|
||
|
||
tablespace_a -----> /mnt/raid/arena0/
|
||
tablespace_b -----> /mnt/raid/emc0/
|
||
|
||
Tablespaces can also store their data on other tablespaces:
|
||
|
||
tablespace_c -----> tablespace_b
|
||
|
||
This is quite interessting for administration purposes.
|
||
|
||
|
||
2. What are its advantages?
|
||
----------------------------
|
||
|
||
As you can choose a different tablespace for every storage
|
||
object (table, index etc.) it is easy to improve the following
|
||
aspects of your system:
|
||
|
||
- Reliability
|
||
|
||
You can put storage objects (mostly tables) you strongly depend
|
||
on onto a more reliable tablespace (mirrored RAID or perhaps
|
||
simply a directory which gets backuped more often than others).
|
||
|
||
- Speed
|
||
|
||
You can put storage objects you rarely need onto a rather slow
|
||
tablespace and keep your quick tablespaces clean from this.
|
||
|
||
A fast, but more expensive RAID-Stripeset can be used more
|
||
efficiently as it doesn't get filled with non-performance
|
||
sensitive data.
|
||
|
||
But also distributing storage objects which have equal needs
|
||
in sense of speed onto different tablespaces makes sense as
|
||
you gain more speed by distributing data over more than one
|
||
harddisk spindle.
|
||
|
||
- Manageability
|
||
|
||
You can grant and revoke rights on base of a tablespace.
|
||
|
||
As every storage object belongs to exactly one tablespace,
|
||
you can easily group storage objects using a tablespace.
|
||
|
||
|
||
3. What about disk I/O?
|
||
------------------------
|
||
|
||
Tablespaces tell the storage manager only where to store
|
||
the data, not how. This is the reasonable way.
|
||
|
||
|
||
4. Usage
|
||
---------
|
||
|
||
CREATE TABLESPACE tsname TYPE storage_type storage_options
|
||
|
||
Examples:
|
||
|
||
CREATE TABLESPACE tsemc0
|
||
TYPE classic DIRECTORY /mnt/raid/emc0 NOFSYNC
|
||
|
||
CREATE TABLESPACE tsarena0 TYPE raw DEVICE /dev/araid/0
|
||
MINSIZE 128 MAXSIZE 4096 GROW 4 32 SHRINK 2 32
|
||
BLOCKSIZE 16384
|
||
|
||
CREATE TABLESPACE quick0 TYPE link TABLESPACE tsarena0;
|
||
|
||
--
|
||
|
||
CREATE TABLE tbname ( ... ) TABLESPACE tsname;
|
||
|
||
Examples:
|
||
|
||
CREATE TABLE foo (
|
||
id int4 NOT NULL UNIQUE,
|
||
name text NOT NULL
|
||
) TABLESPACE tsemc0;
|
||
|
||
CREATE TABLE bar (
|
||
id int4 NOT NULL UNIQUE,
|
||
name text NOT NULL
|
||
) TABLESPACE default;
|
||
|
||
If the tablespace isn't given, the storage objects gets created
|
||
in the "default" tablespace.
|
||
|
||
"default" is the PostgreSQL's default tablespace and the only one
|
||
which has to exist on each system.
|
||
|
||
--
|
||
|
||
ALTER TABLESPACE tsname tssettings
|
||
|
||
Examples:
|
||
|
||
ALTER TABLESPACE tsemc0 DIRECTORY /mnt/raid/emc1
|
||
|
||
|
||
NOTE: altering tablespaces without recreating the contained
|
||
storage objects introduces many problems.
|
||
Realisation is difficult and won't be my first goal.
|
||
|
||
--
|
||
|
||
DROP TABLESPACE tsname [FORCE]
|
||
|
||
Examples:
|
||
|
||
DROP TABLESPACE tsarena0
|
||
|
||
This will immediately remove the tablespace tsarena0
|
||
if it contains no storage objects.
|
||
|
||
If it still contains some the tablespace is marked for
|
||
deletion.
|
||
|
||
This means:
|
||
1. you can't create new storage objects in the tablespace
|
||
2. if the last storage object inside gets dropped, the
|
||
tablespace will be removed.
|
||
|
||
|
||
DROP TABLESPACE tsarena0 FORCE
|
||
|
||
This will remove the tablespace including all contained
|
||
storage objects immediately.
|
||
|
||
--
|
||
|
||
VACUUM tsname
|
||
|
||
Example:
|
||
|
||
VACUUM tsemc1
|
||
|
||
This will vacuum a single tablespace with all contained
|
||
storage objects.
|
||
-----------------------------------------------------------------
|
||
|
||
--
|
||
Martin Neumann, Welkenrather Str. 118c, 52074 Aachen, Germany
|
||
mne@mne.de - http://www.mne.de/mne/ - sms@mne.de [eMail2SMS]
|
||
Tel. 0241 / 8876-080 - Mobil: 0173 / 27 69 632
|
||
..------.---------------------------------------------------------
|
||
| at | Inform GmbH - Abteilung Airport Logistics
|
||
| work | Pascalstr. 23 - 52076 Aachen - Tel. 02408 / 9456-0
|
||
|______| martin.neumann@inform-ac.com - http://www.inform-ac.com
|
||
|
||
|
||
|
||
From JanWieck@t-online.de Wed Jun 14 19:01:01 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA21372
|
||
for <pgman@candle.pha.pa.us>; Wed, 14 Jun 2000 19:00:59 -0400 (EDT)
|
||
Received: from mailout02.sul.t-online.com (mailout02.sul.t-online.com [194.25.134.17]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id SAA01930 for <pgman@candle.pha.pa.us>; Wed, 14 Jun 2000 18:51:11 -0400 (EDT)
|
||
Received: from fwd01.sul.t-online.de
|
||
by mailout02.sul.t-online.com with smtp
|
||
id 132Lz6-0004ec-01; Thu, 15 Jun 2000 00:50:08 +0200
|
||
Received: from hot.jw.home (340000654369-0001@[62.224.107.172]) by fwd01.sul.t-online.de
|
||
with esmtp id 132Lyy-0tYyi9C; Thu, 15 Jun 2000 00:50:00 +0200
|
||
Received: (from wieck@localhost)
|
||
by hot.jw.home (8.8.5/8.8.5) id WAA07887;
|
||
Wed, 14 Jun 2000 22:43:39 +0200
|
||
From: JanWieck@t-online.de (Jan Wieck)
|
||
Message-Id: <200006142043.WAA07887@hot.jw.home>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-Reply-To: <14752.960996980@sss.pgh.pa.us> from Tom Lane at "Jun 14, 2000 11:36:20
|
||
am"
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Date: Wed, 14 Jun 2000 22:43:39 +0200 (MEST)
|
||
CC: Oliver Elphick <olly@lfix.co.uk>, Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Reply-To: Jan Wieck <JanWieck@Yahoo.com>
|
||
X-Mailer: ELM [version 2.4ME+ PL68 (25)]
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain; charset=US-ASCII
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Sender: 340000654369-0001@t-dialin.net
|
||
Status: ROr
|
||
|
||
Tom Lane wrote:
|
||
> "Oliver Elphick" <olly@lfix.co.uk> writes:
|
||
> > I suggest that DROP TABLE in a transaction should not be allowed.
|
||
>
|
||
> I had actually made it do that for a short time early this year,
|
||
> and was shouted down. On reflection I have to agree; it's too useful
|
||
> to be able to do
|
||
>
|
||
> begin;
|
||
> drop table foo;
|
||
> create table foo(new schema);
|
||
> ...
|
||
> end;
|
||
>
|
||
> You do indeed lose big if you suffer an error partway through, but
|
||
> the answer to that is to fix our file naming conventions so that we
|
||
> can support rollback of drop table.
|
||
|
||
Belongs IMHO to the discussion to keep separate what is
|
||
separate (having indices/toast-relations/etc. in separate
|
||
directories and whatnot).
|
||
|
||
I've never been really happy with the file naming
|
||
conventions. The need of a filesystem entry to have the same
|
||
name of the DB object that is associated with it isn't right.
|
||
I know, some people love to be able to easily identify the
|
||
files with ls(1). OTOH what is that good for?
|
||
|
||
Well, someone can easily see how big the disk footprint of
|
||
his data is. Whow - what an info. Anything else?
|
||
|
||
Why not changing the naming to be something like this:
|
||
|
||
<dbroot>/catalog_tables/pg_...
|
||
<dbroot>/catalog_index/pg_...
|
||
<dbroot>/user_tables/oid_...
|
||
<dbroot>/user_index/oid_...
|
||
<dbroot>/temp_tables/oid_...
|
||
<dbroot>/temp_index/oid_...
|
||
<dbroot>/toast_tables/oid_...
|
||
<dbroot>/toast_index/oid_...
|
||
<dbroot>/whatnot_???/...
|
||
|
||
This way, it would be much easier to separate all the
|
||
different object types to different physical media. We would
|
||
loose some transparency, but I've allways wondered what
|
||
people USE that for (except for just wanna know). For
|
||
convinience we could implement another little utility that
|
||
tells the object size like
|
||
|
||
DESCRIBE TABLE/VIEW/whatnot <object-name>
|
||
|
||
that returns the physical location and storage details of the
|
||
object. And psql could use it to print this info additional
|
||
on the \d commands. Would give unprivileged users access to
|
||
this info, so be it, it's not a security issue IMHO.
|
||
|
||
The subdirectory an object goes into has to be controlled by
|
||
the relkind. So we need to tidy up that a little too. I think
|
||
it's worth it.
|
||
|
||
The objects storage location (the bare file) now would
|
||
contain the OID. So we avoid naming conflicts for temp
|
||
tables, naming conflicts during DROP/CREATE in a transaction
|
||
and all the like.
|
||
|
||
Comments?
|
||
|
||
|
||
Jan
|
||
|
||
--
|
||
|
||
#======================================================================#
|
||
# It's easier to get forgiveness for being wrong than for being right. #
|
||
# Let's break this rule - forgive me. #
|
||
#================================================== JanWieck@Yahoo.com #
|
||
|
||
|
||
|
||
From tgl@sss.pgh.pa.us Wed Jun 14 22:06:54 2000
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA02821
|
||
for <pgman@candle.pha.pa.us>; Wed, 14 Jun 2000 22:06:52 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id WAA16609;
|
||
Wed, 14 Jun 2000 22:07:16 -0400 (EDT)
|
||
To: Jan Wieck <JanWieck@Yahoo.com>
|
||
cc: Oliver Elphick <olly@lfix.co.uk>, Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006142043.WAA07887@hot.jw.home>
|
||
References: <200006142043.WAA07887@hot.jw.home>
|
||
Comments: In-reply-to JanWieck@t-online.de (Jan Wieck)
|
||
message dated "Wed, 14 Jun 2000 22:43:39 +0200"
|
||
Date: Wed, 14 Jun 2000 22:07:15 -0400
|
||
Message-ID: <16606.961034835@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: RO
|
||
|
||
JanWieck@t-online.de (Jan Wieck) writes:
|
||
> I've never been really happy with the file naming
|
||
> conventions. The need of a filesystem entry to have the same
|
||
> name of the DB object that is associated with it isn't right.
|
||
> I know, some people love to be able to easily identify the
|
||
> files with ls(1). OTOH what is that good for?
|
||
|
||
I agree with Jan on this: let's just change the file names over to
|
||
be OIDs. Then we can have rollbackable DROP and RENAME TABLE easily.
|
||
Naming the files after the logical names of the tables is nice if it
|
||
doesn't cost anything, but it is *not* worth the trouble to preserve
|
||
a relationship between filename and tablename when it is costing us.
|
||
And it's costing us big time. That single feature is hurting us on
|
||
functionality, robustness, and portability, and for what benefit?
|
||
Not nearly enough. It's time to just let go of it.
|
||
|
||
> Why not changing the naming to be something like this:
|
||
|
||
> <dbroot>/catalog_tables/pg_...
|
||
> <dbroot>/catalog_index/pg_...
|
||
> <dbroot>/user_tables/oid_...
|
||
> <dbroot>/user_index/oid_...
|
||
> <dbroot>/temp_tables/oid_...
|
||
> <dbroot>/temp_index/oid_...
|
||
> <dbroot>/toast_tables/oid_...
|
||
> <dbroot>/toast_index/oid_...
|
||
> <dbroot>/whatnot_???/...
|
||
|
||
I don't see a lot of value in that. Better to do something like
|
||
tablespaces:
|
||
|
||
<dbroot>/<oidoftablespace>/<oidofobject>
|
||
|
||
regards, tom lane
|
||
|
||
From tgl@sss.pgh.pa.us Wed Jun 14 22:20:59 2000
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA25561
|
||
for <pgman@candle.pha.pa.us>; Wed, 14 Jun 2000 22:20:56 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id WAA16708;
|
||
Wed, 14 Jun 2000 22:21:30 -0400 (EDT)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: Jan Wieck <JanWieck@Yahoo.com>, Oliver Elphick <olly@lfix.co.uk>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006142313.TAA22904@candle.pha.pa.us>
|
||
References: <200006142313.TAA22904@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Wed, 14 Jun 2000 19:13:47 -0400"
|
||
Date: Wed, 14 Jun 2000 22:21:30 -0400
|
||
Message-ID: <16705.961035690@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: ROr
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> You need something that works from the command line, and something that
|
||
> works if PostgreSQL is not running. How would you restore one file from
|
||
> a tape.
|
||
|
||
"Restore one file from a tape"? How are you going to do that anyway?
|
||
You can't save and restore portions of a database like that, because
|
||
of transaction commit status problems. To restore table X correctly,
|
||
you'd have to restore pg_log as well, and then your other tables are
|
||
hosed --- unless you also restore all of them from the backup. Only
|
||
a complete database restore from tape would work, and for that you
|
||
don't need to tell which file is which. So the above argument is a
|
||
red herring.
|
||
|
||
I realize it's nice to be able to tell which table file is which by
|
||
eyeball, but the price we are paying for that small convenience is
|
||
just too high. Give that up, and we can have rollbackable DROP and
|
||
RENAME now (I'll personally commit to making it happen for 7.1).
|
||
Continue to insist on it, and I don't think we'll *ever* have those
|
||
features in a really robust form. It's just not possible to do
|
||
multiple file renames atomically.
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M3381@hub.org Wed Jun 14 22:23:25 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA05943
|
||
for <pgman@candle.pha.pa.us>; Wed, 14 Jun 2000 22:23:24 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5F2ME840721;
|
||
Wed, 14 Jun 2000 22:22:14 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5F2Le840155
|
||
for <pgsql-hackers@postgresql.org>; Wed, 14 Jun 2000 22:21:41 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id WAA16708;
|
||
Wed, 14 Jun 2000 22:21:30 -0400 (EDT)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: Jan Wieck <JanWieck@Yahoo.com>, Oliver Elphick <olly@lfix.co.uk>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006142313.TAA22904@candle.pha.pa.us>
|
||
References: <200006142313.TAA22904@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Wed, 14 Jun 2000 19:13:47 -0400"
|
||
Date: Wed, 14 Jun 2000 22:21:30 -0400
|
||
Message-ID: <16705.961035690@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: ROr
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> You need something that works from the command line, and something that
|
||
> works if PostgreSQL is not running. How would you restore one file from
|
||
> a tape.
|
||
|
||
"Restore one file from a tape"? How are you going to do that anyway?
|
||
You can't save and restore portions of a database like that, because
|
||
of transaction commit status problems. To restore table X correctly,
|
||
you'd have to restore pg_log as well, and then your other tables are
|
||
hosed --- unless you also restore all of them from the backup. Only
|
||
a complete database restore from tape would work, and for that you
|
||
don't need to tell which file is which. So the above argument is a
|
||
red herring.
|
||
|
||
I realize it's nice to be able to tell which table file is which by
|
||
eyeball, but the price we are paying for that small convenience is
|
||
just too high. Give that up, and we can have rollbackable DROP and
|
||
RENAME now (I'll personally commit to making it happen for 7.1).
|
||
Continue to insist on it, and I don't think we'll *ever* have those
|
||
features in a really robust form. It's just not possible to do
|
||
multiple file renames atomically.
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M3382@hub.org Wed Jun 14 22:31:42 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA10091
|
||
for <pgman@candle.pha.pa.us>; Wed, 14 Jun 2000 22:31:41 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5F2UI853244;
|
||
Wed, 14 Jun 2000 22:30:18 -0400 (EDT)
|
||
Received: from candle.pha.pa.us (pgman@s5-03.ppp.op.net [209.152.195.67])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5F2Th852641
|
||
for <pgsql-hackers@postgresql.org>; Wed, 14 Jun 2000 22:29:43 -0400 (EDT)
|
||
Received: (from pgman@localhost)
|
||
by candle.pha.pa.us (8.9.0/8.9.0) id WAA06576;
|
||
Wed, 14 Jun 2000 22:28:53 -0400 (EDT)
|
||
From: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
Message-Id: <200006150228.WAA06576@candle.pha.pa.us>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-Reply-To: <16705.961035690@sss.pgh.pa.us> "from Tom Lane at Jun 14, 2000 10:21:30
|
||
pm"
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Date: Wed, 14 Jun 2000 22:28:53 -0400 (EDT)
|
||
CC: Jan Wieck <JanWieck@Yahoo.com>, Oliver Elphick <olly@lfix.co.uk>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
X-Mailer: ELM [version 2.4ME+ PL77 (25)]
|
||
MIME-Version: 1.0
|
||
Content-Transfer-Encoding: 7bit
|
||
Content-Type: text/plain; charset=US-ASCII
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> > You need something that works from the command line, and something that
|
||
> > works if PostgreSQL is not running. How would you restore one file from
|
||
> > a tape.
|
||
>
|
||
> "Restore one file from a tape"? How are you going to do that anyway?
|
||
> You can't save and restore portions of a database like that, because
|
||
> of transaction commit status problems. To restore table X correctly,
|
||
> you'd have to restore pg_log as well, and then your other tables are
|
||
> hosed --- unless you also restore all of them from the backup. Only
|
||
> a complete database restore from tape would work, and for that you
|
||
> don't need to tell which file is which. So the above argument is a
|
||
> red herring.
|
||
>
|
||
> I realize it's nice to be able to tell which table file is which by
|
||
> eyeball, but the price we are paying for that small convenience is
|
||
> just too high. Give that up, and we can have rollbackable DROP and
|
||
> RENAME now (I'll personally commit to making it happen for 7.1).
|
||
> Continue to insist on it, and I don't think we'll *ever* have those
|
||
> features in a really robust form. It's just not possible to do
|
||
> multiple file renames atomically.
|
||
>
|
||
|
||
OK, I am flexible. (Yea, right.) :-)
|
||
|
||
But seriously, let me give some background. I used Ingres, that used
|
||
the VMS file system, but used strange sequential AAAF324 numbers for
|
||
tables. When someone deleted a table, or we were looking at what tables
|
||
were using disk space, it was impossible to find the Ingres table names
|
||
that went with the file. There was a system table that showed it, but
|
||
it was poorly documented, and if you deleted the table, there was no way
|
||
to look on the tape to find out which file to restore.
|
||
|
||
As far as pg_log, you certainly would not expect to get any information
|
||
back from the time of the backup table to current, so the current pg_log
|
||
would be just fine.
|
||
|
||
Basically, I guess we have to do it, but we have to print the proper
|
||
error messages for cases in the backend we just print the file name.
|
||
Also, we have to now replace the 'ls -l' command with something that
|
||
will be meaningful.
|
||
|
||
Right now, we use 'ps' with args to display backend information, and ls
|
||
-l to show disk information. We are going to lose that here.
|
||
|
||
|
||
|
||
--
|
||
Bruce Momjian | http://www.op.net/~candle
|
||
pgman@candle.pha.pa.us | (610) 853-3000
|
||
+ If your life is a hard drive, | 830 Blythe Avenue
|
||
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
|
||
|
||
From tgl@sss.pgh.pa.us Wed Jun 14 22:31:01 2000
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA09340
|
||
for <pgman@candle.pha.pa.us>; Wed, 14 Jun 2000 22:31:00 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id WAA16783
|
||
for <pgman@candle.pha.pa.us>; Wed, 14 Jun 2000 22:31:34 -0400 (EDT)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006150223.WAA06516@candle.pha.pa.us>
|
||
References: <200006150223.WAA06516@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Wed, 14 Jun 2000 22:23:58 -0400"
|
||
Date: Wed, 14 Jun 2000 22:31:33 -0400
|
||
Message-ID: <16780.961036293@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: RO
|
||
|
||
> Can I phone you?
|
||
|
||
Sure, I'm here.
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M3383@hub.org Wed Jun 14 22:38:29 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA27501
|
||
for <pgman@candle.pha.pa.us>; Wed, 14 Jun 2000 22:38:28 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5F2bD870244;
|
||
Wed, 14 Jun 2000 22:37:13 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5F2af869743
|
||
for <pgsql-hackers@postgresql.org>; Wed, 14 Jun 2000 22:36:41 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id WAA16814;
|
||
Wed, 14 Jun 2000 22:36:19 -0400 (EDT)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: Jan Wieck <JanWieck@Yahoo.com>, Oliver Elphick <olly@lfix.co.uk>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006150228.WAA06576@candle.pha.pa.us>
|
||
References: <200006150228.WAA06576@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Wed, 14 Jun 2000 22:28:53 -0400"
|
||
Date: Wed, 14 Jun 2000 22:36:19 -0400
|
||
Message-ID: <16810.961036579@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: ROr
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> But seriously, let me give some background. I used Ingres, that used
|
||
> the VMS file system, but used strange sequential AAAF324 numbers for
|
||
> tables. When someone deleted a table, or we were looking at what tables
|
||
> were using disk space, it was impossible to find the Ingres table names
|
||
> that went with the file. There was a system table that showed it, but
|
||
> it was poorly documented, and if you deleted the table, there was no way
|
||
> to look on the tape to find out which file to restore.
|
||
|
||
Fair enough, but it seems to me that the answer is to expend some effort
|
||
on system admin support tools. We could do a lot in that line with less
|
||
effort than trying to make a fundamentally mismatched filesystem
|
||
representation do what we need.
|
||
|
||
regards, tom lane
|
||
|
||
From tgl@sss.pgh.pa.us Wed Jun 14 23:13:35 2000
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA06306
|
||
for <pgman@candle.pha.pa.us>; Wed, 14 Jun 2000 23:13:26 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id XAA16988;
|
||
Wed, 14 Jun 2000 23:13:53 -0400 (EDT)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: Jan Wieck <JanWieck@Yahoo.com>, Oliver Elphick <olly@lfix.co.uk>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006150244.WAA27741@candle.pha.pa.us>
|
||
References: <200006150244.WAA27741@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Wed, 14 Jun 2000 22:44:16 -0400"
|
||
Date: Wed, 14 Jun 2000 23:13:52 -0400
|
||
Message-ID: <16985.961038832@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: ROr
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> That was my point --- that in doing this change, we are taking on more
|
||
> TODO items, that may detract from our main TODO items.
|
||
|
||
True, but they are also TODO items that could be handled by people other
|
||
than the inner circle of key developers. The actual rejiggering of
|
||
table-to-filename mapping is going to have to be done by one of the
|
||
small number of people who are fully up to speed on backend internals.
|
||
But we've got a lot more folks who would be able (and, hopefully,
|
||
willing) to design and code whatever tools are needed to make the
|
||
dbadmin's job easier in the face of the new filesystem layout. I'd
|
||
rather not expend a lot of core time to avoid needing those tools,
|
||
especially when I feel the old approach is fatally flawed anyway.
|
||
|
||
> Even gdb shows us the filename/tablename in backtraces. We are never
|
||
> going to be able to reproduce that.
|
||
|
||
Backtraces from *what*, exactly? 99% of the backend is still going
|
||
to be dealing with the same data as ever. It might be that poking
|
||
around in fd.c will be a little harder, but considering that fd.c
|
||
doesn't really know or care what the files it's manipulating are
|
||
anyway, I'm not convinced that this is a real issue.
|
||
|
||
> I guess I don't consider table schema commands inside transactions and
|
||
> such to be as big an items as the utility features we will need to
|
||
> build.
|
||
|
||
You've *got* to be kidding. We're constantly seeing complaints about
|
||
the fact that rolling back DROP or RENAME TABLE fails --- and worse,
|
||
leaves the table in a corrupted/inconsistent state. As far as I can
|
||
tell, that's one of the worst robustness problems we've got left to
|
||
fix. This is a big deal IMHO, and I want it to be fixed and fixed
|
||
right. I don't see how to fix it right if we try to keep physical
|
||
filenames tied to logical tablenames.
|
||
|
||
Moreover, that restriction will continue to hurt us if we try to
|
||
preserve it while implementing tablespaces, ANSI schemas, etc.
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M3387@hub.org Wed Jun 14 23:16:56 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA07268
|
||
for <pgman@candle.pha.pa.us>; Wed, 14 Jun 2000 23:16:54 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5F3Em841832;
|
||
Wed, 14 Jun 2000 23:14:48 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5F3EG841655
|
||
for <pgsql-hackers@postgresql.org>; Wed, 14 Jun 2000 23:14:16 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id XAA16988;
|
||
Wed, 14 Jun 2000 23:13:53 -0400 (EDT)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: Jan Wieck <JanWieck@Yahoo.com>, Oliver Elphick <olly@lfix.co.uk>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006150244.WAA27741@candle.pha.pa.us>
|
||
References: <200006150244.WAA27741@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Wed, 14 Jun 2000 22:44:16 -0400"
|
||
Date: Wed, 14 Jun 2000 23:13:52 -0400
|
||
Message-ID: <16985.961038832@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: ROr
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> That was my point --- that in doing this change, we are taking on more
|
||
> TODO items, that may detract from our main TODO items.
|
||
|
||
True, but they are also TODO items that could be handled by people other
|
||
than the inner circle of key developers. The actual rejiggering of
|
||
table-to-filename mapping is going to have to be done by one of the
|
||
small number of people who are fully up to speed on backend internals.
|
||
But we've got a lot more folks who would be able (and, hopefully,
|
||
willing) to design and code whatever tools are needed to make the
|
||
dbadmin's job easier in the face of the new filesystem layout. I'd
|
||
rather not expend a lot of core time to avoid needing those tools,
|
||
especially when I feel the old approach is fatally flawed anyway.
|
||
|
||
> Even gdb shows us the filename/tablename in backtraces. We are never
|
||
> going to be able to reproduce that.
|
||
|
||
Backtraces from *what*, exactly? 99% of the backend is still going
|
||
to be dealing with the same data as ever. It might be that poking
|
||
around in fd.c will be a little harder, but considering that fd.c
|
||
doesn't really know or care what the files it's manipulating are
|
||
anyway, I'm not convinced that this is a real issue.
|
||
|
||
> I guess I don't consider table schema commands inside transactions and
|
||
> such to be as big an items as the utility features we will need to
|
||
> build.
|
||
|
||
You've *got* to be kidding. We're constantly seeing complaints about
|
||
the fact that rolling back DROP or RENAME TABLE fails --- and worse,
|
||
leaves the table in a corrupted/inconsistent state. As far as I can
|
||
tell, that's one of the worst robustness problems we've got left to
|
||
fix. This is a big deal IMHO, and I want it to be fixed and fixed
|
||
right. I don't see how to fix it right if we try to keep physical
|
||
filenames tied to logical tablenames.
|
||
|
||
Moreover, that restriction will continue to hurt us if we try to
|
||
preserve it while implementing tablespaces, ANSI schemas, etc.
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M3397@hub.org Thu Jun 15 03:03:33 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA24286
|
||
for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 03:03:32 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5F72T815284;
|
||
Thu, 15 Jun 2000 03:02:29 -0400 (EDT)
|
||
Received: from mailo.vtcif.telstra.com.au (mailo.vtcif.telstra.com.au [202.12.144.17])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5F721814963
|
||
for <pgsql-hackers@postgresql.org>; Thu, 15 Jun 2000 03:02:01 -0400 (EDT)
|
||
Received: (from uucp@localhost) by mailo.vtcif.telstra.com.au (8.8.2/8.6.9) id RAA01186; Thu, 15 Jun 2000 17:01:48 +1000 (EST)
|
||
Received: from maili.vtcif.telstra.com.au(202.12.142.17)
|
||
via SMTP by mailo.vtcif.telstra.com.au, id smtpd0SbI.z; Thu Jun 15 17:00:39 2000
|
||
Received: (from uucp@localhost) by maili.vtcif.telstra.com.au (8.8.2/8.6.9) id RAA21419; Thu, 15 Jun 2000 17:00:37 +1000 (EST)
|
||
Received: from localhost(127.0.0.1), claiming to be "mail.cdn.telstra.com.au"
|
||
via SMTP by localhost, id smtpdWTHrU_; Thu Jun 15 16:59:34 2000
|
||
Received: from lunitari.nimrod.itg.telecom.com.au (lunitari.nimrod.itg.telecom.com.au [192.53.254.48]) by mail.cdn.telstra.com.au (8.8.2/8.6.9) with ESMTP id QAA04796; Thu, 15 Jun 2000 16:59:33 +1000 (EST)
|
||
Received: from nimrod.itg.telecom.com.au (majere [192.53.254.45])
|
||
by lunitari.nimrod.itg.telecom.com.au (8.9.1/8.9.3) with ESMTP id QAA18056;
|
||
Thu, 15 Jun 2000 16:58:17 +1000 (EST)
|
||
Message-ID: <39487E0C.970680AB@nimrod.itg.telecom.com.au>
|
||
Date: Thu, 15 Jun 2000 16:56:12 +1000
|
||
From: Chris Bitmead <chrisb@nimrod.itg.telstra.com.au>
|
||
Organization: IBM Global Services
|
||
X-Mailer: Mozilla 4.6 [en] (X11; I; SunOS 5.6 sun4u)
|
||
X-Accept-Language: en
|
||
MIME-Version: 1.0
|
||
To: "Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
CC: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
References: <16985.961038832@sss.pgh.pa.us> <200006150321.XAA09510@candle.pha.pa.us> <20000615010312.A995@rice.edu>
|
||
Content-Type: text/plain; charset=us-ascii
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
"Ross J. Reedstrom" wrote:
|
||
|
||
> Any strong objections to the mixed relname_oid solution? It gets us
|
||
> everything oids does, and still lets Bruce use 'ls -l' to find the big
|
||
> tables, putting off writing any admin tools that'll need to be rewritten,
|
||
> anyway.
|
||
|
||
Doesn't relname_oid defeat the purpose of oid file names, which is that
|
||
they don't change when the table is renamed? Wasn't it going to be oids
|
||
with a tool to create a symlink of relname -> oid ?
|
||
|
||
From pgsql-hackers-owner+M3400@hub.org Thu Jun 15 03:31:16 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA24604
|
||
for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 03:31:15 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id DAA01191 for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 03:15:28 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5F7CP835301;
|
||
Thu, 15 Jun 2000 03:12:25 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5F7Bt833744
|
||
for <pgsql-hackers@postgresql.org>; Thu, 15 Jun 2000 03:11:55 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id DAA18801;
|
||
Thu, 15 Jun 2000 03:11:53 -0400 (EDT)
|
||
To: "Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <20000615010312.A995@rice.edu>
|
||
References: <16985.961038832@sss.pgh.pa.us> <200006150321.XAA09510@candle.pha.pa.us> <20000615010312.A995@rice.edu>
|
||
Comments: In-reply-to "Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
message dated "Thu, 15 Jun 2000 01:03:12 -0500"
|
||
Date: Thu, 15 Jun 2000 03:11:52 -0400
|
||
Message-ID: <18798.961053112@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu> writes:
|
||
> Any strong objections to the mixed relname_oid solution?
|
||
|
||
Yes!
|
||
|
||
You cannot make it work reliably unless the relname part is the original
|
||
relname and does not track ALTER TABLE RENAME. IMHO having an obsolete
|
||
relname in the filename is worse than not having the relname at all;
|
||
it's a recipe for confusion, it means you still need admin tools to tell
|
||
which end is really up, and what's worst is you might think you don't.
|
||
|
||
Furthermore it requires an additional column in pg_class to keep track
|
||
of the original relname, which is a waste of space and effort.
|
||
|
||
It also creates a portability risk, or at least fails to remove one,
|
||
since you are critically dependent on the assumption that the OS
|
||
supports long filenames --- on a filesystem that truncates names to less
|
||
than about 45 characters you're in very deep trouble. An OID-only
|
||
approach still works on traditional 14-char-filename Unix filesystems
|
||
(it'd mostly even work on DOS 8+3, though I doubt we care about that).
|
||
|
||
Finally, one of the reasons I want to go to filenames based only on OID
|
||
is that that'll make life easier for mdblindwrt. Original relname + OID
|
||
doesn't help, in fact it makes life harder (more shmem space needed to
|
||
keep track of the filename for each buffer).
|
||
|
||
Can we *PLEASE JUST LET GO* of this bad idea? No relname in the
|
||
filename. Period.
|
||
|
||
regards, tom lane
|
||
|
||
From tgl@sss.pgh.pa.us Thu Jun 15 03:31:11 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA24592
|
||
for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 03:31:10 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id DAA01213 for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 03:15:46 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id DAA18833;
|
||
Thu, 15 Jun 2000 03:14:30 -0400 (EDT)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: Jan Wieck <JanWieck@Yahoo.com>, Oliver Elphick <olly@lfix.co.uk>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006150321.XAA09510@candle.pha.pa.us>
|
||
References: <200006150321.XAA09510@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Wed, 14 Jun 2000 23:21:15 -0400"
|
||
Date: Thu, 15 Jun 2000 03:14:30 -0400
|
||
Message-ID: <18830.961053270@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: RO
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> Well, we did have someone do a test implementation of oid file names,
|
||
> and their report was that is looked pretty ugly. However, if people are
|
||
> convinced it has to be done, we can get started. I guess I was waiting
|
||
> for Vadim's storage manager, where the whole idea of separate files is
|
||
> going to go away anyway, I suspect. We would then have to re-write all
|
||
> our admin tools for the new format.
|
||
|
||
I seem to recall him saying that he wanted to go to filename == OID
|
||
just like I'm suggesting. But I agree we probably ought to hold off
|
||
doing anything until he gets back from Russia and can let us know
|
||
whether that's still his plan. If he is planning one-huge-file or
|
||
something like that, we might as well let these issues go unfixed
|
||
for one more release cycle.
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M3401@hub.org Thu Jun 15 03:31:15 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA24601
|
||
for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 03:31:14 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id DAA01428 for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 03:19:39 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5F7GP843802;
|
||
Thu, 15 Jun 2000 03:16:25 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5F7Fr842651
|
||
for <pgsql-hackers@postgresql.org>; Thu, 15 Jun 2000 03:15:53 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id DAA18833;
|
||
Thu, 15 Jun 2000 03:14:30 -0400 (EDT)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: Jan Wieck <JanWieck@Yahoo.com>, Oliver Elphick <olly@lfix.co.uk>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006150321.XAA09510@candle.pha.pa.us>
|
||
References: <200006150321.XAA09510@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Wed, 14 Jun 2000 23:21:15 -0400"
|
||
Date: Thu, 15 Jun 2000 03:14:30 -0400
|
||
Message-ID: <18830.961053270@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> Well, we did have someone do a test implementation of oid file names,
|
||
> and their report was that is looked pretty ugly. However, if people are
|
||
> convinced it has to be done, we can get started. I guess I was waiting
|
||
> for Vadim's storage manager, where the whole idea of separate files is
|
||
> going to go away anyway, I suspect. We would then have to re-write all
|
||
> our admin tools for the new format.
|
||
|
||
I seem to recall him saying that he wanted to go to filename == OID
|
||
just like I'm suggesting. But I agree we probably ought to hold off
|
||
doing anything until he gets back from Russia and can let us know
|
||
whether that's still his plan. If he is planning one-huge-file or
|
||
something like that, we might as well let these issues go unfixed
|
||
for one more release cycle.
|
||
|
||
regards, tom lane
|
||
|
||
From ZeugswetterA@wien.spardat.at Thu Jun 15 03:30:59 2000
|
||
Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA24584
|
||
for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 03:30:56 -0400 (EDT)
|
||
Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
|
||
by gandalf.it-austria.net (xxx/xxx) with ESMTP id JAA29140;
|
||
Thu, 15 Jun 2000 09:31:12 +0200
|
||
Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
|
||
id <M6F99QGS>; Thu, 15 Jun 2000 09:31:12 +0200
|
||
Message-ID: <219F68D65015D011A8E000006F8590C604AF7DE4@sdexcsrv1.f000.d0188.sd.spardat.at>
|
||
From: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
|
||
To: "'Tom Lane'" <tgl@sss.pgh.pa.us>, Bruce Momjian <pgman@candle.pha.pa.us>
|
||
Cc: Jan Wieck <JanWieck@Yahoo.com>, Oliver Elphick <olly@lfix.co.uk>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: AW: [HACKERS] Big 7.1 open items
|
||
Date: Thu, 15 Jun 2000 09:31:11 +0200
|
||
MIME-Version: 1.0
|
||
X-Mailer: Internet Mail Service (5.5.2448.0)
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Status: RO
|
||
|
||
|
||
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> > You need something that works from the command line, and
|
||
> something that
|
||
> > works if PostgreSQL is not running. How would you restore
|
||
> one file from
|
||
> > a tape.
|
||
>
|
||
> "Restore one file from a tape"? How are you going to do that anyway?
|
||
> You can't save and restore portions of a database like that, because
|
||
> of transaction commit status problems. To restore table X correctly,
|
||
> you'd have to restore pg_log as well, and then your other tables are
|
||
> hosed --- unless you also restore all of them from the backup. Only
|
||
> a complete database restore from tape would work, and for that you
|
||
> don't need to tell which file is which. So the above argument is a
|
||
> red herring.
|
||
|
||
>From what I know it is possible to simply restore one table file
|
||
since pg_log keeps all tid's. Of course it cannot guarantee integrity
|
||
and does not work if the table was altered.
|
||
|
||
> I realize it's nice to be able to tell which table file is which by
|
||
> eyeball, but the price we are paying for that small convenience is
|
||
> just too high. Give that up, and we can have rollbackable DROP and
|
||
> RENAME now (I'll personally commit to making it happen for 7.1).
|
||
> Continue to insist on it, and I don't think we'll *ever* have those
|
||
> features in a really robust form. It's just not possible to do
|
||
> multiple file renames atomically.
|
||
|
||
In the last proposal Bruce and I had it all layed out for tabname + oid
|
||
with no overhead in the normal situation, and little overhead if a rename
|
||
table crashed or was not rolled back or committed properly
|
||
which imho had all advantages combined.
|
||
|
||
Andreas
|
||
|
||
From ZeugswetterA@wien.spardat.at Thu Jun 15 04:31:04 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id EAA25144
|
||
for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 04:31:03 -0400 (EDT)
|
||
Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id EAA03225 for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 04:05:41 -0400 (EDT)
|
||
Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
|
||
by gandalf.it-austria.net (xxx/xxx) with ESMTP id KAA100894;
|
||
Thu, 15 Jun 2000 10:04:52 +0200
|
||
Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
|
||
id <M6F99QWD>; Thu, 15 Jun 2000 10:04:52 +0200
|
||
Message-ID: <219F68D65015D011A8E000006F8590C604AF7DE7@sdexcsrv1.f000.d0188.sd.spardat.at>
|
||
From: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
|
||
To: "'Don Baccus'" <dhogaza@pacifier.com>,
|
||
Bruce Momjian
|
||
<pgman@candle.pha.pa.us>, Tom Lane <tgl@sss.pgh.pa.us>
|
||
Cc: Jan Wieck <JanWieck@yahoo.com>, Oliver Elphick <olly@lfix.co.uk>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: AW: [HACKERS] Big 7.1 open items
|
||
Date: Thu, 15 Jun 2000 10:04:51 +0200
|
||
MIME-Version: 1.0
|
||
X-Mailer: Internet Mail Service (5.5.2448.0)
|
||
Content-Type: text/plain;
|
||
charset="windows-1252"
|
||
Status: RO
|
||
|
||
|
||
> In reality, very few people are going to be interested in restoring
|
||
> a table in a way that breaks referential integrity and other
|
||
> normal assumptions about what exists in the database.
|
||
|
||
This is not true. In my DBA history it would have saved me manweeks
|
||
of work if an easy and efficient restore of one single table from backup
|
||
would have been available in Informix and Oracle.
|
||
We allways had to restore most of the whole system to another machine only
|
||
to get back at some table info that would then be manually re-added
|
||
to the production system.
|
||
A restore of one table to a different/new tablename would have been
|
||
very convenient, and this is currently possible in PostgreSQL.
|
||
(create new table with same schema, then replace new table data file
|
||
with file from backup)
|
||
|
||
> The reality
|
||
> is that most people are going to engage in a little time travel
|
||
> to a past, consistent backup rather than do as you suggest.
|
||
|
||
No, this is what is done most of the time, but it is very inconvenient
|
||
to tell people that they loose all work from past days, so it is usually
|
||
done as I noted above if possible. We once had a situation where all data
|
||
was deleted from a table, but the problem was only noticed 3 weeks later.
|
||
|
||
> This is going to be more and more true as Postgres gains more and
|
||
> more acceptance in (no offense intended) the real world.
|
||
>
|
||
> >Right now, we use 'ps' with args to display backend
|
||
> information, and ls
|
||
> >-l to show disk information. We are going to lose that here.
|
||
>
|
||
> Dependence on "ls -l" is, IMO, a very weak argument.
|
||
|
||
In normal situations where everything works I agree, it is the
|
||
error situations where it really helps if you see what data is where.
|
||
debugging, lsof, Bruce already named them.
|
||
|
||
Andreas
|
||
|
||
From pgsql-hackers-owner+M3405@hub.org Thu Jun 15 04:31:09 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id EAA25151
|
||
for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 04:31:07 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id EAA04151 for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 04:30:23 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5F8RI883087;
|
||
Thu, 15 Jun 2000 04:27:18 -0400 (EDT)
|
||
Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5F8Qx881928
|
||
for <pgsql-hackers@postgresql.org>; Thu, 15 Jun 2000 04:27:00 -0400 (EDT)
|
||
Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
|
||
by gandalf.it-austria.net (xxx/xxx) with ESMTP id KAA79848;
|
||
Thu, 15 Jun 2000 10:26:13 +0200
|
||
Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
|
||
id <M6F99Q8A>; Thu, 15 Jun 2000 10:26:14 +0200
|
||
Message-ID: <219F68D65015D011A8E000006F8590C604AF7DE8@sdexcsrv1.f000.d0188.sd.spardat.at>
|
||
From: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
|
||
To: "'Tom Lane'" <tgl@sss.pgh.pa.us>,
|
||
"Ross J. Reedstrom"
|
||
<reedstrm@rice.edu>
|
||
Cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: AW: [HACKERS] Big 7.1 open items
|
||
Date: Thu, 15 Jun 2000 10:26:12 +0200
|
||
MIME-Version: 1.0
|
||
X-Mailer: Internet Mail Service (5.5.2448.0)
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: ROr
|
||
|
||
|
||
> "Ross J. Reedstrom" <reedstrm@rice.edu> writes:
|
||
> > Any strong objections to the mixed relname_oid solution?
|
||
>
|
||
> Yes!
|
||
>
|
||
> You cannot make it work reliably unless the relname part is
|
||
> the original
|
||
> relname and does not track ALTER TABLE RENAME.
|
||
|
||
It does, or should at least. Only problem case is where db crashes during
|
||
alter or commit/rollback. This could be fixed by first open that fails to
|
||
find the file
|
||
or vacuum, or some other utility.
|
||
|
||
> IMHO having
|
||
> an obsolete
|
||
> relname in the filename is worse than not having the relname at all;
|
||
> it's a recipe for confusion, it means you still need admin
|
||
> tools to tell
|
||
> which end is really up, and what's worst is you might think you don't.
|
||
>
|
||
> Furthermore it requires an additional column in pg_class to keep track
|
||
> of the original relname, which is a waste of space and effort.
|
||
|
||
it does not.
|
||
|
||
> Finally, one of the reasons I want to go to filenames based
|
||
> only on OID
|
||
> is that that'll make life easier for mdblindwrt. Original
|
||
> relname + OID
|
||
> doesn't help, in fact it makes life harder (more shmem space needed to
|
||
> keep track of the filename for each buffer).
|
||
|
||
I do not see this. filename is constructed from relname+oid.
|
||
if not found, do directory scan for *_<OID>.dat, if found --> rename.
|
||
|
||
Andreas
|
||
|
||
From pgsql-hackers-owner+M3407@hub.org Thu Jun 15 05:01:03 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id FAA25462
|
||
for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 05:01:02 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id EAA04667 for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 04:45:51 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5F8gr817124;
|
||
Thu, 15 Jun 2000 04:42:53 -0400 (EDT)
|
||
Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5F8gX815763
|
||
for <pgsql-hackers@postgresql.org>; Thu, 15 Jun 2000 04:42:34 -0400 (EDT)
|
||
Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
|
||
by gandalf.it-austria.net (xxx/xxx) with ESMTP id KAA29072;
|
||
Thu, 15 Jun 2000 10:41:51 +0200
|
||
Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
|
||
id <M6F99RCR>; Thu, 15 Jun 2000 10:41:51 +0200
|
||
Message-ID: <219F68D65015D011A8E000006F8590C604AF7DE9@sdexcsrv1.f000.d0188.sd.spardat.at>
|
||
From: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
|
||
To: "'Tom Lane'" <tgl@sss.pgh.pa.us>
|
||
Cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: AW: [HACKERS] Big 7.1 open items
|
||
Date: Thu, 15 Jun 2000 10:41:50 +0200
|
||
MIME-Version: 1.0
|
||
X-Mailer: Internet Mail Service (5.5.2448.0)
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
> It's just not possible to do
|
||
> multiple file renames atomically.
|
||
|
||
This is not necessary, since *_<OID> is unique regardless of relname prefix.
|
||
|
||
Andreas
|
||
|
||
From scrappy@hub.org Thu Jun 15 08:30:59 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id IAA03846
|
||
for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 08:30:58 -0400 (EDT)
|
||
Received: from thelab.hub.org (nat193.152.mpoweredpc.net [142.177.193.152]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id IAA14167 for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 08:16:58 -0400 (EDT)
|
||
Received: from localhost (scrappy@localhost)
|
||
by thelab.hub.org (8.9.3/8.9.3) with ESMTP id JAA74856;
|
||
Thu, 15 Jun 2000 09:14:29 -0300 (ADT)
|
||
(envelope-from scrappy@hub.org)
|
||
X-Authentication-Warning: thelab.hub.org: scrappy owned process doing -bs
|
||
Date: Thu, 15 Jun 2000 09:14:29 -0300 (ADT)
|
||
From: The Hermit Hacker <scrappy@hub.org>
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: Tom Lane <tgl@sss.pgh.pa.us>, Jan Wieck <JanWieck@Yahoo.com>,
|
||
Oliver Elphick <olly@lfix.co.uk>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-Reply-To: <200006150321.XAA09510@candle.pha.pa.us>
|
||
Message-ID: <Pine.BSF.4.21.0006150909030.722-100000@thelab.hub.org>
|
||
MIME-Version: 1.0
|
||
Content-Type: TEXT/PLAIN; charset=US-ASCII
|
||
Status: RO
|
||
|
||
On Wed, 14 Jun 2000, Bruce Momjian wrote:
|
||
|
||
> > Backtraces from *what*, exactly? 99% of the backend is still going
|
||
> > to be dealing with the same data as ever. It might be that poking
|
||
> > around in fd.c will be a little harder, but considering that fd.c
|
||
> > doesn't really know or care what the files it's manipulating are
|
||
> > anyway, I'm not convinced that this is a real issue.
|
||
>
|
||
> I was just throwing gdb out as an example. The bigger ones are ls,
|
||
> lsof/fstat, and tar.
|
||
|
||
You've lost me on this one ... if someone does an lsof of the process, it
|
||
will still provide them a list of open files ... are you complaining about
|
||
the extra step required to translate the file name to a "valid table"?
|
||
|
||
Oh, one point here ... this whole 'filenaming issue' ... as far as ls is
|
||
concerned, at least, only affects the superuser, since he's the only one
|
||
that can go 'ls'ng around i nthe directories ...
|
||
|
||
And, ummm, how hard would it be to have \d in psql display the "physical
|
||
table name" as part of its output?
|
||
|
||
Slight tangent here:
|
||
|
||
One thing that I think would be great if we could add is some sort of:
|
||
|
||
SELECT db_name, disk_space;
|
||
|
||
query wher a database owner, not the superuser, could see how much disk
|
||
space their tables are using up ... possible?
|
||
|
||
|
||
From pgsql-hackers-owner+M3412@hub.org Thu Jun 15 08:30:55 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id IAA03842
|
||
for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 08:30:54 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id IAA15241 for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 08:31:29 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5FCSM877572;
|
||
Thu, 15 Jun 2000 08:28:22 -0400 (EDT)
|
||
Received: from zrtps06s.us.nortel.com ([47.140.48.50])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5FCRS877255
|
||
for <pgsql-hackers@postgresql.org>; Thu, 15 Jun 2000 08:27:28 -0400 (EDT)
|
||
Received: from ertpg15e1.nortelnetworks.com (actually zrtph06n.us.nortel.com)
|
||
by zrtps06s.us.nortel.com; Thu, 15 Jun 2000 08:26:34 -0400
|
||
Received: from zrtpd004.us.nortel.com (actually zrtpd004)
|
||
by ertpg15e1.nortelnetworks.com; Thu, 15 Jun 2000 08:26:11 -0400
|
||
Received: from zrtpd003.us.nortel.com ([47.140.224.137])
|
||
by zrtpd004.us.nortel.com
|
||
with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2650.21)
|
||
id MPQCZWMM; Thu, 15 Jun 2000 08:26:10 -0400
|
||
Received: from americasm01.nt.com (hrtpp28d.us.nortel.com [47.190.110.250])
|
||
by zrtpd003.us.nortel.com
|
||
with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2650.21)
|
||
id L1N0XG78; Thu, 15 Jun 2000 08:26:12 -0400
|
||
Message-ID: <3948CBDC.5A4F5705@americasm01.nt.com>
|
||
Date: Thu, 15 Jun 2000 08:28:12 -0400
|
||
X-Sybari-Space: 00000000 00000000 00000000
|
||
From: "Mark Hollomon" <mhh@nortelnetworks.com>
|
||
Reply-To: "Mark Hollomon" <mhh@nortelnetworks.com>
|
||
Organization: Nortel Networks
|
||
X-Mailer: Mozilla 4.04 [en] (Win95; U)
|
||
MIME-Version: 1.0
|
||
To: "Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
CC: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
References: <16985.961038832@sss.pgh.pa.us> <200006150321.XAA09510@candle.pha.pa.us> <20000615010312.A995@rice.edu>
|
||
Content-Type: text/plain; charset=us-ascii
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Orig: <mhh@nortelnetworks.com>
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
Ross J. Reedstrom wrote:
|
||
>
|
||
> Any strong objections to the mixed relname_oid solution? It gets us
|
||
> everything oids does, and still lets Bruce use 'ls -l' to find the big
|
||
> tables, putting off writing any admin tools that'll need to be rewritten,
|
||
> anyway.
|
||
|
||
I would object to the mixed name.
|
||
|
||
Consider:
|
||
|
||
CREATE TABLE FOO ....
|
||
ALTER TABLE FOO RENAME FOO_OLD;
|
||
CREATE TABLE FOO ....
|
||
|
||
For the same atomicity reason, rename can't change the
|
||
name of the files. So, which foo_<oid> is the FOO_OLD
|
||
and which is FOO?
|
||
|
||
In other words, in the presence of rename, putting
|
||
relname in the filename is misleading at best.
|
||
|
||
--
|
||
|
||
Mark Hollomon
|
||
mhh@nortelnetworks.com
|
||
ESN 451-9008 (302)454-9008
|
||
|
||
From pgsql-hackers-owner+M3413@hub.org Thu Jun 15 08:30:47 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id IAA03837
|
||
for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 08:30:45 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5FCTb883200;
|
||
Thu, 15 Jun 2000 08:29:37 -0400 (EDT)
|
||
Received: from smtp1.andrew.cmu.edu (SMTP1.ANDREW.CMU.EDU [128.2.10.81])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5FCT7881265
|
||
for <pgsql-hackers@postgresql.org>; Thu, 15 Jun 2000 08:29:07 -0400 (EDT)
|
||
Received: from export.andrew.cmu.edu (EXPORT.ANDREW.CMU.EDU [128.2.23.2])
|
||
by smtp1.andrew.cmu.edu (8.9.3/8.9.3) with ESMTP id IAA02782
|
||
for <pgsql-hackers@postgresql.org>; Thu, 15 Jun 2000 08:29:02 -0400 (EDT)
|
||
Date: Thu, 15 Jun 2000 08:29:02 -0400 (EDT)
|
||
Message-Id: <200006151229.IAA02782@smtp1.andrew.cmu.edu>
|
||
From: Brian E Gallew <geek+@cmu.edu>
|
||
X-Mailer: BatIMail version 3.2
|
||
To: "PostgreSQL-development" <pgsql-hackers@postgresql.org>
|
||
In-reply-to: <16810.961036579@sss.pgh.pa.us>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
References: <200006150228.WAA06576@candle.pha.pa.us> <16810.961036579@sss.pgh.pa.us>
|
||
Mime-Version: 1.0 (generated by tm-edit 7.106)
|
||
Content-Type: multipart/signed; protocol="application/pgp-signature";
|
||
boundary="pgp-sign-Multipart_Thu_Jun_15_08:29:00_2000-1"; micalg=pgp-md5
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
|
||
--pgp-sign-Multipart_Thu_Jun_15_08:29:00_2000-1
|
||
Content-Type: text/plain; charset=US-ASCII
|
||
|
||
Then <tgl@sss.pgh.pa.us> spoke up and said:
|
||
> Precedence: bulk
|
||
>
|
||
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> > But seriously, let me give some background. I used Ingres, that used
|
||
> > the VMS file system, but used strange sequential AAAF324 numbers for
|
||
> > tables. When someone deleted a table, or we were looking at what tables
|
||
> > were using disk space, it was impossible to find the Ingres table names
|
||
> > that went with the file. There was a system table that showed it, but
|
||
> > it was poorly documented, and if you deleted the table, there was no way
|
||
> > to look on the tape to find out which file to restore.
|
||
>
|
||
> Fair enough, but it seems to me that the answer is to expend some effort
|
||
> on system admin support tools. We could do a lot in that line with less
|
||
> effort than trying to make a fundamentally mismatched filesystem
|
||
> representation do what we need.
|
||
|
||
We've been an Ingres shop as long as there's been an Ingres. While
|
||
we've also had the problem Bruce noticed with table names, we've
|
||
*also* used the trivial fix of running a (simple) Report Writer job
|
||
each night, immediately before the backup, that lists all of the
|
||
database tables/indicies and the underlying files.
|
||
|
||
True, if someone drops/recreates a table twice between backups we
|
||
can't find the intermediate file name, but since we also haven't
|
||
backed up that filename, this isn't an issue.
|
||
|
||
Also, the consistency issue is really not as important as you would
|
||
think. If you are restoring a table, you want the information in it,
|
||
whether or not it's consistent with anything else. I've done hundreds
|
||
of table restores (can you say "modify table to heap"?) and never once
|
||
has inconsistency been an issue. Oh, yeah, and we don't shut the
|
||
database down for this, either. (That last isn't my choice, BTW.)
|
||
|
||
--
|
||
=====================================================================
|
||
| JAVA must have been developed in the wilds of West Virginia. |
|
||
| After all, why else would it support only single inheritance?? |
|
||
=====================================================================
|
||
| Finger geek@cmu.edu for my public key. |
|
||
=====================================================================
|
||
|
||
--pgp-sign-Multipart_Thu_Jun_15_08:29:00_2000-1
|
||
Content-Type: application/pgp-signature
|
||
Content-Transfer-Encoding: 7bit
|
||
|
||
-----BEGIN PGP MESSAGE-----
|
||
Version: 2.6.2
|
||
Comment: Processed by Mailcrypt 3.3, an Emacs/PGP interface
|
||
|
||
iQBVAwUBOUjMDYdzVnzma+gdAQHUowH+JglNasUWT5RKSnF3pzNdy5nyrGmLhbWa
|
||
Oom1oUqToxcyfjVFL34dXpnIlvNHO0K2Di4NKZ9HykwOHzrnExf15w==
|
||
=yXoe
|
||
-----END PGP MESSAGE-----
|
||
|
||
--pgp-sign-Multipart_Thu_Jun_15_08:29:00_2000-1--
|
||
|
||
|
||
From dhogaza@pacifier.com Thu Jun 15 09:31:05 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id JAA04418
|
||
for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 09:31:04 -0400 (EDT)
|
||
Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id JAA20080 for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 09:22:36 -0400 (EDT)
|
||
Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68])
|
||
by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id GAA05755;
|
||
Thu, 15 Jun 2000 06:21:54 -0700 (PDT)
|
||
Message-Id: <3.0.1.32.20000615054049.011bcec0@mail.pacifier.com>
|
||
X-Sender: dhogaza@mail.pacifier.com
|
||
X-Mailer: Windows Eudora Pro Version 3.0.1 (32)
|
||
Date: Thu, 15 Jun 2000 05:40:49 -0700
|
||
To: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>,
|
||
Bruce Momjian <pgman@candle.pha.pa.us>, Tom Lane <tgl@sss.pgh.pa.us>
|
||
From: Don Baccus <dhogaza@pacifier.com>
|
||
Subject: Re: AW: [HACKERS] Big 7.1 open items
|
||
Cc: Jan Wieck <JanWieck@yahoo.com>, Oliver Elphick <olly@lfix.co.uk>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
In-Reply-To: <219F68D65015D011A8E000006F8590C604AF7DE7@sdexcsrv1.f000.d0
|
||
188.sd.spardat.at>
|
||
Mime-Version: 1.0
|
||
Content-Type: text/plain; charset="us-ascii"
|
||
Status: RO
|
||
|
||
At 10:04 AM 6/15/00 +0200, Zeugswetter Andreas SB wrote:
|
||
>
|
||
>> In reality, very few people are going to be interested in restoring
|
||
>> a table in a way that breaks referential integrity and other
|
||
>> normal assumptions about what exists in the database.
|
||
>
|
||
>This is not true. In my DBA history it would have saved me manweeks
|
||
>of work if an easy and efficient restore of one single table from backup
|
||
>would have been available in Informix and Oracle.
|
||
>We allways had to restore most of the whole system to another machine only
|
||
>to get back at some table info that would then be manually re-added
|
||
>to the production system.
|
||
|
||
I'm missing something, I guess. You would do a createdb, do a filesystem
|
||
copy of pg_log and one file into it, and then read data from the table
|
||
without having to restore the other tables in the database?
|
||
|
||
I'm just curious - when was the last time you restored a Postgres
|
||
database in this piecemeal manner, and how often do you do it?
|
||
|
||
|
||
|
||
- Don Baccus, Portland OR <dhogaza@pacifier.com>
|
||
Nature photos, on-line guides, Pacific Northwest
|
||
Rare Bird Alert Service and other goodies at
|
||
http://donb.photo.net.
|
||
|
||
From pgsql-hackers-owner+M3440@hub.org Thu Jun 15 14:46:22 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA04607
|
||
for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 14:46:21 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id MAA12695 for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 12:48:58 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5FGjXI40370;
|
||
Thu, 15 Jun 2000 12:45:33 -0400 (EDT)
|
||
Received: from wallace.ece.rice.edu (wallace.ece.rice.edu [128.42.12.154])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5FGjJI39359
|
||
for <pgsql-hackers@postgresql.org>; Thu, 15 Jun 2000 12:45:20 -0400 (EDT)
|
||
Received: by rice.edu
|
||
via sendmail from stdin
|
||
id <m132clb-000LEEC@wallace.ece.rice.edu> (Debian Smail3.2.0.102)
|
||
for pgsql-hackers@postgresql.org; Thu, 15 Jun 2000 11:45:19 -0500 (CDT)
|
||
Date: Thu, 15 Jun 2000 11:45:19 -0500
|
||
From: "Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Cc: PostgreSQL-development <pgsql-hackers@postgreSQL.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
Message-ID: <20000615114519.B3939@rice.edu>
|
||
Mail-Followup-To: Tom Lane <tgl@sss.pgh.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
References: <16985.961038832@sss.pgh.pa.us> <200006150321.XAA09510@candle.pha.pa.us> <20000615010312.A995@rice.edu> <18798.961053112@sss.pgh.pa.us>
|
||
Mime-Version: 1.0
|
||
Content-Type: text/plain; charset=us-ascii
|
||
User-Agent: Mutt/1.0i
|
||
In-Reply-To: <18798.961053112@sss.pgh.pa.us>; from tgl@sss.pgh.pa.us on Thu, Jun 15, 2000 at 03:11:52AM -0400
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: ROr
|
||
|
||
On Thu, Jun 15, 2000 at 03:11:52AM -0400, Tom Lane wrote:
|
||
> "Ross J. Reedstrom" <reedstrm@rice.edu> writes:
|
||
> > Any strong objections to the mixed relname_oid solution?
|
||
>
|
||
> Yes!
|
||
>
|
||
> You cannot make it work reliably unless the relname part is the original
|
||
> relname and does not track ALTER TABLE RENAME. IMHO having an obsolete
|
||
> relname in the filename is worse than not having the relname at all;
|
||
> it's a recipe for confusion, it means you still need admin tools to tell
|
||
> which end is really up, and what's worst is you might think you don't.
|
||
|
||
The plan here was to let VACUUM handle renaming the file, since it
|
||
will already have all the necessary locks. This shortens the window
|
||
of confusion. ALTER TABLE RENAME doesn't happen that often, really -
|
||
the relname is there just for human consumption, then.
|
||
|
||
>
|
||
> Furthermore it requires an additional column in pg_class to keep track
|
||
> of the original relname, which is a waste of space and effort.
|
||
>
|
||
|
||
I actually started down this path thinking about implementing SCHEMA,
|
||
since tables in the same DB but in different schema can have the same
|
||
relname, I figured I needed to change that. We'll need something in
|
||
pg_class to keep track of what schema a relation is in, instead.
|
||
|
||
> It also creates a portability risk, or at least fails to remove one,
|
||
> since you are critically dependent on the assumption that the OS
|
||
> supports long filenames --- on a filesystem that truncates names to less
|
||
> than about 45 characters you're in very deep trouble. An OID-only
|
||
> approach still works on traditional 14-char-filename Unix filesystems
|
||
> (it'd mostly even work on DOS 8+3, though I doubt we care about that).
|
||
|
||
Actually, no. Since I store the filename in a name attribute, I used this
|
||
nifty function somebody wrote, makeObjectName, to trim the relname part,
|
||
but leave the oid. (Yes, I know it's yours ;-)
|
||
|
||
>
|
||
> Finally, one of the reasons I want to go to filenames based only on OID
|
||
> is that that'll make life easier for mdblindwrt. Original relname + OID
|
||
> doesn't help, in fact it makes life harder (more shmem space needed to
|
||
> keep track of the filename for each buffer).
|
||
|
||
Can you explain in more detail how this helps? Not by letting the bufmgr
|
||
know that oid == filename, I hope. We need to improving the abstraction
|
||
of the smgr, not add another violation. Ah, sorry, mdblindwrt _is_
|
||
in the smgr.
|
||
|
||
Hmm, grovelling through that code, I see how it could be simpler if reloid
|
||
== filename. Heck, we even get to save shmem in the buffdesc.blind part,
|
||
since we only need the dbname in there, now.
|
||
|
||
Hmm, I see I missed the relpath_blind() in my patch - oops. (relpath()
|
||
is always called with RelationGetPhysicalRelationName(), and that's
|
||
where I was putting in the relphysname)
|
||
|
||
Hmm, what's all this with functions in catalog.c that are only called by
|
||
smgr/md.c? seems to me that anything having to do with physical storage
|
||
(like the path!) belongs in the smgr abstraction.
|
||
|
||
>
|
||
> Can we *PLEASE JUST LET GO* of this bad idea? No relname in the
|
||
> filename. Period.
|
||
>
|
||
|
||
Gee, so dogmatic. No one besides Bruce and Hiroshi discussed this _at
|
||
all_ when I first put up patches two month ago. O.K., I'll do the oids
|
||
only version (and fix up relpath_blind)
|
||
|
||
Ross
|
||
|
||
--
|
||
Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu>
|
||
NSBRI Research Scientist/Programmer
|
||
Computer and Information Technology Institute
|
||
Rice University, 6100 S. Main St., Houston, TX 77005
|
||
|
||
From Inoue@tpf.co.jp Thu Jun 15 17:45:40 2000
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA27548
|
||
for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 17:45:37 -0400 (EDT)
|
||
Received: from mcadnote1 (ppm122.noc.fukui.nsk.ne.jp [210.161.188.41])
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id GAA07248; Fri, 16 Jun 2000 06:45:30 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Bruce Momjian" <pgman@candle.pha.pa.us>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Cc: "Tom Lane" <tgl@sss.pgh.pa.us>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgreSQL.org>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Fri, 16 Jun 2000 06:48:21 +0900
|
||
Message-ID: <EKEJJICOHDIEMGPNIFIJKEPCCBAA.Inoue@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="us-ascii"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
|
||
In-Reply-To: <200006151935.PAA17512@candle.pha.pa.us>
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
|
||
Importance: Normal
|
||
Status: ROr
|
||
|
||
> -----Original Message-----
|
||
> From: pgsql-hackers-owner@hub.org
|
||
> [mailto:pgsql-hackers-owner@hub.org]On Behalf Of Bruce Momjian
|
||
>
|
||
> > > Can we *PLEASE JUST LET GO* of this bad idea? No relname in the
|
||
> > > filename. Period.
|
||
> > >
|
||
> >
|
||
> > Gee, so dogmatic. No one besides Bruce and Hiroshi discussed this _at
|
||
> > all_ when I first put up patches two month ago. O.K., I'll do the oids
|
||
> > only version (and fix up relpath_blind)
|
||
>
|
||
> Hold on. I don't think we want that work done yet. Seems even Tom is
|
||
> thinking that if Vadim is going to re-do everything later anyway, we may
|
||
> be better with a relname/oid solution that does require additional
|
||
> administration apps.
|
||
>
|
||
|
||
Hmm,why is naming rule first ?
|
||
|
||
I've never enphasized naming rule except that it should be unique.
|
||
It has been my main point to reduce the necessity of naming rule
|
||
as possible. IIRC,by keeping the stored place in pg_class,Ross's
|
||
trial patch remains only 2 places where naming rule is required.
|
||
So wouldn't we be free from naming rule(it would not be so difficult
|
||
to change naming rule if the rule is found to be bad) ?
|
||
|
||
I've also mentioned many times neither relname nor oid is sufficient
|
||
for the uniqueness. In addiiton neither relname nor oid would be
|
||
necessary for the uniqueness.
|
||
IMHO,it's bad to rely on the item which is neither necessary nor
|
||
sufficient.
|
||
I proposed relname+unique_id naming once. The unique_id is
|
||
independent from oid. The relname is only for convinience for
|
||
DBA and so we don't have to change it due to RENAME.
|
||
Db's consistency is much more important than dba's satis-
|
||
faction.
|
||
|
||
Comments ?
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
|
||
From pgsql-hackers-owner+M3448@hub.org Thu Jun 15 19:01:03 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA00764
|
||
for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 19:01:02 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id SAA17328 for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 18:57:32 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5FMsMI97744;
|
||
Thu, 15 Jun 2000 18:54:22 -0400 (EDT)
|
||
Received: from wallace.ece.rice.edu (wallace.ece.rice.edu [128.42.12.154])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5FMs0I94252
|
||
for <pgsql-hackers@postgresql.org>; Thu, 15 Jun 2000 18:54:00 -0400 (EDT)
|
||
Received: by rice.edu
|
||
via sendmail from stdin
|
||
id <m132iWN-000LEEC@wallace.ece.rice.edu> (Debian Smail3.2.0.102)
|
||
for pgsql-hackers@postgresql.org; Thu, 15 Jun 2000 17:53:59 -0500 (CDT)
|
||
Date: Thu, 15 Jun 2000 17:53:59 -0500
|
||
From: "Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
To: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
Message-ID: <20000615175359.A12194@rice.edu>
|
||
Mail-Followup-To: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
References: <EKEJJICOHDIEMGPNIFIJKEPCCBAA.Inoue@tpf.co.jp> <200006152148.RAA27790@candle.pha.pa.us>
|
||
Mime-Version: 1.0
|
||
Content-Type: text/plain; charset=us-ascii
|
||
User-Agent: Mutt/1.0i
|
||
In-Reply-To: <200006152148.RAA27790@candle.pha.pa.us>; from pgman@candle.pha.pa.us on Thu, Jun 15, 2000 at 05:48:59PM -0400
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
On Thu, Jun 15, 2000 at 05:48:59PM -0400, Bruce Momjian wrote:
|
||
> > I've also mentioned many times neither relname nor oid is sufficient
|
||
> > for the uniqueness. In addiiton neither relname nor oid would be
|
||
> > necessary for the uniqueness.
|
||
> > IMHO,it's bad to rely on the item which is neither necessary nor
|
||
> > sufficient.
|
||
> > I proposed relname+unique_id naming once. The unique_id is
|
||
> > independent from oid. The relname is only for convinience for
|
||
> > DBA and so we don't have to change it due to RENAME.
|
||
> > Db's consistency is much more important than dba's satis-
|
||
> > faction.
|
||
> >
|
||
> > Comments ?
|
||
>
|
||
> I am happy not to rename the file on 'RENAME', but seems no one likes
|
||
> that.
|
||
|
||
Good, 'cause that's how I've implemented it so far. Actually, all
|
||
I've done is port my previous patch to current, with one little
|
||
change: I added a macro RelationGetRealRelationName which does what
|
||
RelationGetPhysicalRelationName used to do: i.e. return the relname with
|
||
no temptable funny business, and used that for the relcache macros. It
|
||
passes all the serial regression tests: I haven't run the parallel tests
|
||
yet. ALTER TABLE RENAME rollsback nicely. I'll need to learn some omre
|
||
about xacts to get DROP TABLE rolling back.
|
||
|
||
I'll drop it on PATCHES right now, for comment.
|
||
|
||
Ross
|
||
--
|
||
Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu>
|
||
NSBRI Research Scientist/Programmer
|
||
Computer and Information Technology Institute
|
||
Rice University, 6100 S. Main St., Houston, TX 77005
|
||
|
||
From pgsql-hackers-owner+M3451@hub.org Thu Jun 15 20:01:00 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA01651
|
||
for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 20:00:59 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id TAA20985 for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 19:57:49 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5FNsgI25402;
|
||
Thu, 15 Jun 2000 19:54:42 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5FNsCI22412
|
||
for <pgsql-hackers@postgresql.org>; Thu, 15 Jun 2000 19:54:12 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id TAA02263;
|
||
Thu, 15 Jun 2000 19:53:52 -0400 (EDT)
|
||
To: "Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
cc: PostgreSQL-development <pgsql-hackers@postgreSQL.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <20000615114519.B3939@rice.edu>
|
||
References: <16985.961038832@sss.pgh.pa.us> <200006150321.XAA09510@candle.pha.pa.us> <20000615010312.A995@rice.edu> <18798.961053112@sss.pgh.pa.us> <20000615114519.B3939@rice.edu>
|
||
Comments: In-reply-to "Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
message dated "Thu, 15 Jun 2000 11:45:19 -0500"
|
||
Date: Thu, 15 Jun 2000 19:53:52 -0400
|
||
Message-ID: <2260.961113232@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu> writes:
|
||
> On Thu, Jun 15, 2000 at 03:11:52AM -0400, Tom Lane wrote:
|
||
>> "Ross J. Reedstrom" <reedstrm@rice.edu> writes:
|
||
>>>> Any strong objections to the mixed relname_oid solution?
|
||
>>
|
||
>> Yes!
|
||
|
||
> The plan here was to let VACUUM handle renaming the file, since it
|
||
> will already have all the necessary locks. This shortens the window
|
||
> of confusion. ALTER TABLE RENAME doesn't happen that often, really -
|
||
> the relname is there just for human consumption, then.
|
||
|
||
Yeah, I've seen tons of discussion of how if we do this, that, and
|
||
the other thing, and be prepared to fix up some other things in case
|
||
of crash recovery, we can make it work with filename == relname + OID
|
||
(where relname tracks logical name, at least at some remove).
|
||
|
||
Probably. Assuming nobody forgets anything.
|
||
|
||
I'm just trying to point out that that's a huge amount of pretty
|
||
delicate mechanism. The amount of work required to make it trustworthy
|
||
looks to me to dwarf the admin tools that Bruce is complaining about.
|
||
And we only have a few people competent to do the work. (With all
|
||
due respect, Ross, if you weren't already aware of the implications
|
||
for mdblindwrt, I have to wonder what else you missed.)
|
||
|
||
Filename == OID is so simple, reliable, and straightforward by
|
||
comparison that I think the decision is a no-brainer.
|
||
|
||
If we could afford to sink unlimited time into this one issue then
|
||
it might make sense to do it the hard way, but we have enough
|
||
important stuff on our TODO list to keep us all busy for years ---
|
||
I cannot believe that it's an effective use of our time to do this.
|
||
|
||
|
||
> Hmm, what's all this with functions in catalog.c that are only called by
|
||
> smgr/md.c? seems to me that anything having to do with physical storage
|
||
> (like the path!) belongs in the smgr abstraction.
|
||
|
||
Yeah, there's a bunch of stuff that should have been implemented by
|
||
adding new smgr entry points, but wasn't. It should be pushed down.
|
||
(I can't resist pointing out that one of those things is physical
|
||
relation rename, which will go away and not *need* to be pushed down
|
||
if we do it the way I want.)
|
||
|
||
regards, tom lane
|
||
|
||
From tgl@sss.pgh.pa.us Thu Jun 15 20:00:59 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA01647
|
||
for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 20:00:58 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id TAA21034 for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 19:58:30 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id TAA02283;
|
||
Thu, 15 Jun 2000 19:57:05 -0400 (EDT)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: "Ross J. Reedstrom" <reedstrm@rice.edu>,
|
||
PostgreSQL-development <pgsql-hackers@postgreSQL.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006151935.PAA17512@candle.pha.pa.us>
|
||
References: <200006151935.PAA17512@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Thu, 15 Jun 2000 15:35:45 -0400"
|
||
Date: Thu, 15 Jun 2000 19:57:05 -0400
|
||
Message-ID: <2280.961113425@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: RO
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
>> Gee, so dogmatic. No one besides Bruce and Hiroshi discussed this _at
|
||
>> all_ when I first put up patches two month ago. O.K., I'll do the oids
|
||
>> only version (and fix up relpath_blind)
|
||
|
||
> Hold on. I don't think we want that work done yet. Seems even Tom is
|
||
> thinking that if Vadim is going to re-do everything later anyway, we may
|
||
> be better with a relname/oid solution that does require additional
|
||
> administration apps.
|
||
|
||
Don't put words in my mouth, please. If we are going to throw the
|
||
work away later, it'd be foolish to do the much greater amount of
|
||
work needed to make filename=relname+OID fly than is needed for
|
||
filename=OID.
|
||
|
||
However, I'm pretty sure I recall Vadim stating that he thought
|
||
filename=OID would be required for his smgr changes anyway...
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M3453@hub.org Thu Jun 15 21:01:01 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA02731
|
||
for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 21:01:01 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id UAA23469 for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 20:36:36 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5G0WDI97134;
|
||
Thu, 15 Jun 2000 20:32:13 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5G0VsI97003
|
||
for <pgsql-hackers@postgresql.org>; Thu, 15 Jun 2000 20:31:54 -0400 (EDT)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id JAA07328; Fri, 16 Jun 2000 09:26:04 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Bruce Momjian" <maillist@candle.pha.pa.us>,
|
||
"Tom Lane" <tgl@sss.pgh.pa.us>
|
||
Cc: "PostgreSQL-development" <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Fri, 16 Jun 2000 09:28:14 +0900
|
||
Message-ID: <000d01bfd729$c24b29c0$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-2022-jp"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
In-Reply-To: <2260.961113232@sss.pgh.pa.us>
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
Importance: Normal
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
> -----Original Message-----
|
||
> From: pgsql-hackers-owner@hub.org [mailto:pgsql-hackers-owner@hub.org]On
|
||
> Behalf Of Tom Lane
|
||
>
|
||
> "Ross J. Reedstrom" <reedstrm@rice.edu> writes:
|
||
> > On Thu, Jun 15, 2000 at 03:11:52AM -0400, Tom Lane wrote:
|
||
> >> "Ross J. Reedstrom" <reedstrm@rice.edu> writes:
|
||
> >>>> Any strong objections to the mixed relname_oid solution?
|
||
> >>
|
||
> >> Yes!
|
||
>
|
||
> > The plan here was to let VACUUM handle renaming the file, since it
|
||
> > will already have all the necessary locks. This shortens the window
|
||
> > of confusion. ALTER TABLE RENAME doesn't happen that often, really -
|
||
> > the relname is there just for human consumption, then.
|
||
>
|
||
> Yeah, I've seen tons of discussion of how if we do this, that, and
|
||
> the other thing, and be prepared to fix up some other things in case
|
||
> of crash recovery, we can make it work with filename == relname + OID
|
||
> (where relname tracks logical name, at least at some remove).
|
||
>
|
||
|
||
I've seen little discussion of how to avoid the use of naming rule.
|
||
I've proposed many times that we should keep the information
|
||
where the table is stored in our database itself. I've never seen
|
||
clear objections to it. So I could understand my proposal is OK ?
|
||
Isn't it much more important than naming rule ? Under the
|
||
mechanism,we could easily replace bad naming rule.
|
||
And I believe that Ross's work is mostly around the mechanism
|
||
not naming rule.
|
||
|
||
Now I like neither relname nor oid because it's not sufficient
|
||
for my purpose.
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
From tgl@sss.pgh.pa.us Thu Jun 15 22:01:02 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA03637
|
||
for <maillist@candle.pha.pa.us>; Thu, 15 Jun 2000 22:01:01 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id VAA28521 for <maillist@candle.pha.pa.us>; Thu, 15 Jun 2000 21:58:46 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id VAA02730;
|
||
Thu, 15 Jun 2000 21:57:27 -0400 (EDT)
|
||
To: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
cc: "Bruce Momjian" <maillist@candle.pha.pa.us>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <000d01bfd729$c24b29c0$2801007e@tpf.co.jp>
|
||
References: <000d01bfd729$c24b29c0$2801007e@tpf.co.jp>
|
||
Comments: In-reply-to "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
message dated "Fri, 16 Jun 2000 09:28:14 +0900"
|
||
Date: Thu, 15 Jun 2000 21:57:27 -0400
|
||
Message-ID: <2727.961120647@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: ROr
|
||
|
||
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
|
||
> Now I like neither relname nor oid because it's not sufficient
|
||
> for my purpose.
|
||
|
||
We should probably not do much of anything with this issue until
|
||
we have a clearer understanding of what we want to do about
|
||
tablespaces and schemas.
|
||
|
||
My gut feeling is that we will end up with pathnames that look
|
||
something like
|
||
|
||
.../data/base/DBNAME/TABLESPACE/OIDOFRELATION
|
||
|
||
(with .N attached if a segment of a large relation, of course).
|
||
|
||
The TABLESPACE "name" should likely be an OID itself, but it wouldn't
|
||
have to be if you are willing to say that tablespaces aren't renamable.
|
||
(Come to think of it, does anyone care about being able to rename
|
||
databases? ;-)) Note that the TABLESPACE will often be a symlink
|
||
to storage on another drive, rather than a plain subdirectory of the
|
||
DBNAME, but that shouldn't be an issue at this level of discussion.
|
||
|
||
I think that schemas probably don't enter into this. We should instead
|
||
rely on the uniqueness of OIDs to prevent filename collisions. However,
|
||
OIDs aren't really unique: different databases in an installation will
|
||
use the same OIDs for their system tables. My feeling is that we can
|
||
live with a restriction like "you can't store the system tables of
|
||
different databases in the same tablespace". Alternatively we could
|
||
avoid that issue by inverting the pathname order:
|
||
|
||
.../data/base/TABLESPACE/DBNAME/OIDOFRELATION
|
||
|
||
Note that in any case, system tables will have to live in a
|
||
predetermined tablespace, since you can't very well look in pg_class
|
||
to find out which tablespace pg_class lives in. Perhaps we should
|
||
just reserve a tablespace per database for system tables and forget
|
||
the whole issue. If we do that, there's not really any need for
|
||
the database in the path! Just
|
||
|
||
.../data/base/TABLESPACE/OIDOFRELATION
|
||
|
||
would do fine and help reduce lookup overhead.
|
||
|
||
BTW, schemas do make things interesting for the other camp:
|
||
is it possible for the same table to be referenced by different
|
||
names in different schemas? If so, just how useful is it to pick
|
||
one of those names arbitrarily for the filename? This is an advanced
|
||
version of the main objection to using the original relname and not
|
||
updating it at RENAME TABLE --- sooner or later, the filenames are
|
||
going to be more confusing than helpful.
|
||
|
||
Comments? Have I missed something important about schemas?
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M3457@hub.org Thu Jun 15 22:27:45 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA04586
|
||
for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 22:27:44 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5G2POI23418;
|
||
Thu, 15 Jun 2000 22:25:24 -0400 (EDT)
|
||
Received: from candle.pha.pa.us (pgman@nav-43.dsl.navpoint.com [162.33.245.46])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5G2P3I23299
|
||
for <pgsql-hackers@postgresql.org>; Thu, 15 Jun 2000 22:25:04 -0400 (EDT)
|
||
Received: (from pgman@localhost)
|
||
by candle.pha.pa.us (8.9.0/8.9.0) id WAA04345;
|
||
Thu, 15 Jun 2000 22:24:53 -0400 (EDT)
|
||
From: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
Message-Id: <200006160224.WAA04345@candle.pha.pa.us>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-Reply-To: <2727.961120647@sss.pgh.pa.us> "from Tom Lane at Jun 15, 2000 09:57:27
|
||
pm"
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Date: Thu, 15 Jun 2000 22:24:52 -0400 (EDT)
|
||
CC: Hiroshi Inoue <Inoue@tpf.co.jp>, Bruce Momjian <maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
X-Mailer: ELM [version 2.4ME+ PL77 (25)]
|
||
MIME-Version: 1.0
|
||
Content-Transfer-Encoding: 7bit
|
||
Content-Type: text/plain; charset=US-ASCII
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
> "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
|
||
> > Now I like neither relname nor oid because it's not sufficient
|
||
> > for my purpose.
|
||
>
|
||
> We should probably not do much of anything with this issue until
|
||
> we have a clearer understanding of what we want to do about
|
||
> tablespaces and schemas.
|
||
|
||
Here is an analysis of our options:
|
||
|
||
Work required Disadvantages
|
||
----------------------------------------------------------------------------
|
||
|
||
Keep current system no work rename/create no rollback
|
||
|
||
relname/oid but less work new pg_class column,
|
||
no rename change filename not accurate on
|
||
rename
|
||
|
||
relname/oid with more work complex code
|
||
rename change during
|
||
vacuum
|
||
|
||
oid filename less work, but confusing to admins
|
||
need admin tools
|
||
|
||
--
|
||
Bruce Momjian | http://www.op.net/~candle
|
||
pgman@candle.pha.pa.us | (610) 853-3000
|
||
+ If your life is a hard drive, | 830 Blythe Avenue
|
||
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
|
||
|
||
From Inoue@tpf.co.jp Thu Jun 15 22:41:50 2000
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA05230
|
||
for <maillist@candle.pha.pa.us>; Thu, 15 Jun 2000 22:41:48 -0400 (EDT)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id LAA07495; Fri, 16 Jun 2000 11:41:43 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Tom Lane" <tgl@sss.pgh.pa.us>
|
||
Cc: "Bruce Momjian" <maillist@candle.pha.pa.us>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Fri, 16 Jun 2000 11:43:52 +0900
|
||
Message-ID: <000201bfd73c$b52873c0$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-2022-jp"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
In-Reply-To: <2727.961120647@sss.pgh.pa.us>
|
||
Importance: Normal
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
Status: RO
|
||
|
||
Sorry for my previous mail. It was posted by my mistake.
|
||
|
||
> -----Original Message-----
|
||
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
|
||
>
|
||
> "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
|
||
> > Now I like neither relname nor oid because it's not sufficient
|
||
> > for my purpose.
|
||
>
|
||
> We should probably not do much of anything with this issue until
|
||
> we have a clearer understanding of what we want to do about
|
||
> tablespaces and schemas.
|
||
>
|
||
> My gut feeling is that we will end up with pathnames that look
|
||
> something like
|
||
>
|
||
> .../data/base/DBNAME/TABLESPACE/OIDOFRELATION
|
||
>
|
||
|
||
Schema is a logical concept and irrevant to physical location.
|
||
I strongly object your suggestion unless above means *default*
|
||
location.
|
||
Tablespace is an encapsulation of table allocation and the
|
||
name should be irrevant to the location basically. So above
|
||
seems very bad for me.
|
||
|
||
Anyway I don't see any advantage in fixed mapping impleme
|
||
ntation. After renewal,we should at least have a possibility to
|
||
allocate a specific table in arbitrary separate directory.
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
From Inoue@tpf.co.jp Thu Jun 15 23:31:00 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA06634;
|
||
Thu, 15 Jun 2000 23:30:59 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id XAA03227; Thu, 15 Jun 2000 23:18:54 -0400 (EDT)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id MAA07544; Fri, 16 Jun 2000 12:18:06 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Bruce Momjian" <pgman@candle.pha.pa.us>, "Tom Lane" <tgl@sss.pgh.pa.us>
|
||
Cc: "Bruce Momjian" <maillist@candle.pha.pa.us>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Fri, 16 Jun 2000 12:20:16 +0900
|
||
Message-ID: <000401bfd741$cabea100$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
In-Reply-To: <200006160224.WAA04345@candle.pha.pa.us>
|
||
Importance: Normal
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
Status: RO
|
||
|
||
> -----Original Message-----
|
||
> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
|
||
>
|
||
> > "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
|
||
> > > Now I like neither relname nor oid because it's not sufficient
|
||
> > > for my purpose.
|
||
> >
|
||
> > We should probably not do much of anything with this issue until
|
||
> > we have a clearer understanding of what we want to do about
|
||
> > tablespaces and schemas.
|
||
>
|
||
> Here is an analysis of our options:
|
||
>
|
||
> Work required Disadvantages
|
||
> ------------------------------------------------------------------
|
||
> ----------
|
||
>
|
||
> Keep current system no work rename/create
|
||
> no rollback
|
||
>
|
||
> relname/oid but less work new pg_class column,
|
||
> no rename change filename not
|
||
> accurate on
|
||
> rename
|
||
>
|
||
> relname/oid with more work complex code
|
||
> rename change during
|
||
> vacuum
|
||
>
|
||
> oid filename less work, but confusing to admins
|
||
> need admin tools
|
||
>
|
||
|
||
Please add my opinion for naming rule.
|
||
|
||
relname/unique_id but need some work new pg_class column,
|
||
no relname change. for unique-id generation filename not relname
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
From pgsql-hackers-owner+M3465@hub.org Fri Jun 16 00:01:01 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA06924
|
||
for <pgman@candle.pha.pa.us>; Fri, 16 Jun 2000 00:01:00 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id XAA05470 for <pgman@candle.pha.pa.us>; Thu, 15 Jun 2000 23:59:46 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5G3uaI10809;
|
||
Thu, 15 Jun 2000 23:56:36 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5G3uKI10702
|
||
for <pgsql-hackers@postgresql.org>; Thu, 15 Jun 2000 23:56:21 -0400 (EDT)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id MAA07571; Fri, 16 Jun 2000 12:55:33 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Tom Lane" <tgl@sss.pgh.pa.us>
|
||
Cc: "PostgreSQL-development" <pgsql-hackers@postgresql.org>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Fri, 16 Jun 2000 12:57:44 +0900
|
||
Message-ID: <000501bfd747$067f0220$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-2022-jp"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
In-Reply-To: <3264.961127021@sss.pgh.pa.us>
|
||
Importance: Normal
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
> -----Original Message-----
|
||
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
|
||
>
|
||
> "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
|
||
> > Please add my opinion for naming rule.
|
||
>
|
||
> > relname/unique_id but need some work new
|
||
> pg_class column,
|
||
> > no relname change. for unique-id generation filename not relname
|
||
>
|
||
> Why is a unique ID better than --- or even different from ---
|
||
> using the relation's OID? It seems pointless to me...
|
||
>
|
||
|
||
For example,in the implementation of CLUSTER command,
|
||
we would need another new file for the target relation in
|
||
order to put sorted rows but don't we want to change the
|
||
OID ? It would be needed for table re-construction generally.
|
||
If I remember correectly,you once proposed OID+version
|
||
naming for the cases.
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
From Inoue@tpf.co.jp Fri Jun 16 02:01:00 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA08093
|
||
for <maillist@candle.pha.pa.us>; Fri, 16 Jun 2000 02:00:59 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id BAA10174 for <maillist@candle.pha.pa.us>; Fri, 16 Jun 2000 01:34:44 -0400 (EDT)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id OAA07656; Fri, 16 Jun 2000 14:33:12 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Tom Lane" <tgl@sss.pgh.pa.us>
|
||
Cc: "Bruce Momjian" <maillist@candle.pha.pa.us>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Fri, 16 Jun 2000 14:35:21 +0900
|
||
Message-ID: <000001bfd754$a9e44f80$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
In-Reply-To: <3238.961126521@sss.pgh.pa.us>
|
||
Importance: Normal
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
Status: RO
|
||
|
||
> -----Original Message-----
|
||
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
|
||
>
|
||
> "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
|
||
> > Tablespace is an encapsulation of table allocation and the
|
||
> > name should be irrevant to the location basically. So above
|
||
> > seems very bad for me.
|
||
> > Anyway I don't see any advantage in fixed mapping impleme
|
||
> > ntation. After renewal,we should at least have a possibility to
|
||
> > allocate a specific table in arbitrary separate directory.
|
||
>
|
||
> Call a "directory" a "tablespace" and we're on the same page,
|
||
> aren't we? Actually I'd envision some kind of admin command
|
||
> "CREATE TABLESPACE foo AS /path/to/wherever".
|
||
|
||
Yes,I think 'tablespace -> directory' is the most natural
|
||
extension under current file_per_table storage manager.
|
||
If many_tables_in_a_file storage manager is introduced,we
|
||
may be able to change the definiiton of TABLESPACE
|
||
to 'tablespace -> files' like Oracle.
|
||
|
||
> That would make
|
||
> appropriate system catalog entries and also create a symlink
|
||
> from ".../data/base/foo" (or some such place) to the target
|
||
> directory.
|
||
> Then when we make a table in that tablespace,
|
||
> it's in the right place. Problem solved, no?
|
||
>
|
||
|
||
I don't like symlink for dbms data files. However it may
|
||
be OK,If symlink are limited to 'tablespace->directory'
|
||
corrspondence and all tablespaces(including default
|
||
etc) are symlink. It is simple and all debugging would
|
||
be processed under tablespace_is_symlink environment.
|
||
|
||
> It gets a little trickier if you want to be able to split
|
||
> multi-gig tables across several tablespaces, though, since
|
||
> you couldn't just append ".N" to the base table path in that
|
||
> scenario.
|
||
>
|
||
|
||
This seems to be not that easy to solve now.
|
||
Ross doesn't change this naming rule for multi-gig
|
||
tables either in his trial.
|
||
|
||
> I'd be interested to know what sort of facilities Oracle
|
||
> provides for managing huge tables...
|
||
>
|
||
|
||
In my knowledge about old Oracle,one TABLESPACE
|
||
could have many DATAFILEs which could contain
|
||
many tables.
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
From pgsql-hackers-owner+M3469@hub.org Fri Jun 16 02:01:03 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA08109
|
||
for <pgman@candle.pha.pa.us>; Fri, 16 Jun 2000 02:01:02 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id BAA11218 for <pgman@candle.pha.pa.us>; Fri, 16 Jun 2000 01:57:33 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5G5tLI49492;
|
||
Fri, 16 Jun 2000 01:55:21 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5G5tAI49395
|
||
for <pgsql-hackers@postgresql.org>; Fri, 16 Jun 2000 01:55:10 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id BAA05749;
|
||
Fri, 16 Jun 2000 01:54:46 -0400 (EDT)
|
||
To: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
cc: "PostgreSQL-development" <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <000501bfd747$067f0220$2801007e@tpf.co.jp>
|
||
References: <000501bfd747$067f0220$2801007e@tpf.co.jp>
|
||
Comments: In-reply-to "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
message dated "Fri, 16 Jun 2000 12:57:44 +0900"
|
||
Date: Fri, 16 Jun 2000 01:54:46 -0400
|
||
Message-ID: <5746.961134886@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
|
||
>> Why is a unique ID better than --- or even different from ---
|
||
>> using the relation's OID? It seems pointless to me...
|
||
|
||
> For example,in the implementation of CLUSTER command,
|
||
> we would need another new file for the target relation in
|
||
> order to put sorted rows but don't we want to change the
|
||
> OID ? It would be needed for table re-construction generally.
|
||
> If I remember correectly,you once proposed OID+version
|
||
> naming for the cases.
|
||
|
||
Hmm, so you are thinking that the pg_class row for the table would
|
||
include this uniqueID, and then committing the pg_class update would
|
||
be the atomic action that replaces the old table contents with the
|
||
new? It does have some attraction now that I think about it.
|
||
|
||
But there are other ways we could do the same thing. If we want to
|
||
have tablespaces, there will need to be a tablespace identifier in
|
||
each pg_class row. So we could do CLUSTER in the same way as we'd
|
||
move a table from one tablespace to another: create the new files in
|
||
the new tablespace directory, and the commit of the new pg_class row
|
||
with the new tablespace value is the atomic action that makes the new
|
||
files valid and the old files not.
|
||
|
||
You will probably say "but I didn't want to move my table to a new
|
||
tablespace just to cluster it!" I think we could live with that,
|
||
though. A tablespace doesn't need to have any existence more concrete
|
||
than a subdirectory, in my vision of the way things would work. We
|
||
could do something like making two subdirectories of each place that
|
||
the dbadmin designates as a "tablespace", so that we make two logical
|
||
tablespaces out of what the dbadmin thinks of as one. Then we can
|
||
ping-pong between those directories to do things like clustering "in
|
||
place".
|
||
|
||
Basically I want to keep the bottom-level mechanisms as simple and
|
||
reliable as we possibly can. The fewer concepts are known down at
|
||
the bottom, the better. If we can keep the pathname constituents
|
||
to just "tablespace" and "relation OID" we'll be in great shape ---
|
||
but each additional concept that has to be known down there is
|
||
another potential problem.
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M3471@hub.org Fri Jun 16 03:31:05 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA12816
|
||
for <pgman@candle.pha.pa.us>; Fri, 16 Jun 2000 03:31:04 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id DAA14405 for <pgman@candle.pha.pa.us>; Fri, 16 Jun 2000 03:03:38 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5G71YI83633;
|
||
Fri, 16 Jun 2000 03:01:34 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5G713I82023
|
||
for <pgsql-hackers@postgresql.org>; Fri, 16 Jun 2000 03:01:04 -0400 (EDT)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id QAA07731; Fri, 16 Jun 2000 16:00:57 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Tom Lane" <tgl@sss.pgh.pa.us>
|
||
Cc: "PostgreSQL-development" <pgsql-hackers@postgresql.org>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Fri, 16 Jun 2000 16:03:06 +0900
|
||
Message-ID: <000101bfd760$ebcee3e0$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
In-Reply-To: <5746.961134886@sss.pgh.pa.us>
|
||
Importance: Normal
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
> -----Original Message-----
|
||
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
|
||
>
|
||
> "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
|
||
> >> Why is a unique ID better than --- or even different from ---
|
||
> >> using the relation's OID? It seems pointless to me...
|
||
>
|
||
> > For example,in the implementation of CLUSTER command,
|
||
> > we would need another new file for the target relation in
|
||
> > order to put sorted rows but don't we want to change the
|
||
> > OID ? It would be needed for table re-construction generally.
|
||
> > If I remember correectly,you once proposed OID+version
|
||
> > naming for the cases.
|
||
>
|
||
> Hmm, so you are thinking that the pg_class row for the table would
|
||
> include this uniqueID,
|
||
|
||
No,I just include the place where the table is stored(pathname under
|
||
current file_per_table storage manager) in the pg_class row because
|
||
I don't want to rely on table allocating rule(naming rule for current)
|
||
to access existent relation files. This has always been my main point.
|
||
Many_tables_in_a_file storage manager wouldn't be able to live without
|
||
keeping this kind of infomation.
|
||
This information(where it is stored) is diffrent from tablespace(where
|
||
to store) information. There was an idea to keep the information into
|
||
opaque entry in pg_class which only a specific storage manager
|
||
could handle. There was an idea to have a new system table which
|
||
keeps the information. and so on...
|
||
|
||
> and then committing the pg_class update would
|
||
> be the atomic action that replaces the old table contents with the
|
||
> new? It does have some attraction now that I think about it.
|
||
>
|
||
> But there are other ways we could do the same thing. If we want to
|
||
> have tablespaces, there will need to be a tablespace identifier in
|
||
> each pg_class row. So we could do CLUSTER in the same way as we'd
|
||
> move a table from one tablespace to another: create the new files in
|
||
> the new tablespace directory, and the commit of the new pg_class row
|
||
> with the new tablespace value is the atomic action that makes the new
|
||
> files valid and the old files not.
|
||
>
|
||
> You will probably say "but I didn't want to move my table to a new
|
||
> tablespace just to cluster it!"
|
||
|
||
Yes.
|
||
|
||
> I think we could live with that,
|
||
> though. A tablespace doesn't need to have any existence more concrete
|
||
> than a subdirectory, in my vision of the way things would work. We
|
||
> could do something like making two subdirectories of each place that
|
||
> the dbadmin designates as a "tablespace", so that we make two logical
|
||
> tablespaces out of what the dbadmin thinks of as one.
|
||
|
||
Certainly we could design TABLESPACE(where to store) as above.
|
||
|
||
> Then we can
|
||
> ping-pong between those directories to do things like clustering "in
|
||
> place".
|
||
>
|
||
|
||
But maybe we must keep the directory information where the table was
|
||
*ping-ponged* in (e.g.) pg_class. Is such an implementation cleaner or
|
||
more extensible than mine(keeping the stored place exactly) ?
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
From pgsql-hackers-owner+M3473@hub.org Fri Jun 16 04:01:12 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id EAA13087
|
||
for <pgman@candle.pha.pa.us>; Fri, 16 Jun 2000 04:01:11 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id DAA16002 for <pgman@candle.pha.pa.us>; Fri, 16 Jun 2000 03:37:24 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5G7ZZI51521;
|
||
Fri, 16 Jun 2000 03:35:35 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5G7ZEI51350
|
||
for <pgsql-hackers@postgresql.org>; Fri, 16 Jun 2000 03:35:14 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id DAA06103;
|
||
Fri, 16 Jun 2000 03:34:47 -0400 (EDT)
|
||
To: Chris Bitmead <chrisb@nimrod.itg.telstra.com.au>
|
||
cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <3949BCC4.8424A58F@nimrod.itg.telecom.com.au>
|
||
References: <200006142043.WAA07887@hot.jw.home> <16606.961034835@sss.pgh.pa.us> <3949BCC4.8424A58F@nimrod.itg.telecom.com.au>
|
||
Comments: In-reply-to Chris Bitmead <chrisb@nimrod.itg.telstra.com.au>
|
||
message dated "Fri, 16 Jun 2000 15:36:04 +1000"
|
||
Date: Fri, 16 Jun 2000 03:34:47 -0400
|
||
Message-ID: <6100.961140887@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> writes:
|
||
> Tom Lane wrote:
|
||
>> I don't see a lot of value in that. Better to do something like
|
||
>> tablespaces:
|
||
>>
|
||
>> <dbroot>/<oidoftablespace>/<oidofobject>
|
||
|
||
> What is the benefit of having oidoftablespace in the directory path?
|
||
> Isn't tablespace an idea so you can store it somewhere completely
|
||
> different?
|
||
> Or is there some symlink idea or something?
|
||
|
||
Exactly --- I'm assuming that the tablespace "directory" is likely
|
||
to be a symlink to some other mounted volume. The point here is
|
||
to keep the low-level file access routines from having to know very
|
||
much about tablespaces or file organization. In the above proposal,
|
||
all they need to know is the relation's OID and the name (or OID)
|
||
of the tablespace the relation's assigned to; then they can form
|
||
a valid path using a hardwired rule. There's still plenty of
|
||
flexibility of organization, but it's not necessary to know that
|
||
where the rubber meets the road (eg, when you're down inside mdblindwrt
|
||
trying to dump a dirty buffer to disk with no spare resources to find
|
||
out anything about the relation the page belongs to...)
|
||
|
||
regards, tom lane
|
||
|
||
From JanWieck@t-online.de Fri Jun 16 11:01:06 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA28913
|
||
for <maillist@candle.pha.pa.us>; Fri, 16 Jun 2000 11:01:05 -0400 (EDT)
|
||
Received: from mailout05.sul.t-online.com (mailout05.sul.t-online.com [194.25.134.82]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id KAA01818 for <maillist@candle.pha.pa.us>; Fri, 16 Jun 2000 10:46:42 -0400 (EDT)
|
||
Received: from fwd06.sul.t-online.de
|
||
by mailout05.sul.t-online.com with smtp
|
||
id 132xN9-0006ze-03; Fri, 16 Jun 2000 16:45:27 +0200
|
||
Received: from hot.jw.home (340000654369-0001@[62.158.179.251]) by fwd06.sul.t-online.de
|
||
with esmtp id 132xMx-0E54HQC; Fri, 16 Jun 2000 16:45:15 +0200
|
||
Received: (from wieck@localhost)
|
||
by hot.jw.home (8.8.5/8.8.5) id OAA15163;
|
||
Fri, 16 Jun 2000 14:42:12 +0200
|
||
From: JanWieck@t-online.de (Jan Wieck)
|
||
Message-Id: <200006161242.OAA15163@hot.jw.home>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-Reply-To: <3238.961126521@sss.pgh.pa.us> from Tom Lane at "Jun 15, 2000 11:35:21
|
||
pm"
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Date: Fri, 16 Jun 2000 14:42:12 +0200 (MEST)
|
||
CC: Hiroshi Inoue <Inoue@tpf.co.jp>, Bruce Momjian <maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Reply-To: Jan Wieck <JanWieck@Yahoo.com>
|
||
X-Mailer: ELM [version 2.4ME+ PL68 (25)]
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain; charset=US-ASCII
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Sender: 340000654369-0001@t-dialin.net
|
||
Status: ROr
|
||
|
||
Tom Lane wrote:
|
||
>
|
||
> It gets a little trickier if you want to be able to split
|
||
> multi-gig tables across several tablespaces, though, since
|
||
> you couldn't just append ".N" to the base table path in that
|
||
> scenario.
|
||
>
|
||
> I'd be interested to know what sort of facilities Oracle
|
||
> provides for managing huge tables...
|
||
|
||
Oracle tablespaces are a collection of 1...n preallocated
|
||
files. Each table then is bound to a tablespace and
|
||
allocates extents (chunks) from those files.
|
||
|
||
There are some per table attributes that control the extent
|
||
sizes with default values coming from the tablespace. The
|
||
initial extent size, the nextextent and the pctincrease.
|
||
There is a hardcoded limit for the number of extents a table
|
||
can have at all. In Oracle7 it was 512 (or somewhat below -
|
||
don't recall correct). Maybe that's gone with Oracle8, don't
|
||
know.
|
||
|
||
This storage concept has IMHO a couple of advatages over
|
||
ours.
|
||
|
||
The tablespace files are preallocated, so there will
|
||
never be a change in block allocation during runtime and
|
||
that's the base for fdatasync() beeing sufficient at
|
||
syncpoints. All what might be inaccurate after a crash is
|
||
the last modified time in the inode, and that's totally
|
||
irrelevant for Oracle. The fsck will never fail, and
|
||
anything is up to Oracle's recovery.
|
||
|
||
The number of total tablespace files is limited to a
|
||
value that ensures, that the backends can keep them all
|
||
open all the time. It's hard to exceed that limit. A
|
||
typical SAP installation with more than 20,000
|
||
tables/indices doesn't need more than 30 or 40 of them.
|
||
|
||
It is perfectly prepared for raw devices, since a
|
||
tablespace in a raw device installation is simply an area
|
||
of blocks on a disk.
|
||
|
||
There are also disadvantages.
|
||
|
||
You can run out of space even if there are plenty GB's
|
||
free on your disks. You have to create tablespaces
|
||
explicitly.
|
||
|
||
If you've choosen inadequate extent size parameters, you
|
||
end up with high fragmented tables (slowing down) or get
|
||
stuck with running against maxextents, where only a reorg
|
||
(export/import) helps.
|
||
|
||
|
||
Jan
|
||
|
||
--
|
||
|
||
#======================================================================#
|
||
# It's easier to get forgiveness for being wrong than for being right. #
|
||
# Let's break this rule - forgive me. #
|
||
#================================================== JanWieck@Yahoo.com #
|
||
|
||
|
||
|
||
From tgl@sss.pgh.pa.us Fri Jun 16 11:00:40 2000
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA28898
|
||
for <maillist@candle.pha.pa.us>; Fri, 16 Jun 2000 11:00:39 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id LAA07184;
|
||
Fri, 16 Jun 2000 11:00:35 -0400 (EDT)
|
||
To: Jan Wieck <JanWieck@Yahoo.com>
|
||
cc: Hiroshi Inoue <Inoue@tpf.co.jp>, Bruce Momjian <maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006161242.OAA15163@hot.jw.home>
|
||
References: <200006161242.OAA15163@hot.jw.home>
|
||
Comments: In-reply-to JanWieck@t-online.de (Jan Wieck)
|
||
message dated "Fri, 16 Jun 2000 14:42:12 +0200"
|
||
Date: Fri, 16 Jun 2000 11:00:35 -0400
|
||
Message-ID: <7181.961167635@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: RO
|
||
|
||
JanWieck@t-online.de (Jan Wieck) writes:
|
||
> There are also disadvantages.
|
||
|
||
> You can run out of space even if there are plenty GB's
|
||
> free on your disks. You have to create tablespaces
|
||
> explicitly.
|
||
|
||
Not to mention the reverse: if I read this right, you have to suck
|
||
up your GB's long in advance of actually needing them. That's OK
|
||
for a machine that's dedicated to Oracle ... not so OK for smaller
|
||
installations, playpens, etc.
|
||
|
||
I'm not convinced that there's anything fundamentally wrong with
|
||
doing storage allocation in Unix files the way we have been.
|
||
|
||
(At least not when we're sitting atop a well-done filesystem,
|
||
which may leave the Linux folk out in the cold ;-).)
|
||
|
||
regards, tom lane
|
||
|
||
From tgl@sss.pgh.pa.us Fri Jun 16 12:01:03 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA29853
|
||
for <maillist@candle.pha.pa.us>; Fri, 16 Jun 2000 12:01:02 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id LAA08255 for <maillist@candle.pha.pa.us>; Fri, 16 Jun 2000 11:48:10 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id LAA07461;
|
||
Fri, 16 Jun 2000 11:46:41 -0400 (EDT)
|
||
To: Jan Wieck <JanWieck@Yahoo.com>
|
||
cc: Hiroshi Inoue <Inoue@tpf.co.jp>, Bruce Momjian <maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006161242.OAA15163@hot.jw.home>
|
||
References: <200006161242.OAA15163@hot.jw.home>
|
||
Comments: In-reply-to JanWieck@t-online.de (Jan Wieck)
|
||
message dated "Fri, 16 Jun 2000 14:42:12 +0200"
|
||
Date: Fri, 16 Jun 2000 11:46:41 -0400
|
||
Message-ID: <7458.961170401@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: RO
|
||
|
||
JanWieck@t-online.de (Jan Wieck) writes:
|
||
> Tom Lane wrote:
|
||
>> It gets a little trickier if you want to be able to split
|
||
>> multi-gig tables across several tablespaces, though, since
|
||
>> you couldn't just append ".N" to the base table path in that
|
||
>> scenario.
|
||
>>
|
||
>> I'd be interested to know what sort of facilities Oracle
|
||
>> provides for managing huge tables...
|
||
|
||
> Oracle tablespaces are a collection of 1...n preallocated
|
||
> files. Each table then is bound to a tablespace and
|
||
> allocates extents (chunks) from those files.
|
||
|
||
OK, to get back to the point here: so in Oracle, tables can't cross
|
||
tablespace boundaries, but a tablespace itself could span multiple
|
||
disks?
|
||
|
||
Not sure if I like that better or worse than equating a tablespace
|
||
with a directory (so, presumably, all the files within it live on
|
||
one filesystem) and then trying to make tables able to span
|
||
tablespaces. We will need to do one or the other though, if we want
|
||
to have any significant improvement over the current state of affairs
|
||
for large tables.
|
||
|
||
One way is to play the flip-the-path-ordering game some more,
|
||
and access multiple-segment tables with pathnames like this:
|
||
|
||
.../TABLESPACE/RELATION -- first or only segment
|
||
.../TABLESPACE/N/RELATION -- N'th extension segment
|
||
|
||
This isn't any harder for md.c to deal with than what we do now,
|
||
but by making the /N subdirectories be symlinks, the dbadmin could
|
||
easily arrange for extension segments to go on different filesystems.
|
||
Also, since /N subdirectory symlinks can be added as needed,
|
||
expanding available space by attaching more disks isn't hard.
|
||
(If the admin hasn't pre-made a /N symlink when it's needed,
|
||
I'd envision the backend just automatically creating a plain
|
||
subdirectory so that it can extend the table.)
|
||
|
||
A limitation is that the N'th extension segments of all the relations
|
||
in a given tablespace have to be in the same place, but I don't see
|
||
that as a major objection. Worst case is you make a separate tablespace
|
||
for each of your multi-gig relations ... you're probably not going to
|
||
have a very large number of such relations, so this doesn't seem like
|
||
unmanageable admin complexity.
|
||
|
||
We'd still want to create some tools to help the dbadmin with slinging
|
||
all these symlinks around, of course. But I think it's critical to keep
|
||
the low-level file access protocol simple and reliable, which really
|
||
means minimizing the amount of information the backend needs to know to
|
||
figure out which file to write a page in. With something like the above
|
||
you only need to know the tablespace name (or more likely OID), the
|
||
relation OID (+name or not, depending on outcome of other argument),
|
||
and the offset in the table. No worse than now from the software's
|
||
point of view.
|
||
|
||
Comments?
|
||
|
||
regards, tom lane
|
||
|
||
From lockhart@alumni.caltech.edu Fri Jun 16 12:31:50 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA00649
|
||
for <maillist@candle.pha.pa.us>; Fri, 16 Jun 2000 12:31:49 -0400 (EDT)
|
||
Received: from huey.jpl.nasa.gov (huey.jpl.nasa.gov [128.149.68.100]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id MAA13118 for <maillist@candle.pha.pa.us>; Fri, 16 Jun 2000 12:31:52 -0400 (EDT)
|
||
Received: from golem.jpl.nasa.gov (hectic-1 [128.149.68.203])
|
||
by huey.jpl.nasa.gov (8.8.8+Sun/8.8.8) with ESMTP id JAA15007;
|
||
Fri, 16 Jun 2000 09:27:18 -0700 (PDT)
|
||
Received: from alumni.caltech.edu (localhost.localdomain [127.0.0.1])
|
||
by golem.jpl.nasa.gov (Postfix) with ESMTP
|
||
id DD8426F51; Fri, 16 Jun 2000 16:27:22 +0000 (UTC)
|
||
Sender: lockhart@mythos.jpl.nasa.gov
|
||
Message-ID: <394A556A.4EAC8B9A@alumni.caltech.edu>
|
||
Date: Fri, 16 Jun 2000 16:27:22 +0000
|
||
From: Thomas Lockhart <lockhart@alumni.caltech.edu>
|
||
Organization: Yes
|
||
X-Mailer: Mozilla 4.7 [en] (X11; I; Linux 2.2.14-15mdksmp i686)
|
||
X-Accept-Language: en
|
||
MIME-Version: 1.0
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Cc: Jan Wieck <JanWieck@Yahoo.com>, Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
Bruce Momjian <maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
References: <200006161242.OAA15163@hot.jw.home> <7458.961170401@sss.pgh.pa.us>
|
||
Content-Type: text/plain; charset=us-ascii
|
||
Content-Transfer-Encoding: 7bit
|
||
Status: RO
|
||
|
||
> ... But I think it's critical to keep
|
||
> the low-level file access protocol simple and reliable, which really
|
||
> means minimizing the amount of information the backend needs to know
|
||
> to figure out which file to write a page in. With something like the
|
||
> above you only need to know the tablespace name (or more likely OID),
|
||
> the relation OID (+name or not, depending on outcome of other
|
||
> argument), and the offset in the table. No worse than now from the
|
||
> software's point of view.
|
||
> Comments?
|
||
|
||
I'm probably missing the context a bit, but imho we should try hard to
|
||
stay away from symlinks as the general solution for anything.
|
||
|
||
Sorry for being behind here, but to make sure I'm on the right page:
|
||
o tablespaces decouple storage from logical tables
|
||
o a database lives in a default tablespace, unless specified
|
||
o by default, a table will live in the default tablespace
|
||
o (eventually) a table can be split across tablespaces
|
||
|
||
Some thoughts:
|
||
o the ability to split single tables across disks was essential for
|
||
scalability when disks were small. But with RAID, NAS, etc etc isn't
|
||
that a smaller issue now?
|
||
o "tablespaces" would implement our less-developed "with location"
|
||
feature, right? Splitting databases, whole indices and whole tables
|
||
across storage is the biggest win for this work since more users will
|
||
use the feature.
|
||
o location information needs to travel with individual tables anyway.
|
||
|
||
From scrappy@hub.org Fri Jun 16 13:01:02 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA01191;
|
||
Fri, 16 Jun 2000 13:01:01 -0400 (EDT)
|
||
Received: from thelab.hub.org (nat193.152.mpoweredpc.net [142.177.193.152]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id MAA15282; Fri, 16 Jun 2000 12:53:23 -0400 (EDT)
|
||
Received: from localhost (scrappy@localhost)
|
||
by thelab.hub.org (8.9.3/8.9.3) with ESMTP id NAA28326;
|
||
Fri, 16 Jun 2000 13:50:37 -0300 (ADT)
|
||
(envelope-from scrappy@hub.org)
|
||
X-Authentication-Warning: thelab.hub.org: scrappy owned process doing -bs
|
||
Date: Fri, 16 Jun 2000 13:50:37 -0300 (ADT)
|
||
From: The Hermit Hacker <scrappy@hub.org>
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: Tom Lane <tgl@sss.pgh.pa.us>, Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
Bruce Momjian <maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-Reply-To: <200006160224.WAA04345@candle.pha.pa.us>
|
||
Message-ID: <Pine.BSF.4.21.0006161349140.722-100000@thelab.hub.org>
|
||
MIME-Version: 1.0
|
||
Content-Type: TEXT/PLAIN; charset=US-ASCII
|
||
Status: RO
|
||
|
||
On Thu, 15 Jun 2000, Bruce Momjian wrote:
|
||
|
||
> > "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
|
||
> > > Now I like neither relname nor oid because it's not sufficient
|
||
> > > for my purpose.
|
||
> >
|
||
> > We should probably not do much of anything with this issue until
|
||
> > we have a clearer understanding of what we want to do about
|
||
> > tablespaces and schemas.
|
||
>
|
||
> Here is an analysis of our options:
|
||
>
|
||
> Work required Disadvantages
|
||
> ----------------------------------------------------------------------------
|
||
>
|
||
> Keep current system no work rename/create no rollback
|
||
>
|
||
> relname/oid but less work new pg_class column,
|
||
> no rename change filename not accurate on
|
||
> rename
|
||
>
|
||
> relname/oid with more work complex code
|
||
> rename change during
|
||
> vacuum
|
||
>
|
||
> oid filename less work, but confusing to admins
|
||
> need admin tools
|
||
|
||
My vote is with Tom on this one ... oid only ... the admin should be able
|
||
to do a quick SELECT on a table to find out the OID->table mapping, and I
|
||
believe its already been pointed out that you cant' just restore one file
|
||
anyway, so it kinda negates the "server isn't running problem" ...
|
||
|
||
|
||
|
||
|
||
From tgl@sss.pgh.pa.us Fri Jun 16 13:01:01 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA01188
|
||
for <maillist@candle.pha.pa.us>; Fri, 16 Jun 2000 13:01:01 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id MAA15530 for <maillist@candle.pha.pa.us>; Fri, 16 Jun 2000 12:55:38 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA07750;
|
||
Fri, 16 Jun 2000 12:54:00 -0400 (EDT)
|
||
To: Thomas Lockhart <lockhart@alumni.caltech.edu>
|
||
cc: Jan Wieck <JanWieck@yahoo.com>, Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
Bruce Momjian <maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <394A556A.4EAC8B9A@alumni.caltech.edu>
|
||
References: <200006161242.OAA15163@hot.jw.home> <7458.961170401@sss.pgh.pa.us> <394A556A.4EAC8B9A@alumni.caltech.edu>
|
||
Comments: In-reply-to Thomas Lockhart <lockhart@alumni.caltech.edu>
|
||
message dated "Fri, 16 Jun 2000 16:27:22 -0000"
|
||
Date: Fri, 16 Jun 2000 12:54:00 -0400
|
||
Message-ID: <7747.961174440@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: RO
|
||
|
||
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
|
||
>> ... But I think it's critical to keep
|
||
>> the low-level file access protocol simple and reliable, which really
|
||
>> means minimizing the amount of information the backend needs to know
|
||
>> to figure out which file to write a page in. With something like the
|
||
>> above you only need to know the tablespace name (or more likely OID),
|
||
>> the relation OID (+name or not, depending on outcome of other
|
||
>> argument), and the offset in the table. No worse than now from the
|
||
>> software's point of view.
|
||
>> Comments?
|
||
|
||
> I'm probably missing the context a bit, but imho we should try hard to
|
||
> stay away from symlinks as the general solution for anything.
|
||
|
||
Why?
|
||
|
||
regards, tom lane
|
||
|
||
From dhogaza@pacifier.com Fri Jun 16 14:55:00 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA02086
|
||
for <maillist@candle.pha.pa.us>; Fri, 16 Jun 2000 14:54:59 -0400 (EDT)
|
||
Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id OAA26430 for <maillist@candle.pha.pa.us>; Fri, 16 Jun 2000 14:40:00 -0400 (EDT)
|
||
Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68])
|
||
by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id LAA08661;
|
||
Fri, 16 Jun 2000 11:38:36 -0700 (PDT)
|
||
Message-Id: <3.0.1.32.20000616105023.011dbdb0@mail.pacifier.com>
|
||
X-Sender: dhogaza@mail.pacifier.com
|
||
X-Mailer: Windows Eudora Pro Version 3.0.1 (32)
|
||
Date: Fri, 16 Jun 2000 10:50:23 -0700
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>, Jan Wieck <JanWieck@yahoo.com>
|
||
From: Don Baccus <dhogaza@pacifier.com>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
Cc: Hiroshi Inoue <Inoue@tpf.co.jp>, Bruce Momjian <maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
In-Reply-To: <7458.961170401@sss.pgh.pa.us>
|
||
References: <200006161242.OAA15163@hot.jw.home>
|
||
<200006161242.OAA15163@hot.jw.home>
|
||
Mime-Version: 1.0
|
||
Content-Type: text/plain; charset="us-ascii"
|
||
Status: RO
|
||
|
||
At 11:46 AM 6/16/00 -0400, Tom Lane wrote:
|
||
|
||
>OK, to get back to the point here: so in Oracle, tables can't cross
|
||
>tablespace boundaries,
|
||
|
||
Right, the construct AFAIK is "create table/index foo on tablespace ..."
|
||
|
||
> but a tablespace itself could span multiple
|
||
>disks?
|
||
|
||
Right.
|
||
|
||
>Not sure if I like that better or worse than equating a tablespace
|
||
>with a directory (so, presumably, all the files within it live on
|
||
>one filesystem) and then trying to make tables able to span
|
||
>tablespaces. We will need to do one or the other though, if we want
|
||
>to have any significant improvement over the current state of affairs
|
||
>for large tables.
|
||
|
||
Oracle's way does a reasonable job of isolating the datamodel
|
||
from the details of the physical layout.
|
||
|
||
Take the OpenACS web toolkit, for instance. We could take
|
||
each module's tables and indices and assign them appropriately
|
||
to various dataspaces, then provide a separate .sql files with
|
||
only "create tablespace" statements in there.
|
||
|
||
By modifying that one central file, the toolkit installation
|
||
could be customized to run anything from a small site (one
|
||
disk with everything on it, ala my own personal webserver at
|
||
birdnotes.net) or a very large site with many spindles, with
|
||
various index and table structures spread out widely hither
|
||
and thither.
|
||
|
||
Given that the OpenACS datamodel is nearly 10K lines long (including
|
||
many comments, of course), being able to customize an installation
|
||
to such a degree by modifying a single file filled with "create
|
||
tablespaces" would be very attractive.
|
||
|
||
>One way is to play the flip-the-path-ordering game some more,
|
||
>and access multiple-segment tables with pathnames like this:
|
||
>
|
||
> .../TABLESPACE/RELATION -- first or only segment
|
||
> .../TABLESPACE/N/RELATION -- N'th extension segment
|
||
>
|
||
>This isn't any harder for md.c to deal with than what we do now,
|
||
>but by making the /N subdirectories be symlinks, the dbadmin could
|
||
>easily arrange for extension segments to go on different filesystems.
|
||
|
||
I personally dislike depending on symlinks to move stuff around.
|
||
Among other things, a pg_dump/restore (and presumably future
|
||
backup tools?) can't recreate the disk layout automatically.
|
||
|
||
>We'd still want to create some tools to help the dbadmin with slinging
|
||
>all these symlinks around, of course.
|
||
|
||
OK, if symlinks are simply an implementation detail hidden from the
|
||
dbadmin, and if the physical structure is kept in the db so it can
|
||
be rebuilt if necessary automatically, then I don't mind symlinks.
|
||
|
||
> But I think it's critical to keep
|
||
>the low-level file access protocol simple and reliable, which really
|
||
>means minimizing the amount of information the backend needs to know to
|
||
>figure out which file to write a page in. With something like the above
|
||
>you only need to know the tablespace name (or more likely OID), the
|
||
>relation OID (+name or not, depending on outcome of other argument),
|
||
>and the offset in the table. No worse than now from the software's
|
||
>point of view.
|
||
|
||
Make the code that creates and otherwise manipulates tablespaces
|
||
do the work, while keeping the low-level file access protocol simple.
|
||
|
||
Yes, this approach sounds very good to me.
|
||
|
||
|
||
|
||
- Don Baccus, Portland OR <dhogaza@pacifier.com>
|
||
Nature photos, on-line guides, Pacific Northwest
|
||
Rare Bird Alert Service and other goodies at
|
||
http://donb.photo.net.
|
||
|
||
From pgsql-hackers-owner+M3500@hub.org Fri Jun 16 14:55:10 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA02107
|
||
for <pgman@candle.pha.pa.us>; Fri, 16 Jun 2000 14:55:09 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id OAA26943 for <pgman@candle.pha.pa.us>; Fri, 16 Jun 2000 14:44:12 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5GIelM05972;
|
||
Fri, 16 Jun 2000 14:40:47 -0400 (EDT)
|
||
Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5GIe5M05692
|
||
for <pgsql-hackers@postgresql.org>; Fri, 16 Jun 2000 14:40:05 -0400 (EDT)
|
||
Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68])
|
||
by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id LAA08667;
|
||
Fri, 16 Jun 2000 11:38:41 -0700 (PDT)
|
||
Message-Id: <3.0.1.32.20000616111435.01a17a10@mail.pacifier.com>
|
||
X-Sender: dhogaza@mail.pacifier.com
|
||
X-Mailer: Windows Eudora Pro Version 3.0.1 (32)
|
||
Date: Fri, 16 Jun 2000 11:14:35 -0700
|
||
To: Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Tom Lane <tgl@sss.pgh.pa.us>
|
||
From: Don Baccus <dhogaza@pacifier.com>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
Cc: Jan Wieck <JanWieck@yahoo.com>, Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
Bruce Momjian <maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
In-Reply-To: <394A556A.4EAC8B9A@alumni.caltech.edu>
|
||
References: <200006161242.OAA15163@hot.jw.home>
|
||
<7458.961170401@sss.pgh.pa.us>
|
||
Mime-Version: 1.0
|
||
Content-Type: text/plain; charset="us-ascii"
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
At 04:27 PM 6/16/00 +0000, Thomas Lockhart wrote:
|
||
|
||
>Sorry for being behind here, but to make sure I'm on the right page:
|
||
>o tablespaces decouple storage from logical tables
|
||
>o a database lives in a default tablespace, unless specified
|
||
>o by default, a table will live in the default tablespace
|
||
>o (eventually) a table can be split across tablespaces
|
||
|
||
Or tablespaces across filesystems/mountpoints whatever.
|
||
|
||
>Some thoughts:
|
||
>o the ability to split single tables across disks was essential for
|
||
>scalability when disks were small. But with RAID, NAS, etc etc isn't
|
||
>that a smaller issue now?
|
||
|
||
Yes for size issues, I should think, especially if you have the
|
||
money for a large RAID subsystem. But for throughput performance,
|
||
control over which spindles particularly busy tables and indices
|
||
go on would still seem to be pretty relevant, when they're being
|
||
updated a lot. In order to minimize seek times.
|
||
|
||
I really can't say how important this is in reality. Oracle-world
|
||
folks still talk about this kind of optimization being important,
|
||
but I'm not personally running any kind of database-backed website
|
||
that's busy enough or contains enough storage to worry about it.
|
||
|
||
>o "tablespaces" would implement our less-developed "with location"
|
||
>feature, right? Splitting databases, whole indices and whole tables
|
||
>across storage is the biggest win for this work since more users will
|
||
>use the feature.
|
||
>o location information needs to travel with individual tables anyway.
|
||
|
||
|
||
|
||
- Don Baccus, Portland OR <dhogaza@pacifier.com>
|
||
Nature photos, on-line guides, Pacific Northwest
|
||
Rare Bird Alert Service and other goodies at
|
||
http://donb.photo.net.
|
||
|
||
From tgl@sss.pgh.pa.us Fri Jun 16 15:00:55 2000
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id PAA02397
|
||
for <maillist@candle.pha.pa.us>; Fri, 16 Jun 2000 15:00:54 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id PAA08247;
|
||
Fri, 16 Jun 2000 15:00:11 -0400 (EDT)
|
||
To: Don Baccus <dhogaza@pacifier.com>
|
||
cc: Jan Wieck <JanWieck@yahoo.com>, Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
Bruce Momjian <maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <3.0.1.32.20000616105023.011dbdb0@mail.pacifier.com>
|
||
References: <200006161242.OAA15163@hot.jw.home> <200006161242.OAA15163@hot.jw.home> <3.0.1.32.20000616105023.011dbdb0@mail.pacifier.com>
|
||
Comments: In-reply-to Don Baccus <dhogaza@pacifier.com>
|
||
message dated "Fri, 16 Jun 2000 10:50:23 -0700"
|
||
Date: Fri, 16 Jun 2000 15:00:10 -0400
|
||
Message-ID: <8244.961182010@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: RO
|
||
|
||
Don Baccus <dhogaza@pacifier.com> writes:
|
||
>> This isn't any harder for md.c to deal with than what we do now,
|
||
>> but by making the /N subdirectories be symlinks, the dbadmin could
|
||
>> easily arrange for extension segments to go on different filesystems.
|
||
|
||
> I personally dislike depending on symlinks to move stuff around.
|
||
> Among other things, a pg_dump/restore (and presumably future
|
||
> backup tools?) can't recreate the disk layout automatically.
|
||
|
||
Good point, we'd need some way of saving/restoring the tablespace
|
||
structures.
|
||
|
||
>> We'd still want to create some tools to help the dbadmin with slinging
|
||
>> all these symlinks around, of course.
|
||
|
||
> OK, if symlinks are simply an implementation detail hidden from the
|
||
> dbadmin, and if the physical structure is kept in the db so it can
|
||
> be rebuilt if necessary automatically, then I don't mind symlinks.
|
||
|
||
I'm not sure about keeping it in the db --- creates a bit of a
|
||
chicken-and-egg problem doesn't it? Maybe there needs to be a
|
||
"system database" that has nailed-down pathnames (no tablespaces
|
||
for you baby) and contains the critical installation-wide tables
|
||
like pg_database, pg_user, pg_tablespace. A restore would have
|
||
to restore these tables first anyway.
|
||
|
||
> Make the code that creates and otherwise manipulates tablespaces
|
||
> do the work, while keeping the low-level file access protocol simple.
|
||
|
||
Right, that's the bottom line for me.
|
||
|
||
regards, tom lane
|
||
|
||
From reedstrm@rice.edu Fri Jun 16 16:51:50 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id QAA03689
|
||
for <maillist@candle.pha.pa.us>; Fri, 16 Jun 2000 16:51:49 -0400 (EDT)
|
||
Received: from wallace.ece.rice.edu (wallace.ece.rice.edu [128.42.12.154]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id PAA03409 for <maillist@candle.pha.pa.us>; Fri, 16 Jun 2000 15:48:40 -0400 (EDT)
|
||
Received: by rice.edu
|
||
via sendmail from stdin
|
||
id <m1331to-000LEJC@wallace.ece.rice.edu> (Debian Smail3.2.0.102)
|
||
for maillist@candle.pha.pa.us; Fri, 16 Jun 2000 14:35:28 -0500 (CDT)
|
||
Date: Fri, 16 Jun 2000 14:35:28 -0500
|
||
From: "Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
To: Thomas Lockhart <lockhart@alumni.caltech.edu>
|
||
Cc: Tom Lane <tgl@sss.pgh.pa.us>, Jan Wieck <JanWieck@yahoo.com>,
|
||
Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
Bruce Momjian <maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
Message-ID: <20000616143528.A28920@rice.edu>
|
||
Mail-Followup-To: Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Tom Lane <tgl@sss.pgh.pa.us>, Jan Wieck <JanWieck@yahoo.com>,
|
||
Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
Bruce Momjian <maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
References: <200006161242.OAA15163@hot.jw.home> <7458.961170401@sss.pgh.pa.us> <394A556A.4EAC8B9A@alumni.caltech.edu>
|
||
Mime-Version: 1.0
|
||
Content-Type: text/plain; charset=iso-8859-1
|
||
Content-Transfer-Encoding: 8bit
|
||
User-Agent: Mutt/1.0i
|
||
In-Reply-To: <394A556A.4EAC8B9A@alumni.caltech.edu>; from lockhart@alumni.caltech.edu on Fri, Jun 16, 2000 at 04:27:22PM +0000
|
||
Status: RO
|
||
|
||
On Fri, Jun 16, 2000 at 04:27:22PM +0000, Thomas Lockhart wrote:
|
||
> > ... But I think it's critical to keep
|
||
> > the low-level file access protocol simple and reliable, which really
|
||
> > means minimizing the amount of information the backend needs to know
|
||
> > to figure out which file to write a page in. With something like the
|
||
> > above you only need to know the tablespace name (or more likely OID),
|
||
> > the relation OID (+name or not, depending on outcome of other
|
||
> > argument), and the offset in the table. No worse than now from the
|
||
> > software's point of view.
|
||
> > Comments?
|
||
|
||
I think the backend needs a per table token that indicates how
|
||
to get at the physical bits of the file. Whether that's a filename
|
||
alone, filename with path, oid, key to a smgr hash table or something
|
||
else, it's opaque above the smgr routines.
|
||
|
||
Hmm, now I'm thinking, since the tablespace discussion has been reopened,
|
||
the way to go about coding all this is to reactivate the smgr code: how
|
||
about I leave the existing md smgr as is, and clone it, call it md2 or
|
||
something, and start messing with adding features there?
|
||
|
||
|
||
>
|
||
> I'm probably missing the context a bit, but imho we should try hard to
|
||
> stay away from symlinks as the general solution for anything.
|
||
>
|
||
> Sorry for being behind here, but to make sure I'm on the right page:
|
||
> o tablespaces decouple storage from logical tables
|
||
> o a database lives in a default tablespace, unless specified
|
||
> o by default, a table will live in the default tablespace
|
||
> o (eventually) a table can be split across tablespaces
|
||
>
|
||
> Some thoughts:
|
||
> o the ability to split single tables across disks was essential for
|
||
> scalability when disks were small. But with RAID, NAS, etc etc isn't
|
||
> that a smaller issue now?
|
||
> o "tablespaces" would implement our less-developed "with location"
|
||
> feature, right? Splitting databases, whole indices and whole tables
|
||
> across storage is the biggest win for this work since more users will
|
||
> use the feature.
|
||
> o location information needs to travel with individual tables anyway.
|
||
|
||
I was juist thinking that that discussion needed some summation.
|
||
|
||
Some links to historic discussion:
|
||
|
||
This one is Vadim saying WAL will need oids names:
|
||
http://www.postgresql.org/mhonarc/pgsql-hackers/1999-11/msg00809.html
|
||
|
||
A longer discussion kicked off by Don Baccus:
|
||
http://www.postgresql.org/mhonarc/pgsql-hackers/2000-01/msg00510.html
|
||
|
||
Tom suggesting OIDs to allow rollback:
|
||
http://www.postgresql.org/mhonarc/pgsql-hackers/2000-03/msg00119.html
|
||
|
||
|
||
Martin Neumann posted an question on dataspaces:
|
||
|
||
(can't find it in the offical archives: looks like March 2000, 10-29 is
|
||
missing. here's my copy: don't beat on it! n particular, since I threw
|
||
it together for local access, it's one _big_ index page)
|
||
|
||
http://cooker.ir.rice.edu/postgresql/msg20257.html
|
||
(in that thread is a post where I mention blindwrites and getting rid
|
||
of GetRawDatabaseInfo)
|
||
|
||
Martin later posted an RFD on tablespaces:
|
||
|
||
http://cooker.ir.rice.edu/postgresql/msg20490.html
|
||
|
||
Here's Hor<6F>k Daniel with a patch for discussion, implementing dataspaces
|
||
on a per database level:
|
||
|
||
http://cooker.ir.rice.edu/postgresql/msg20498.html
|
||
|
||
Ross
|
||
--
|
||
Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu>
|
||
NSBRI Research Scientist/Programmer
|
||
Computer and Information Technology Institute
|
||
Rice University, 6100 S. Main St., Houston, TX 77005
|
||
|
||
From dhogaza@pacifier.com Fri Jun 16 16:51:51 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id QAA03692
|
||
for <maillist@candle.pha.pa.us>; Fri, 16 Jun 2000 16:51:50 -0400 (EDT)
|
||
Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id PAA02911 for <maillist@candle.pha.pa.us>; Fri, 16 Jun 2000 15:43:13 -0400 (EDT)
|
||
Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68])
|
||
by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id MAA11003;
|
||
Fri, 16 Jun 2000 12:41:50 -0700 (PDT)
|
||
Message-Id: <3.0.1.32.20000616123736.01a19910@mail.pacifier.com>
|
||
X-Sender: dhogaza@mail.pacifier.com
|
||
X-Mailer: Windows Eudora Pro Version 3.0.1 (32)
|
||
Date: Fri, 16 Jun 2000 12:37:36 -0700
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
From: Don Baccus <dhogaza@pacifier.com>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
Cc: Jan Wieck <JanWieck@yahoo.com>, Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
Bruce Momjian <maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
In-Reply-To: <8244.961182010@sss.pgh.pa.us>
|
||
References: <3.0.1.32.20000616105023.011dbdb0@mail.pacifier.com>
|
||
<200006161242.OAA15163@hot.jw.home>
|
||
<200006161242.OAA15163@hot.jw.home>
|
||
<3.0.1.32.20000616105023.011dbdb0@mail.pacifier.com>
|
||
Mime-Version: 1.0
|
||
Content-Type: text/plain; charset="us-ascii"
|
||
Status: RO
|
||
|
||
At 03:00 PM 6/16/00 -0400, Tom Lane wrote:
|
||
|
||
>> OK, if symlinks are simply an implementation detail hidden from the
|
||
>> dbadmin, and if the physical structure is kept in the db so it can
|
||
>> be rebuilt if necessary automatically, then I don't mind symlinks.
|
||
>
|
||
>I'm not sure about keeping it in the db --- creates a bit of a
|
||
>chicken-and-egg problem doesn't it?
|
||
|
||
Not if the tablespace creates preceeds the tables stored in them.
|
||
|
||
> Maybe there needs to be a
|
||
>"system database" that has nailed-down pathnames (no tablespaces
|
||
>for you baby) and contains the critical installation-wide tables
|
||
>like pg_database, pg_user, pg_tablespace. A restore would have
|
||
>to restore these tables first anyway.
|
||
|
||
Oh, I see. Yes, when I've looked into this and have thought about
|
||
it I've assumed that there would always be a known starting point
|
||
which would contain the installation-wide tables.
|
||
|
||
>From a practical point of view, I don't think that's really a
|
||
problem.
|
||
|
||
I've not looked into how Oracle does this, I assume it builds
|
||
a system tablespace on one of the initial mount points you give
|
||
it when you install the thing. The paths to the mount points
|
||
are stored in specific files known to Oracle, I think. It's
|
||
been over a year (not long enough!) since I've set up Oracle...
|
||
|
||
|
||
|
||
|
||
- Don Baccus, Portland OR <dhogaza@pacifier.com>
|
||
Nature photos, on-line guides, Pacific Northwest
|
||
Rare Bird Alert Service and other goodies at
|
||
http://donb.photo.net.
|
||
|
||
From pgsql-hackers-owner+M3512@hub.org Fri Jun 16 17:31:04 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA04168
|
||
for <pgman@candle.pha.pa.us>; Fri, 16 Jun 2000 17:31:03 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id RAA12122 for <pgman@candle.pha.pa.us>; Fri, 16 Jun 2000 17:09:28 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5GL7WM02231;
|
||
Fri, 16 Jun 2000 17:07:32 -0400 (EDT)
|
||
Received: from wallace.ece.rice.edu (wallace.ece.rice.edu [128.42.12.154])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5GL7EM02150
|
||
for <pgsql-hackers@postgresql.org>; Fri, 16 Jun 2000 17:07:14 -0400 (EDT)
|
||
Received: by rice.edu
|
||
via sendmail from stdin
|
||
id <m1333Kb-000LEJC@wallace.ece.rice.edu> (Debian Smail3.2.0.102)
|
||
for pgsql-hackers@postgresql.org; Fri, 16 Jun 2000 16:07:13 -0500 (CDT)
|
||
Date: Fri, 16 Jun 2000 16:07:13 -0500
|
||
From: "Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Cc: pgsql-hackers@postgresql.org
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
Message-ID: <20000616160713.A30793@rice.edu>
|
||
Mail-Followup-To: Tom Lane <tgl@sss.pgh.pa.us>,
|
||
pgsql-hackers@postgresql.org
|
||
References: <16985.961038832@sss.pgh.pa.us> <200006150321.XAA09510@candle.pha.pa.us> <20000615010312.A995@rice.edu> <18798.961053112@sss.pgh.pa.us> <20000615114519.B3939@rice.edu> <2260.961113232@sss.pgh.pa.us>
|
||
Mime-Version: 1.0
|
||
Content-Type: text/plain; charset=us-ascii
|
||
User-Agent: Mutt/1.0i
|
||
In-Reply-To: <2260.961113232@sss.pgh.pa.us>; from tgl@sss.pgh.pa.us on Thu, Jun 15, 2000 at 07:53:52PM -0400
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
On Thu, Jun 15, 2000 at 07:53:52PM -0400, Tom Lane wrote:
|
||
> "Ross J. Reedstrom" <reedstrm@rice.edu> writes:
|
||
> > On Thu, Jun 15, 2000 at 03:11:52AM -0400, Tom Lane wrote:
|
||
> >> "Ross J. Reedstrom" <reedstrm@rice.edu> writes:
|
||
> >>>> Any strong objections to the mixed relname_oid solution?
|
||
> >>
|
||
> >> Yes!
|
||
>
|
||
> > The plan here was to let VACUUM handle renaming the file, since it
|
||
> > will already have all the necessary locks. This shortens the window
|
||
> > of confusion. ALTER TABLE RENAME doesn't happen that often, really -
|
||
> > the relname is there just for human consumption, then.
|
||
>
|
||
> Yeah, I've seen tons of discussion of how if we do this, that, and
|
||
> the other thing, and be prepared to fix up some other things in case
|
||
> of crash recovery, we can make it work with filename == relname + OID
|
||
> (where relname tracks logical name, at least at some remove).
|
||
>
|
||
> Probably. Assuming nobody forgets anything.
|
||
|
||
I agree, it seems a major undertaking, at first glance. And second. Even
|
||
third. Especially for someone who hasn't 'earned his spurs' yet. as
|
||
it were.
|
||
|
||
> I'm just trying to point out that that's a huge amount of pretty
|
||
> delicate mechanism. The amount of work required to make it trustworthy
|
||
> looks to me to dwarf the admin tools that Bruce is complaining about.
|
||
> And we only have a few people competent to do the work. (With all
|
||
> due respect, Ross, if you weren't already aware of the implications
|
||
> for mdblindwrt, I have to wonder what else you missed.)
|
||
|
||
Ah, you knew that comment would come back to haunt me (I have a
|
||
tendency to think out loud, even if checking and coming back latter
|
||
would be better;-) In fact, there's no problem, and never was, since the
|
||
buffer->blind.relname is filled in via RelationGetPhysicalRelationName,
|
||
just like every other path that requires direct file access. I just
|
||
didn't remember that I had in fact checked it (it's been a couple months,
|
||
and I just got back from vacation ;-)
|
||
|
||
Actually, Once I re-checked it, the code looked very familiar. I had
|
||
spent time looking at the blind write code in the context of getting
|
||
rid of the only non-startup use of GetRawDatabaseInfo.
|
||
|
||
As to missing things: I'm leaning heavily on Bruce's previous
|
||
work for temp tables, to seperate the two uses of relname, via the
|
||
RelationGetRelationName and RelationGetPhysicalRelationName. There are
|
||
102 uses of the first in the current code (many in elog messages), and
|
||
only 11 of the second. If I'd had to do the original work of finding
|
||
every use of relname, and catagorizing it, I agree I'm not (yet) up to
|
||
it, but I have more confidence in Bruce's (already tested) work.
|
||
|
||
>
|
||
> Filename == OID is so simple, reliable, and straightforward by
|
||
> comparison that I think the decision is a no-brainer.
|
||
>
|
||
|
||
Perhaps. Changing the label of the file on disk still requires finding
|
||
all the code that assumes it knows what that name is, and changing it.
|
||
Same work.
|
||
|
||
> If we could afford to sink unlimited time into this one issue then
|
||
> it might make sense to do it the hard way, but we have enough
|
||
> important stuff on our TODO list to keep us all busy for years ---
|
||
> I cannot believe that it's an effective use of our time to do this.
|
||
>
|
||
|
||
The joys of Open Development. You've spent a fair amount of time trying
|
||
to convince _me_ not to waste my time. Thanks, but I'm pretty bull headed
|
||
sometimes. Since I've already done something of the work, take a look
|
||
at what I've got, and then tell me I'm wasting my time, o.k.?
|
||
|
||
>
|
||
> > Hmm, what's all this with functions in catalog.c that are only called by
|
||
> > smgr/md.c? seems to me that anything having to do with physical storage
|
||
> > (like the path!) belongs in the smgr abstraction.
|
||
>
|
||
> Yeah, there's a bunch of stuff that should have been implemented by
|
||
> adding new smgr entry points, but wasn't. It should be pushed down.
|
||
> (I can't resist pointing out that one of those things is physical
|
||
> relation rename, which will go away and not *need* to be pushed down
|
||
> if we do it the way I want.)
|
||
>
|
||
|
||
Oh, I agree completely. In fact, As I said to Hiroshi last time this came
|
||
up, I think of the field in pg_class an an opaque token, to be filled in
|
||
by the smgr, and only used by code further up to hand back to the smgr
|
||
routines. Same should be true of the buffer->blind struct.
|
||
|
||
Ross
|
||
--
|
||
Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu>
|
||
NSBRI Research Scientist/Programmer
|
||
Computer and Information Technology Institute
|
||
Rice University, 6100 S. Main St., Houston, TX 77005
|
||
|
||
|
||
From Inoue@tpf.co.jp Fri Jun 16 19:31:00 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA05334
|
||
for <maillist@candle.pha.pa.us>; Fri, 16 Jun 2000 19:30:59 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id TAA19834 for <maillist@candle.pha.pa.us>; Fri, 16 Jun 2000 19:09:59 -0400 (EDT)
|
||
Received: from mcadnote1 (ppm122.noc.fukui.nsk.ne.jp [210.161.188.41])
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id IAA08210; Sat, 17 Jun 2000 08:08:15 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Tom Lane" <tgl@sss.pgh.pa.us>, "Jan Wieck" <JanWieck@Yahoo.com>
|
||
Cc: "Bruce Momjian" <maillist@candle.pha.pa.us>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Sat, 17 Jun 2000 08:11:08 +0900
|
||
Message-ID: <EKEJJICOHDIEMGPNIFIJAEADCCAA.Inoue@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-2022-jp"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
|
||
In-Reply-To: <7181.961167635@sss.pgh.pa.us>
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
|
||
Importance: Normal
|
||
Status: RO
|
||
|
||
> -----Original Message-----
|
||
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
|
||
>
|
||
> JanWieck@t-online.de (Jan Wieck) writes:
|
||
> > There are also disadvantages.
|
||
>
|
||
> > You can run out of space even if there are plenty GB's
|
||
> > free on your disks. You have to create tablespaces
|
||
> > explicitly.
|
||
>
|
||
> Not to mention the reverse: if I read this right, you have to suck
|
||
> up your GB's long in advance of actually needing them. That's OK
|
||
> for a machine that's dedicated to Oracle ... not so OK for smaller
|
||
> installations, playpens, etc.
|
||
>
|
||
|
||
I've had an anxiety about the way like Oracle's preallocation.
|
||
It had not been easy for me to estimate the extent size in
|
||
Oracle. Maybe it would lose the simplicity of environment
|
||
settings which is one of the biggest advantage of PostgreSQL.
|
||
It seems that we should also provide not_preallocated DATAFILE
|
||
when many_tables_in_a_file storage manager is introduced.
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
|
||
|
||
From tgl@sss.pgh.pa.us Fri Jun 16 19:31:01 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA05337
|
||
for <maillist@candle.pha.pa.us>; Fri, 16 Jun 2000 19:31:00 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id TAA20335 for <maillist@candle.pha.pa.us>; Fri, 16 Jun 2000 19:18:26 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id TAA09274;
|
||
Fri, 16 Jun 2000 19:16:37 -0400 (EDT)
|
||
To: "Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
cc: Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Jan Wieck <JanWieck@Yahoo.com>, Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
Bruce Momjian <maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <20000616143528.A28920@rice.edu>
|
||
References: <200006161242.OAA15163@hot.jw.home> <7458.961170401@sss.pgh.pa.us> <394A556A.4EAC8B9A@alumni.caltech.edu> <20000616143528.A28920@rice.edu>
|
||
Comments: In-reply-to "Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
message dated "Fri, 16 Jun 2000 14:35:28 -0500"
|
||
Date: Fri, 16 Jun 2000 19:16:37 -0400
|
||
Message-ID: <9271.961197397@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: RO
|
||
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu> writes:
|
||
> I think the backend needs a per table token that indicates how
|
||
> to get at the physical bits of the file. Whether that's a filename
|
||
> alone, filename with path, oid, key to a smgr hash table or something
|
||
> else, it's opaque above the smgr routines.
|
||
|
||
Except to the commands that provide the user interface for tablespaces
|
||
and so forth. And there aren't all that many places that deal with
|
||
physical filenames anyway. It would be a good idea to try to be a
|
||
little stricter about this, but I'm not sure you can make the separation
|
||
a whole lot cleaner than it is now ... with the exception of the obvious
|
||
bogosities like "rename table" being done above the smgr level. (But,
|
||
as I said, I want to see that code go away, not just get moved into
|
||
smgr...)
|
||
|
||
> Hmm, now I'm thinking, since the tablespace discussion has been reopened,
|
||
> the way to go about coding all this is to reactivate the smgr code: how
|
||
> about I leave the existing md smgr as is, and clone it, call it md2 or
|
||
> something, and start messing with adding features there?
|
||
|
||
Um, well, you can't have it both ways. If you're going to change/fix
|
||
the assumptions of code above the smgr, then you've got to update md
|
||
at the same time to match your new definition of the smgr interface.
|
||
Won't do much good to have a playpen smgr if the "standard" one is
|
||
broken.
|
||
|
||
One thing I have been thinking would be a good idea is to take the
|
||
relcache out of the bufmgr/smgr interfaces. The relcache is a
|
||
higher-level concept and ought not be known to bufmgr or smgr; they
|
||
ought to work with some low-level data structure or token for relations.
|
||
We might be able to eliminate the whole concept of "blind write" if we
|
||
do that. There are other problems with the relcache dependency: entries
|
||
in relcache can get blown away at inopportune times due to shared cache
|
||
inval, and it doesn't provide a good home for tokens for multiple
|
||
"versions" of a relation if we go with the fill-a-new-physical-file
|
||
approach to CLUSTER and so on.
|
||
|
||
Hmm, if you replace relcache in the smgr interfaces with pointers to
|
||
an smgr-maintained data structure, that might be the same thing that
|
||
you are alluding to above about an smgr hash table.
|
||
|
||
One thing *not* to do is add yet a third layer of data structure on
|
||
top of the ones already maintained in fd.c and md.c. Whatever extra
|
||
data might be needed here should be added to md.c's tables, I think,
|
||
and then the tokens used in the smgr interface would be pointers into
|
||
that table.
|
||
|
||
regards, tom lane
|
||
|
||
From tgl@sss.pgh.pa.us Fri Jun 16 19:30:43 2000
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA05329
|
||
for <maillist@candle.pha.pa.us>; Fri, 16 Jun 2000 19:30:41 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id TAA09320;
|
||
Fri, 16 Jun 2000 19:30:26 -0400 (EDT)
|
||
To: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
cc: "Jan Wieck" <JanWieck@Yahoo.com>,
|
||
"Bruce Momjian" <maillist@candle.pha.pa.us>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <EKEJJICOHDIEMGPNIFIJAEADCCAA.Inoue@tpf.co.jp>
|
||
References: <EKEJJICOHDIEMGPNIFIJAEADCCAA.Inoue@tpf.co.jp>
|
||
Comments: In-reply-to "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
message dated "Sat, 17 Jun 2000 08:11:08 +0900"
|
||
Date: Fri, 16 Jun 2000 19:30:25 -0400
|
||
Message-ID: <9317.961198225@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: ROr
|
||
|
||
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
|
||
> It seems that we should also provide not_preallocated DATAFILE
|
||
> when many_tables_in_a_file storage manager is introduced.
|
||
|
||
Several people in this thread have been talking like a
|
||
single-physical-file storage manager is in our future, but I can't
|
||
recall anyone saying that they were going to do such a thing or even
|
||
presenting reasons why it'd be a good idea.
|
||
|
||
Seems to me that physical file per relation is considerably better for
|
||
our purposes. It's easier to figure out what's going on for admin and
|
||
debug work, it means less lock contention among different backends
|
||
appending concurrently to different relations, and it gives the OS a
|
||
better shot at doing effective read-ahead on sequential scans.
|
||
|
||
So why all the enthusiasm for multi-tables-per-file?
|
||
|
||
regards, tom lane
|
||
|
||
From chris@bitmead.com Fri Jun 16 21:01:02 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA07578;
|
||
Fri, 16 Jun 2000 21:01:00 -0400 (EDT)
|
||
Received: from tech.com.au (IDENT:root@techpt.lnk.telstra.net [139.130.75.122]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id UAA24724; Fri, 16 Jun 2000 20:39:30 -0400 (EDT)
|
||
Received: from bitmead.com (IDENT:chris@tardis [203.41.180.243])
|
||
by tech.com.au (8.9.3/8.9.3) with ESMTP id KAA21388;
|
||
Sat, 17 Jun 2000 10:39:21 +1000
|
||
Sender: chris@tech.com.au
|
||
Message-ID: <394AC8B4.C5B4CCFB@bitmead.com>
|
||
Date: Sat, 17 Jun 2000 10:39:16 +1000
|
||
From: Chris Bitmead <chris@bitmead.com>
|
||
X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.14-5.0 i686)
|
||
X-Accept-Language: en
|
||
MIME-Version: 1.0
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
CC: Tom Lane <tgl@sss.pgh.pa.us>, Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
Jan Wieck <JanWieck@Yahoo.com>,
|
||
Bruce Momjian <maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgreSQL.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
References: <200006170008.UAA06798@candle.pha.pa.us>
|
||
Content-Type: text/plain; charset=us-ascii
|
||
Content-Transfer-Encoding: 7bit
|
||
Status: RO
|
||
|
||
|
||
> > So why all the enthusiasm for multi-tables-per-file?
|
||
|
||
It allows you to use raw partitions which stop the OS double buffering
|
||
and wasting half of memory, as well as removing the overhead of indirect
|
||
blocks in the file system.
|
||
|
||
From Inoue@tpf.co.jp Sat Jun 17 06:00:59 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id GAA22177;
|
||
Sat, 17 Jun 2000 06:00:59 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id FAA21759; Sat, 17 Jun 2000 05:36:27 -0400 (EDT)
|
||
Received: from mcadnote1 (ppm130.noc.fukui.nsk.ne.jp [210.161.188.49])
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id SAA08383; Sat, 17 Jun 2000 18:35:36 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Bruce Momjian" <pgman@candle.pha.pa.us>, "Tom Lane" <tgl@sss.pgh.pa.us>
|
||
Cc: "Jan Wieck" <JanWieck@Yahoo.com>,
|
||
"Bruce Momjian" <maillist@candle.pha.pa.us>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Sat, 17 Jun 2000 18:38:29 +0900
|
||
Message-ID: <EKEJJICOHDIEMGPNIFIJEEAKCCAA.Inoue@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="US-ASCII"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
|
||
In-Reply-To: <200006170008.UAA06798@candle.pha.pa.us>
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
|
||
Importance: Normal
|
||
Status: RO
|
||
|
||
> -----Original Message-----
|
||
> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
|
||
> >
|
||
> > So why all the enthusiasm for multi-tables-per-file?
|
||
>
|
||
> No idea. I thought Vadim mentioned it, but I am not sure anymore. I
|
||
> certainly like our current system.
|
||
>
|
||
|
||
Oops,I'm not so enthusiastic for multi_tables_per_file smgr.
|
||
I believe that Ross and I have taken a practical way that doesn't
|
||
break current file_per_table smgr.
|
||
|
||
However it seems very natural to take multi_tables_per_file
|
||
smgr into account when we consider TABLESPACE concept.
|
||
Because TABLESPACE is an encapsulation,it should have
|
||
a possibility to handle multi_tables_per_file smgr IMHO.
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
From tgl@sss.pgh.pa.us Sat Jun 17 12:31:08 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA02794;
|
||
Sat, 17 Jun 2000 12:31:07 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id MAA07194; Sat, 17 Jun 2000 12:12:53 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA18824;
|
||
Sat, 17 Jun 2000 12:11:18 -0400 (EDT)
|
||
To: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
cc: "Bruce Momjian" <pgman@candle.pha.pa.us>, "Jan Wieck" <JanWieck@Yahoo.com>,
|
||
"Bruce Momjian" <maillist@candle.pha.pa.us>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgreSQL.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <EKEJJICOHDIEMGPNIFIJEEAKCCAA.Inoue@tpf.co.jp>
|
||
References: <EKEJJICOHDIEMGPNIFIJEEAKCCAA.Inoue@tpf.co.jp>
|
||
Comments: In-reply-to "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
message dated "Sat, 17 Jun 2000 18:38:29 +0900"
|
||
Date: Sat, 17 Jun 2000 12:11:18 -0400
|
||
Message-ID: <18821.961258278@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: RO
|
||
|
||
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
|
||
> However it seems very natural to take multi_tables_per_file
|
||
> smgr into account when we consider TABLESPACE concept.
|
||
> Because TABLESPACE is an encapsulation,it should have
|
||
> a possibility to handle multi_tables_per_file smgr IMHO.
|
||
|
||
OK, I see: you're just saying that the tablespace stuff should be
|
||
designed in such a way that it would work with a non-file-per-table
|
||
smgr. Agreed, that'd be a good check of a clean design, and someday
|
||
we might need it...
|
||
|
||
regards, tom lane
|
||
|
||
From tgl@sss.pgh.pa.us Sun Jun 18 12:30:59 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA06514
|
||
for <pgman@candle.pha.pa.us>; Sun, 18 Jun 2000 12:30:58 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id MAA04979 for <pgman@candle.pha.pa.us>; Sun, 18 Jun 2000 12:07:44 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA12163;
|
||
Sun, 18 Jun 2000 12:06:29 -0400 (EDT)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: Jan Wieck <JanWieck@Yahoo.com>, Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006181333.JAA01648@candle.pha.pa.us>
|
||
References: <200006181333.JAA01648@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Sun, 18 Jun 2000 09:33:44 -0400"
|
||
Date: Sun, 18 Jun 2000 12:06:29 -0400
|
||
Message-ID: <12160.961344389@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: ROr
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> ... We could even get fancy and
|
||
> round-robin through all the extents directories, looping around to the
|
||
> beginning when we run out of them. That sounds nice.
|
||
|
||
That sounds horrible. There's no way to tell which extent directory
|
||
extent N goes into except by scanning the location directory to find
|
||
out how many extent subdirectories there are (so that you can compute
|
||
N modulo number-of-directories). Do you want to pay that price on every
|
||
file open?
|
||
|
||
Worse, what happens when you add another extent directory? You can't
|
||
find your old extents anymore, that's what, because they're not in the
|
||
right place (N modulo number-of-directories just changed). Since the
|
||
extents are presumably on different volumes, you're talking about
|
||
physical file moves to get them where they should be. You probably
|
||
can't add a new extent without shutting down the entire database while
|
||
you reshuffle files --- at the very least you'd need to get exclusive
|
||
locks on all the tables in that tablespace.
|
||
|
||
Also, you'll get filename conflicts from multiple extents of a single
|
||
table appearing in one of the recycled extent dirs. You could work
|
||
around it by using the non-modulo'd N as part of the final file name,
|
||
but that just adds more complexity and makes the filename-generation
|
||
machinery that much more closely tied to this specific way of doing
|
||
things.
|
||
|
||
The right way to do this is that extent N goes into extents subdirectory
|
||
N, period. If there's no such subdirectory, create one on-the-fly as a
|
||
plain subdirectory of the location directory. The dbadmin can easily
|
||
create secondary extent symlinks *in advance of their being needed*.
|
||
Reorganizing later is much more painful since it requires moving
|
||
physical files, but I think that'd be true no matter what. At least
|
||
we should see to it that adding more space in advance of needing it is
|
||
painless.
|
||
|
||
It's possible to do it that way (auto-create extent subdir if needed)
|
||
without tying the md.c machinery real closely to a specific filename
|
||
creation procedure: it's just the same sort of thing as install programs
|
||
customarily do. "If you fail to create a file, try creating its
|
||
ancestor directory." We'd have to think about whether it'd be a good
|
||
idea to allow auto-creation of more than one level of directory; offhand
|
||
it seems that needing to make more than one level is probably a sign of
|
||
an erroneous path, not need for another extent subdirectory.
|
||
|
||
regards, tom lane
|
||
|
||
From dhogaza@pacifier.com Sun Jun 18 20:01:00 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA19951
|
||
for <pgman@candle.pha.pa.us>; Sun, 18 Jun 2000 20:00:59 -0400 (EDT)
|
||
Received: from smtp.pacifier.com (asteroid.pacifier.com [199.2.117.154]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id TAA24345 for <pgman@candle.pha.pa.us>; Sun, 18 Jun 2000 19:50:06 -0400 (EDT)
|
||
Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68])
|
||
by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id QAA05302;
|
||
Sun, 18 Jun 2000 16:49:27 -0700 (PDT)
|
||
Message-Id: <3.0.1.32.20000618164342.011d2450@mail.pacifier.com>
|
||
X-Sender: dhogaza@mail.pacifier.com
|
||
X-Mailer: Windows Eudora Pro Version 3.0.1 (32)
|
||
Date: Sun, 18 Jun 2000 16:43:42 -0700
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>, Tom Lane <tgl@sss.pgh.pa.us>
|
||
From: Don Baccus <dhogaza@pacifier.com>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
Cc: Jan Wieck <JanWieck@yahoo.com>, Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
In-Reply-To: <200006182250.SAA13436@candle.pha.pa.us>
|
||
References: <12160.961344389@sss.pgh.pa.us>
|
||
Mime-Version: 1.0
|
||
Content-Type: text/plain; charset="us-ascii"
|
||
Status: ROr
|
||
|
||
At 06:50 PM 6/18/00 -0400, Bruce Momjian wrote:
|
||
>If we eliminate the round-robin idea, what did people think of the rest
|
||
>of the ideas?
|
||
|
||
Why invent new syntax when "create tablespace" is something a lot
|
||
of folks will recognize?
|
||
|
||
And why not use "create table ... using ... "? In other words,
|
||
Oracle-compatible for this construct? Sure, Postgres doesn't
|
||
have to follow Oraclisms but picking an existing contruct means
|
||
at least SOME folks can import a datamodel without having to
|
||
edit it.
|
||
|
||
Does your proposal break the smgr abstraction, i.e. does it
|
||
preclude later efforts to (say) implement an (optional)
|
||
raw-device storage manager?
|
||
|
||
|
||
|
||
|
||
- Don Baccus, Portland OR <dhogaza@pacifier.com>
|
||
Nature photos, on-line guides, Pacific Northwest
|
||
Rare Bird Alert Service and other goodies at
|
||
http://donb.photo.net.
|
||
|
||
From pgsql-hackers-owner+M3571@hub.org Sun Jun 18 23:28:13 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA23880
|
||
for <pgman@candle.pha.pa.us>; Sun, 18 Jun 2000 23:28:12 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id XAA04627 for <pgman@candle.pha.pa.us>; Sun, 18 Jun 2000 23:24:37 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5J3GQM78526;
|
||
Sun, 18 Jun 2000 23:16:26 -0400 (EDT)
|
||
Received: from candle.pha.pa.us (pgman@nav-43.dsl.navpoint.com [162.33.245.46])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5J3E3M71538
|
||
for <pgsql-hackers@postgresql.org>; Sun, 18 Jun 2000 23:14:03 -0400 (EDT)
|
||
Received: (from pgman@localhost)
|
||
by candle.pha.pa.us (8.9.0/8.9.0) id XAA23541;
|
||
Sun, 18 Jun 2000 23:13:44 -0400 (EDT)
|
||
From: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
Message-Id: <200006190313.XAA23541@candle.pha.pa.us>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-Reply-To: <12160.961344389@sss.pgh.pa.us> "from Tom Lane at Jun 18, 2000 12:06:29
|
||
pm"
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Date: Sun, 18 Jun 2000 23:13:44 -0400 (EDT)
|
||
CC: Jan Wieck <JanWieck@Yahoo.com>, Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
X-Mailer: ELM [version 2.4ME+ PL77 (25)]
|
||
MIME-Version: 1.0
|
||
Content-Transfer-Encoding: 7bit
|
||
Content-Type: text/plain; charset=US-ASCII
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
My basic proposal is that we optionally allow symlinks when creating
|
||
tablespace directories, and that we interrogate those symlinks during a
|
||
dump so administrators can move tablespaces around without having to
|
||
modify environment variables or system tables.
|
||
|
||
I also suggested creating an extent directory to hold extents, like
|
||
extent/2 and extent/3. This will allow administration for smaller sites
|
||
to be simpler.
|
||
|
||
--
|
||
Bruce Momjian | http://www.op.net/~candle
|
||
pgman@candle.pha.pa.us | (610) 853-3000
|
||
+ If your life is a hard drive, | 830 Blythe Avenue
|
||
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
|
||
|
||
From dhogaza@pacifier.com Mon Jun 19 00:31:00 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA01941
|
||
for <pgman@candle.pha.pa.us>; Mon, 19 Jun 2000 00:31:00 -0400 (EDT)
|
||
Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id AAA06881 for <pgman@candle.pha.pa.us>; Mon, 19 Jun 2000 00:11:39 -0400 (EDT)
|
||
Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68])
|
||
by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id VAA29138;
|
||
Sun, 18 Jun 2000 21:11:01 -0700 (PDT)
|
||
Message-Id: <3.0.1.32.20000618210748.011d1c40@mail.pacifier.com>
|
||
X-Sender: dhogaza@mail.pacifier.com
|
||
X-Mailer: Windows Eudora Pro Version 3.0.1 (32)
|
||
Date: Sun, 18 Jun 2000 21:07:48 -0700
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>, Tom Lane <tgl@sss.pgh.pa.us>
|
||
From: Don Baccus <dhogaza@pacifier.com>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
Cc: Jan Wieck <JanWieck@yahoo.com>, Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
In-Reply-To: <200006190313.XAA23541@candle.pha.pa.us>
|
||
References: <12160.961344389@sss.pgh.pa.us>
|
||
Mime-Version: 1.0
|
||
Content-Type: text/plain; charset="us-ascii"
|
||
Status: RO
|
||
|
||
At 11:13 PM 6/18/00 -0400, Bruce Momjian wrote:
|
||
>My basic proposal is that we optionally allow symlinks when creating
|
||
>tablespace directories, and that we interrogate those symlinks during a
|
||
>dump so administrators can move tablespaces around without having to
|
||
>modify environment variables or system tables.
|
||
|
||
If they can move them around from within the db, they'll have no need to
|
||
move them around from outside the db.
|
||
|
||
I don't quite understand your devotion to using filesystem commands
|
||
outside the database to do database administration.
|
||
|
||
|
||
|
||
- Don Baccus, Portland OR <dhogaza@pacifier.com>
|
||
Nature photos, on-line guides, Pacific Northwest
|
||
Rare Bird Alert Service and other goodies at
|
||
http://donb.photo.net.
|
||
|
||
From pgsql-hackers-owner+M3573@hub.org Mon Jun 19 01:31:02 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA01981
|
||
for <pgman@candle.pha.pa.us>; Mon, 19 Jun 2000 01:31:01 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id BAA09569 for <pgman@candle.pha.pa.us>; Mon, 19 Jun 2000 01:13:53 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5J4T3M86960;
|
||
Mon, 19 Jun 2000 00:29:04 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5J4RFM80712
|
||
for <pgsql-hackers@postgresql.org>; Mon, 19 Jun 2000 00:27:15 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id AAA09517;
|
||
Mon, 19 Jun 2000 00:25:53 -0400 (EDT)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: Jan Wieck <JanWieck@yahoo.com>, Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
PostgreSQL-development <pgsql-hackers@postgreSQL.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006190313.XAA23541@candle.pha.pa.us>
|
||
References: <200006190313.XAA23541@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Sun, 18 Jun 2000 23:13:44 -0400"
|
||
Date: Mon, 19 Jun 2000 00:25:52 -0400
|
||
Message-ID: <9514.961388752@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: ROr
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> I also suggested creating an extent directory to hold extents, like
|
||
> extent/2 and extent/3. This will allow administration for smaller sites
|
||
> to be simpler.
|
||
|
||
I don't see the value in creating an extra level of directory --- seems
|
||
that just adds one more Unix directory-lookup cycle to each file open,
|
||
without any apparent return. What's wrong with extent directory names
|
||
like extent2, extent3, etc?
|
||
|
||
Obviously the extent dirnames must be chosen so they can't conflict
|
||
with table filenames, but that's easily done. For example, if table
|
||
files are named like 'OID_xxx' then 'extentN' will never conflict.
|
||
|
||
regards, tom lane
|
||
|
||
From tgl@sss.pgh.pa.us Mon Jun 19 00:30:58 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA01934
|
||
for <pgman@candle.pha.pa.us>; Mon, 19 Jun 2000 00:30:58 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id AAA07814 for <pgman@candle.pha.pa.us>; Mon, 19 Jun 2000 00:29:36 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id AAA09535;
|
||
Mon, 19 Jun 2000 00:28:14 -0400 (EDT)
|
||
To: Don Baccus <dhogaza@pacifier.com>
|
||
cc: Bruce Momjian <pgman@candle.pha.pa.us>, Jan Wieck <JanWieck@yahoo.com>,
|
||
Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <3.0.1.32.20000618210748.011d1c40@mail.pacifier.com>
|
||
References: <12160.961344389@sss.pgh.pa.us> <3.0.1.32.20000618210748.011d1c40@mail.pacifier.com>
|
||
Comments: In-reply-to Don Baccus <dhogaza@pacifier.com>
|
||
message dated "Sun, 18 Jun 2000 21:07:48 -0700"
|
||
Date: Mon, 19 Jun 2000 00:28:14 -0400
|
||
Message-ID: <9532.961388894@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: ROr
|
||
|
||
Don Baccus <dhogaza@pacifier.com> writes:
|
||
> If they can move them around from within the db, they'll have no need to
|
||
> move them around from outside the db.
|
||
> I don't quite understand your devotion to using filesystem commands
|
||
> outside the database to do database administration.
|
||
|
||
Being *able* to use filesystem commands to see/fix what's going on is a
|
||
good thing, particularly from a development/debugging standpoint. But
|
||
I agree we want to have within-the-system admin commands to do the same
|
||
things.
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M3574@hub.org Mon Jun 19 01:31:01 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA01977
|
||
for <pgman@candle.pha.pa.us>; Mon, 19 Jun 2000 01:31:00 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id BAA09374 for <pgman@candle.pha.pa.us>; Mon, 19 Jun 2000 01:07:50 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5J4VkM95901;
|
||
Mon, 19 Jun 2000 00:31:46 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5J4TgM89399
|
||
for <pgsql-hackers@postgresql.org>; Mon, 19 Jun 2000 00:29:42 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id AAA09535;
|
||
Mon, 19 Jun 2000 00:28:14 -0400 (EDT)
|
||
To: Don Baccus <dhogaza@pacifier.com>
|
||
cc: Bruce Momjian <pgman@candle.pha.pa.us>, Jan Wieck <JanWieck@yahoo.com>,
|
||
Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
PostgreSQL-development <pgsql-hackers@postgreSQL.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <3.0.1.32.20000618210748.011d1c40@mail.pacifier.com>
|
||
References: <12160.961344389@sss.pgh.pa.us> <3.0.1.32.20000618210748.011d1c40@mail.pacifier.com>
|
||
Comments: In-reply-to Don Baccus <dhogaza@pacifier.com>
|
||
message dated "Sun, 18 Jun 2000 21:07:48 -0700"
|
||
Date: Mon, 19 Jun 2000 00:28:14 -0400
|
||
Message-ID: <9532.961388894@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
Don Baccus <dhogaza@pacifier.com> writes:
|
||
> If they can move them around from within the db, they'll have no need to
|
||
> move them around from outside the db.
|
||
> I don't quite understand your devotion to using filesystem commands
|
||
> outside the database to do database administration.
|
||
|
||
Being *able* to use filesystem commands to see/fix what's going on is a
|
||
good thing, particularly from a development/debugging standpoint. But
|
||
I agree we want to have within-the-system admin commands to do the same
|
||
things.
|
||
|
||
regards, tom lane
|
||
|
||
From dhogaza@pacifier.com Mon Jun 19 00:58:39 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA00799
|
||
for <pgman@candle.pha.pa.us>; Mon, 19 Jun 2000 00:58:38 -0400 (EDT)
|
||
Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id AAA08143 for <pgman@candle.pha.pa.us>; Mon, 19 Jun 2000 00:37:39 -0400 (EDT)
|
||
Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68])
|
||
by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id VAA00259;
|
||
Sun, 18 Jun 2000 21:36:25 -0700 (PDT)
|
||
Message-Id: <3.0.1.32.20000618213319.011d59c0@mail.pacifier.com>
|
||
X-Sender: dhogaza@mail.pacifier.com
|
||
X-Mailer: Windows Eudora Pro Version 3.0.1 (32)
|
||
Date: Sun, 18 Jun 2000 21:33:19 -0700
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
From: Don Baccus <dhogaza@pacifier.com>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
Cc: Bruce Momjian <pgman@candle.pha.pa.us>, Jan Wieck <JanWieck@yahoo.com>,
|
||
Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
In-Reply-To: <9532.961388894@sss.pgh.pa.us>
|
||
References: <3.0.1.32.20000618210748.011d1c40@mail.pacifier.com>
|
||
<12160.961344389@sss.pgh.pa.us>
|
||
<3.0.1.32.20000618210748.011d1c40@mail.pacifier.com>
|
||
Mime-Version: 1.0
|
||
Content-Type: text/plain; charset="us-ascii"
|
||
Status: RO
|
||
|
||
At 12:28 AM 6/19/00 -0400, Tom Lane wrote:
|
||
|
||
>Being *able* to use filesystem commands to see/fix what's going on is a
|
||
>good thing, particularly from a development/debugging standpoint.
|
||
|
||
Of course it's a crutch for development, but outside of development
|
||
circles few users will know how to use the OS in regard to the
|
||
database.
|
||
|
||
Assuming PG takes off. Of course, if it remains the realm of the
|
||
dedicated hard-core hacker, I'm wrong.
|
||
|
||
I have nothing against preserving the ability to use filesystem
|
||
commands if there's no significant costs inherent with this approach.
|
||
I'd view the breaking of smgr abstraction as a significant cost (though
|
||
I agree with Ross that it Bruce's proposal shouldn't require that, I
|
||
asked my question to flush Bruce out, if you will, because he's
|
||
devoted to a particular outside-the-db management model).
|
||
|
||
> But
|
||
>I agree we want to have within-the-system admin commands to do the same
|
||
>things.
|
||
|
||
MUST have, I should think.
|
||
|
||
|
||
|
||
- Don Baccus, Portland OR <dhogaza@pacifier.com>
|
||
Nature photos, on-line guides, Pacific Northwest
|
||
Rare Bird Alert Service and other goodies at
|
||
http://donb.photo.net.
|
||
|
||
From Inoue@tpf.co.jp Mon Jun 19 12:31:17 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA29988
|
||
for <pgman@candle.pha.pa.us>; Mon, 19 Jun 2000 12:31:16 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (mail.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id MAA21005 for <pgman@candle.pha.pa.us>; Mon, 19 Jun 2000 12:15:22 -0400 (EDT)
|
||
Received: from mcadnote1 (ppm127.noc.fukui.nsk.ne.jp [210.161.188.46])
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id BAA09828; Tue, 20 Jun 2000 01:14:19 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Bruce Momjian" <pgman@candle.pha.pa.us>
|
||
Cc: "Tom Lane" <tgl@sss.pgh.pa.us>, "Jan Wieck" <JanWieck@yahoo.com>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgreSQL.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>,
|
||
"Don Baccus" <dhogaza@pacifier.com>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Tue, 20 Jun 2000 01:17:14 +0900
|
||
Message-ID: <EKEJJICOHDIEMGPNIFIJGECCCCAA.Inoue@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="us-ascii"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
|
||
In-Reply-To: <200006191330.JAA16908@candle.pha.pa.us>
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
|
||
Importance: Normal
|
||
Status: ROr
|
||
|
||
> -----Original Message-----
|
||
> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
|
||
>
|
||
> The fact is that symlink information is already stored in the file
|
||
> system. If we store symlink information in the database too, there
|
||
> exists the ability for the two to get out of sync. My point is that I
|
||
> think we can _not_ store symlink information in the database, and query
|
||
> the file system using lstat when required.
|
||
>
|
||
|
||
Hmm,this seems pretty confusing to me.
|
||
I don't understand the necessity of symlink.
|
||
Directory tree,symlink,hard link ... are OS's standard.
|
||
But I don't think they are fit for dbms management.
|
||
|
||
PostgreSQL is a database system of cource. So
|
||
couldn't it handle more flexible structure than OS's
|
||
directory tree for itself ?
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
|
||
|
||
From Inoue@tpf.co.jp Tue Jun 20 02:01:04 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA24419
|
||
for <pgman@candle.pha.pa.us>; Tue, 20 Jun 2000 02:00:59 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id BAA26090 for <pgman@candle.pha.pa.us>; Tue, 20 Jun 2000 01:51:00 -0400 (EDT)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id OAA10171; Tue, 20 Jun 2000 14:50:03 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Bruce Momjian" <pgman@candle.pha.pa.us>
|
||
Cc: "Tom Lane" <tgl@sss.pgh.pa.us>, "Jan Wieck" <JanWieck@yahoo.com>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>,
|
||
"Don Baccus" <dhogaza@pacifier.com>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgreSQL.org>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Tue, 20 Jun 2000 14:52:17 +0900
|
||
Message-ID: <000001bfda7b$b0dbf160$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
In-Reply-To: <200006191735.NAA03241@candle.pha.pa.us>
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
Importance: Normal
|
||
Status: ROr
|
||
|
||
> -----Original Message-----
|
||
> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
|
||
>
|
||
> > > -----Original Message-----
|
||
> > > From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
|
||
> > >
|
||
> > > The fact is that symlink information is already stored in the file
|
||
> > > system. If we store symlink information in the database too, there
|
||
> > > exists the ability for the two to get out of sync. My point is that I
|
||
> > > think we can _not_ store symlink information in the database,
|
||
> and query
|
||
> > > the file system using lstat when required.
|
||
> > >
|
||
> > Hmm,this seems pretty confusing to me.
|
||
> > I don't understand the necessity of symlink.
|
||
> > Directory tree,symlink,hard link ... are OS's standard.
|
||
> > But I don't think they are fit for dbms management.
|
||
> >
|
||
> > PostgreSQL is a database system of cource. So
|
||
> > couldn't it handle more flexible structure than OS's
|
||
> > directory tree for itself ?
|
||
>
|
||
> Yes, but is anyone suggesting a solution that does not work with
|
||
> symlinks? If not, why not do it that way?
|
||
>
|
||
|
||
Maybe other solutions have been proposed already because
|
||
there have been so many opinions and proposals.
|
||
|
||
I've felt TABLE(DATA)SPACE discussion has always been
|
||
divergent. IMHO,one of the main cause is that various factors
|
||
have been discussed at once. Shouldn't we make step by step
|
||
consensus in TABLE(DATA)SPACE discussion ?
|
||
|
||
IMHO,the first step is to decide the syntax of CREATE TABLE
|
||
command not to define TABLE(DATA)SPACE.
|
||
|
||
Comments ?
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
|
||
From tgl@sss.pgh.pa.us Tue Jun 20 10:51:32 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA15181
|
||
for <pgman@candle.pha.pa.us>; Tue, 20 Jun 2000 10:51:31 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id KAA26466 for <pgman@candle.pha.pa.us>; Tue, 20 Jun 2000 10:37:20 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id KAA29689;
|
||
Tue, 20 Jun 2000 10:36:04 -0400 (EDT)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: Hiroshi Inoue <Inoue@tpf.co.jp>, Jan Wieck <JanWieck@yahoo.com>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>,
|
||
Don Baccus <dhogaza@pacifier.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgreSQL.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006201340.JAA10387@candle.pha.pa.us>
|
||
References: <200006201340.JAA10387@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Tue, 20 Jun 2000 09:40:03 -0400"
|
||
Date: Tue, 20 Jun 2000 10:36:04 -0400
|
||
Message-ID: <29686.961511764@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: RO
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> Agreed. Seems we have several issues:
|
||
|
||
> filename contents
|
||
> tablespace implementation
|
||
> tablespace directory layout
|
||
> tablespace commands and syntax
|
||
|
||
I think we've agreed that the filename must depend on tablespace,
|
||
file version, and file segment number in some fashion --- plus
|
||
the table name/OID of course. Although there's no real consensus
|
||
about exactly how to construct the name, agreeing on the components
|
||
is still a positive step.
|
||
|
||
A couple of other areas of contention were:
|
||
|
||
revising smgr interface to be cleaner
|
||
exactly what to store in pg_class
|
||
|
||
I don't think there's any quibble about the idea of cleaning up smgr,
|
||
but we don't have a complete proposal on the table yet either.
|
||
|
||
As for the pg_class issue, I still favor storing
|
||
(a) OID of tablespace --- not for file access, but so that
|
||
associated tablespace-table entry can be looked up
|
||
by tablespace management operations
|
||
(b) pathname of file as a column of type "name", including
|
||
a %d to be replaced by segment #
|
||
|
||
I think Peter was holding out for storing purely numeric tablespace OID
|
||
and table version in pg_class and having a hardwired mapping to pathname
|
||
somewhere in smgr. However, I think that doing it that way gains only
|
||
micro-efficiency compared to passing a "name" around, while using the
|
||
name approach buys us flexibility that's needed for at least some of
|
||
the variants under discussion. Given that the exact filename contents
|
||
are still so contentious, I think it'd be a bad idea to pick an
|
||
implementation that doesn't allow some leeway as to what the filename
|
||
will be. A name also has the advantage that it is a single item that
|
||
can be used to identify the table to smgr, which will help in cleaning
|
||
up the smgr interface.
|
||
|
||
As for tablespace layout/implementation, the only real proposal I've
|
||
heard is that there be a subdirectory of the database directory for each
|
||
tablespace, and that that have a subdirectory for each segment (extent)
|
||
of its tables --- where any of these subdirectories could be symlinks
|
||
off to a different filesystem. Some unhappiness was raised about
|
||
depending on symlinks for this function, but I didn't hear one single
|
||
concrete reason not to do it, nor an alternative design. Unless someone
|
||
comes up with a counterproposal, I think that that's what the actual
|
||
access mechanism will look like. We still need to talk about what we
|
||
want to store in the SQL-level representation of a tablespace, and what
|
||
sort of tablespace management tools/commands are needed. (Although
|
||
"try to make it look like Oracle" seems to be pretty much the consensus
|
||
for the command level, not all of us know exactly what that means...)
|
||
|
||
Comments? Anything else that we do have consensus on?
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M3615@hub.org Tue Jun 20 12:55:05 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA25768
|
||
for <pgman@candle.pha.pa.us>; Tue, 20 Jun 2000 12:55:04 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id MAA09949 for <pgman@candle.pha.pa.us>; Tue, 20 Jun 2000 12:41:15 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5KGcCM19112;
|
||
Tue, 20 Jun 2000 12:38:12 -0400 (EDT)
|
||
Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5KGbbM18701
|
||
for <pgsql-hackers@postgresql.org>; Tue, 20 Jun 2000 12:37:37 -0400 (EDT)
|
||
Received: from regulus.student.UU.SE ([130.238.5.2]:43625 "EHLO
|
||
regulus.its.uu.se") by merganser.its.uu.se with ESMTP
|
||
id <S303230AbQFTQhF>; Tue, 20 Jun 2000 18:37:05 +0200
|
||
Received: from peter (helo=localhost)
|
||
by regulus.its.uu.se with local-esmtp (Exim 3.02 #2)
|
||
id 134R7f-0003wS-00; Tue, 20 Jun 2000 18:43:35 +0200
|
||
Date: Tue, 20 Jun 2000 18:43:35 +0200 (CEST)
|
||
From: Peter Eisentraut <peter_e@gmx.net>
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: Jan Wieck <JanWieck@Yahoo.com>, Tom Lane <tgl@sss.pgh.pa.us>,
|
||
Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
Bruce Momjian <maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-Reply-To: <200006180316.XAA15410@candle.pha.pa.us>
|
||
Message-ID: <Pine.LNX.4.21.0006200034310.353-100000@localhost.localdomain>
|
||
MIME-Version: 1.0
|
||
Content-Type: TEXT/PLAIN; charset=ISO-8859-1
|
||
Content-Transfer-Encoding: 8BIT
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: ROr
|
||
|
||
Bruce Momjian writes:
|
||
|
||
> If we have a new CREATE DATABASE LOCATION command, we can say:
|
||
>
|
||
> CREATE DATABASE LOCATION dbloc IN '/var/private/pgsql';
|
||
> CREATE DATABASE newdb IN dbloc;
|
||
|
||
We kind of have this already, with CREATE DATABASE foo WITH LOCATION =
|
||
'bar'; but of course with environment variable kludgery. But it's a start.
|
||
|
||
> mkdir /var/private/pgsql/dbloc
|
||
> ln -s /var/private/pgsql/dbloc data/base/dbloc
|
||
|
||
I think the problem with this was that you'd have to do an extra lookup
|
||
into, say, pg_location to resolve this. Some people are talking about
|
||
blind writes, this is not really blind.
|
||
|
||
> CREATE LOCATION tabloc IN '/var/private/pgsql';
|
||
> CREATE TABLE newtab ... IN tabloc;
|
||
|
||
Okay, so we'd have "table spaces" and "database spaces". Seems like one
|
||
"space" ought to be enough. I was thinking that the database "space" would
|
||
serve as a default "space" for tables created within it but you could
|
||
still create tables in other "spaces" than were the database really is. In
|
||
fact, the database wouldn't show up at all in the file names anymore,
|
||
which may or may not be a good thing.
|
||
|
||
I think Tom suggested something more or less like this:
|
||
|
||
$PGDATA/base/tablespace/segment/table
|
||
|
||
(leaving the details of "table" aside for now). pg_class would get a
|
||
column storing the table space somehow, say an oid reference to
|
||
pg_location. There would have to be a default tablespace that's created by
|
||
initdb and it's indicated by oid 0. So if you create a simple little table
|
||
"foo" it ends up in
|
||
|
||
$PGDATA/base/0/0/foo
|
||
|
||
That is pretty manageable. Now to create a table space you do
|
||
|
||
CREATE LOCATION "name" AT '/some/where';
|
||
|
||
which would make an entry in pg_location and, similar to how you
|
||
suggested, create a symlink from
|
||
|
||
$PGDATA/base/newoid -> /some/where
|
||
|
||
Then when you create a new table at that new location this gets simply
|
||
noted in pg_class with an oid reference, the rest works completely
|
||
transparently and no lookup outside of pg_class required. The system would
|
||
create the segment 0 subdirectory automatically.
|
||
|
||
When tables get segmented the system would simply create subdirectories 1,
|
||
2, 3, etc. as needed, just as it created the 0 as need, no extra code.
|
||
|
||
pg_dump doesn't need to use lstat or whatever at all because the locations
|
||
are catalogued. Administrators don't even need to know about the linking
|
||
business, they just make sure the target directory exists.
|
||
|
||
Two more items to ponder:
|
||
|
||
* per-location transaction logs
|
||
|
||
* pg_upgrade
|
||
|
||
|
||
--
|
||
Peter Eisentraut Sernanders v<>g 10:115
|
||
peter_e@gmx.net 75262 Uppsala
|
||
http://yi.org/peter-e/ Sweden
|
||
|
||
|
||
From Inoue@tpf.co.jp Tue Jun 20 17:10:56 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA10307
|
||
for <pgman@candle.pha.pa.us>; Tue, 20 Jun 2000 17:10:55 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (mail.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id QAA08017 for <pgman@candle.pha.pa.us>; Tue, 20 Jun 2000 16:57:44 -0400 (EDT)
|
||
Received: from mcadnote1 (ppm127.noc.fukui.nsk.ne.jp [210.161.188.46])
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id FAA00867; Wed, 21 Jun 2000 05:56:44 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Tom Lane" <tgl@sss.pgh.pa.us>, "Bruce Momjian" <pgman@candle.pha.pa.us>
|
||
Cc: "Jan Wieck" <JanWieck@yahoo.com>, "Ross J. Reedstrom" <reedstrm@rice.edu>,
|
||
"Don Baccus" <dhogaza@pacifier.com>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgreSQL.org>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Wed, 21 Jun 2000 05:59:41 +0900
|
||
Message-ID: <EKEJJICOHDIEMGPNIFIJIEDDCCAA.Inoue@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-2022-jp"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
|
||
In-Reply-To: <29686.961511764@sss.pgh.pa.us>
|
||
Importance: Normal
|
||
Status: RO
|
||
|
||
> -----Original Message-----
|
||
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
|
||
>
|
||
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> > Agreed. Seems we have several issues:
|
||
>
|
||
> > filename contents
|
||
> > tablespace implementation
|
||
> > tablespace directory layout
|
||
> > tablespace commands and syntax
|
||
>
|
||
|
||
[snip]
|
||
|
||
>
|
||
> Comments? Anything else that we do have consensus on?
|
||
>
|
||
|
||
Before the details of tablespace implementation,
|
||
|
||
1) How to change(extend) the syntax of CREATE TABLE
|
||
We only add table(data)space name with some
|
||
keyword ? i.e Do we consider tablespace as an
|
||
abstraction ?
|
||
|
||
To confirm our mutual understanding.
|
||
|
||
2) Is tablespace defined per PostgreSQL's database ?
|
||
3) Is default tablespace defined per database/user or
|
||
for all ?
|
||
|
||
AFAIK in Oracle,2) global, 3) per user.
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
From Inoue@tpf.co.jp Tue Jun 20 20:00:59 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA12668;
|
||
Tue, 20 Jun 2000 20:00:58 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id TAA21016; Tue, 20 Jun 2000 19:54:18 -0400 (EDT)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id IAA00974; Wed, 21 Jun 2000 08:52:38 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Peter Eisentraut" <peter_e@gmx.net>
|
||
Cc: "Jan Wieck" <JanWieck@Yahoo.com>, "Tom Lane" <tgl@sss.pgh.pa.us>,
|
||
"Bruce Momjian" <maillist@candle.pha.pa.us>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>,
|
||
"Bruce Momjian" <pgman@candle.pha.pa.us>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Wed, 21 Jun 2000 08:54:51 +0900
|
||
Message-ID: <000e01bfdb12$ecc08f00$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
In-Reply-To: <Pine.LNX.4.21.0006200034310.353-100000@localhost.localdomain>
|
||
Importance: Normal
|
||
Status: ROr
|
||
|
||
> -----Original Message-----
|
||
> From: Peter Eisentraut
|
||
>
|
||
> Bruce Momjian writes:
|
||
>
|
||
> > If we have a new CREATE DATABASE LOCATION command, we can say:
|
||
> >
|
||
> > CREATE DATABASE LOCATION dbloc IN '/var/private/pgsql';
|
||
> > CREATE DATABASE newdb IN dbloc;
|
||
>
|
||
> We kind of have this already, with CREATE DATABASE foo WITH LOCATION =
|
||
> 'bar'; but of course with environment variable kludgery. But it's a start.
|
||
>
|
||
> > mkdir /var/private/pgsql/dbloc
|
||
> > ln -s /var/private/pgsql/dbloc data/base/dbloc
|
||
>
|
||
> I think the problem with this was that you'd have to do an extra lookup
|
||
> into, say, pg_location to resolve this. Some people are talking about
|
||
> blind writes, this is not really blind.
|
||
>
|
||
> > CREATE LOCATION tabloc IN '/var/private/pgsql';
|
||
> > CREATE TABLE newtab ... IN tabloc;
|
||
>
|
||
> Okay, so we'd have "table spaces" and "database spaces". Seems like one
|
||
> "space" ought to be enough.
|
||
|
||
Does your "database space" correspond to current PostgreSQL's database ?
|
||
And is it different from SCHEMA ?
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
|
||
From tgl@sss.pgh.pa.us Wed Jun 21 00:23:48 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA18016;
|
||
Wed, 21 Jun 2000 00:23:47 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id AAA05207; Wed, 21 Jun 2000 00:07:58 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id AAA03002;
|
||
Wed, 21 Jun 2000 00:06:42 -0400 (EDT)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: Hiroshi Inoue <Inoue@tpf.co.jp>, Peter Eisentraut <peter_e@gmx.net>,
|
||
Jan Wieck <JanWieck@Yahoo.com>,
|
||
Bruce Momjian <maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006210345.XAA15107@candle.pha.pa.us>
|
||
References: <200006210345.XAA15107@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Tue, 20 Jun 2000 23:45:13 -0400"
|
||
Date: Wed, 21 Jun 2000 00:06:42 -0400
|
||
Message-ID: <2999.961560402@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: ROr
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> I recommend making a dbname in each directory, then putting the
|
||
> location inside there.
|
||
|
||
This still seems backwards to me. Why is it better than tablespace
|
||
directory inside database directory?
|
||
|
||
One significant problem with it is that there's no longer (AFAICS)
|
||
a "default" per-database directory that corresponds to the current
|
||
working directory of backends running in that database. Thus,
|
||
for example, it's not immediately clear where temporary files and
|
||
backend core-dump files will end up. Also, you've just added an
|
||
essential extra level (if not two) to the pathnames that backends will
|
||
use to address files.
|
||
|
||
There is a great deal to be said for
|
||
..../database/tablespace/filename
|
||
where .../database/ is the working directory of a backend running in
|
||
that database, so that the relative pathname used by that backend to
|
||
get to a table is just tablespace/filename. I fail to see any advantage
|
||
in reversing the pathname order. If you see one, enlighten me.
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M3635@hub.org Wed Jun 21 01:00:59 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA19614
|
||
for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 01:00:54 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5L4wA125142;
|
||
Wed, 21 Jun 2000 00:58:10 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5L4vp125043
|
||
for <pgsql-hackers@postgresql.org>; Wed, 21 Jun 2000 00:57:51 -0400 (EDT)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id NAA01462; Wed, 21 Jun 2000 13:52:47 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Tom Lane" <tgl@sss.pgh.pa.us>, "Bruce Momjian" <pgman@candle.pha.pa.us>
|
||
Cc: "Peter Eisentraut" <peter_e@gmx.net>, "Jan Wieck" <JanWieck@Yahoo.com>,
|
||
"Bruce Momjian" <maillist@candle.pha.pa.us>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Wed, 21 Jun 2000 13:55:01 +0900
|
||
Message-ID: <000001bfdb3c$db728760$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-2022-jp"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
In-reply-to: <2999.961560402@sss.pgh.pa.us>
|
||
Importance: Normal
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
> -----Original Message-----
|
||
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
|
||
>
|
||
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> > I recommend making a dbname in each directory, then putting the
|
||
> > location inside there.
|
||
>
|
||
> This still seems backwards to me. Why is it better than tablespace
|
||
> directory inside database directory?
|
||
>
|
||
> One significant problem with it is that there's no longer (AFAICS)
|
||
> a "default" per-database directory that corresponds to the current
|
||
> working directory of backends running in that database. Thus,
|
||
> for example, it's not immediately clear where temporary files and
|
||
> backend core-dump files will end up. Also, you've just added an
|
||
> essential extra level (if not two) to the pathnames that backends will
|
||
> use to address files.
|
||
>
|
||
> There is a great deal to be said for
|
||
> ..../database/tablespace/filename
|
||
|
||
OK,I seem to have gotten the answer for the question
|
||
Is tablespace defined per PostgreSQL's database ?
|
||
|
||
You and Bruce
|
||
1) tablespace is per database
|
||
Peter seems to have the following idea(?? not sure)
|
||
2) database = tablespace
|
||
My opinion
|
||
3) database and tablespace are relatively irrelevant.
|
||
I assume PostgreSQL's database would correspond
|
||
to the concept of SCHEMA.
|
||
|
||
It seems we are different from the first.
|
||
Shoudln't we reach an agreement on it in the first place ?
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
|
||
From pgsql-hackers-owner+M3636@hub.org Wed Jun 21 01:31:12 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA20523
|
||
for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 01:31:12 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id BAA08982 for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 01:15:17 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5L5Bp151546;
|
||
Wed, 21 Jun 2000 01:11:51 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5L5BP151324
|
||
for <pgsql-hackers@postgresql.org>; Wed, 21 Jun 2000 01:11:25 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id BAA03463;
|
||
Wed, 21 Jun 2000 01:09:52 -0400 (EDT)
|
||
To: Chris Bitmead <chrisb@nimrod.itg.telstra.com.au>
|
||
cc: Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgreSQL.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <3950484D.417C87E9@nimrod.itg.telecom.com.au>
|
||
References: <200006210346.XAA15138@candle.pha.pa.us> <3950484D.417C87E9@nimrod.itg.telecom.com.au>
|
||
Comments: In-reply-to Chris Bitmead <chrisb@nimrod.itg.telstra.com.au>
|
||
message dated "Wed, 21 Jun 2000 14:45:01 +1000"
|
||
Date: Wed, 21 Jun 2000 01:09:52 -0400
|
||
Message-ID: <3459.961564192@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> writes:
|
||
> What I meant is, would you still be able to create tablespaces on
|
||
> systems without symlinks? That would seem to be a desirable feature.
|
||
|
||
All else being equal, it'd be nice. Since all else is not equal,
|
||
exactly how much sweat are we willing to expend on supporting that
|
||
feature on such systems --- to the exclusion of other features we
|
||
might expend the same sweat on, with more widely useful results?
|
||
|
||
Bear in mind that everything will still *work* just fine on such a
|
||
platform, you just don't have a way to spread the database across
|
||
multiple filesystems. That's only an issue if the platform has a
|
||
fairly Unixy notion of filesystems ... but no symlinks.
|
||
|
||
A few messages back someone was opining that we were wasting our time
|
||
thinking about tablespaces at all, because any modern platform can
|
||
create disk-spanning filesystems for itself, so applications don't have
|
||
to worry. I don't buy that argument in general, but I'm quite willing
|
||
to quote it for the *very* few systems that are Unixy enough to run
|
||
Postgres in the first place, but not quite Unixy enough to have
|
||
symlinks.
|
||
|
||
You gotta draw the line somewhere at what you will support, and
|
||
this particular line seems to me to be entirely reasonable and
|
||
justifiable. YMMV...
|
||
|
||
regards, tom lane
|
||
|
||
From dhogaza@pacifier.com Wed Jun 21 01:31:03 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA20492
|
||
for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 01:30:58 -0400 (EDT)
|
||
Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id BAA09401 for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 01:22:50 -0400 (EDT)
|
||
Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68])
|
||
by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id WAA22395;
|
||
Tue, 20 Jun 2000 22:21:47 -0700 (PDT)
|
||
Message-Id: <3.0.1.32.20000620221248.0150f610@mail.pacifier.com>
|
||
X-Sender: dhogaza@mail.pacifier.com
|
||
X-Mailer: Windows Eudora Pro Version 3.0.1 (32)
|
||
Date: Tue, 20 Jun 2000 22:12:48 -0700
|
||
To: "Philip J. Warner" <pjw@rhyme.com.au>, "Hiroshi Inoue" <Inoue@tpf.co.jp>,
|
||
"Tom Lane" <tgl@sss.pgh.pa.us>,
|
||
"Bruce Momjian" <pgman@candle.pha.pa.us>
|
||
From: Don Baccus <dhogaza@pacifier.com>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Cc: "Jan Wieck" <JanWieck@yahoo.com>, "Ross J. Reedstrom" <reedstrm@rice.edu>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgreSQL.org>
|
||
In-Reply-To: <3.0.5.32.20000621112210.01d97680@mail.rhyme.com.au>
|
||
References: <EKEJJICOHDIEMGPNIFIJIEDDCCAA.Inoue@tpf.co.jp>
|
||
<29686.961511764@sss.pgh.pa.us>
|
||
Mime-Version: 1.0
|
||
Content-Type: text/plain; charset="us-ascii"
|
||
Status: RO
|
||
|
||
At 11:22 AM 6/21/00 +1000, Philip J. Warner wrote:
|
||
|
||
>It may be worth considering leaving the CREATE TABLE statement alone.
|
||
>Dec/RDB uses a new statement entirely to define where a table goes...
|
||
|
||
It's worth considering, but on the other hand Oracle users greatly
|
||
outnumber Compaq/RDB users these days...
|
||
|
||
If there's no SQL92 guidance for implementing a feature, I'm pretty much in
|
||
favor of tracking Oracle, whose SQL dialect is rapidly becoming a
|
||
de-facto standard.
|
||
|
||
I'm not saying I like the fact, Oracle's a pain in the ass. But when
|
||
adopting existing syntax, might as well adopt that of the crushing
|
||
borg.
|
||
|
||
|
||
|
||
- Don Baccus, Portland OR <dhogaza@pacifier.com>
|
||
Nature photos, on-line guides, Pacific Northwest
|
||
Rare Bird Alert Service and other goodies at
|
||
http://donb.photo.net.
|
||
|
||
From lockhart@alumni.caltech.edu Wed Jun 21 01:31:07 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA20508;
|
||
Wed, 21 Jun 2000 01:31:06 -0400 (EDT)
|
||
Received: from huey.jpl.nasa.gov (huey.jpl.nasa.gov [128.149.68.100]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id BAA09355; Wed, 21 Jun 2000 01:22:03 -0400 (EDT)
|
||
Received: from golem.jpl.nasa.gov (hectic-1 [128.149.68.203])
|
||
by huey.jpl.nasa.gov (8.8.8+Sun/8.8.8) with ESMTP id WAA00821;
|
||
Tue, 20 Jun 2000 22:18:38 -0700 (PDT)
|
||
Received: from alumni.caltech.edu (localhost.localdomain [127.0.0.1])
|
||
by golem.jpl.nasa.gov (Postfix) with ESMTP
|
||
id AF4376F51; Wed, 21 Jun 2000 05:19:29 +0000 (UTC)
|
||
Sender: lockhart@mythos.jpl.nasa.gov
|
||
Message-ID: <39505061.F42334AB@alumni.caltech.edu>
|
||
Date: Wed, 21 Jun 2000 05:19:29 +0000
|
||
From: Thomas Lockhart <lockhart@alumni.caltech.edu>
|
||
Organization: Yes
|
||
X-Mailer: Mozilla 4.7 [en] (X11; I; Linux 2.2.14-15mdksmp i686)
|
||
X-Accept-Language: en
|
||
MIME-Version: 1.0
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
Cc: Peter Eisentraut <peter_e@gmx.net>, Jan Wieck <JanWieck@yahoo.com>,
|
||
Tom Lane <tgl@sss.pgh.pa.us>, Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
Bruce Momjian <maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgreSQL.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
References: <200006201753.NAA27293@candle.pha.pa.us>
|
||
Content-Type: text/plain; charset=us-ascii
|
||
Content-Transfer-Encoding: 7bit
|
||
Status: ROr
|
||
|
||
> Yes, I didn't like the environment variable stuff. In fact, I would
|
||
> like to not mention the symlink location anywhere in the database, so
|
||
> it can be changed without changing it in the database.
|
||
|
||
Well, as y'all have noticed, I think there are strong reasons to use
|
||
environment variables to manage locations, and that symlinks are a
|
||
potential portability and robustness problem.
|
||
|
||
An additional point which has relevance to this whole discussion:
|
||
|
||
In the future we may allow system resource such as tables to carry names
|
||
which use multi-byte encodings. afaik these encodings are not allowed to
|
||
be used for physical file names, and even if they were the utility of
|
||
using standard operating system utilities like ls goes way down.
|
||
|
||
istm that from a portability and evolutionary standpoint OID-only file
|
||
names (or at least file names *not* based on relation/class names) is a
|
||
requirement.
|
||
|
||
Comments?
|
||
|
||
- Thomas
|
||
|
||
From tgl@sss.pgh.pa.us Wed Jun 21 01:31:05 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA20503
|
||
for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 01:31:05 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id BAA09513 for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 01:25:18 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id BAA03557;
|
||
Wed, 21 Jun 2000 01:23:58 -0400 (EDT)
|
||
To: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
cc: "Bruce Momjian" <pgman@candle.pha.pa.us>,
|
||
"Peter Eisentraut" <peter_e@gmx.net>, "Jan Wieck" <JanWieck@yahoo.com>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgreSQL.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <000001bfdb3c$db728760$2801007e@tpf.co.jp>
|
||
References: <000001bfdb3c$db728760$2801007e@tpf.co.jp>
|
||
Comments: In-reply-to "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
message dated "Wed, 21 Jun 2000 13:55:01 +0900"
|
||
Date: Wed, 21 Jun 2000 01:23:57 -0400
|
||
Message-ID: <3554.961565037@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: ROr
|
||
|
||
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
|
||
>> There is a great deal to be said for
|
||
>> ..../database/tablespace/filename
|
||
|
||
> OK,I seem to have gotten the answer for the question
|
||
> Is tablespace defined per PostgreSQL's database ?
|
||
|
||
Not necessarily --- the tablespace subdirectories could be symlinks
|
||
pointing to the same place (assuming you use OIDs or something to keep
|
||
the table filenames unique even across databases). This is just an
|
||
implementation mechanism; it doesn't foreclose the policy decision
|
||
whether tablespaces are database-local or installation-wide.
|
||
|
||
(OTOH, pathnames like tablespace/database would pretty much force
|
||
tablespaces to be installation-wide whether you wanted it that way
|
||
or not.)
|
||
|
||
> My opinion
|
||
> 3) database and tablespace are relatively irrelevant.
|
||
> I assume PostgreSQL's database would correspond
|
||
> to the concept of SCHEMA.
|
||
|
||
My inclindation is that tablespaces should be installation-wide, but
|
||
I'm not completely sold on it. In any case I could see wanting a
|
||
permissions mechanism that would only allow some databases to have
|
||
tables in a particular tablespace.
|
||
|
||
We do need to think more about how traditional Postgres databases
|
||
fit together with SCHEMA. Maybe we wouldn't even need multiple
|
||
databases per installation if we had SCHEMA done right.
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M3641@hub.org Wed Jun 21 02:31:02 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA25698
|
||
for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 02:31:00 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id CAA11423 for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 02:09:13 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5L5we151226;
|
||
Wed, 21 Jun 2000 01:58:40 -0400 (EDT)
|
||
Received: from wallace.ece.rice.edu (wallace.ece.rice.edu [128.42.12.154])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5L5wE151030
|
||
for <pgsql-hackers@postgresql.org>; Wed, 21 Jun 2000 01:58:14 -0400 (EDT)
|
||
Received: by rice.edu
|
||
via sendmail from stdin
|
||
id <m134dJu-000LGmC@wallace.ece.rice.edu> (Debian Smail3.2.0.102)
|
||
for pgsql-hackers@postgresql.org; Wed, 21 Jun 2000 00:45:02 -0500 (CDT)
|
||
Date: Wed, 21 Jun 2000 00:45:02 -0500
|
||
From: "Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Cc: Hiroshi Inoue <Inoue@tpf.co.jp>, Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
Peter Eisentraut <peter_e@gmx.net>, Jan Wieck <JanWieck@yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
Message-ID: <20000621004502.A24387@rice.edu>
|
||
Mail-Followup-To: Tom Lane <tgl@sss.pgh.pa.us>,
|
||
Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
Peter Eisentraut <peter_e@gmx.net>, Jan Wieck <JanWieck@yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
References: <000001bfdb3c$db728760$2801007e@tpf.co.jp> <3554.961565037@sss.pgh.pa.us>
|
||
Mime-Version: 1.0
|
||
Content-Type: text/plain; charset=us-ascii
|
||
User-Agent: Mutt/1.0i
|
||
In-Reply-To: <3554.961565037@sss.pgh.pa.us>; from tgl@sss.pgh.pa.us on Wed, Jun 21, 2000 at 01:23:57AM -0400
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: ROr
|
||
|
||
On Wed, Jun 21, 2000 at 01:23:57AM -0400, Tom Lane wrote:
|
||
> "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
|
||
>
|
||
> > My opinion
|
||
> > 3) database and tablespace are relatively irrelevant.
|
||
> > I assume PostgreSQL's database would correspond
|
||
> > to the concept of SCHEMA.
|
||
>
|
||
> My inclindation is that tablespaces should be installation-wide, but
|
||
> I'm not completely sold on it. In any case I could see wanting a
|
||
> permissions mechanism that would only allow some databases to have
|
||
> tables in a particular tablespace.
|
||
>
|
||
> We do need to think more about how traditional Postgres databases
|
||
> fit together with SCHEMA. Maybe we wouldn't even need multiple
|
||
> databases per installation if we had SCHEMA done right.
|
||
>
|
||
|
||
The important point I think is that tablespaces are about physical
|
||
storage/namespace, and SCHEMA are about logical namespace: it would make
|
||
sense for tables from multiple schema to live in the same tablespace,
|
||
as well as tables from one schema to be stored in multiple tablespaces.
|
||
|
||
Ross
|
||
--
|
||
Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu>
|
||
NSBRI Research Scientist/Programmer
|
||
Computer and Information Technology Institute
|
||
Rice University, 6100 S. Main St., Houston, TX 77005
|
||
|
||
From pgsql-hackers-owner+M3644@hub.org Wed Jun 21 02:31:03 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA25704
|
||
for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 02:31:02 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id CAA11923 for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 02:22:41 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5L6JO196109;
|
||
Wed, 21 Jun 2000 02:19:24 -0400 (EDT)
|
||
Received: from mailo.vtcif.telstra.com.au (mailo.vtcif.telstra.com.au [202.12.144.17])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5L6JB196028
|
||
for <pgsql-hackers@postgresql.org>; Wed, 21 Jun 2000 02:19:11 -0400 (EDT)
|
||
Received: (from uucp@localhost) by mailo.vtcif.telstra.com.au (8.8.2/8.6.9) id QAA21128 for <pgsql-hackers@postgresql.org>; Wed, 21 Jun 2000 16:19:04 +1000 (EST)
|
||
Received: from maili.vtcif.telstra.com.au(202.12.142.17)
|
||
via SMTP by mailo.vtcif.telstra.com.au, id smtpd08EKgu; Wed Jun 21 16:17:56 2000
|
||
Received: (from uucp@localhost) by maili.vtcif.telstra.com.au (8.8.2/8.6.9) id QAA02825 for <pgsql-hackers@postgresql.org>; Wed, 21 Jun 2000 16:17:55 +1000 (EST)
|
||
Received: from localhost(127.0.0.1), claiming to be "mail.cdn.telstra.com.au"
|
||
via SMTP by localhost, id smtpdnjRBD_; Wed Jun 21 16:17:14 2000
|
||
Received: from lunitari.nimrod.itg.telecom.com.au (lunitari.nimrod.itg.telecom.com.au [192.53.254.48]) by mail.cdn.telstra.com.au (8.8.2/8.6.9) with ESMTP id QAA07553 for <pgsql-hackers@postgresql.org>; Wed, 21 Jun 2000 16:17:14 +1000 (EST)
|
||
Received: from nimrod.itg.telecom.com.au (majere [192.53.254.45])
|
||
by lunitari.nimrod.itg.telecom.com.au (8.9.1/8.9.3) with ESMTP id QAA05880
|
||
for <pgsql-hackers@postgresql.org>; Wed, 21 Jun 2000 16:15:56 +1000 (EST)
|
||
Message-ID: <39505D1B.DA335CD2@nimrod.itg.telecom.com.au>
|
||
Date: Wed, 21 Jun 2000 16:13:47 +1000
|
||
From: Chris Bitmead <chrisb@nimrod.itg.telstra.com.au>
|
||
Organization: IBM Global Services
|
||
X-Mailer: Mozilla 4.6 [en] (X11; I; SunOS 5.6 sun4u)
|
||
X-Accept-Language: en
|
||
MIME-Version: 1.0
|
||
To: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
References: <000001bfdb3c$db728760$2801007e@tpf.co.jp> <3554.961565037@sss.pgh.pa.us> <20000621004502.A24387@rice.edu>
|
||
Content-Type: text/plain; charset=us-ascii
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
"Ross J. Reedstrom" wrote:
|
||
|
||
> The important point I think is that tablespaces are about physical
|
||
> storage/namespace, and SCHEMA are about logical namespace: it would make
|
||
> sense for tables from multiple schema to live in the same tablespace,
|
||
> as well as tables from one schema to be stored in multiple tablespaces.
|
||
|
||
If we accept that argument (which sounds good) then wouldn't we have...
|
||
|
||
data/base/db1/table1 -> ../../../tablespace/ts1/db1.table1
|
||
data/base/db1/table2 -> ../../../tablespace/ts1/db1.table2
|
||
data/tablespace/ts1/db1.table1
|
||
data/tablespace/ts1/db1.table2
|
||
|
||
In other words there is a directory for databases, and a directory for
|
||
tablespaces. Database tables are symlinked to the appropriate
|
||
tablespace. So there is multiple databases per tablespace and multiple
|
||
tablespaces per database.
|
||
|
||
From pgsql-hackers-owner+M3648@hub.org Wed Jun 21 09:01:01 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id JAA06055
|
||
for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 09:01:00 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id IAA29647 for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 08:52:25 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5LCo0112103;
|
||
Wed, 21 Jun 2000 08:50:00 -0400 (EDT)
|
||
Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5LCnS112011
|
||
for <pgsql-hackers@postgresql.org>; Wed, 21 Jun 2000 08:49:28 -0400 (EDT)
|
||
Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
|
||
by gandalf.it-austria.net (xxx/xxx) with ESMTP id OAA27330;
|
||
Wed, 21 Jun 2000 14:48:44 +0200
|
||
Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
|
||
id <M6F90Z93>; Wed, 21 Jun 2000 14:48:44 +0200
|
||
Message-ID: <219F68D65015D011A8E000006F8590C605BA5983@sdexcsrv1.f000.d0188.sd.spardat.at>
|
||
From: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
|
||
To: "'Hiroshi Inoue'" <Inoue@tpf.co.jp>
|
||
Cc: "'pgsql-hackers@postgresql.org'" <pgsql-hackers@postgresql.org>
|
||
Subject: AW: [HACKERS] Big 7.1 open items
|
||
Date: Wed, 21 Jun 2000 14:48:43 +0200
|
||
MIME-Version: 1.0
|
||
X-Mailer: Internet Mail Service (5.5.2448.0)
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
|
||
> > > CREATE LOCATION tabloc IN '/var/private/pgsql';
|
||
> > > CREATE TABLE newtab ... IN tabloc;
|
||
> >
|
||
> > Okay, so we'd have "table spaces" and "database spaces".
|
||
> Seems like one
|
||
> > "space" ought to be enough.
|
||
|
||
Yes, one space should be enough.
|
||
|
||
>
|
||
> Does your "database space" correspond to current PostgreSQL's
|
||
> database ?
|
||
|
||
I think we should think of the "database space" as the default "table space"
|
||
for this database.
|
||
|
||
> And is it different from SCHEMA ?
|
||
|
||
Please don't mix schema and database, they are two different issues.
|
||
Even Oracle has a database, only in Oracle you are limited to one database
|
||
per instance. We do not want to add this limitation to PostgreSQL.
|
||
|
||
Andreas
|
||
|
||
From e99re41@DoCS.UU.SE Wed Jun 21 10:01:10 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA06585;
|
||
Wed, 21 Jun 2000 10:01:09 -0400 (EDT)
|
||
Received: from meryl.it.uu.se (root@meryl.it.uu.se [130.238.12.42]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id JAA03592; Wed, 21 Jun 2000 09:38:34 -0400 (EDT)
|
||
Received: from Ulv.DoCS.UU.SE (e99re41@Ulv.DoCS.UU.SE [130.238.9.167])
|
||
by meryl.it.uu.se (8.8.5/8.8.5) with ESMTP id PAA20520;
|
||
Wed, 21 Jun 2000 15:34:34 +0200 (MET DST)
|
||
Received: from localhost (e99re41@localhost) by Ulv.DoCS.UU.SE (8.6.12/8.6.12) with ESMTP id PAA10847; Wed, 21 Jun 2000 15:34:27 +0200
|
||
X-Authentication-Warning: Ulv.DoCS.UU.SE: e99re41 owned process doing -bs
|
||
Date: Wed, 21 Jun 2000 15:34:27 +0200 (MET DST)
|
||
From: Peter Eisentraut <e99re41@DoCS.UU.SE>
|
||
Reply-To: Peter Eisentraut <peter_e@gmx.net>
|
||
To: Hiroshi Inoue <Inoue@tpf.co.jp>
|
||
cc: Tom Lane <tgl@sss.pgh.pa.us>, Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
Jan Wieck <JanWieck@yahoo.com>,
|
||
Bruce Momjian <maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
In-Reply-To: <000001bfdb3c$db728760$2801007e@tpf.co.jp>
|
||
Message-ID: <Pine.GSO.4.02A.10006211521410.10570-100000@Ulv.DoCS.UU.SE>
|
||
MIME-Version: 1.0
|
||
Content-Type: TEXT/PLAIN; charset=iso-8859-1
|
||
Content-Transfer-Encoding: 8bit
|
||
X-MIME-Autoconverted: from QUOTED-PRINTABLE to 8bit by candle.pha.pa.us id KAA06585
|
||
Status: RO
|
||
|
||
On Wed, 21 Jun 2000, Hiroshi Inoue wrote:
|
||
|
||
> Peter seems to have the following idea(?? not sure)
|
||
> 2) database = tablespace
|
||
|
||
No, I thought that a database would have a table space assigned that would
|
||
serve as the default for newly created tables, but could be overridden. So
|
||
you could group databases onto disks as you want, but a couple of
|
||
particularly big/important/unimportant/etc tables from each database could
|
||
be put on a different disk. At least this seems to be the most flexible
|
||
and conceptually simple solution.
|
||
|
||
Ideally, directories per database would go away, but then we'd have the
|
||
system tables colliding, since those have the same oid in each database.
|
||
But that's not really important. So essentially you'd have
|
||
|
||
$PGDATA/base/tablespacesomething/database/tables
|
||
|
||
In the default tablespace, "tablespacesomething" is an ordinary directory,
|
||
for other tablespaces it symlinks somewhere else. For those browsing
|
||
$PGDATA/base, it all looks the same (unless you have colour ls). For those
|
||
browsing the actual storage location it looks like
|
||
/var/foo/elsewhere/database/tables.
|
||
|
||
I'm sure you can squeeze the extension segments in there, maybe between
|
||
tablespace and database.
|
||
|
||
What I think Bruce is saying is that there should be both database spaces
|
||
and table spaces, I think that's too much.
|
||
|
||
> My opinion
|
||
> 3) database and tablespace are relatively irrelevant.
|
||
> I assume PostgreSQL's database would correspond
|
||
> to the concept of SCHEMA.
|
||
|
||
A database corresponds to a catalog and a schema corresponds to nothing
|
||
yet.
|
||
|
||
|
||
--
|
||
Peter Eisentraut Sernanders v<>g 10:115
|
||
peter_e@gmx.net 75262 Uppsala
|
||
http://yi.org/peter-e/ Sweden
|
||
|
||
|
||
From e99re41@DoCS.UU.SE Wed Jun 21 10:01:09 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA06582;
|
||
Wed, 21 Jun 2000 10:01:08 -0400 (EDT)
|
||
Received: from meryl.it.uu.se (root@meryl.it.uu.se [130.238.12.42]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id JAA04510; Wed, 21 Jun 2000 09:43:48 -0400 (EDT)
|
||
Received: from Ulv.DoCS.UU.SE (e99re41@Ulv.DoCS.UU.SE [130.238.9.167])
|
||
by meryl.it.uu.se (8.8.5/8.8.5) with ESMTP id PAA20730;
|
||
Wed, 21 Jun 2000 15:39:23 +0200 (MET DST)
|
||
Received: from localhost (e99re41@localhost) by Ulv.DoCS.UU.SE (8.6.12/8.6.12) with ESMTP id PAA10853; Wed, 21 Jun 2000 15:39:16 +0200
|
||
X-Authentication-Warning: Ulv.DoCS.UU.SE: e99re41 owned process doing -bs
|
||
Date: Wed, 21 Jun 2000 15:39:16 +0200 (MET DST)
|
||
From: Peter Eisentraut <e99re41@DoCS.UU.SE>
|
||
Reply-To: Peter Eisentraut <peter_e@gmx.net>
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: Jan Wieck <JanWieck@yahoo.com>, Tom Lane <tgl@sss.pgh.pa.us>,
|
||
Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
Bruce Momjian <maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-Reply-To: <200006201753.NAA27293@candle.pha.pa.us>
|
||
Message-ID: <Pine.GSO.4.02A.10006211536260.10570-100000@Ulv.DoCS.UU.SE>
|
||
MIME-Version: 1.0
|
||
Content-Type: TEXT/PLAIN; charset=iso-8859-1
|
||
Content-Transfer-Encoding: 8bit
|
||
X-MIME-Autoconverted: from QUOTED-PRINTABLE to 8bit by candle.pha.pa.us id KAA06582
|
||
Status: ROr
|
||
|
||
On Tue, 20 Jun 2000, Bruce Momjian wrote:
|
||
|
||
> What I was suggesting is not to catalog the symlink locations, but to
|
||
> use lstat when dumping, so that admins can move files around using
|
||
> symlinks and not have to udpate the database.
|
||
|
||
That surely wouldn't make those happy that are calling for smgr
|
||
abstraction.
|
||
|
||
|
||
--
|
||
Peter Eisentraut Sernanders v<>g 10:115
|
||
peter_e@gmx.net 75262 Uppsala
|
||
http://yi.org/peter-e/ Sweden
|
||
|
||
|
||
From tgl@sss.pgh.pa.us Wed Jun 21 11:31:09 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA08120;
|
||
Wed, 21 Jun 2000 11:31:08 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id LAA13232; Wed, 21 Jun 2000 11:08:38 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id LAA04286;
|
||
Wed, 21 Jun 2000 11:07:20 -0400 (EDT)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: Hiroshi Inoue <Inoue@tpf.co.jp>, Peter Eisentraut <peter_e@gmx.net>,
|
||
Jan Wieck <JanWieck@Yahoo.com>,
|
||
Bruce Momjian <maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgreSQL.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006210433.AAA18343@candle.pha.pa.us>
|
||
References: <200006210433.AAA18343@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Wed, 21 Jun 2000 00:33:01 -0400"
|
||
Date: Wed, 21 Jun 2000 11:07:20 -0400
|
||
Message-ID: <4283.961600040@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: ROr
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> Yes, agreed. I was thinking this:
|
||
> CREATE TABLESPACE loc USING '/var/pgsql'
|
||
> does:
|
||
> ln -s /var/pgsql/dbname/loc data/base/dbname/loc
|
||
> In this way, the database has a view of its main directory, plus a /loc
|
||
> subdirectory for the tablespace. In the other location, we have
|
||
> /var/pgsql/dbname/loc because this allows different databases to use:
|
||
> CREATE TABLESPACE loc USING '/var/pgsql'
|
||
> and they do not collide with each other in /var/pgsql.
|
||
|
||
But they don't collide anyway, because the dbname is already unique.
|
||
Isn't the extra subdirectory a waste?
|
||
|
||
Because table files will have installation-wide unique names, there's
|
||
no really good reason to have either level of subdirectory; you could
|
||
just make
|
||
CREATE TABLESPACE loc USING '/var/pgsql'
|
||
do
|
||
ln -s /var/pgsql data/base/dbname/loc
|
||
and it'd still work even if multiple DBs were using the same tablespace.
|
||
|
||
However, forcing creation of a subdirectory does give you the chance to
|
||
make sure the subdir is owned by postgres and has the right permissions,
|
||
so there's something to be said for that. It might be reasonable to do
|
||
mkdir /var/pgsql/dbname
|
||
chmod 700 /var/pgsql/dbname
|
||
ln -s /var/pgsql/dbname data/base/dbname/loc
|
||
|
||
regards, tom lane
|
||
|
||
From lockhart@alumni.caltech.edu Wed Jun 21 11:31:10 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA08135;
|
||
Wed, 21 Jun 2000 11:31:09 -0400 (EDT)
|
||
Received: from huey.jpl.nasa.gov (huey.jpl.nasa.gov [128.149.68.100]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id LAA15864; Wed, 21 Jun 2000 11:30:06 -0400 (EDT)
|
||
Received: from golem.jpl.nasa.gov (hectic-1 [128.149.68.203])
|
||
by huey.jpl.nasa.gov (8.8.8+Sun/8.8.8) with ESMTP id IAA02881;
|
||
Wed, 21 Jun 2000 08:26:40 -0700 (PDT)
|
||
Received: from alumni.caltech.edu (localhost.localdomain [127.0.0.1])
|
||
by golem.jpl.nasa.gov (Postfix) with ESMTP
|
||
id AB8AE6F51; Wed, 21 Jun 2000 15:27:36 +0000 (UTC)
|
||
Sender: lockhart@mythos.jpl.nasa.gov
|
||
Message-ID: <3950DEE8.2DB4B401@alumni.caltech.edu>
|
||
Date: Wed, 21 Jun 2000 15:27:36 +0000
|
||
From: Thomas Lockhart <lockhart@alumni.caltech.edu>
|
||
Organization: Yes
|
||
X-Mailer: Mozilla 4.7 [en] (X11; I; Linux 2.2.14-15mdksmp i686)
|
||
X-Accept-Language: en
|
||
MIME-Version: 1.0
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
Cc: Peter Eisentraut <peter_e@gmx.net>, Jan Wieck <JanWieck@Yahoo.com>,
|
||
Tom Lane <tgl@sss.pgh.pa.us>, Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
Bruce Momjian <maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgreSQL.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
References: <200006211511.LAA07416@candle.pha.pa.us>
|
||
Content-Type: text/plain; charset=us-ascii
|
||
Content-Transfer-Encoding: 7bit
|
||
Status: RO
|
||
|
||
> Sorry, disagree. Environment variables are a pain to administer, and
|
||
> quite counter-intuitive.
|
||
|
||
Well, I guess we disagree. But until we have a complete proposed
|
||
solution, we should leave environment variables on the table, since they
|
||
*do* allow some decoupling of logical and physical storage, and *do*
|
||
give the administrator some control over resources *that the admin would
|
||
not otherwise have*.
|
||
|
||
> > istm that from a portability and evolutionary standpoint OID-only
|
||
> > file names (or at least file names *not* based on relation/class
|
||
> > names) is a requirement.
|
||
> Maybe a requirement at some point for some installations, but I hope
|
||
> not a general requirement.
|
||
|
||
If a table name can have characters which are not legal for file names,
|
||
then how would you propose to support it? If we are doing a
|
||
restructuring of the storage scheme, this should be taken into account.
|
||
|
||
lockhart=# create table "one/two" (i int);
|
||
ERROR: cannot create one/two
|
||
|
||
Why not? It demonstrates an unfortunate linkage between file systems and
|
||
database resources.
|
||
|
||
- Thomas
|
||
|
||
From tgl@sss.pgh.pa.us Wed Jun 21 11:31:18 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA08164;
|
||
Wed, 21 Jun 2000 11:31:12 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id LAA15786; Wed, 21 Jun 2000 11:29:30 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id LAA04451;
|
||
Wed, 21 Jun 2000 11:28:09 -0400 (EDT)
|
||
To: Thomas Lockhart <lockhart@alumni.caltech.edu>
|
||
cc: Bruce Momjian <pgman@candle.pha.pa.us>, Peter Eisentraut <peter_e@gmx.net>,
|
||
Jan Wieck <JanWieck@Yahoo.com>, Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
Bruce Momjian <maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgreSQL.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <39505061.F42334AB@alumni.caltech.edu>
|
||
References: <200006201753.NAA27293@candle.pha.pa.us> <39505061.F42334AB@alumni.caltech.edu>
|
||
Comments: In-reply-to Thomas Lockhart <lockhart@alumni.caltech.edu>
|
||
message dated "Wed, 21 Jun 2000 05:19:29 -0000"
|
||
Date: Wed, 21 Jun 2000 11:28:09 -0400
|
||
Message-ID: <4448.961601289@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: RO
|
||
|
||
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
|
||
> Well, as y'all have noticed, I think there are strong reasons to use
|
||
> environment variables to manage locations, and that symlinks are a
|
||
> potential portability and robustness problem.
|
||
|
||
Reasons? Evidence?
|
||
|
||
> An additional point which has relevance to this whole discussion:
|
||
> In the future we may allow system resource such as tables to carry names
|
||
> which use multi-byte encodings. afaik these encodings are not allowed to
|
||
> be used for physical file names, and even if they were the utility of
|
||
> using standard operating system utilities like ls goes way down.
|
||
|
||
Good point, although in one sense a string is a string --- as long as
|
||
we don't allow embedded nulls in server-side encodings, we could use
|
||
anything that Postgres thought was a name in a filename, and the OS
|
||
should take it. But if your local ls doesn't show it the way you see
|
||
in Postgres, the usefulness of having the tablename in the filename
|
||
goes way down.
|
||
|
||
> istm that from a portability and evolutionary standpoint OID-only file
|
||
> names (or at least file names *not* based on relation/class names) is a
|
||
> requirement.
|
||
|
||
No argument from me ;-). I've been looking for compromise positions
|
||
but I still think that pure numeric filenames are the cleanest solution.
|
||
|
||
There's something else that should be taken into account: for WAL, the
|
||
log will need to record the table file that each insert/delete/update
|
||
operation affects. To do that with the smgr-token-is-a-pathname
|
||
approach I was suggesting yesterday, I think you have to record the
|
||
database name and pathname in each WAL log entry. That's 64 bytes/log
|
||
entry which is a *lot*. If we bit the bullet and restricted ourselves
|
||
to numeric filenames then the log would need just four numeric values:
|
||
database OID
|
||
tablespace OID
|
||
relation OID
|
||
relation version number
|
||
(this set of 4 values would also be an smgr file reference token).
|
||
16 bytes/log entry looks much better than 64.
|
||
|
||
At the moment I can recall the following opinions:
|
||
|
||
Pure OID filenames: Thomas, Tom, Marc, Peter E.
|
||
|
||
OID+relname filenames: Bruce
|
||
|
||
Vadim was in the pure-OID camp a few months ago, but I won't presume
|
||
to list him there now since he hasn't been involved in this most
|
||
recent round of discussions. I'm not sure where anyone else stands...
|
||
but at least in terms of the core group it's pretty clear where the
|
||
majority opinion is.
|
||
|
||
regards, tom lane
|
||
|
||
From lamar.owen@wgcr.org Wed Jun 21 11:51:39 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA09021;
|
||
Wed, 21 Jun 2000 11:51:38 -0400 (EDT)
|
||
Received: from www.wgcr.org (IDENT:root@www.wgcr.org [206.74.232.194]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id LAA18613; Wed, 21 Jun 2000 11:51:48 -0400 (EDT)
|
||
Received: from wgcr.org ([206.74.232.197])
|
||
by www.wgcr.org (8.9.3/8.9.3/WGCR) with ESMTP id LAA19124;
|
||
Wed, 21 Jun 2000 11:48:25 -0400
|
||
Message-ID: <3950E3C3.7322BD70@wgcr.org>
|
||
Date: Wed, 21 Jun 2000 11:48:19 -0400
|
||
From: Lamar Owen <lamar.owen@wgcr.org>
|
||
X-Mailer: Mozilla 4.61 [en] (Win95; I)
|
||
X-Accept-Language: en
|
||
MIME-Version: 1.0
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
CC: Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
Peter Eisentraut <peter_e@gmx.net>, Jan Wieck <JanWieck@Yahoo.com>,
|
||
Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
Bruce Momjian <maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
References: <200006201753.NAA27293@candle.pha.pa.us> <39505061.F42334AB@alumni.caltech.edu> <4448.961601289@sss.pgh.pa.us>
|
||
Content-Type: text/plain; charset=us-ascii
|
||
Content-Transfer-Encoding: 7bit
|
||
Status: ROr
|
||
|
||
Tom Lane wrote:
|
||
|
||
> Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
|
||
> > Well, as y'all have noticed, I think there are strong reasons to use
|
||
> > environment variables to manage locations, and that symlinks are a
|
||
> > potential portability and robustness problem.
|
||
|
||
> Reasons? Evidence?
|
||
|
||
Does Win32 do symlinks these days? I know Win32 does envvars, and Win32
|
||
is currently a supported platform.
|
||
|
||
I'm not thrilled with either solution -- envvars have their problems
|
||
just as surely as symlinks do.
|
||
|
||
> At the moment I can recall the following opinions:
|
||
|
||
> Pure OID filenames: Thomas, Tom, Marc, Peter E.
|
||
|
||
FWIW, count me here. I have tried administering my system using the
|
||
filenames -- and have been bitten. Better admin tools in the PostgreSQL
|
||
package beat using standard filesystem tools -- the PostgreSQL tools can
|
||
be WAL-aware, transaction-aware, and can provide consistent results.
|
||
Filesystem tools never will be able to provide consistent results for a
|
||
database system that must remain up 24x7, as many if not most PostgreSQL
|
||
installations must.
|
||
|
||
> OID+relname filenames: Bruce
|
||
|
||
Sorry Bruce -- I understand and am sympathetic to your position, and, at
|
||
one time, I agreed with it. But not any more.
|
||
|
||
--
|
||
Lamar Owen
|
||
WGCR Internet Radio
|
||
1 Peter 4:11
|
||
|
||
From tgl@sss.pgh.pa.us Wed Jun 21 12:10:06 2000
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA09885
|
||
for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 12:10:04 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA04789;
|
||
Wed, 21 Jun 2000 12:10:15 -0400 (EDT)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: Hiroshi Inoue <Inoue@tpf.co.jp>, Peter Eisentraut <peter_e@gmx.net>,
|
||
Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgreSQL.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006211545.LAA08773@candle.pha.pa.us>
|
||
References: <200006211545.LAA08773@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Wed, 21 Jun 2000 11:45:12 -0400"
|
||
Date: Wed, 21 Jun 2000 12:10:15 -0400
|
||
Message-ID: <4786.961603815@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: ROr
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> Yes, that is true. My idea is that they may want to create loc1 and
|
||
> loc2 which initially point to the same location, but later may be moved.
|
||
> For example, one tablespace for tables, another for indexes. They may
|
||
> initially point to the same directory, but later be split.
|
||
|
||
Well, that opens up a completely different issue, which is what about
|
||
moving tables from one tablespace to another?
|
||
|
||
I think the way you appear to be implying above (shut down the server
|
||
so that you can rearrange subdirectories by hand) is the wrong way to
|
||
go about it. For one thing, lots of people don't want to shut down
|
||
their servers completely for that long, but it's difficult to avoid
|
||
doing so if you want to move files by filesystem commands. For another
|
||
thing, the above approach requires guessing in advance --- maybe long
|
||
in advance --- how you are going to want to repartition your database
|
||
when it gets too big for your existing storage.
|
||
|
||
The right way to address this problem is to invent a "move table to
|
||
new tablespace" command. This'd be pretty trivial to implement based
|
||
on a file-versioning approach: the new version of the pg_class tuple
|
||
has a new tablespace identifier in it.
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M3670@hub.org Wed Jun 21 12:30:42 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA10371
|
||
for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 12:30:41 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id MAA22315 for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 12:23:18 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5LGJU175424;
|
||
Wed, 21 Jun 2000 12:19:30 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5LGJJ175359
|
||
for <pgsql-hackers@postgresql.org>; Wed, 21 Jun 2000 12:19:19 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA04878;
|
||
Wed, 21 Jun 2000 12:17:38 -0400 (EDT)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: Lamar Owen <lamar.owen@wgcr.org>,
|
||
Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Peter Eisentraut <peter_e@gmx.net>, Jan Wieck <JanWieck@Yahoo.com>,
|
||
Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006211603.MAA09414@candle.pha.pa.us>
|
||
References: <200006211603.MAA09414@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Wed, 21 Jun 2000 12:03:12 -0400"
|
||
Date: Wed, 21 Jun 2000 12:17:37 -0400
|
||
Message-ID: <4875.961604257@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
>> Sorry Bruce -- I understand and am sympathetic to your position, and, at
|
||
>> one time, I agreed with it. But not any more.
|
||
|
||
> I thought the most recent proposal was to just throw ~16 chars of the
|
||
> file name on the end of the file name, and that should not be used for
|
||
> anything except visibility. WAL would not need to store that. It could
|
||
> just grab the file name that matches the oid/sequence number.
|
||
|
||
But that's extra complexity in WAL, plus extra complexity in renaming
|
||
tables (if you want the filename to track the logical table name, which
|
||
I expect you would), plus extra complexity in smgr and bufmgr and other
|
||
places.
|
||
|
||
I think people are coming around to the notion that it's better to keep
|
||
these low-level operations simple, even if we need to expend more work
|
||
on high-level admin tools as a result.
|
||
|
||
But we do need to remember to expend that effort on tools! Let's not
|
||
drop the ball on that, folks.
|
||
|
||
regards, tom lane
|
||
|
||
From tgl@sss.pgh.pa.us Wed Jun 21 12:30:40 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA10364
|
||
for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 12:30:38 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id MAA22593 for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 12:25:58 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA04944;
|
||
Wed, 21 Jun 2000 12:24:44 -0400 (EDT)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: Hiroshi Inoue <Inoue@tpf.co.jp>, Peter Eisentraut <peter_e@gmx.net>,
|
||
Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006211614.MAA09938@candle.pha.pa.us>
|
||
References: <200006211614.MAA09938@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Wed, 21 Jun 2000 12:14:59 -0400"
|
||
Date: Wed, 21 Jun 2000 12:24:44 -0400
|
||
Message-ID: <4941.961604684@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: ROr
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
>> Well, that opens up a completely different issue, which is what about
|
||
>> moving tables from one tablespace to another?
|
||
|
||
> Are you suggesting that doing dbname/locname is somehow harder to do
|
||
> that? If you are, I don't understand why.
|
||
|
||
It doesn't make it harder, but it still seems pointless to have the
|
||
extra directory level. Bear in mind that if we go with all-OID
|
||
filenames then you're not going to be looking at "loc1" and "loc2"
|
||
anyway, but at "5938171" and "8583727". It's not much of a convenience
|
||
to the admin to see that, so we might as well save a level of directory
|
||
lookup.
|
||
|
||
> The general issue of moving tables between tablespaces can be done from
|
||
> in the database. I don't think it is reasonable to shut down the db to
|
||
> do that. However, I can see moving tablespaces to different symlinked
|
||
> locations may require a shutdown.
|
||
|
||
Only if you insist on doing it outside the database using filesystem
|
||
tools. Another way is to create a new tablespace in the desired new
|
||
location, then move the tables one-by-one to that new tablespace.
|
||
|
||
I suppose either one might be preferable depending on your access
|
||
patterns --- locking your most critical tables while they're being moved
|
||
might be as bad as a total shutdown.
|
||
|
||
regards, tom lane
|
||
|
||
From tgl@sss.pgh.pa.us Wed Jun 21 13:01:06 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA11366
|
||
for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 13:01:05 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id MAA24726 for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 12:47:50 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA05112;
|
||
Wed, 21 Jun 2000 12:46:34 -0400 (EDT)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: Hiroshi Inoue <Inoue@tpf.co.jp>, Peter Eisentraut <peter_e@gmx.net>,
|
||
Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006211640.MAA10498@candle.pha.pa.us>
|
||
References: <200006211640.MAA10498@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Wed, 21 Jun 2000 12:40:35 -0400"
|
||
Date: Wed, 21 Jun 2000 12:46:34 -0400
|
||
Message-ID: <5109.961605994@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: ROr
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
>>>> Are you suggesting that doing dbname/locname is somehow harder to do
|
||
>>>> that? If you are, I don't understand why.
|
||
>>
|
||
>> It doesn't make it harder, but it still seems pointless to have the
|
||
>> extra directory level. Bear in mind that if we go with all-OID
|
||
>> filenames then you're not going to be looking at "loc1" and "loc2"
|
||
>> anyway, but at "5938171" and "8583727". It's not much of a convenience
|
||
>> to the admin to see that, so we might as well save a level of directory
|
||
>> lookup.
|
||
|
||
> Just seems easier to have stuff segregates into separate per-db
|
||
> directories for clarity. Also, as directories get bigger, finding a
|
||
> specific file in there becomes harder. Putting 10 databases all in the
|
||
> same directory seems bad in this regard.
|
||
|
||
Huh? I wasn't arguing against making a db-specific directory below the
|
||
tablespace point. I was arguing against making *another* directory
|
||
below that one.
|
||
|
||
> I don't think we want to be using
|
||
> symlinks for tables if we can avoid it.
|
||
|
||
Agreed, but where did that come from? None of these proposals mentioned
|
||
symlinks for anything but directories, AFAIR.
|
||
|
||
regards, tom lane
|
||
|
||
From peter@localhost.its.uu.se Wed Jun 21 14:31:13 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA13233
|
||
for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 14:31:13 -0400 (EDT)
|
||
Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id OAA04201 for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 14:11:42 -0400 (EDT)
|
||
Received: from regulus.student.UU.SE ([130.238.5.2]:34923 "EHLO
|
||
regulus.its.uu.se") by merganser.its.uu.se with ESMTP
|
||
id <S385153AbQFUSJq>; Wed, 21 Jun 2000 20:09:46 +0200
|
||
Received: from peter (helo=localhost)
|
||
by regulus.its.uu.se with local-esmtp (Exim 3.02 #2)
|
||
id 134p2o-0000Uo-00; Wed, 21 Jun 2000 20:16:10 +0200
|
||
Date: Wed, 21 Jun 2000 20:16:10 +0200 (CEST)
|
||
From: Peter Eisentraut <peter_e@gmx.net>
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
cc: Bruce Momjian <pgman@candle.pha.pa.us>, Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
Jan Wieck <JanWieck@yahoo.com>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>,
|
||
Don Baccus <dhogaza@pacifier.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-Reply-To: <29686.961511764@sss.pgh.pa.us>
|
||
Message-ID: <Pine.LNX.4.21.0006201906100.4054-100000@localhost.localdomain>
|
||
MIME-Version: 1.0
|
||
Content-Type: TEXT/PLAIN; charset=ISO-8859-1
|
||
Content-Transfer-Encoding: 8BIT
|
||
Sender: Peter Eisentraut <peter@Op.Net>
|
||
Status: ROr
|
||
|
||
Tom Lane writes:
|
||
|
||
> I think Peter was holding out for storing purely numeric tablespace OID
|
||
> and table version in pg_class and having a hardwired mapping to pathname
|
||
> somewhere in smgr. However, I think that doing it that way gains only
|
||
> micro-efficiency compared to passing a "name" around, while using the
|
||
> name approach buys us flexibility that's needed for at least some of
|
||
> the variants under discussion.
|
||
|
||
But that name can only be a dozen or so characters, contain no slash or
|
||
other funny characters, etc. That's really poor. Then the alternative is
|
||
to have an internal name and an external canonical name. Then you have two
|
||
names to worry about. Also consider that when you store both the table
|
||
space oid and the internal name in pg_class you create redundant data.
|
||
What if you rename the table space? Do you leave the internal name out of
|
||
sync? Then what good is the internal name? I'm just concerned that we are
|
||
creating at the table space level problems similar to that we're trying to
|
||
get rid of at the relation and database level.
|
||
|
||
|
||
--
|
||
Peter Eisentraut Sernanders v<>g 10:115
|
||
peter_e@gmx.net 75262 Uppsala
|
||
http://yi.org/peter-e/ Sweden
|
||
|
||
|
||
From tgl@sss.pgh.pa.us Wed Jun 21 18:14:19 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id SAA24147
|
||
for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 18:14:18 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id RAA24649 for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 17:40:59 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id RAA06031;
|
||
Wed, 21 Jun 2000 17:39:38 -0400 (EDT)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: Peter Eisentraut <peter_e@gmx.net>, Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
Jan Wieck <JanWieck@yahoo.com>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>,
|
||
Don Baccus <dhogaza@pacifier.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006211842.OAA13514@candle.pha.pa.us>
|
||
References: <200006211842.OAA13514@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Wed, 21 Jun 2000 14:42:21 -0400"
|
||
Date: Wed, 21 Jun 2000 17:39:38 -0400
|
||
Message-ID: <6028.961623578@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: RO
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
>> But that name can only be a dozen or so characters, contain no slash or
|
||
>> other funny characters, etc. That's really poor. Then the alternative is
|
||
>> to have an internal name and an external canonical name. Then you have two
|
||
>> names to worry about. Also consider that when you store both the table
|
||
>> space oid and the internal name in pg_class you create redundant data.
|
||
>> What if you rename the table space? Do you leave the internal name out of
|
||
>> sync? Then what good is the internal name? I'm just concerned that we are
|
||
>> creating at the table space level problems similar to that we're trying to
|
||
>> get rid of at the relation and database level.
|
||
|
||
> Agreed. Having table spaces stored by directories named by oid just
|
||
> seems very complicated for no reason.
|
||
|
||
Huh? He just gave you two very good reasons: avoid Unix-derived
|
||
limitations on the naming of tablespaces (and tables), and avoid
|
||
problems with renaming tablespaces.
|
||
|
||
I'm pretty much firmly back in the "OID and nothing but" camp.
|
||
Or perhaps I should say "OID, file version, and nothing but",
|
||
since we still need a version number to do CLUSTER etc.
|
||
|
||
regards, tom lane
|
||
|
||
From vmikheev@SECTORBASE.COM Wed Jun 21 22:18:38 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07570;
|
||
Wed, 21 Jun 2000 22:18:36 -0400 (EDT)
|
||
Received: from sectorbase2.sectorbase.com ([208.48.122.131]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id TAA29965; Wed, 21 Jun 2000 19:07:37 -0400 (EDT)
|
||
Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21)
|
||
id <MCTD2WLM>; Wed, 21 Jun 2000 15:58:30 -0700
|
||
Message-ID: <8F4C99C66D04D4118F580090272A7A23018C2B@SECTORBASE1>
|
||
From: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
|
||
To: "'Tom Lane'" <tgl@sss.pgh.pa.us>,
|
||
Thomas Lockhart
|
||
<lockhart@alumni.caltech.edu>
|
||
Cc: Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
Peter Eisentraut
|
||
<peter_e@gmx.net>, Jan Wieck <JanWieck@Yahoo.com>,
|
||
Hiroshi Inoue
|
||
<Inoue@tpf.co.jp>,
|
||
Bruce Momjian <maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgreSQL.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Wed, 21 Jun 2000 16:00:17 -0700
|
||
MIME-Version: 1.0
|
||
X-Mailer: Internet Mail Service (5.5.2650.21)
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Status: RO
|
||
|
||
> If we bit the bullet and restricted ourselves to numeric filenames then
|
||
> the log would need just four numeric values:
|
||
> database OID
|
||
> tablespace OID
|
||
|
||
Is someone going to implement it for 7.1?
|
||
|
||
> relation OID
|
||
> relation version number
|
||
|
||
I believe that we can avoid versions using WAL...
|
||
|
||
> (this set of 4 values would also be an smgr file reference token).
|
||
> 16 bytes/log entry looks much better than 64.
|
||
>
|
||
> At the moment I can recall the following opinions:
|
||
>
|
||
> Pure OID filenames: Thomas, Tom, Marc, Peter E.
|
||
|
||
+ me.
|
||
|
||
But what about LOCATIONs? I object using environment and think that
|
||
locations
|
||
must be stored in pg_control..?
|
||
|
||
Vadim
|
||
|
||
From Inoue@tpf.co.jp Wed Jun 21 22:18:39 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07573;
|
||
Wed, 21 Jun 2000 22:18:38 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id TAA01857; Wed, 21 Jun 2000 19:37:04 -0400 (EDT)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id IAA02627; Thu, 22 Jun 2000 08:35:27 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Tom Lane" <tgl@sss.pgh.pa.us>
|
||
Cc: "Bruce Momjian" <pgman@candle.pha.pa.us>,
|
||
"Peter Eisentraut" <peter_e@gmx.net>, "Jan Wieck" <JanWieck@Yahoo.com>,
|
||
"Bruce Momjian" <maillist@candle.pha.pa.us>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgreSQL.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>,
|
||
"Thomas Lockhart" <lockhart@alumni.caltech.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Thu, 22 Jun 2000 08:37:42 +0900
|
||
Message-ID: <000201bfdbd9$b1985580$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-2022-jp"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
Importance: Normal
|
||
In-Reply-To: <4448.961601289@sss.pgh.pa.us>
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
Status: RO
|
||
|
||
> -----Original Message-----
|
||
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
|
||
>
|
||
> No argument from me ;-). I've been looking for compromise positions
|
||
> but I still think that pure numeric filenames are the cleanest solution.
|
||
>
|
||
> There's something else that should be taken into account: for WAL, the
|
||
> log will need to record the table file that each insert/delete/update
|
||
> operation affects. To do that with the smgr-token-is-a-pathname
|
||
> approach I was suggesting yesterday, I think you have to record the
|
||
> database name and pathname in each WAL log entry. That's 64 bytes/log
|
||
> entry which is a *lot*. If we bit the bullet and restricted ourselves
|
||
> to numeric filenames then the log would need just four numeric values:
|
||
> database OID
|
||
> tablespace OID
|
||
|
||
I strongly object to keep tablespace OID for smgr file reference token
|
||
though we have to keep it for another purpose of cource. I've mentioned
|
||
many times tablespace(where to store) info should be distinguished from
|
||
*where it is stored* info. Generally tablespace isn't sufficiently
|
||
restrictive
|
||
for this purpose. e.g. there was an idea about round-robin. e.g. Oracle's
|
||
tablespace could have pluaral files... etc.
|
||
IMHO,it is misleading to use tablespace OID as (a part of) reference token.
|
||
|
||
> relation OID
|
||
> relation version number
|
||
> (this set of 4 values would also be an smgr file reference token).
|
||
> 16 bytes/log entry looks much better than 64.
|
||
>
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
|
||
From Inoue@tpf.co.jp Wed Jun 21 22:18:15 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07540;
|
||
Wed, 21 Jun 2000 22:18:11 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id UAA04100; Wed, 21 Jun 2000 20:15:09 -0400 (EDT)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id JAA02691; Thu, 22 Jun 2000 09:14:15 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
|
||
Cc: "Bruce Momjian" <pgman@candle.pha.pa.us>,
|
||
"Peter Eisentraut" <peter_e@gmx.net>, "Jan Wieck" <JanWieck@Yahoo.com>,
|
||
"Bruce Momjian" <maillist@candle.pha.pa.us>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgreSQL.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>,
|
||
"'Tom Lane'" <tgl@sss.pgh.pa.us>,
|
||
"Thomas Lockhart" <lockhart@alumni.caltech.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Thu, 22 Jun 2000 09:16:30 +0900
|
||
Message-ID: <000301bfdbdf$1d0dd920$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
Importance: Normal
|
||
In-Reply-To: <8F4C99C66D04D4118F580090272A7A23018C2B@SECTORBASE1>
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
Status: RO
|
||
|
||
> -----Original Message-----
|
||
> From: Mikheev, Vadim [mailto:vmikheev@SECTORBASE.COM]
|
||
>
|
||
> > If we bit the bullet and restricted ourselves to numeric filenames then
|
||
> > the log would need just four numeric values:
|
||
> > database OID
|
||
> > tablespace OID
|
||
>
|
||
> Is someone going to implement it for 7.1?
|
||
>
|
||
> > relation OID
|
||
> > relation version number
|
||
>
|
||
> I believe that we can avoid versions using WAL...
|
||
>
|
||
|
||
How to re-construct tables in place ?
|
||
Is the following right ?
|
||
1) save the content of current table to somewhere
|
||
2) shrink the table and related indexes
|
||
3) reload the saved(+some filtering) content
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
From Inoue@tpf.co.jp Wed Jun 21 22:18:16 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07553;
|
||
Wed, 21 Jun 2000 22:18:15 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id UAA05872; Wed, 21 Jun 2000 20:44:21 -0400 (EDT)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id JAA02750; Thu, 22 Jun 2000 09:43:31 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
|
||
Cc: "Bruce Momjian" <pgman@candle.pha.pa.us>,
|
||
"Peter Eisentraut" <peter_e@gmx.net>, "Jan Wieck" <JanWieck@Yahoo.com>,
|
||
"Bruce Momjian" <maillist@candle.pha.pa.us>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgreSQL.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>,
|
||
"'Tom Lane'" <tgl@sss.pgh.pa.us>,
|
||
"Thomas Lockhart" <lockhart@alumni.caltech.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Thu, 22 Jun 2000 09:45:46 +0900
|
||
Message-ID: <000401bfdbe3$3420fee0$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
Importance: Normal
|
||
In-Reply-To: <8F4C99C66D04D4118F580090272A7A23018C2C@SECTORBASE1>
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
Status: RO
|
||
|
||
> -----Original Message-----
|
||
> From: Mikheev, Vadim [mailto:vmikheev@SECTORBASE.COM]
|
||
>
|
||
> > > > relation version number
|
||
> > >
|
||
> > > I believe that we can avoid versions using WAL...
|
||
> > >
|
||
> >
|
||
> > How to re-construct tables in place ?
|
||
> > Is the following right ?
|
||
> > 1) save the content of current table to somewhere
|
||
> > 2) shrink the table and related indexes
|
||
> > 3) reload the saved(+some filtering) content
|
||
>
|
||
> Or - create tmp file and load with new content; log "intent to
|
||
> relink table
|
||
> file";
|
||
> relink table file; log "file is relinked".
|
||
>
|
||
|
||
It seems to me that whole content of the table should be
|
||
logged before relinking or shrinking.
|
||
Is my understanding right ?
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
From pgsql-hackers-owner+M3700@hub.org Wed Jun 21 22:17:59 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07504
|
||
for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 22:17:58 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id VAA07914 for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 21:23:22 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5M1It194420;
|
||
Wed, 21 Jun 2000 21:18:55 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5M1Ig194334
|
||
for <pgsql-hackers@postgresql.org>; Wed, 21 Jun 2000 21:18:43 -0400 (EDT)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id KAA02808; Thu, 22 Jun 2000 10:12:45 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Tom Lane" <tgl@sss.pgh.pa.us>
|
||
Cc: "Bruce Momjian" <pgman@candle.pha.pa.us>,
|
||
"Peter Eisentraut" <peter_e@gmx.net>, "Jan Wieck" <JanWieck@Yahoo.com>,
|
||
"Bruce Momjian" <maillist@candle.pha.pa.us>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgreSQL.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>,
|
||
"Thomas Lockhart" <lockhart@alumni.caltech.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Thu, 22 Jun 2000 10:15:01 +0900
|
||
Message-ID: <000501bfdbe7$49fcdd20$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-2022-jp"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
Importance: Normal
|
||
In-Reply-To: <4448.961601289@sss.pgh.pa.us>
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
> -----Original Message-----
|
||
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
|
||
>
|
||
> At the moment I can recall the following opinions:
|
||
>
|
||
> Pure OID filenames: Thomas, Tom, Marc, Peter E.
|
||
>
|
||
> OID+relname filenames: Bruce
|
||
>
|
||
|
||
Please add my opinion to the list.
|
||
|
||
Unique-id filename: Hiroshi
|
||
(Unqiue-id is irrelevant to OID/relname).
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
From pgsql-hackers-owner+M3701@hub.org Wed Jun 21 22:18:02 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07513
|
||
for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 22:18:01 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id VAA08502 for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 21:33:13 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5M1QS107400;
|
||
Wed, 21 Jun 2000 21:26:28 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5M1QA107223
|
||
for <pgsql-hackers@postgresql.org>; Wed, 21 Jun 2000 21:26:10 -0400 (EDT)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id KAA02831; Thu, 22 Jun 2000 10:25:11 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
|
||
Cc: "Bruce Momjian" <pgman@candle.pha.pa.us>,
|
||
"Peter Eisentraut" <peter_e@gmx.net>, "Jan Wieck" <JanWieck@Yahoo.com>,
|
||
"Bruce Momjian" <maillist@candle.pha.pa.us>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgreSQL.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>,
|
||
"'Tom Lane'" <tgl@sss.pgh.pa.us>,
|
||
"Thomas Lockhart" <lockhart@alumni.caltech.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Thu, 22 Jun 2000 10:27:26 +0900
|
||
Message-ID: <000601bfdbe9$0658a980$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
Importance: Normal
|
||
In-Reply-To: <8F4C99C66D04D4118F580090272A7A23018C2D@SECTORBASE1>
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
> -----Original Message-----
|
||
> From: Mikheev, Vadim [mailto:vmikheev@SECTORBASE.COM]
|
||
>
|
||
> > > Or - create tmp file and load with new content;
|
||
> > > log "intent to relink table file";
|
||
> > > relink table file; log "file is relinked".
|
||
> >
|
||
> > It seems to me that whole content of the table should be
|
||
> > logged before relinking or shrinking.
|
||
>
|
||
> Why not just fsync tmp files?
|
||
>
|
||
|
||
Probably I've misunderstood *relink*.
|
||
If *relink* different from *rename* ?
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
From vmikheev@SECTORBASE.COM Wed Jun 21 22:17:52 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07492;
|
||
Wed, 21 Jun 2000 22:17:51 -0400 (EDT)
|
||
Received: from sectorbase2.sectorbase.com ([208.48.122.131]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id VAA08730; Wed, 21 Jun 2000 21:37:44 -0400 (EDT)
|
||
Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21)
|
||
id <MCTD2WWC>; Wed, 21 Jun 2000 18:28:36 -0700
|
||
Message-ID: <8F4C99C66D04D4118F580090272A7A23018C2F@SECTORBASE1>
|
||
From: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
|
||
To: "'Hiroshi Inoue'" <Inoue@tpf.co.jp>
|
||
Cc: Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
Peter Eisentraut
|
||
<peter_e@gmx.net>, Jan Wieck <JanWieck@Yahoo.com>,
|
||
Bruce Momjian
|
||
<maillist@candle.pha.pa.us>,
|
||
PostgreSQL-development
|
||
<pgsql-hackers@postgreSQL.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>,
|
||
"'Tom Lane'" <tgl@sss.pgh.pa.us>,
|
||
Thomas Lockhart
|
||
<lockhart@alumni.caltech.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Wed, 21 Jun 2000 18:30:23 -0700
|
||
MIME-Version: 1.0
|
||
X-Mailer: Internet Mail Service (5.5.2650.21)
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Status: RO
|
||
|
||
> > > > Or - create tmp file and load with new content;
|
||
> > > > log "intent to relink table file";
|
||
> > > > relink table file; log "file is relinked".
|
||
> > >
|
||
> > > It seems to me that whole content of the table should be
|
||
> > > logged before relinking or shrinking.
|
||
> >
|
||
> > Why not just fsync tmp files?
|
||
> >
|
||
>
|
||
> Probably I've misunderstood *relink*.
|
||
> If *relink* different from *rename* ?
|
||
|
||
I ment something like this - link(table file, tmp2 file); fsync(tmp2 file);
|
||
unlink(table file); link(tmp file, table file); fsync(table file);
|
||
unlink(tmp file). We can do additional logging (with log flush) of these
|
||
steps
|
||
if required, postpone on-recovery redo of operations till last relink log
|
||
record/
|
||
end of log/transaction abort etc etc etc.
|
||
|
||
Vadim
|
||
|
||
From Inoue@tpf.co.jp Wed Jun 21 23:22:36 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA10350
|
||
for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 23:22:35 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id XAA13743 for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 23:07:50 -0400 (EDT)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id MAA03008; Thu, 22 Jun 2000 12:07:00 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
|
||
Cc: "Bruce Momjian" <pgman@candle.pha.pa.us>,
|
||
"Peter Eisentraut" <peter_e@gmx.net>, "Jan Wieck" <JanWieck@Yahoo.com>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>,
|
||
"'Tom Lane'" <tgl@sss.pgh.pa.us>,
|
||
"Thomas Lockhart" <lockhart@alumni.caltech.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Thu, 22 Jun 2000 12:09:15 +0900
|
||
Message-ID: <000801bfdbf7$3f674200$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
Importance: Normal
|
||
In-Reply-To: <8F4C99C66D04D4118F580090272A7A23018C2F@SECTORBASE1>
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
Status: RO
|
||
|
||
> -----Original Message-----
|
||
> From: Mikheev, Vadim [mailto:vmikheev@SECTORBASE.COM]
|
||
>
|
||
> > > > > Or - create tmp file and load with new content;
|
||
> > > > > log "intent to relink table file";
|
||
> > > > > relink table file; log "file is relinked".
|
||
> > > >
|
||
> > > > It seems to me that whole content of the table should be
|
||
> > > > logged before relinking or shrinking.
|
||
> > >
|
||
> > > Why not just fsync tmp files?
|
||
> > >
|
||
> >
|
||
> > Probably I've misunderstood *relink*.
|
||
> > If *relink* different from *rename* ?
|
||
>
|
||
> I ment something like this - link(table file, tmp2 file);
|
||
> fsync(tmp2 file);
|
||
> unlink(table file); link(tmp file, table file); fsync(table file);
|
||
> unlink(tmp file).
|
||
|
||
I see,old file would be rolled back from tmp2 file on abort.
|
||
This would work on most platforms.
|
||
But cygwin port has a flaw that files could not be unlinked
|
||
if they are open. So *relink* may fail in some cases(including
|
||
rollback cases).
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
From tgl@sss.pgh.pa.us Wed Jun 21 23:22:38 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA10353
|
||
for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 23:22:36 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id XAA14206 for <pgman@candle.pha.pa.us>; Wed, 21 Jun 2000 23:16:26 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id XAA07099;
|
||
Wed, 21 Jun 2000 23:14:50 -0400 (EDT)
|
||
To: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
|
||
cc: Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
Peter Eisentraut <peter_e@gmx.net>, Jan Wieck <JanWieck@Yahoo.com>,
|
||
Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <8F4C99C66D04D4118F580090272A7A23018C2B@SECTORBASE1>
|
||
References: <8F4C99C66D04D4118F580090272A7A23018C2B@SECTORBASE1>
|
||
Comments: In-reply-to "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
|
||
message dated "Wed, 21 Jun 2000 16:00:17 -0700"
|
||
Date: Wed, 21 Jun 2000 23:14:50 -0400
|
||
Message-ID: <7096.961643690@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: RO
|
||
|
||
"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:
|
||
>> relation OID
|
||
>> relation version number
|
||
|
||
> I believe that we can avoid versions using WAL...
|
||
|
||
I don't think so. You're basically saying that
|
||
1. create file 'new'
|
||
2. delete file 'old'
|
||
3. rename 'new' to 'old'
|
||
is safe as long as you have a redo log to ensure that the rename
|
||
happens even if you crash between steps 2 and 3. But crash is not
|
||
the only hazard. What if step 3 just plain fails? Redo won't help.
|
||
|
||
I'm having a hard time inventing really plausible examples, but a
|
||
slightly implausible example is that someone chmod's the containing
|
||
directory -w between steps 2 and 3. (Maybe it's not so implausible
|
||
if you assume a crash after step 2 ... someone might have left the
|
||
directory nonwritable while restoring the system.)
|
||
|
||
If we use file version numbers, then the *only* thing needed to
|
||
make a valid transition between one set of files and another is
|
||
a commit of the update of pg_class that shows the new version number
|
||
in the rel's pg_class tuple. The worst that can happen to you in
|
||
a crash or other failure is that you are unable to get rid of the
|
||
set of files that you don't want anymore. That might waste disk
|
||
space but it doesn't leave the database corrupted.
|
||
|
||
> But what about LOCATIONs? I object using environment and think that
|
||
> locations must be stored in pg_control..?
|
||
|
||
I don't like environment variables for this either; it's just way too
|
||
easy to start the postmaster with wrong environment. It still seems
|
||
to me that relying on subdirectory symlinks is a good way to go.
|
||
pg_control is not so good --- if it gets corrupted, how do you recover?
|
||
symlinks can be recreated by hand if necessary, but...
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M3711@hub.org Thu Jun 22 01:01:06 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA22245
|
||
for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 01:01:02 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id AAA18310 for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 00:43:00 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5M3US167109;
|
||
Wed, 21 Jun 2000 23:30:28 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5M3U0164115
|
||
for <pgsql-hackers@postgresql.org>; Wed, 21 Jun 2000 23:30:00 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id XAA07156;
|
||
Wed, 21 Jun 2000 23:27:10 -0400 (EDT)
|
||
To: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
cc: "Bruce Momjian" <pgman@candle.pha.pa.us>,
|
||
"Peter Eisentraut" <peter_e@gmx.net>, "Jan Wieck" <JanWieck@Yahoo.com>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>,
|
||
"Thomas Lockhart" <lockhart@alumni.caltech.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <000501bfdbe7$49fcdd20$2801007e@tpf.co.jp>
|
||
References: <000501bfdbe7$49fcdd20$2801007e@tpf.co.jp>
|
||
Comments: In-reply-to "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
message dated "Thu, 22 Jun 2000 10:15:01 +0900"
|
||
Date: Wed, 21 Jun 2000 23:27:10 -0400
|
||
Message-ID: <7153.961644430@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
|
||
> Please add my opinion to the list.
|
||
> Unique-id filename: Hiroshi
|
||
> (Unqiue-id is irrelevant to OID/relname).
|
||
|
||
"Unique ID" is more or less equivalent to "OID + version number",
|
||
right?
|
||
|
||
I was trying earlier to convince myself that a single unique-ID value
|
||
would be better than OID+version for the smgr interface, because it'd
|
||
certainly be easier to pass around. I failed to convince myself though,
|
||
and the thing that bothered me was this. Suppose you are trying to
|
||
recover a corrupted database manually, and the only information you have
|
||
about which table is which is a somewhat out-of-date listing of OIDs
|
||
versus table names. (Maybe it's out of date because you got it from
|
||
your last backup tape.) If the files are named OID+version you're not
|
||
going to have much trouble seeing which is which, even if some of the
|
||
versions are higher than what was on the tape. But if version-updated
|
||
tables are given entirely new unique IDs, you've got no hope at all of
|
||
telling which one corresponds to what you had in the listing. Maybe
|
||
you can tell by looking through the physical file contents, but
|
||
certainly this way is more fragile from the point of view of data
|
||
recovery.
|
||
|
||
regards, tom lane
|
||
|
||
From tgl@sss.pgh.pa.us Thu Jun 22 01:01:00 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA22232;
|
||
Thu, 22 Jun 2000 01:00:59 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id AAA17842; Thu, 22 Jun 2000 00:31:06 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id AAA07254;
|
||
Thu, 22 Jun 2000 00:29:42 -0400 (EDT)
|
||
To: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
cc: "Bruce Momjian" <pgman@candle.pha.pa.us>,
|
||
"Peter Eisentraut" <peter_e@gmx.net>, "Jan Wieck" <JanWieck@Yahoo.com>,
|
||
"Bruce Momjian" <maillist@candle.pha.pa.us>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>,
|
||
"Thomas Lockhart" <lockhart@alumni.caltech.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <000201bfdbd9$b1985580$2801007e@tpf.co.jp>
|
||
References: <000201bfdbd9$b1985580$2801007e@tpf.co.jp>
|
||
Comments: In-reply-to "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
message dated "Thu, 22 Jun 2000 08:37:42 +0900"
|
||
Date: Thu, 22 Jun 2000 00:29:42 -0400
|
||
Message-ID: <7251.961648182@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: RO
|
||
|
||
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
|
||
> I strongly object to keep tablespace OID for smgr file reference token
|
||
> though we have to keep it for another purpose of cource. I've mentioned
|
||
> many times tablespace(where to store) info should be distinguished from
|
||
> *where it is stored* info.
|
||
|
||
Sure. But this proposal assumes that we're relying on symlinks to
|
||
carry the information about physical locations corresponding to
|
||
tablespace OIDs. The backend just needs to know enough to access a
|
||
relation file at a relative pathname like
|
||
tablespaceOID/relationOID
|
||
(ignoring version and segment numbers for now). Under the hood,
|
||
a symlink for tablespaceOID gets the work done.
|
||
|
||
Certainly this is not a perfect mechanism. But it is simple, it
|
||
is reliable, it is portable to most of the platforms we care about
|
||
(yeah, I know we have a Win port, but you wouldn't ever recommend
|
||
someone to run a *serious* database on it would you?), and in general
|
||
I think the bang-for-the-buck ratio is enormous. I do not want to
|
||
have to deal with explicit tablespace bookkeeping in the backend,
|
||
but that seems like what we'd have to do in order to improve on
|
||
symlinks.
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M3720@hub.org Thu Jun 22 02:01:02 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA24025
|
||
for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 02:01:02 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id BAA21392 for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 01:56:49 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5M5jp143149;
|
||
Thu, 22 Jun 2000 01:45:51 -0400 (EDT)
|
||
Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5M5jT143025
|
||
for <pgsql-hackers@postgreSQL.org>; Thu, 22 Jun 2000 01:45:29 -0400 (EDT)
|
||
Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68])
|
||
by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id WAA11735;
|
||
Wed, 21 Jun 2000 22:44:28 -0700 (PDT)
|
||
Message-Id: <3.0.1.32.20000621224122.035b8c80@mail.pacifier.com>
|
||
X-Sender: dhogaza@mail.pacifier.com
|
||
X-Mailer: Windows Eudora Pro Version 3.0.1 (32)
|
||
Date: Wed, 21 Jun 2000 22:41:22 -0700
|
||
To: Chris Bitmead <chrisb@nimrod.itg.telstra.com.au>,
|
||
Bruce Momjian <pgman@candle.pha.pa.us>
|
||
From: Don Baccus <dhogaza@pacifier.com>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
Cc: PostgreSQL-development <pgsql-hackers@postgreSQL.org>
|
||
In-Reply-To: <39518B7C.F76108FD@nimrod.itg.telecom.com.au>
|
||
References: <200006220229.WAA08130@candle.pha.pa.us>
|
||
Mime-Version: 1.0
|
||
Content-Type: text/plain; charset="us-ascii"
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
At 01:43 PM 6/22/00 +1000, Chris Bitmead wrote:
|
||
|
||
>I'm wondering if pg_dump should store the location of the tablespace. If
|
||
>your machine dies, you get a new machine to re-create the database, you
|
||
>may not want the tablespace in the same spot. And text-editing a
|
||
>gigabyte file would be extremely painful.
|
||
|
||
So you don't dump your create tablespace statements, recognizing that on
|
||
a new machine (due to upgrades or crashing) you might assign them to
|
||
different directories/mount points/whatever. That's the reason for
|
||
wanting to hide physical allocation in tablespaces ... the rest of
|
||
your datamodel doesn't need to know.
|
||
|
||
Or you do dump your tablespaces, and knowing the paths assigned
|
||
to various ones set up your new machine accordingly.
|
||
|
||
|
||
|
||
- Don Baccus, Portland OR <dhogaza@pacifier.com>
|
||
Nature photos, on-line guides, Pacific Northwest
|
||
Rare Bird Alert Service and other goodies at
|
||
http://donb.photo.net.
|
||
|
||
From dhogaza@pacifier.com Thu Jun 22 02:00:58 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA24005
|
||
for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 02:00:58 -0400 (EDT)
|
||
Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id BAA21369 for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 01:56:18 -0400 (EDT)
|
||
Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68])
|
||
by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id WAA12121;
|
||
Wed, 21 Jun 2000 22:55:39 -0700 (PDT)
|
||
Message-Id: <3.0.1.32.20000621225149.035bc070@mail.pacifier.com>
|
||
X-Sender: dhogaza@mail.pacifier.com
|
||
X-Mailer: Windows Eudora Pro Version 3.0.1 (32)
|
||
Date: Wed, 21 Jun 2000 22:51:49 -0700
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
Chris Bitmead <chrisb@nimrod.itg.telstra.com.au>
|
||
From: Don Baccus <dhogaza@pacifier.com>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
Cc: PostgreSQL-development <pgsql-hackers@postgreSQL.org>
|
||
In-Reply-To: <200006220403.AAA15648@candle.pha.pa.us>
|
||
References: <39518B7C.F76108FD@nimrod.itg.telecom.com.au>
|
||
Mime-Version: 1.0
|
||
Content-Type: text/plain; charset="us-ascii"
|
||
Status: RO
|
||
|
||
At 12:03 AM 6/22/00 -0400, Bruce Momjian wrote:
|
||
|
||
>If the symlink create fails in CREATE TABLESPACE, it just creates an
|
||
>ordinary directory.
|
||
|
||
Silent surprises - the earmark of truly professional software ...
|
||
|
||
|
||
|
||
- Don Baccus, Portland OR <dhogaza@pacifier.com>
|
||
Nature photos, on-line guides, Pacific Northwest
|
||
Rare Bird Alert Service and other goodies at
|
||
http://donb.photo.net.
|
||
|
||
From Inoue@tpf.co.jp Thu Jun 22 02:01:00 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA24009
|
||
for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 02:00:59 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id BAA21277 for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 01:54:44 -0400 (EDT)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id OAA03303; Thu, 22 Jun 2000 14:53:52 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Tom Lane" <tgl@sss.pgh.pa.us>
|
||
Cc: "Bruce Momjian" <pgman@candle.pha.pa.us>,
|
||
"Peter Eisentraut" <peter_e@gmx.net>, "Jan Wieck" <JanWieck@Yahoo.com>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgreSQL.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>,
|
||
"Thomas Lockhart" <lockhart@alumni.caltech.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Thu, 22 Jun 2000 14:56:07 +0900
|
||
Message-ID: <000901bfdc0e$8f32fec0$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-2022-jp"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
Importance: Normal
|
||
In-Reply-To: <7251.961648182@sss.pgh.pa.us>
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
Status: RO
|
||
|
||
> -----Original Message-----
|
||
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
|
||
>
|
||
> "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
|
||
> > I strongly object to keep tablespace OID for smgr file reference token
|
||
> > though we have to keep it for another purpose of cource. I've mentioned
|
||
> > many times tablespace(where to store) info should be distinguished from
|
||
> > *where it is stored* info.
|
||
>
|
||
> Sure. But this proposal assumes that we're relying on symlinks to
|
||
> carry the information about physical locations corresponding to
|
||
> tablespace OIDs. The backend just needs to know enough to access a
|
||
> relation file at a relative pathname like
|
||
> tablespaceOID/relationOID
|
||
> (ignoring version and segment numbers for now). Under the hood,
|
||
> a symlink for tablespaceOID gets the work done.
|
||
>
|
||
|
||
I think tablespaceOID is an easy substitution for the purpose.
|
||
I don't like to depend on poor directory tree structure in dbms
|
||
either..
|
||
|
||
> Certainly this is not a perfect mechanism. But it is simple, it
|
||
> is reliable, it is portable to most of the platforms we care about
|
||
> (yeah, I know we have a Win port, but you wouldn't ever recommend
|
||
> someone to run a *serious* database on it would you?), and in general
|
||
> I think the bang-for-the-buck ratio is enormous. I do not want to
|
||
> have to deal with explicit tablespace bookkeeping in the backend,
|
||
> but that seems like what we'd have to do in order to improve on
|
||
> symlinks.
|
||
>
|
||
|
||
I've already mentioned about it 10 times or so but unfortunately
|
||
I see no one on my side yet.
|
||
OK,I've given up the discussion about it. I don't want to waste
|
||
my time any more.
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
From tgl@sss.pgh.pa.us Thu Jun 22 03:31:04 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA28813
|
||
for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 03:31:03 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id DAA23901 for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 03:06:47 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id DAA07725;
|
||
Thu, 22 Jun 2000 03:05:00 -0400 (EDT)
|
||
To: Chris Bitmead <chrisb@nimrod.itg.telstra.com.au>
|
||
cc: Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <39518B7C.F76108FD@nimrod.itg.telecom.com.au>
|
||
References: <200006220229.WAA08130@candle.pha.pa.us> <39518B7C.F76108FD@nimrod.itg.telecom.com.au>
|
||
Comments: In-reply-to Chris Bitmead <chrisb@nimrod.itg.telstra.com.au>
|
||
message dated "Thu, 22 Jun 2000 13:43:56 +1000"
|
||
Date: Thu, 22 Jun 2000 03:05:00 -0400
|
||
Message-ID: <7722.961657500@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: RO
|
||
|
||
Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> writes:
|
||
> I'm wondering if pg_dump should store the location of the tablespace. If
|
||
> your machine dies, you get a new machine to re-create the database, you
|
||
> may not want the tablespace in the same spot. And text-editing a
|
||
> gigabyte file would be extremely painful.
|
||
|
||
Might make sense to store the tablespace setup separately from the bulk
|
||
of the data, but certainly you want some way to dump that info in a
|
||
restorable form.
|
||
|
||
I've been thinking lately that the pg_dump shove-it-all-in-one-file
|
||
approach doesn't scale anyway. We ought to start thinking about ways
|
||
to make the standard dump method store schema separately from bulk
|
||
data, for example. That's offtopic for this thread but ought to be
|
||
on the TODO list someplace...
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M3727@hub.org Thu Jun 22 03:31:06 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA28819
|
||
for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 03:31:05 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id DAA24751 for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 03:29:00 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5M7KP140211;
|
||
Thu, 22 Jun 2000 03:20:25 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5M7Jb139991
|
||
for <pgsql-hackers@postgresql.org>; Thu, 22 Jun 2000 03:19:37 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id DAA07785;
|
||
Thu, 22 Jun 2000 03:17:45 -0400 (EDT)
|
||
To: "Philip J. Warner" <pjw@rhyme.com.au>
|
||
cc: "Hiroshi Inoue" <Inoue@tpf.co.jp>,
|
||
"Bruce Momjian" <pgman@candle.pha.pa.us>,
|
||
"Peter Eisentraut" <peter_e@gmx.net>, "Jan Wieck" <JanWieck@Yahoo.com>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>,
|
||
"Thomas Lockhart" <lockhart@alumni.caltech.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <3.0.5.32.20000622163133.009b1600@mail.rhyme.com.au>
|
||
References: <000501bfdbe7$49fcdd20$2801007e@tpf.co.jp> <000501bfdbe7$49fcdd20$2801007e@tpf.co.jp> <3.0.5.32.20000622163133.009b1600@mail.rhyme.com.au>
|
||
Comments: In-reply-to "Philip J. Warner" <pjw@rhyme.com.au>
|
||
message dated "Thu, 22 Jun 2000 16:31:33 +1000"
|
||
Date: Thu, 22 Jun 2000 03:17:45 -0400
|
||
Message-ID: <7782.961658265@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
"Philip J. Warner" <pjw@rhyme.com.au> writes:
|
||
>> ... the thing that bothered me was this. Suppose you are trying to
|
||
>> recover a corrupted database manually, and the only information you have
|
||
>> about which table is which is a somewhat out-of-date listing of OIDs
|
||
>> versus table names.
|
||
|
||
> This worries me a little; in the Dec/RDB world it is a very long time since
|
||
> database backups were done by copying the files. There is a database
|
||
> backup/restore utility which runs while the database is on-line and makes
|
||
> sure a valid snapshot is taken. Backing up storage areas (table spapces)
|
||
> can be done separately by the same utility, and again, it records enough
|
||
> information to ensure integrity. Maybe the thing to do is write a pg_backup
|
||
> utility, which in a first pass could, presumably, be synonymous with pg_dump?
|
||
|
||
pg_dump already does the consistent-snapshot trick (it just has to run
|
||
inside a single transaction).
|
||
|
||
> Am I missing something here? Is there a problem with backing up using
|
||
> 'pg_dump | gzip'?
|
||
|
||
None, as long as your ambition extends no further than restoring your
|
||
data to where it was at your last pg_dump. I was thinking about the
|
||
all-too-common-in-the-real-world scenario where you're hoping to recover
|
||
some data more recent than your last backup from the fractured shards
|
||
of your database...
|
||
|
||
regards, tom lane
|
||
|
||
From zeugswettera@wien.spardat.at Thu Jun 22 05:01:11 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id FAA29525
|
||
for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 05:01:09 -0400 (EDT)
|
||
Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id EAA27070 for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 04:38:32 -0400 (EDT)
|
||
Received: from peligor.server.lan.at (peligor.server.lan.at [10.8.32.84])
|
||
by gandalf.it-austria.net (xxx/xxx) with ESMTP id KAA23252;
|
||
Thu, 22 Jun 2000 10:37:45 +0200
|
||
Received: from zeus (totalctlh1-port029.f000.d0188.sd.spardat.at [10.8.35.226])
|
||
by peligor.server.lan.at (8.9.1/8.9.1) with SMTP id KAA02457;
|
||
Thu, 22 Jun 2000 10:41:04 GMT
|
||
From: Zeugswetter Andreas SB <zeugswettera@wien.spardat.at>
|
||
To: Chris Bitmead <chrisb@nimrod.itg.telstra.com.au>,
|
||
Bruce Momjian <pgman@candle.pha.pa.us>
|
||
Subject: Re: Big 7.1 open items
|
||
Date: Thu, 22 Jun 2000 09:49:07 +0200
|
||
X-Mailer: KMail [version 1.0.29.1]
|
||
Content-Type: text/plain
|
||
Cc: PostgreSQL-development <pgsql-hackers@postgreSQL.org>
|
||
References: <200006220229.WAA08130@candle.pha.pa.us> <39518B7C.F76108FD@nimrod.itg.telecom.com.au>
|
||
In-Reply-To: <39518B7C.F76108FD@nimrod.itg.telecom.com.au>
|
||
MIME-Version: 1.0
|
||
Message-Id: <00062210055400.00299@zeus>
|
||
Content-Transfer-Encoding: 8bit
|
||
Status: RO
|
||
|
||
|
||
> > pg_dump would recreate a CREATE TABLESPACE command:
|
||
> >
|
||
> > printf("CREATE TABLESPACE %s USING %s", loc, symloc);
|
||
> >
|
||
> > where symloc would be SELECT symloc(loc) and return the value into a
|
||
> > variable that is used by pg_dump. The backend would do the lstat() and
|
||
> > return the value to the client.
|
||
>
|
||
> I'm wondering if pg_dump should store the location of the tablespace. If
|
||
> your machine dies, you get a new machine to re-create the database, you
|
||
> may not want the tablespace in the same spot. And text-editing a
|
||
> gigabyte file would be extremely painful.
|
||
|
||
Yes, that seems like a valid concern that should be kept in mind.
|
||
It should also be possible to restore a pg instance to a different location
|
||
on the same machine.
|
||
Maybe this could be done by adding a utility that dumps all tablespace
|
||
info which could then be altered to desire.
|
||
|
||
I still opt for instance-wide tablespaces. People wanting separation can easily
|
||
create different tablespaces for each database, but those that only want to
|
||
separate data and index need only create two tablespaces. A typical installation would
|
||
have 1 to 4 tablespaces (systemtbs, datatbs, indextbs, toasttbs | lobdbs )
|
||
|
||
I would also switch the directory structure between dbname and extent subdir,
|
||
because that allows less symlinks/filesystems, and thus less admin.
|
||
|
||
thus you would have:
|
||
tablespace1/extent1/dbname1
|
||
tablespace1/extent2/dbname1
|
||
tablespace1/extent1/dbname2
|
||
|
||
Andreas
|
||
|
||
From pjw@rhyme.com.au Thu Jun 22 04:01:05 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id EAA29060
|
||
for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 04:01:03 -0400 (EDT)
|
||
Received: from acheron.rime.com.au (root@albatr.lnk.telstra.net [139.130.54.222]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id DAA25604 for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 03:50:30 -0400 (EDT)
|
||
Received: from oberon (Oberon.rime.com.au [203.8.195.100])
|
||
by acheron.rime.com.au (8.9.3/8.9.3) with SMTP id RAA08811;
|
||
Thu, 22 Jun 2000 17:43:22 +1000
|
||
Message-Id: <3.0.5.32.20000622175015.00a10160@mail.rhyme.com.au>
|
||
X-Sender: pjw@mail.rhyme.com.au
|
||
X-Mailer: QUALCOMM Windows Eudora Pro Version 3.0.5 (32)
|
||
Date: Thu, 22 Jun 2000 17:50:15 +1000
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
From: "Philip J. Warner" <pjw@rhyme.com.au>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
Cc: "Hiroshi Inoue" <Inoue@tpf.co.jp>,
|
||
"Bruce Momjian" <pgman@candle.pha.pa.us>,
|
||
"Peter Eisentraut" <peter_e@gmx.net>, "Jan Wieck" <JanWieck@Yahoo.com>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>,
|
||
"Thomas Lockhart" <lockhart@alumni.caltech.edu>
|
||
In-Reply-To: <7782.961658265@sss.pgh.pa.us>
|
||
References: <3.0.5.32.20000622163133.009b1600@mail.rhyme.com.au>
|
||
<000501bfdbe7$49fcdd20$2801007e@tpf.co.jp>
|
||
<000501bfdbe7$49fcdd20$2801007e@tpf.co.jp>
|
||
<3.0.5.32.20000622163133.009b1600@mail.rhyme.com.au>
|
||
Mime-Version: 1.0
|
||
Content-Type: text/plain; charset="us-ascii"
|
||
Status: RO
|
||
|
||
At 03:17 22/06/00 -0400, Tom Lane wrote:
|
||
>
|
||
>> This worries me a little; in the Dec/RDB world it is a very long time since
|
||
>> database backups were done by copying the files. There is a database
|
||
>> backup/restore utility which runs while the database is on-line and makes
|
||
>> sure a valid snapshot is taken. Backing up storage areas (table spapces)
|
||
>> can be done separately by the same utility, and again, it records enough
|
||
>> information to ensure integrity. Maybe the thing to do is write a pg_backup
|
||
>> utility, which in a first pass could, presumably, be synonymous with
|
||
pg_dump?
|
||
>
|
||
>pg_dump already does the consistent-snapshot trick (it just has to run
|
||
>inside a single transaction).
|
||
>
|
||
>> Am I missing something here? Is there a problem with backing up using
|
||
>> 'pg_dump | gzip'?
|
||
>
|
||
>None, as long as your ambition extends no further than restoring your
|
||
>data to where it was at your last pg_dump. I was thinking about the
|
||
>all-too-common-in-the-real-world scenario where you're hoping to recover
|
||
>some data more recent than your last backup from the fractured shards
|
||
>of your database...
|
||
>
|
||
|
||
pg_dump is a good basis for any pg_backup utility; perhaps as you indicated
|
||
elsewhere, more carefull formatting of the dump files would make
|
||
table-based restoration possible. In another response, I also suggested
|
||
allowing overrides of placement information in a restore operation- the
|
||
simplest approach would be an 'ignore-storage-parameters' flag. Does this
|
||
sound reasonable? If so, then discussion of file-id based on OID needs not
|
||
be too concerned about how db restoration is done.
|
||
|
||
|
||
|
||
|
||
|
||
----------------------------------------------------------------
|
||
Philip Warner | __---_____
|
||
Albatross Consulting Pty. Ltd. |----/ - \
|
||
(A.C.N. 008 659 498) | /(@) ______---_
|
||
Tel: (+61) 0500 83 82 81 | _________ \
|
||
Fax: (+61) 0500 83 82 82 | ___________ |
|
||
Http://www.rhyme.com.au | / \|
|
||
| --________--
|
||
PGP key available upon request, | /
|
||
and from pgp5.ai.mit.edu:11371 |/
|
||
|
||
From pgsql-hackers-owner+M3730@hub.org Thu Jun 22 05:31:00 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id FAA29741
|
||
for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 05:31:00 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id FAA28478 for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 05:18:37 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5M96W171286;
|
||
Thu, 22 Jun 2000 05:06:32 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5M96A168442
|
||
for <pgsql-hackers@postgresql.org>; Thu, 22 Jun 2000 05:06:10 -0400 (EDT)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id SAA03635; Thu, 22 Jun 2000 18:05:02 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Peter Eisentraut" <peter_e@gmx.net>
|
||
Cc: "Tom Lane" <tgl@sss.pgh.pa.us>, "Bruce Momjian" <pgman@candle.pha.pa.us>,
|
||
"Jan Wieck" <JanWieck@Yahoo.com>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Thu, 22 Jun 2000 18:07:18 +0900
|
||
Message-ID: <000c01bfdc29$43f717a0$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
Importance: Normal
|
||
In-Reply-To: <Pine.GSO.4.02A.10006211521410.10570-100000@Ulv.DoCS.UU.SE>
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
> -----Original Message-----
|
||
> From: Peter Eisentraut [mailto:e99re41@DoCS.UU.SE]
|
||
>
|
||
> > My opinion
|
||
> > 3) database and tablespace are relatively irrelevant.
|
||
> > I assume PostgreSQL's database would correspond
|
||
> > to the concept of SCHEMA.
|
||
>
|
||
> A database corresponds to a catalog and a schema corresponds to nothing
|
||
> yet.
|
||
>
|
||
|
||
Oh I see your point. However I've thought that current PostgreSQL's
|
||
database is an imcomplete SCHEMA and still feel so in reality.
|
||
Catalog per database has been nothing but needless for me from
|
||
the first.
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
From Inoue@tpf.co.jp Thu Jun 22 07:31:01 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id HAA07559
|
||
for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 07:31:00 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id HAA02741 for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 07:08:29 -0400 (EDT)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id UAA03834; Thu, 22 Jun 2000 20:06:51 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Tom Lane" <tgl@sss.pgh.pa.us>
|
||
Cc: "Bruce Momjian" <pgman@candle.pha.pa.us>,
|
||
"Peter Eisentraut" <peter_e@gmx.net>, "Jan Wieck" <JanWieck@Yahoo.com>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>,
|
||
"Thomas Lockhart" <lockhart@alumni.caltech.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Thu, 22 Jun 2000 20:09:07 +0900
|
||
Message-ID: <000d01bfdc3a$48fb35e0$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-2022-jp"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
Importance: Normal
|
||
In-Reply-To: <7153.961644430@sss.pgh.pa.us>
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
Status: RO
|
||
|
||
> -----Original Message-----
|
||
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
|
||
>
|
||
> "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
|
||
> > Please add my opinion to the list.
|
||
> > Unique-id filename: Hiroshi
|
||
> > (Unqiue-id is irrelevant to OID/relname).
|
||
>
|
||
> "Unique ID" is more or less equivalent to "OID + version number",
|
||
> right?
|
||
>
|
||
|
||
Hmm,no one seems to be on my side at this point also.
|
||
OK,I change my mind as follows.
|
||
|
||
OID except cygwin,unique-id on cygwin
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
From tgl@sss.pgh.pa.us Thu Jun 22 11:31:06 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA10544
|
||
for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 11:31:05 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id LAA23513 for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 11:28:53 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id LAA08851;
|
||
Thu, 22 Jun 2000 11:27:30 -0400 (EDT)
|
||
To: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
cc: "Bruce Momjian" <pgman@candle.pha.pa.us>,
|
||
"Peter Eisentraut" <peter_e@gmx.net>, "Jan Wieck" <JanWieck@Yahoo.com>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>,
|
||
"Thomas Lockhart" <lockhart@alumni.caltech.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <000d01bfdc3a$48fb35e0$2801007e@tpf.co.jp>
|
||
References: <000d01bfdc3a$48fb35e0$2801007e@tpf.co.jp>
|
||
Comments: In-reply-to "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
message dated "Thu, 22 Jun 2000 20:09:07 +0900"
|
||
Date: Thu, 22 Jun 2000 11:27:30 -0400
|
||
Message-ID: <8848.961687650@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: RO
|
||
|
||
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
|
||
> OK,I change my mind as follows.
|
||
> OID except cygwin,unique-id on cygwin
|
||
|
||
We don't really want to do that, do we? That's a huge difference in
|
||
behavior to have in just one port --- especially a port that none of
|
||
the primary developers use (AFAIK anyway). The cygwin port's normal
|
||
state of existence will be "broken", surely, if we go that way.
|
||
|
||
Besides which, OID alone doesn't give us a possibility of file
|
||
versioning, and as I commented to Vadim I think we will want that,
|
||
WAL or no WAL. So it seems to me the two viable choices are
|
||
unique-id or OID+version-number. Either way, the file-naming behavior
|
||
should be the same across all platforms.
|
||
|
||
regards, tom lane
|
||
|
||
From vmikheev@SECTORBASE.COM Thu Jun 22 14:31:00 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA11892
|
||
for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 14:30:59 -0400 (EDT)
|
||
Received: from sectorbase2.sectorbase.com ([208.48.122.131]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id OAA10107 for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 14:17:04 -0400 (EDT)
|
||
Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21)
|
||
id <MCTD2X5X>; Thu, 22 Jun 2000 11:07:59 -0700
|
||
Message-ID: <8F4C99C66D04D4118F580090272A7A23018C31@SECTORBASE1>
|
||
From: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
|
||
To: "'Tom Lane'" <tgl@sss.pgh.pa.us>
|
||
Cc: Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Bruce Momjian
|
||
<pgman@candle.pha.pa.us>,
|
||
Peter Eisentraut <peter_e@gmx.net>, Jan Wieck
|
||
<JanWieck@Yahoo.com>,
|
||
Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Thu, 22 Jun 2000 11:09:47 -0700
|
||
MIME-Version: 1.0
|
||
X-Mailer: Internet Mail Service (5.5.2650.21)
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Status: RO
|
||
|
||
> > I believe that we can avoid versions using WAL...
|
||
>
|
||
> I don't think so. You're basically saying that
|
||
> 1. create file 'new'
|
||
> 2. delete file 'old'
|
||
> 3. rename 'new' to 'old'
|
||
> is safe as long as you have a redo log to ensure that the rename
|
||
> happens even if you crash between steps 2 and 3. But crash is not
|
||
> the only hazard. What if step 3 just plain fails? Redo won't help.
|
||
|
||
Ok, ok. Let's use *unique* file name for each table version.
|
||
But after thinking, seems that I agreed with Hiroshi about using
|
||
*some unique id* for file names instead of oid+version: we could use
|
||
just DB' OID + this unique ID in log records to find table file - just
|
||
8 bytes.
|
||
|
||
So, add me to Hiroshi' camp... if Hiroshi is ready to implement new file
|
||
naming -:)
|
||
|
||
> > But what about LOCATIONs? I object using environment and think that
|
||
> > locations must be stored in pg_control..?
|
||
>
|
||
> I don't like environment variables for this either; it's just way too
|
||
> easy to start the postmaster with wrong environment. It still seems
|
||
> to me that relying on subdirectory symlinks is a good way to go.
|
||
|
||
I always thought so.
|
||
|
||
> pg_control is not so good --- if it gets corrupted, how do
|
||
> you recover?
|
||
|
||
Impossible to recover anyway - pg_control keeps last checkpoint pointer,
|
||
required for recovery. That's why Oracle recommends (requires?) at least
|
||
two copies of control file (and log too).
|
||
But what if log gets corrupted? Or file system (lost symlinks etc)?
|
||
One will have to use backup...
|
||
|
||
Vadim
|
||
|
||
From peter@localhost.its.uu.se Thu Jun 22 18:37:35 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id SAA19684
|
||
for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 18:37:34 -0400 (EDT)
|
||
Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id SAA02841 for <pgman@candle.pha.pa.us>; Thu, 22 Jun 2000 18:31:53 -0400 (EDT)
|
||
Received: from regulus.student.UU.SE ([130.238.5.2]:37596 "EHLO
|
||
regulus.its.uu.se") by merganser.its.uu.se with ESMTP
|
||
id <S125060AbQFVW3s>; Fri, 23 Jun 2000 00:29:48 +0200
|
||
Received: from peter (helo=localhost)
|
||
by regulus.its.uu.se with local-esmtp (Exim 3.02 #2)
|
||
id 135FaG-00062q-00; Fri, 23 Jun 2000 00:36:28 +0200
|
||
Date: Fri, 23 Jun 2000 00:36:28 +0200 (CEST)
|
||
From: Peter Eisentraut <peter_e@gmx.net>
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
cc: Hiroshi Inoue <Inoue@tpf.co.jp>, Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-Reply-To: <8803.961687343@sss.pgh.pa.us>
|
||
Message-ID: <Pine.LNX.4.21.0006221913490.4086-100000@localhost.localdomain>
|
||
MIME-Version: 1.0
|
||
Content-Type: TEXT/PLAIN; charset=ISO-8859-1
|
||
Content-Transfer-Encoding: 8BIT
|
||
Sender: Peter Eisentraut <peter@Op.Net>
|
||
Status: RO
|
||
|
||
Tom Lane writes:
|
||
|
||
> In my mind the point of the "database" concept is to provide a domain
|
||
> within which custom datatypes and functions are available.
|
||
|
||
Quoth SQL99:
|
||
|
||
"A user-defined type is a schema object"
|
||
|
||
"An SQL-invoked routine is an element of an SQL-schema"
|
||
|
||
I have yet to see anything in SQL that's a per-catalog object. Some things
|
||
are global, like users, but everything else is per-schema.
|
||
|
||
The way I see it is that schemas are required to be a logical hierarchy,
|
||
whereas implementations may see catalogs as a physical division (as indeed
|
||
this implementation does).
|
||
|
||
> So I think we will still want "database" = "span of applicability of
|
||
> system catalogs"
|
||
|
||
Yes, because the system catalogs would live in a schema of their own.
|
||
|
||
|
||
--
|
||
Peter Eisentraut Sernanders v<>g 10:115
|
||
peter_e@gmx.net 75262 Uppsala
|
||
http://yi.org/peter-e/ Sweden
|
||
|
||
|
||
From ZeugswetterA@wien.spardat.at Mon Jun 26 04:10:01 2000
|
||
Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id EAA29267
|
||
for <pgman@candle.pha.pa.us>; Mon, 26 Jun 2000 04:09:59 -0400 (EDT)
|
||
Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
|
||
by gandalf.it-austria.net (xxx/xxx) with ESMTP id KAA35550;
|
||
Mon, 26 Jun 2000 10:09:14 +0200
|
||
Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
|
||
id <M6F0AJMV>; Mon, 26 Jun 2000 10:09:14 +0200
|
||
Message-ID: <219F68D65015D011A8E000006F8590C605BA598B@sdexcsrv1.f000.d0188.sd.spardat.at>
|
||
From: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
|
||
To: "'Tom Lane'" <tgl@sss.pgh.pa.us>, Hiroshi Inoue <Inoue@tpf.co.jp>
|
||
Cc: Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
Peter Eisentraut
|
||
<peter_e@gmx.net>, Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>,
|
||
Thomas Lockhart
|
||
<lockhart@alumni.caltech.edu>
|
||
Subject: [HACKERS] File versioning (was: Big 7.1 open items)
|
||
Date: Mon, 26 Jun 2000 10:09:13 +0200
|
||
MIME-Version: 1.0
|
||
X-Mailer: Internet Mail Service (5.5.2448.0)
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Status: RO
|
||
|
||
|
||
> Besides which, OID alone doesn't give us a possibility of file
|
||
> versioning, and as I commented to Vadim I think we will want that,
|
||
> WAL or no WAL. So it seems to me the two viable choices are
|
||
> unique-id or OID+version-number. Either way, the file-naming behavior
|
||
> should be the same across all platforms.
|
||
|
||
I do not think the only problem of a failing rename of "temp" to "new"
|
||
on startup rollforward is issue enough to justify the additional complexity
|
||
a version implys.
|
||
Why not simply abort startup of postmaster in such an event and let the
|
||
dba fix it. There can be no data loss.
|
||
|
||
If e.g. the permissions of the directory are insufficient we will want to
|
||
abort
|
||
startup anyway, no?
|
||
|
||
Andreas
|
||
|
||
From ZeugswetterA@wien.spardat.at Mon Jun 26 05:32:05 2000
|
||
Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id FAA29616
|
||
for <pgman@candle.pha.pa.us>; Mon, 26 Jun 2000 05:32:03 -0400 (EDT)
|
||
Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
|
||
by gandalf.it-austria.net (xxx/xxx) with ESMTP id LAA27288;
|
||
Mon, 26 Jun 2000 11:31:08 +0200
|
||
Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
|
||
id <M6F0AKXC>; Mon, 26 Jun 2000 11:31:08 +0200
|
||
Message-ID: <219F68D65015D011A8E000006F8590C605BA598F@sdexcsrv1.f000.d0188.sd.spardat.at>
|
||
From: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
|
||
To: "'Hiroshi Inoue'" <Inoue@tpf.co.jp>, Peter Eisentraut <peter_e@gmx.net>,
|
||
Tom Lane <tgl@sss.pgh.pa.us>
|
||
Cc: Bruce Momjian <pgman@candle.pha.pa.us>, Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: AW: [HACKERS] Big 7.1 open items
|
||
Date: Mon, 26 Jun 2000 11:31:06 +0200
|
||
MIME-Version: 1.0
|
||
X-Mailer: Internet Mail Service (5.5.2448.0)
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Status: RO
|
||
|
||
|
||
> > > In my mind the point of the "database" concept is to
|
||
> provide a domain
|
||
> > > within which custom datatypes and functions are available.
|
||
> >
|
||
>
|
||
> AFAIK few users understand it and many users have wondered
|
||
> why we couldn't issue cross "database" queries.
|
||
|
||
Imho the same issue is access to tables on another machine.
|
||
If we "fix" that, access to another db on the same instance is just
|
||
a variant of the above.
|
||
|
||
>
|
||
> > Quoth SQL99:
|
||
> >
|
||
> > "A user-defined type is a schema object"
|
||
> >
|
||
> > "An SQL-invoked routine is an element of an SQL-schema"
|
||
> >
|
||
> > I have yet to see anything in SQL that's a per-catalog
|
||
> object. Some things
|
||
> > are global, like users, but everything else is per-schema.
|
||
|
||
Yes.
|
||
|
||
> So why is system catalog needed per "database" ?
|
||
|
||
I like to use different databases on a development machine,
|
||
because it makes testing easier. The only thing that
|
||
needs to be changed is the connect statement. All other statements
|
||
including schema qualified tablenames stay exactly the same for
|
||
each developer even though each has his own database,
|
||
and his own version of functions.
|
||
I have yet to see an installation that does'nt have at least one program
|
||
that needs access to more than one schema.
|
||
|
||
On production machines we (using Informix) use different databases
|
||
for different products, because it reduces the possibility of accessing
|
||
the wrong tables, since the syntax for accessing tables in other db's
|
||
is different (dbname[@instancename]:"owner".tabname in Informix)
|
||
The schema does not help us, since most of our programs access
|
||
tables from more than one schema.
|
||
|
||
And again someone wanting Oracle'ish behavior will only create one
|
||
database per instance.
|
||
|
||
Andreas
|
||
|
||
From pgsql-hackers-owner+M4088@hub.org Mon Jul 3 01:57:49 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA08810
|
||
for <pgman@candle.pha.pa.us>; Mon, 3 Jul 2000 01:57:49 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e635u5S69222;
|
||
Mon, 3 Jul 2000 01:56:05 -0400 (EDT)
|
||
Received: from po.seiren.co.jp (po.seiren.co.jp [203.138.223.10])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5QA5d124120
|
||
for <pgsql-hackers@postgresql.org>; Mon, 26 Jun 2000 06:05:41 -0400 (EDT)
|
||
Received: from mcadnote1 ([210.161.188.23]) by po.seiren.co.jp
|
||
(post.office MTA v1.9.3 ID# 0100012-16224) with SMTP id AAA59;
|
||
Mon, 26 Jun 2000 19:04:51 +0900
|
||
From: "Hiroshi Inoue" <Inoue@seiren.co.jp>
|
||
To: "Zeugswetter Andreas SB" <ZeugswetterA@wien.spardat.at>,
|
||
"Peter Eisentraut" <peter_e@gmx.net>, "Tom Lane" <tgl@sss.pgh.pa.us>
|
||
Cc: "Bruce Momjian" <pgman@candle.pha.pa.us>, "Jan Wieck" <JanWieck@Yahoo.com>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Mon, 26 Jun 2000 19:08:26 +0900
|
||
Message-ID: <EKEJJICOHDIEMGPNIFIJCEFFCCAA.Inoue@seiren.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="Windows-1252"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
|
||
Importance: Normal
|
||
In-Reply-To: <219F68D65015D011A8E000006F8590C605BA598F@sdexcsrv1.f000.d0188.sd.spardat.at>
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
> -----Original Message-----
|
||
> From: Zeugswetter Andreas SB
|
||
>
|
||
> > > > In my mind the point of the "database" concept is to
|
||
> > provide a domain
|
||
> > > > within which custom datatypes and functions are available.
|
||
> > >
|
||
> >
|
||
> > AFAIK few users understand it and many users have wondered
|
||
> > why we couldn't issue cross "database" queries.
|
||
>
|
||
> Imho the same issue is access to tables on another machine.
|
||
> If we "fix" that, access to another db on the same instance is just
|
||
> a variant of the above.
|
||
>
|
||
|
||
What is a difference between SCHAMA and your "database" ?
|
||
I myself am confused about them.
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
From ZeugswetterA@wien.spardat.at Mon Jun 26 06:50:26 2000
|
||
Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id GAA07354
|
||
for <pgman@candle.pha.pa.us>; Mon, 26 Jun 2000 06:50:24 -0400 (EDT)
|
||
Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
|
||
by gandalf.it-austria.net (xxx/xxx) with ESMTP id MAA41146;
|
||
Mon, 26 Jun 2000 12:50:11 +0200
|
||
Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
|
||
id <M6F0ALQN>; Mon, 26 Jun 2000 12:50:11 +0200
|
||
Message-ID: <219F68D65015D011A8E000006F8590C605BA5991@sdexcsrv1.f000.d0188.sd.spardat.at>
|
||
From: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
|
||
To: "'Hiroshi Inoue'" <Inoue@seiren.co.jp>,
|
||
Peter Eisentraut
|
||
<peter_e@gmx.net>, Tom Lane <tgl@sss.pgh.pa.us>
|
||
Cc: Bruce Momjian <pgman@candle.pha.pa.us>, Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: AW: [HACKERS] Big 7.1 open items
|
||
Date: Mon, 26 Jun 2000 12:50:10 +0200
|
||
MIME-Version: 1.0
|
||
X-Mailer: Internet Mail Service (5.5.2448.0)
|
||
Content-Type: text/plain;
|
||
charset="windows-1252"
|
||
Status: RO
|
||
|
||
Hiroshi Inoue [mailto:Inoue@seiren.co.jp] wrote:
|
||
> > > > > In my mind the point of the "database" concept is to
|
||
> > > provide a domain
|
||
> > > > > within which custom datatypes and functions are available.
|
||
> > > >
|
||
> > >
|
||
> > > AFAIK few users understand it and many users have wondered
|
||
> > > why we couldn't issue cross "database" queries.
|
||
> >
|
||
> > Imho the same issue is access to tables on another machine.
|
||
> > If we "fix" that, access to another db on the same instance is just
|
||
> > a variant of the above.
|
||
> >
|
||
>
|
||
> What is a difference between SCHAMA and your "database" ?
|
||
> I myself am confused about them.
|
||
|
||
Think of it as a hierarchy:
|
||
instance -> database -> schema -> object
|
||
|
||
- "instance" corresponds to one postmaster
|
||
- "database" as in current implementation
|
||
- "schema" name corresponds to the owner of the object,
|
||
only that a corresponding db or os user does not need to exist in
|
||
some of the implementations I know.
|
||
- "object" is one of table, index, function ...
|
||
|
||
The database is what you connect to in your connect statement,
|
||
you then see all schemas inside this database only. Access to another
|
||
database would need an explicitly created synonym or different syntax.
|
||
The default "schema" name is usually the logged in user name
|
||
(although I don't like this approach, I like Informix's approach where
|
||
the schema need not be specified if tabname is unique (and tabname
|
||
is unique per db unless you specify database mode ansi)).
|
||
All other schemas have to be explicitly named ("schemaname".tabname).
|
||
|
||
Oracle has exactly this layout, only you are restricted to one database
|
||
per instance.
|
||
(They even have a "create database .." statement, although it is somehow
|
||
analogous to our initdb).
|
||
|
||
Andreas
|
||
|
||
From ZeugswetterA@wien.spardat.at Mon Jun 26 07:51:14 2000
|
||
Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id HAA07648
|
||
for <pgman@candle.pha.pa.us>; Mon, 26 Jun 2000 07:51:12 -0400 (EDT)
|
||
Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
|
||
by gandalf.it-austria.net (xxx/xxx) with ESMTP id NAA40848;
|
||
Mon, 26 Jun 2000 13:50:56 +0200
|
||
Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
|
||
id <M6F0AMHN>; Mon, 26 Jun 2000 13:50:55 +0200
|
||
Message-ID: <219F68D65015D011A8E000006F8590C605BA5993@sdexcsrv1.f000.d0188.sd.spardat.at>
|
||
From: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
|
||
To: "'Mikheev, Vadim'" <vmikheev@SECTORBASE.COM>,
|
||
"'Tom Lane'"
|
||
<tgl@sss.pgh.pa.us>
|
||
Cc: Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Bruce Momjian
|
||
<pgman@candle.pha.pa.us>,
|
||
Peter Eisentraut <peter_e@gmx.net>, Jan Wieck
|
||
<JanWieck@Yahoo.com>,
|
||
Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: AW: [HACKERS] Big 7.1 open items
|
||
Date: Mon, 26 Jun 2000 13:50:55 +0200
|
||
MIME-Version: 1.0
|
||
X-Mailer: Internet Mail Service (5.5.2448.0)
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Status: RO
|
||
|
||
Vadim wrote:
|
||
> Impossible to recover anyway - pg_control keeps last
|
||
> checkpoint pointer, required for recovery.
|
||
|
||
Why not put this info in the tx log itself.
|
||
|
||
> That's why Oracle recommends (requires?) at least
|
||
> two copies of control file ....
|
||
|
||
This is one of the most stupid design issues Oracle has.
|
||
I suggest you look at the tx log design of Informix.
|
||
(No Informix dba fears to pull the power cord on his servers,
|
||
ask the same of an Oracle dba, they even fear
|
||
"shutdown immediate" on a heavily used db)
|
||
|
||
Andreas
|
||
|
||
From ZeugswetterA@wien.spardat.at Mon Jun 26 08:02:07 2000
|
||
Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id IAA07760
|
||
for <pgman@candle.pha.pa.us>; Mon, 26 Jun 2000 08:02:05 -0400 (EDT)
|
||
Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
|
||
by gandalf.it-austria.net (xxx/xxx) with ESMTP id OAA74134;
|
||
Mon, 26 Jun 2000 14:01:17 +0200
|
||
Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
|
||
id <M6F0AMK3>; Mon, 26 Jun 2000 14:01:17 +0200
|
||
Message-ID: <219F68D65015D011A8E000006F8590C605BA5994@sdexcsrv1.f000.d0188.sd.spardat.at>
|
||
From: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
|
||
To: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>,
|
||
"'Mikheev, Vadim'" <vmikheev@SECTORBASE.COM>,
|
||
"'Tom Lane'"
|
||
<tgl@sss.pgh.pa.us>
|
||
Cc: Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Bruce Momjian
|
||
<pgman@candle.pha.pa.us>,
|
||
Peter Eisentraut <peter_e@gmx.net>, Jan Wieck
|
||
<JanWieck@Yahoo.com>,
|
||
Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: AW: [HACKERS] Big 7.1 open items
|
||
Date: Mon, 26 Jun 2000 14:01:15 +0200
|
||
MIME-Version: 1.0
|
||
X-Mailer: Internet Mail Service (5.5.2448.0)
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Status: RO
|
||
|
||
I wrote:
|
||
> Vadim wrote:
|
||
> > Impossible to recover anyway - pg_control keeps last
|
||
> > checkpoint pointer, required for recovery.
|
||
>
|
||
> Why not put this info in the tx log itself.
|
||
>
|
||
> > That's why Oracle recommends (requires?) at least
|
||
> > two copies of control file ....
|
||
>
|
||
> This is one of the most stupid design issues Oracle has.
|
||
|
||
The problem is, that if you want to switch to a no fsync environment,
|
||
(here I also mean the tx log)
|
||
but the possibility of losing a write is still there, you cannot sync
|
||
writes to two or more different files. Only one file, the tx log itself is
|
||
allowed
|
||
to carry lastminute information.
|
||
|
||
Thus you need to txlog changes to pg_control also.
|
||
|
||
Andreas
|
||
|
||
From tgl@sss.pgh.pa.us Mon Jun 26 10:42:08 2000
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA11148
|
||
for <pgman@candle.pha.pa.us>; Mon, 26 Jun 2000 10:42:06 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id KAA17018;
|
||
Mon, 26 Jun 2000 10:42:31 -0400 (EDT)
|
||
To: Zeugswetter Andreas SB <ZeugswetterA@Wien.Spardat.at>
|
||
cc: Hiroshi Inoue <Inoue@tpf.co.jp>, Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
Peter Eisentraut <peter_e@gmx.net>, Jan Wieck <JanWieck@yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>,
|
||
Thomas Lockhart <lockhart@alumni.caltech.edu>
|
||
Subject: Re: [HACKERS] File versioning (was: Big 7.1 open items)
|
||
In-reply-to: <219F68D65015D011A8E000006F8590C605BA598B@sdexcsrv1.f000.d0188.sd.spardat.at>
|
||
References: <219F68D65015D011A8E000006F8590C605BA598B@sdexcsrv1.f000.d0188.sd.spardat.at>
|
||
Comments: In-reply-to Zeugswetter Andreas SB <ZeugswetterA@Wien.Spardat.at>
|
||
message dated "Mon, 26 Jun 2000 10:09:13 +0200"
|
||
Date: Mon, 26 Jun 2000 10:42:31 -0400
|
||
Message-ID: <17015.962030551@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: RO
|
||
|
||
Zeugswetter Andreas SB <ZeugswetterA@Wien.Spardat.at> writes:
|
||
> I do not think the only problem of a failing rename of "temp" to "new"
|
||
> on startup rollforward is issue enough to justify the additional complexity
|
||
> a version implys.
|
||
|
||
If that were the only reason for it then I wouldn't feel it was so
|
||
essential. However, it will also let us fix CLUSTER, vacuuming of
|
||
indexes, ALTER TABLE DROP COLUMN with physical removal of the column,
|
||
etc etc. Making the world safe for rollbackable RENAME/DROP/TRUNCATE
|
||
TABLE is just one of the benefits.
|
||
|
||
Versioning also eliminates a whole host of problems at the bufmgr/smgr
|
||
level that are caused by having to cope with relation files getting
|
||
renamed out from under you. We have painfully eliminated some of these
|
||
problems over the past couple of years by ad-hoc, ugly techniques like
|
||
flushing the buffer cache when doing a rename. But who's to say there
|
||
are not more such bugs left?
|
||
|
||
In short, I think versioning is far *less* complex, not to mention more
|
||
reliable, than the kluges we need to use to work around the lack of it.
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M3879@hub.org Mon Jun 26 18:30:55 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id SAA02022
|
||
for <pgman@candle.pha.pa.us>; Mon, 26 Jun 2000 18:30:54 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5QMMa123238;
|
||
Mon, 26 Jun 2000 18:22:37 -0400 (EDT)
|
||
Received: from sectorbase2.sectorbase.com ([208.48.122.131])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5QMMJ123161
|
||
for <pgsql-hackers@postgresql.org>; Mon, 26 Jun 2000 18:22:19 -0400 (EDT)
|
||
Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21)
|
||
id <NWJ7SNMF>; Mon, 26 Jun 2000 15:13:48 -0700
|
||
Message-ID: <8F4C99C66D04D4118F580090272A7A23018C36@SECTORBASE1>
|
||
From: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
|
||
To: "'Tom Lane'" <tgl@sss.pgh.pa.us>
|
||
Cc: "'Hiroshi Inoue'" <Inoue@tpf.co.jp>,
|
||
Thomas Lockhart
|
||
<lockhart@alumni.caltech.edu>,
|
||
Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
Peter Eisentraut <peter_e@gmx.net>, Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Mon, 26 Jun 2000 15:15:39 -0700
|
||
MIME-Version: 1.0
|
||
X-Mailer: Internet Mail Service (5.5.2650.21)
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
> > Do we need *both* database & tablespace to find table file ?!
|
||
> > Imho, database shouldn't be used...
|
||
>
|
||
> That'd work fine for me, but I think Bruce was arguing for paths that
|
||
> included the database name. We'd end up with paths that go something
|
||
> like
|
||
> ..../data/tablespaces/TABLESPACEOID/RELATIONOID
|
||
> (plus some kind of decoration for segment and version), so you'd have
|
||
> a hard time telling which files in a tablespace belong to which
|
||
> database. Doesn't bother me a whole lot, personally --- if one wants
|
||
|
||
We could create /data/databases/DATABASEOID/ and create soft-links to
|
||
table-files. This way different tables of the same database could be in
|
||
different tablespaces. /data/database path would be used in production
|
||
and /data/tablespace path would be used in recovery.
|
||
|
||
Vadim
|
||
|
||
From vmikheev@SECTORBASE.COM Mon Jun 26 18:21:53 2000
|
||
Received: from sectorbase2.sectorbase.com ([208.48.122.131])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id SAA01888
|
||
for <pgman@candle.pha.pa.us>; Mon, 26 Jun 2000 18:21:52 -0400 (EDT)
|
||
Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21)
|
||
id <NWJ7SNMF>; Mon, 26 Jun 2000 15:13:48 -0700
|
||
Message-ID: <8F4C99C66D04D4118F580090272A7A23018C36@SECTORBASE1>
|
||
From: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
|
||
To: "'Tom Lane'" <tgl@sss.pgh.pa.us>
|
||
Cc: "'Hiroshi Inoue'" <Inoue@tpf.co.jp>,
|
||
Thomas Lockhart
|
||
<lockhart@alumni.caltech.edu>,
|
||
Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
Peter Eisentraut <peter_e@gmx.net>, Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Mon, 26 Jun 2000 15:15:39 -0700
|
||
MIME-Version: 1.0
|
||
X-Mailer: Internet Mail Service (5.5.2650.21)
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Status: RO
|
||
|
||
> > Do we need *both* database & tablespace to find table file ?!
|
||
> > Imho, database shouldn't be used...
|
||
>
|
||
> That'd work fine for me, but I think Bruce was arguing for paths that
|
||
> included the database name. We'd end up with paths that go something
|
||
> like
|
||
> ..../data/tablespaces/TABLESPACEOID/RELATIONOID
|
||
> (plus some kind of decoration for segment and version), so you'd have
|
||
> a hard time telling which files in a tablespace belong to which
|
||
> database. Doesn't bother me a whole lot, personally --- if one wants
|
||
|
||
We could create /data/databases/DATABASEOID/ and create soft-links to
|
||
table-files. This way different tables of the same database could be in
|
||
different tablespaces. /data/database path would be used in production
|
||
and /data/tablespace path would be used in recovery.
|
||
|
||
Vadim
|
||
|
||
From tgl@sss.pgh.pa.us Mon Jun 26 18:47:54 2000
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id SAA02118
|
||
for <pgman@candle.pha.pa.us>; Mon, 26 Jun 2000 18:47:52 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id SAA19579;
|
||
Mon, 26 Jun 2000 18:48:22 -0400 (EDT)
|
||
To: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
|
||
cc: "'Hiroshi Inoue'" <Inoue@tpf.co.jp>,
|
||
Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
Peter Eisentraut <peter_e@gmx.net>, Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <8F4C99C66D04D4118F580090272A7A23018C36@SECTORBASE1>
|
||
References: <8F4C99C66D04D4118F580090272A7A23018C36@SECTORBASE1>
|
||
Comments: In-reply-to "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
|
||
message dated "Mon, 26 Jun 2000 15:15:39 -0700"
|
||
Date: Mon, 26 Jun 2000 18:48:22 -0400
|
||
Message-ID: <19576.962059702@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: RO
|
||
|
||
"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:
|
||
> We could create /data/databases/DATABASEOID/ and create soft-links to
|
||
> table-files. This way different tables of the same database could be in
|
||
> different tablespaces. /data/database path would be used in production
|
||
> and /data/tablespace path would be used in recovery.
|
||
|
||
Why would you want to do it that way? Having a different access path
|
||
for recovery than for normal operation strikes me as just asking for
|
||
trouble ;-)
|
||
|
||
The symlinks wouldn't do any good for what Bruce had in mind anyway
|
||
(IIRC, he wanted to get useful per-database numbers from "du").
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M3888@hub.org Mon Jun 26 23:37:52 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA04481
|
||
for <pgman@candle.pha.pa.us>; Mon, 26 Jun 2000 23:37:51 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5R1nx169365;
|
||
Mon, 26 Jun 2000 21:50:00 -0400 (EDT)
|
||
Received: from sectorbase2.sectorbase.com ([208.48.122.131])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5R1mt169094
|
||
for <pgsql-hackers@postgresql.org>; Mon, 26 Jun 2000 21:48:55 -0400 (EDT)
|
||
Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21)
|
||
id <NWJ7SNZL>; Mon, 26 Jun 2000 18:40:19 -0700
|
||
Message-ID: <8F4C99C66D04D4118F580090272A7A23018C38@SECTORBASE1>
|
||
From: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
|
||
To: "'Tom Lane'" <tgl@sss.pgh.pa.us>
|
||
Cc: "'Hiroshi Inoue'" <Inoue@tpf.co.jp>,
|
||
Thomas Lockhart
|
||
<lockhart@alumni.caltech.edu>,
|
||
Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
Peter Eisentraut <peter_e@gmx.net>, Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Mon, 26 Jun 2000 18:42:10 -0700
|
||
MIME-Version: 1.0
|
||
X-Mailer: Internet Mail Service (5.5.2650.21)
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
> > We could create /data/databases/DATABASEOID/ and create
|
||
> > soft-links to table-files. This way different tables of
|
||
> > the same database could be in different tablespaces.
|
||
> > /data/database path would be used in production
|
||
> > and /data/tablespace path would be used in recovery.
|
||
>
|
||
> Why would you want to do it that way? Having a different access path
|
||
> for recovery than for normal operation strikes me as just asking for
|
||
> trouble ;-)
|
||
|
||
I just think that *databases* (schemas) must be used for *logical* groupping
|
||
of tables, not for *physical* one. "Where to store table" is tablespace'
|
||
related kind of things!
|
||
|
||
> The symlinks wouldn't do any good for what Bruce had in mind anyway
|
||
> (IIRC, he wanted to get useful per-database numbers from "du").
|
||
|
||
Imho, ability to put different tables/indices (of the same database)
|
||
to different tablespaces (disks) is much more useful then ability to
|
||
use du/ls for administration purposes -:)
|
||
|
||
Also, I think that we *must* go away from OS' driven disk space
|
||
allocation anyway. Currently, the way we extend table files breaks WAL
|
||
rule (nothing must go to disk untill logged). + we have to move tuples
|
||
from end of file to top to shrink relation - not perfect way to reuse
|
||
empty space. +... +... +...
|
||
|
||
Vadim
|
||
|
||
From Inoue@tpf.co.jp Tue Jun 27 00:05:13 2000
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA05264
|
||
for <pgman@candle.pha.pa.us>; Tue, 27 Jun 2000 00:05:11 -0400 (EDT)
|
||
Received: from tpf.co.jp ([126.0.1.56] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with ESMTP
|
||
id NAA01123; Tue, 27 Jun 2000 13:04:26 +0900
|
||
Message-ID: <39582880.7565547@tpf.co.jp>
|
||
Date: Tue, 27 Jun 2000 13:07:28 +0900
|
||
From: Hiroshi Inoue <Inoue@tpf.co.jp>
|
||
X-Mailer: Mozilla 4.73 [ja] (Windows NT 5.0; U)
|
||
X-Accept-Language: ja
|
||
MIME-Version: 1.0
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
CC: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>,
|
||
Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
Peter Eisentraut <peter_e@gmx.net>, Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
References: <8F4C99C66D04D4118F580090272A7A23018C36@SECTORBASE1> <19576.962059702@sss.pgh.pa.us>
|
||
Content-Type: text/plain; charset=iso-2022-jp
|
||
Content-Transfer-Encoding: 7bit
|
||
Status: ROr
|
||
|
||
Tom Lane wrote:
|
||
|
||
>
|
||
> The symlinks wouldn't do any good for what Bruce had in mind anyway
|
||
> (IIRC, he wanted to get useful per-database numbers from "du").
|
||
|
||
Our database design seems to be in the opposite direction
|
||
if it is restricted for the convenience of command calls.
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
|
||
|
||
From pgsql-hackers-owner+M3892@hub.org Tue Jun 27 00:14:24 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA05478
|
||
for <pgman@candle.pha.pa.us>; Tue, 27 Jun 2000 00:14:23 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5R46J182392;
|
||
Tue, 27 Jun 2000 00:06:20 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5R466180629
|
||
for <pgsql-hackers@postgresql.org>; Tue, 27 Jun 2000 00:06:06 -0400 (EDT)
|
||
Received: from tpf.co.jp ([126.0.1.56] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with ESMTP
|
||
id NAA01123; Tue, 27 Jun 2000 13:04:26 +0900
|
||
Message-ID: <39582880.7565547@tpf.co.jp>
|
||
Date: Tue, 27 Jun 2000 13:07:28 +0900
|
||
From: Hiroshi Inoue <Inoue@tpf.co.jp>
|
||
X-Mailer: Mozilla 4.73 [ja] (Windows NT 5.0; U)
|
||
X-Accept-Language: ja
|
||
MIME-Version: 1.0
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
CC: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>,
|
||
Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
Peter Eisentraut <peter_e@gmx.net>, Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
References: <8F4C99C66D04D4118F580090272A7A23018C36@SECTORBASE1> <19576.962059702@sss.pgh.pa.us>
|
||
Content-Type: text/plain; charset=iso-2022-jp
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
Tom Lane wrote:
|
||
|
||
>
|
||
> The symlinks wouldn't do any good for what Bruce had in mind anyway
|
||
> (IIRC, he wanted to get useful per-database numbers from "du").
|
||
|
||
Our database design seems to be in the opposite direction
|
||
if it is restricted for the convenience of command calls.
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
|
||
|
||
From pgsql-hackers-owner+M3905@hub.org Tue Jun 27 10:07:49 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA21305
|
||
for <pgman@candle.pha.pa.us>; Tue, 27 Jun 2000 10:07:48 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5RDUh185923;
|
||
Tue, 27 Jun 2000 09:30:43 -0400 (EDT)
|
||
Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5RDTB183147
|
||
for <pgsql-hackers@postgresql.org>; Tue, 27 Jun 2000 09:29:12 -0400 (EDT)
|
||
Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
|
||
by gandalf.it-austria.net (xxx/xxx) with ESMTP id PAA41830;
|
||
Tue, 27 Jun 2000 15:27:07 +0200
|
||
Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
|
||
id <M6F0AW8N>; Tue, 27 Jun 2000 15:27:06 +0200
|
||
Message-ID: <219F68D65015D011A8E000006F8590C605BA5999@sdexcsrv1.f000.d0188.sd.spardat.at>
|
||
From: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
|
||
To: "'Tom Lane'" <tgl@sss.pgh.pa.us>,
|
||
"Mikheev, Vadim"
|
||
<vmikheev@SECTORBASE.COM>
|
||
Cc: "'Hiroshi Inoue'" <Inoue@tpf.co.jp>,
|
||
Thomas Lockhart
|
||
<lockhart@alumni.caltech.edu>,
|
||
Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
Peter Eisentraut <peter_e@gmx.net>, Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: AW: [HACKERS] Big 7.1 open items
|
||
Date: Tue, 27 Jun 2000 15:27:03 +0200
|
||
MIME-Version: 1.0
|
||
X-Mailer: Internet Mail Service (5.5.2448.0)
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
|
||
> That'd work fine for me, but I think Bruce was arguing for paths that
|
||
> included the database name. We'd end up with paths that go something
|
||
> like
|
||
> ..../data/tablespaces/TABLESPACEOID/RELATIONOID
|
||
> (plus some kind of decoration for segment and version), so you'd have
|
||
> a hard time telling which files in a tablespace belong to which
|
||
> database.
|
||
|
||
Well ,as long as we have the file per object layout it probably makes sense
|
||
to
|
||
have "speaking paths", But I see no real problem with:
|
||
|
||
..../data/tablespacename/dbname/RELATIONOID[.dat|.idx]
|
||
|
||
RELATIONOID standing for whatever the consensus will be.
|
||
I do not really see an argument for using a tablespaceoid instead of
|
||
it's [maybe mangled] name.
|
||
|
||
Andreas
|
||
|
||
From pgsql-hackers-owner+M3912@hub.org Tue Jun 27 10:28:39 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA21468
|
||
for <pgman@candle.pha.pa.us>; Tue, 27 Jun 2000 10:28:38 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5REOa111784;
|
||
Tue, 27 Jun 2000 10:24:36 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5REOG109445
|
||
for <pgsql-hackers@postgresql.org>; Tue, 27 Jun 2000 10:24:16 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id KAA09575;
|
||
Tue, 27 Jun 2000 10:23:48 -0400 (EDT)
|
||
To: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
|
||
cc: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>,
|
||
"'Hiroshi Inoue'" <Inoue@tpf.co.jp>,
|
||
Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
Peter Eisentraut <peter_e@gmx.net>, Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: AW: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <219F68D65015D011A8E000006F8590C605BA5999@sdexcsrv1.f000.d0188.sd.spardat.at>
|
||
References: <219F68D65015D011A8E000006F8590C605BA5999@sdexcsrv1.f000.d0188.sd.spardat.at>
|
||
Comments: In-reply-to Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
|
||
message dated "Tue, 27 Jun 2000 15:27:03 +0200"
|
||
Date: Tue, 27 Jun 2000 10:23:48 -0400
|
||
Message-ID: <9572.962115828@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at> writes:
|
||
> I do not really see an argument for using a tablespaceoid instead of
|
||
> it's [maybe mangled] name.
|
||
|
||
Eliminating filesystem-based restrictions on names, for one.
|
||
For example we'd not have to forbid slashes and (probably) backquotes
|
||
in tablespace names if we did this, and we'd not have to worry about
|
||
filesystem-induced limits on name lengths. Renaming a tablespace
|
||
would also be trivial instead of nigh impossible.
|
||
|
||
It might be that using tablespace names as directory names is worth
|
||
enough from the admin point of view to make the above restrictions
|
||
acceptable. But it's a tradeoff, and not one with an obvious choice
|
||
IMHO.
|
||
|
||
regards, tom lane
|
||
|
||
From vmikheev@SECTORBASE.COM Tue Jun 27 14:01:08 2000
|
||
Received: from sectorbase2.sectorbase.com ([208.48.122.131])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA28715
|
||
for <pgman@candle.pha.pa.us>; Tue, 27 Jun 2000 14:01:07 -0400 (EDT)
|
||
Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21)
|
||
id <NYA9PA2Z>; Tue, 27 Jun 2000 10:53:03 -0700
|
||
Message-ID: <8F4C99C66D04D4118F580090272A7A23018C39@SECTORBASE1>
|
||
From: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
|
||
To: "'Bruce Momjian'" <pgman@candle.pha.pa.us>,
|
||
Hiroshi Inoue
|
||
<Inoue@tpf.co.jp>
|
||
Cc: Tom Lane <tgl@sss.pgh.pa.us>,
|
||
Thomas Lockhart
|
||
<lockhart@alumni.caltech.edu>,
|
||
Peter Eisentraut <peter_e@gmx.net>, Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development
|
||
<pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Tue, 27 Jun 2000 10:54:55 -0700
|
||
MIME-Version: 1.0
|
||
X-Mailer: Internet Mail Service (5.5.2650.21)
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Status: ROr
|
||
|
||
> > > The symlinks wouldn't do any good for what Bruce had in
|
||
> > > mind anyway (IIRC, he wanted to get useful per-database
|
||
> > > numbers from "du").
|
||
> >
|
||
> > Our database design seems to be in the opposite direction
|
||
> > if it is restricted for the convenience of command calls.
|
||
>
|
||
> Well, I don't see any reason not to use tablespace/database
|
||
> rather than just tablespace. Seems having fewer files in each directory
|
||
|
||
Once again - ability to use different tablespaces (disks) for tables/indices
|
||
in the same schema. Schemas must not dictate where to store objects <-
|
||
bad design.
|
||
|
||
> will be a little faster, and if we can make administration easier,
|
||
> why not?
|
||
|
||
Because you'll not be able use du/ls once we'll implement new smgr anyway.
|
||
|
||
And, btw, - for what are we going implement tablespaces? Just to have
|
||
fewer files in each dir ?!
|
||
|
||
Vadim
|
||
|
||
From pgsql-hackers-owner+M3925@hub.org Tue Jun 27 14:03:35 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA28748
|
||
for <pgman@candle.pha.pa.us>; Tue, 27 Jun 2000 14:03:34 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5RI1h139788;
|
||
Tue, 27 Jun 2000 14:01:44 -0400 (EDT)
|
||
Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5RI1I138791
|
||
for <pgsql-hackers@postgresql.org>; Tue, 27 Jun 2000 14:01:18 -0400 (EDT)
|
||
Received: from regulus.student.UU.SE ([130.238.5.2]:59174 "EHLO
|
||
regulus.its.uu.se") by merganser.its.uu.se with ESMTP
|
||
id <S145539AbQF0SAu>; Tue, 27 Jun 2000 20:00:50 +0200
|
||
Received: from peter (helo=localhost)
|
||
by regulus.its.uu.se with local-esmtp (Exim 3.02 #2)
|
||
id 136zlm-0003zn-00; Tue, 27 Jun 2000 20:07:34 +0200
|
||
Date: Tue, 27 Jun 2000 20:07:34 +0200 (CEST)
|
||
From: Peter Eisentraut <peter_e@gmx.net>
|
||
To: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
|
||
cc: "'Hiroshi Inoue'" <Inoue@tpf.co.jp>, "'Tom Lane'" <tgl@sss.pgh.pa.us>,
|
||
Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Bruce Momjian <pgman@candle.pha.pa.us>, Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
In-Reply-To: <8F4C99C66D04D4118F580090272A7A23018C35@SECTORBASE1>
|
||
Message-ID: <Pine.LNX.4.21.0006270326410.9749-100000@localhost.localdomain>
|
||
MIME-Version: 1.0
|
||
Content-Type: TEXT/PLAIN; charset=ISO-8859-1
|
||
Content-Transfer-Encoding: 8BIT
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
Mikheev, Vadim writes:
|
||
|
||
> Do we need *both* database & tablespace to find table file ?!
|
||
> Imho, database shouldn't be used...
|
||
|
||
Then the system tables from different databases would collide.
|
||
|
||
|
||
--
|
||
Peter Eisentraut Sernanders v<>g 10:115
|
||
peter_e@gmx.net 75262 Uppsala
|
||
http://yi.org/peter-e/ Sweden
|
||
|
||
|
||
From vmikheev@SECTORBASE.COM Tue Jun 27 15:28:25 2000
|
||
Received: from sectorbase2.sectorbase.com ([208.48.122.131])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id PAA04820
|
||
for <pgman@candle.pha.pa.us>; Tue, 27 Jun 2000 15:28:24 -0400 (EDT)
|
||
Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21)
|
||
id <NYA9PARR>; Tue, 27 Jun 2000 12:20:20 -0700
|
||
Message-ID: <8F4C99C66D04D4118F580090272A7A23018C3A@SECTORBASE1>
|
||
From: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
|
||
To: "'Bruce Momjian'" <pgman@candle.pha.pa.us>
|
||
Cc: Hiroshi Inoue <Inoue@tpf.co.jp>, Tom Lane <tgl@sss.pgh.pa.us>,
|
||
Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Peter Eisentraut
|
||
<peter_e@gmx.net>, Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Tue, 27 Jun 2000 12:22:13 -0700
|
||
MIME-Version: 1.0
|
||
X-Mailer: Internet Mail Service (5.5.2650.21)
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Status: ROr
|
||
|
||
> > > Well, I don't see any reason not to use tablespace/database
|
||
> > > rather than just tablespace. Seems having fewer files in
|
||
> > > each directory
|
||
> >
|
||
> > Once again - ability to use different tablespaces (disks)
|
||
> > for tables/indices in the same schema. Schemas must not dictate
|
||
> > where to store objects <- bad design.
|
||
>
|
||
> I am suggesting this symlink:
|
||
>
|
||
> ln -s data/base/testdb/myspace /var/myspace/testdb
|
||
>
|
||
> rather than:
|
||
>
|
||
> ln -s data/base/testdb/myspace /var/myspace
|
||
>
|
||
> Tablespaces still sit inside database directories, it is just that it
|
||
> points to a subdirectory of myspace, rather than myspace itself.
|
||
^^^^^^^^^^^
|
||
|
||
Didn't you mean
|
||
|
||
ln -s /var/myspace/testdb data/base/testdb/myspace
|
||
|
||
?
|
||
|
||
I thought that you don't like symlinks from data/base/... This is
|
||
how I understood Tom' words:
|
||
|
||
> The symlinks wouldn't do any good for what Bruce had in mind anyway
|
||
> (IIRC, he wanted to get useful per-database numbers from "du").
|
||
|
||
Vadim
|
||
|
||
From vmikheev@SECTORBASE.COM Tue Jun 27 15:43:31 2000
|
||
Received: from sectorbase2.sectorbase.com ([208.48.122.131])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id PAA05148
|
||
for <pgman@candle.pha.pa.us>; Tue, 27 Jun 2000 15:43:30 -0400 (EDT)
|
||
Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21)
|
||
id <NYA9PASW>; Tue, 27 Jun 2000 12:35:41 -0700
|
||
Message-ID: <8F4C99C66D04D4118F580090272A7A23018C3C@SECTORBASE1>
|
||
From: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
|
||
To: "'Bruce Momjian'" <pgman@candle.pha.pa.us>
|
||
Cc: "'Peter Eisentraut'" <peter_e@gmx.net>,
|
||
"'Hiroshi Inoue'"
|
||
<Inoue@tpf.co.jp>,
|
||
"'Tom Lane'" <tgl@sss.pgh.pa.us>,
|
||
Thomas Lockhart
|
||
<lockhart@alumni.caltech.edu>,
|
||
Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Tue, 27 Jun 2000 12:37:34 -0700
|
||
MIME-Version: 1.0
|
||
X-Mailer: Internet Mail Service (5.5.2650.21)
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Status: ROr
|
||
|
||
> > > Then the system tables from different databases would collide.
|
||
> >
|
||
> > Actually, if we're going to use unique-ids for file names
|
||
> > then we have to know how to get system file names anyway.
|
||
> > Hm, OID+VERSION would make our life easier... Hiroshi?
|
||
>
|
||
> I assume we were going to have a pg_class.relversion to do that, but
|
||
^^^^^^^^
|
||
PG_CLASS_OID.VERSION_ID...
|
||
|
||
Just a clarification -:)
|
||
|
||
> that is per-database because pg_class is per-database.
|
||
|
||
Vadim
|
||
|
||
From vmikheev@SECTORBASE.COM Tue Jun 27 15:48:31 2000
|
||
Received: from sectorbase2.sectorbase.com ([208.48.122.131])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id PAA05452
|
||
for <pgman@candle.pha.pa.us>; Tue, 27 Jun 2000 15:48:30 -0400 (EDT)
|
||
Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21)
|
||
id <NYA9PATN>; Tue, 27 Jun 2000 12:40:42 -0700
|
||
Message-ID: <8F4C99C66D04D4118F580090272A7A23018C3D@SECTORBASE1>
|
||
From: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
|
||
To: "'Bruce Momjian'" <pgman@candle.pha.pa.us>
|
||
Cc: "'Peter Eisentraut'" <peter_e@gmx.net>,
|
||
"'Hiroshi Inoue'"
|
||
<Inoue@tpf.co.jp>,
|
||
"'Tom Lane'" <tgl@sss.pgh.pa.us>,
|
||
Thomas Lockhart
|
||
<lockhart@alumni.caltech.edu>,
|
||
Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: RE: [HACKERS] Big 7.1 open items
|
||
Date: Tue, 27 Jun 2000 12:42:35 -0700
|
||
MIME-Version: 1.0
|
||
X-Mailer: Internet Mail Service (5.5.2650.21)
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Status: ROr
|
||
|
||
> I actually meant I thought we were going to have a pg_class column
|
||
> called relversion that held the currently active version for that
|
||
> relation.
|
||
>
|
||
> Yes, the file name will be pg_class_oid.version_id.
|
||
>
|
||
> Is that OK?
|
||
|
||
We recently discussed pure *unique-id* file names...
|
||
|
||
Vadim
|
||
|
||
|
||
From pgsql-hackers-owner+M3939@hub.org Tue Jun 27 17:03:33 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA08565
|
||
for <pgman@candle.pha.pa.us>; Tue, 27 Jun 2000 17:03:32 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5RL2B155891;
|
||
Tue, 27 Jun 2000 17:02:11 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5RL10155419
|
||
for <pgsql-hackers@postgresql.org>; Tue, 27 Jun 2000 17:01:00 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id RAA11135;
|
||
Tue, 27 Jun 2000 17:00:12 -0400 (EDT)
|
||
To: Peter Eisentraut <peter_e@gmx.net>
|
||
cc: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>,
|
||
"'Hiroshi Inoue'" <Inoue@tpf.co.jp>,
|
||
Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Bruce Momjian <pgman@candle.pha.pa.us>, Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <Pine.LNX.4.21.0006270326410.9749-100000@localhost.localdomain>
|
||
References: <Pine.LNX.4.21.0006270326410.9749-100000@localhost.localdomain>
|
||
Comments: In-reply-to Peter Eisentraut <peter_e@gmx.net>
|
||
message dated "Tue, 27 Jun 2000 20:07:34 +0200"
|
||
Date: Tue, 27 Jun 2000 17:00:11 -0400
|
||
Message-ID: <11132.962139611@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
Peter Eisentraut <peter_e@gmx.net> writes:
|
||
> Mikheev, Vadim writes:
|
||
>> Do we need *both* database & tablespace to find table file ?!
|
||
>> Imho, database shouldn't be used...
|
||
|
||
> Then the system tables from different databases would collide.
|
||
|
||
I've been assuming that we would create a separate tablespace for
|
||
each database, which would be the location of that database's
|
||
system tables. It's probably also the default tablespace for user
|
||
tables created in that database, though it wouldn't have to be.
|
||
|
||
There should also be a known tablespace for the installation-wide tables
|
||
(pg_shadow et al).
|
||
|
||
With this approach tablespace+relation would indeed be a sufficient
|
||
identifier. We could even eliminate the knowledge that certain
|
||
tables are installation-wide from the bufmgr and below (currently
|
||
that knowledge is hardwired in places that I'd rather didn't know
|
||
about it...)
|
||
|
||
regards, tom lane
|
||
|
||
From tgl@sss.pgh.pa.us Tue Jun 27 17:00:13 2000
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA08435
|
||
for <pgman@candle.pha.pa.us>; Tue, 27 Jun 2000 17:00:12 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id RAA11135;
|
||
Tue, 27 Jun 2000 17:00:12 -0400 (EDT)
|
||
To: Peter Eisentraut <peter_e@gmx.net>
|
||
cc: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>,
|
||
"'Hiroshi Inoue'" <Inoue@tpf.co.jp>,
|
||
Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Bruce Momjian <pgman@candle.pha.pa.us>, Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <Pine.LNX.4.21.0006270326410.9749-100000@localhost.localdomain>
|
||
References: <Pine.LNX.4.21.0006270326410.9749-100000@localhost.localdomain>
|
||
Comments: In-reply-to Peter Eisentraut <peter_e@gmx.net>
|
||
message dated "Tue, 27 Jun 2000 20:07:34 +0200"
|
||
Date: Tue, 27 Jun 2000 17:00:11 -0400
|
||
Message-ID: <11132.962139611@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: ROr
|
||
|
||
Peter Eisentraut <peter_e@gmx.net> writes:
|
||
> Mikheev, Vadim writes:
|
||
>> Do we need *both* database & tablespace to find table file ?!
|
||
>> Imho, database shouldn't be used...
|
||
|
||
> Then the system tables from different databases would collide.
|
||
|
||
I've been assuming that we would create a separate tablespace for
|
||
each database, which would be the location of that database's
|
||
system tables. It's probably also the default tablespace for user
|
||
tables created in that database, though it wouldn't have to be.
|
||
|
||
There should also be a known tablespace for the installation-wide tables
|
||
(pg_shadow et al).
|
||
|
||
With this approach tablespace+relation would indeed be a sufficient
|
||
identifier. We could even eliminate the knowledge that certain
|
||
tables are installation-wide from the bufmgr and below (currently
|
||
that knowledge is hardwired in places that I'd rather didn't know
|
||
about it...)
|
||
|
||
regards, tom lane
|
||
|
||
From tgl@sss.pgh.pa.us Tue Jun 27 17:18:49 2000
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA09638
|
||
for <pgman@candle.pha.pa.us>; Tue, 27 Jun 2000 17:18:48 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id RAA11377;
|
||
Tue, 27 Jun 2000 17:19:31 -0400 (EDT)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>,
|
||
"'Peter Eisentraut'" <peter_e@gmx.net>,
|
||
"'Hiroshi Inoue'" <Inoue@tpf.co.jp>,
|
||
Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006271952.PAA05609@candle.pha.pa.us>
|
||
References: <200006271952.PAA05609@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Tue, 27 Jun 2000 15:52:40 -0400"
|
||
Date: Tue, 27 Jun 2000 17:19:31 -0400
|
||
Message-ID: <11374.962140771@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: ROr
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> Well, that would allow us to mix database files in the same directory,
|
||
> if we wanted to do that. My opinion it is better to keep databases in
|
||
> separate directories in each tablespace for clarity and performance
|
||
> reasons.
|
||
|
||
One reason not to do that is that we'd still have to special-case
|
||
the system-wide relations. If it's just tablespace and OID in the
|
||
path, then the system-wide rels look just the same as any other rel
|
||
as far as the low-level stuff is concerned. That would be nice.
|
||
|
||
My feeling about the "clarity and performance" issue is that if a
|
||
dbadmin wants to keep track of database contents separately, he can
|
||
put different databases' tables into different tablespaces to start
|
||
with. If he puts several tables into one tablespace, he's saying
|
||
he doesn't care about distinguishing their space usage. There's
|
||
no reason for us to force an additional level of directory lookup
|
||
to be done whether the admin wants it or not.
|
||
|
||
regards, tom lane
|
||
|
||
From tgl@sss.pgh.pa.us Tue Jun 27 17:29:35 2000
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA09909
|
||
for <pgman@candle.pha.pa.us>; Tue, 27 Jun 2000 17:29:33 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id RAA13026;
|
||
Tue, 27 Jun 2000 17:30:18 -0400 (EDT)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>,
|
||
"'Peter Eisentraut'" <peter_e@gmx.net>,
|
||
"'Hiroshi Inoue'" <Inoue@tpf.co.jp>,
|
||
Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006272123.RAA09720@candle.pha.pa.us>
|
||
References: <200006272123.RAA09720@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Tue, 27 Jun 2000 17:23:49 -0400"
|
||
Date: Tue, 27 Jun 2000 17:30:17 -0400
|
||
Message-ID: <13018.962141417@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: RO
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> Yes, good point about pg_shadow. They don't have databases. How do we
|
||
> get multiple pg_class tables in the same directory? Is the
|
||
> pg_class.relversion file a number like 1,2,3,4, or does it come out of
|
||
> some global counter like oid. If so, we could put them in the same
|
||
> directory.
|
||
|
||
I think we could get away with insisting that each database store its
|
||
pg_class and friends in a separate tablespace (physically distinct
|
||
directory) from any other database. That gets around the OID conflict.
|
||
|
||
It's still an open question whether OID+version is better than
|
||
unique-ID for naming files that belong to different versions of the
|
||
same relation. I can see arguments on both sides.
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M3944@hub.org Tue Jun 27 17:33:05 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA09986
|
||
for <pgman@candle.pha.pa.us>; Tue, 27 Jun 2000 17:33:04 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5RLV7124097;
|
||
Tue, 27 Jun 2000 17:31:07 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5RLUn123949
|
||
for <pgsql-hackers@postgresql.org>; Tue, 27 Jun 2000 17:30:49 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id RAA13026;
|
||
Tue, 27 Jun 2000 17:30:18 -0400 (EDT)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>,
|
||
"'Peter Eisentraut'" <peter_e@gmx.net>,
|
||
"'Hiroshi Inoue'" <Inoue@tpf.co.jp>,
|
||
Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006272123.RAA09720@candle.pha.pa.us>
|
||
References: <200006272123.RAA09720@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Tue, 27 Jun 2000 17:23:49 -0400"
|
||
Date: Tue, 27 Jun 2000 17:30:17 -0400
|
||
Message-ID: <13018.962141417@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> Yes, good point about pg_shadow. They don't have databases. How do we
|
||
> get multiple pg_class tables in the same directory? Is the
|
||
> pg_class.relversion file a number like 1,2,3,4, or does it come out of
|
||
> some global counter like oid. If so, we could put them in the same
|
||
> directory.
|
||
|
||
I think we could get away with insisting that each database store its
|
||
pg_class and friends in a separate tablespace (physically distinct
|
||
directory) from any other database. That gets around the OID conflict.
|
||
|
||
It's still an open question whether OID+version is better than
|
||
unique-ID for naming files that belong to different versions of the
|
||
same relation. I can see arguments on both sides.
|
||
|
||
regards, tom lane
|
||
|
||
From Inoue@tpf.co.jp Tue Jun 27 19:13:30 2000
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA12791
|
||
for <pgman@candle.pha.pa.us>; Tue, 27 Jun 2000 19:13:28 -0400 (EDT)
|
||
Received: from tpf.co.jp ([126.0.1.56] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with ESMTP
|
||
id IAA01830; Wed, 28 Jun 2000 08:13:26 +0900
|
||
Message-ID: <395935CB.2CC10452@tpf.co.jp>
|
||
Date: Wed, 28 Jun 2000 08:16:27 +0900
|
||
From: Hiroshi Inoue <Inoue@tpf.co.jp>
|
||
X-Mailer: Mozilla 4.73 [ja] (Windows NT 5.0; U)
|
||
X-Accept-Language: ja
|
||
MIME-Version: 1.0
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
CC: Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
"Mikheev, Vadim" <vmikheev@SECTORBASE.COM>,
|
||
"'Peter Eisentraut'" <peter_e@gmx.net>,
|
||
Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
References: <200006272123.RAA09720@candle.pha.pa.us> <13018.962141417@sss.pgh.pa.us>
|
||
Content-Type: text/plain; charset=iso-2022-jp
|
||
Content-Transfer-Encoding: 7bit
|
||
Status: RO
|
||
|
||
Tom Lane wrote:
|
||
|
||
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> > Yes, good point about pg_shadow. They don't have databases. How do we
|
||
> > get multiple pg_class tables in the same directory? Is the
|
||
> > pg_class.relversion file a number like 1,2,3,4, or does it come out of
|
||
> > some global counter like oid. If so, we could put them in the same
|
||
> > directory.
|
||
>
|
||
> I think we could get away with insisting that each database store its
|
||
> pg_class and friends in a separate tablespace (physically distinct
|
||
> directory) from any other database. That gets around the OID conflict.
|
||
>
|
||
> It's still an open question whether OID+version is better than
|
||
> unique-ID for naming files that belong to different versions of the
|
||
> same relation. I can see arguments on both sides.
|
||
>
|
||
|
||
I don't stick to unique-ID. My main point has always been the
|
||
transactional control of file allocation change.
|
||
However *VERSION(_ID)* may be misleading because it couldn't
|
||
mean the version of pg_class tuples.
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
|
||
|
||
From tgl@sss.pgh.pa.us Wed Jun 28 12:10:59 2000
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA11316
|
||
for <pgman@candle.pha.pa.us>; Wed, 28 Jun 2000 12:10:58 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA15790;
|
||
Wed, 28 Jun 2000 12:11:40 -0400 (EDT)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>,
|
||
"'Peter Eisentraut'" <peter_e@gmx.net>,
|
||
"'Hiroshi Inoue'" <Inoue@tpf.co.jp>,
|
||
Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <200006281425.KAA05633@candle.pha.pa.us>
|
||
References: <200006281425.KAA05633@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Wed, 28 Jun 2000 10:25:21 -0400"
|
||
Date: Wed, 28 Jun 2000 12:11:40 -0400
|
||
Message-ID: <15787.962208700@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: ROr
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> If we put multiple database tables in the same directory, have we
|
||
> considered how to drop databases? Right now we do rm -rf:
|
||
|
||
rm -rf will no longer work in a tablespaces environment anyway.
|
||
(Even if you kept symlinks underneath the DB directory, rm -rf
|
||
wouldn't follow them.)
|
||
|
||
DROP DATABASE will have to be implemented honestly: run through
|
||
pg_class and do a regular DROP on each user table.
|
||
|
||
Once you've got rid of the user tables, rm -rf should suffice to
|
||
get rid of the "home tablespace" as I've been calling it, with
|
||
all the system tables therein.
|
||
|
||
Now that you mention it, this is another reason why system tables for
|
||
each database have to live in a separate tablespace directory: there's
|
||
no other good way to do that final stage of DROP DATABASE. The
|
||
DROP-each-table approach doesn't work for system tables (somewhere along
|
||
about the point where you drop pg_attribute, DROP TABLE itself would
|
||
stop working ;-)).
|
||
|
||
However I do see a bit of a problem here: since DROP DATABASE is
|
||
ordinarily executed by a backend that's running in a different database,
|
||
how's it going to read pg_class of the target database? Perhaps it will
|
||
be necessary to fire up a sub-backend that runs in the target DB for
|
||
long enough to kill all the user tables. Looking messy...
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M3998@hub.org Wed Jun 28 19:53:28 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA27612
|
||
for <pgman@candle.pha.pa.us>; Wed, 28 Jun 2000 19:53:27 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5SNqG142069;
|
||
Wed, 28 Jun 2000 19:52:17 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5SNp7137729
|
||
for <pgsql-hackers@postgresql.org>; Wed, 28 Jun 2000 19:51:07 -0400 (EDT)
|
||
Received: from tpf.co.jp ([126.0.1.56] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with ESMTP
|
||
id IAA03041; Thu, 29 Jun 2000 08:50:01 +0900
|
||
Message-ID: <395A8FDF.1132EC6D@tpf.co.jp>
|
||
Date: Thu, 29 Jun 2000 08:53:03 +0900
|
||
From: Hiroshi Inoue <Inoue@tpf.co.jp>
|
||
X-Mailer: Mozilla 4.73 [ja] (Windows NT 5.0; U)
|
||
X-Accept-Language: ja
|
||
MIME-Version: 1.0
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
CC: Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
"Mikheev, Vadim" <vmikheev@SECTORBASE.COM>,
|
||
"'Peter Eisentraut'" <peter_e@gmx.net>,
|
||
Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
References: <EKEJJICOHDIEMGPNIFIJGEHCCCAA.Inoue@tpf.co.jp> <16404.962213972@sss.pgh.pa.us>
|
||
Content-Type: text/plain; charset=iso-2022-jp
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
Tom Lane wrote:
|
||
|
||
> "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
|
||
> > Why do we have to have system tables per *database* ?
|
||
> > Is there anything wrong with global system tables ?
|
||
> > And how about adding dbid to pg_class,pg_proc etc ?
|
||
>
|
||
> We could, but I think I'd vote against it on two grounds:
|
||
>
|
||
> 1. Reliability. If something corrupts pg_class, do you want to
|
||
> lose your whole installation, or just one database?
|
||
>
|
||
> 2. Increased locking overhead/loss of concurrency. Currently, there
|
||
> is very little lock contention between backends running in different
|
||
> databases. A shared pg_class will be a single point of locking (as
|
||
> well as a single point of failure) for the whole installation.
|
||
|
||
Isn't current design of PG's *database* for dropdb using "rm -rf"
|
||
rather than for above 1.2. ?
|
||
If we couldn't rely on our db itself and our locking mechanism is
|
||
poor,we could start different postmasters for different *database*s.
|
||
|
||
|
||
> It would solve the DROP DATABASE problem kind of nicely, but really
|
||
> it'd just be downgrading DROP DATABASE to a DROP SCHEMA operation...
|
||
>
|
||
|
||
What is our *DATABASE* ?
|
||
Is it clear to all people ?
|
||
At least it's a vague concept for me.
|
||
Could you please tell me what kind of objects are our *DATABASE*
|
||
objects but could not be schema objects ?
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
|
||
|
||
|
||
From pgsql-hackers-owner+M4003@hub.org Thu Jun 29 10:41:19 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA28321
|
||
for <pgman@candle.pha.pa.us>; Thu, 29 Jun 2000 10:39:57 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5T7nr158743;
|
||
Thu, 29 Jun 2000 03:49:53 -0400 (EDT)
|
||
Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5T7io146030
|
||
for <pgsql-hackers@postgresql.org>; Thu, 29 Jun 2000 03:44:51 -0400 (EDT)
|
||
Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
|
||
by gandalf.it-austria.net (xxx/xxx) with ESMTP id JAA46266;
|
||
Thu, 29 Jun 2000 09:43:20 +0200
|
||
Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
|
||
id <M6F0BFSR>; Thu, 29 Jun 2000 09:43:20 +0200
|
||
Message-ID: <219F68D65015D011A8E000006F8590C605BA59A8@sdexcsrv1.f000.d0188.sd.spardat.at>
|
||
From: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
|
||
To: "'Bruce Momjian'" <pgman@candle.pha.pa.us>
|
||
Cc: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>,
|
||
Hiroshi Inoue
|
||
<Inoue@tpf.co.jp>, Tom Lane <tgl@sss.pgh.pa.us>,
|
||
Thomas Lockhart
|
||
<lockhart@alumni.caltech.edu>,
|
||
Peter Eisentraut <peter_e@gmx.net>, Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development
|
||
<pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: AW: AW: [HACKERS] Big 7.1 open items
|
||
Date: Thu, 29 Jun 2000 09:43:14 +0200
|
||
MIME-Version: 1.0
|
||
X-Mailer: Internet Mail Service (5.5.2448.0)
|
||
Content-Type: text/plain;
|
||
charset="windows-1252"
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
|
||
> > ln -s data/base/testdb/myspace/extent1 /var/myspace/extent1/testdb
|
||
>
|
||
> The idea was to put the main files in the directory, and create Extent2,
|
||
> Extent3 directories for the extents.
|
||
|
||
The reasoning was, that the database subdir should be below the extentdir,
|
||
so that creating different fs for each extent would be easier, and not
|
||
depend
|
||
on the database name.
|
||
|
||
It is easy to create fs for:
|
||
/var/myspace
|
||
or
|
||
/var/myspace[/extent1]
|
||
/var/myspace/extent2
|
||
but not if it has dbname in it.
|
||
|
||
Andreas
|
||
|
||
From ZeugswetterA@wien.spardat.at Thu Jun 29 06:34:49 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id GAA25201
|
||
for <pgman@candle.pha.pa.us>; Thu, 29 Jun 2000 06:34:44 -0400 (EDT)
|
||
Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id GAA00379 for <pgman@candle.pha.pa.us>; Thu, 29 Jun 2000 06:35:30 -0400 (EDT)
|
||
Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
|
||
by gandalf.it-austria.net (xxx/xxx) with ESMTP id MAA33950;
|
||
Thu, 29 Jun 2000 12:33:42 +0200
|
||
Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
|
||
id <M6F0BHT4>; Thu, 29 Jun 2000 12:33:42 +0200
|
||
Message-ID: <219F68D65015D011A8E000006F8590C605BA59AC@sdexcsrv1.f000.d0188.sd.spardat.at>
|
||
From: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
|
||
To: "'Tom Lane'" <tgl@sss.pgh.pa.us>
|
||
Cc: "'Bruce Momjian'" <pgman@candle.pha.pa.us>,
|
||
Peter Eisentraut
|
||
<peter_e@gmx.net>,
|
||
"Mikheev, Vadim" <vmikheev@SECTORBASE.COM>,
|
||
"'Hiroshi Inoue'" <Inoue@tpf.co.jp>,
|
||
Thomas Lockhart
|
||
<lockhart@alumni.caltech.edu>,
|
||
Jan Wieck <JanWieck@yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: AW: AW: [HACKERS] Big 7.1 open items
|
||
Date: Thu, 29 Jun 2000 12:33:39 +0200
|
||
MIME-Version: 1.0
|
||
X-Mailer: Internet Mail Service (5.5.2448.0)
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Status: RO
|
||
|
||
|
||
> > > I think I would prefer the ability to place more than one
|
||
> > database into
|
||
> > > the same tablespace.
|
||
> >
|
||
> > You can put user tables from multiple databases into the same
|
||
> > tablespace, under this proposal. Just not system tables.
|
||
>
|
||
> Yes, but then it is only half baked.
|
||
|
||
Half baked or not, I think I am starting to like it.
|
||
I think I would restrict such an automagically created tablespace
|
||
(tblspace name = db name) to only contain tables from this database.
|
||
|
||
Andreas
|
||
|
||
From pgsql-hackers-owner+M4019@hub.org Thu Jun 29 13:24:36 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA08070
|
||
for <pgman@candle.pha.pa.us>; Thu, 29 Jun 2000 13:24:35 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5THLf102550;
|
||
Thu, 29 Jun 2000 13:21:41 -0400 (EDT)
|
||
Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5THL1197262
|
||
for <pgsql-hackers@postgresql.org>; Thu, 29 Jun 2000 13:21:01 -0400 (EDT)
|
||
Received: from regulus.student.UU.SE ([130.238.5.2]:50625 "EHLO
|
||
regulus.its.uu.se") by merganser.its.uu.se with ESMTP
|
||
id <S98439AbQF2RU2>; Thu, 29 Jun 2000 19:20:28 +0200
|
||
Received: from peter (helo=localhost)
|
||
by regulus.its.uu.se with local-esmtp (Exim 3.02 #2)
|
||
id 137i5r-0000BK-00; Thu, 29 Jun 2000 19:27:15 +0200
|
||
Date: Thu, 29 Jun 2000 19:27:15 +0200 (CEST)
|
||
From: Peter Eisentraut <peter_e@gmx.net>
|
||
To: Hiroshi Inoue <Inoue@tpf.co.jp>
|
||
cc: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>,
|
||
"'Mikheev, Vadim'" <vmikheev@SECTORBASE.COM>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: AW: [HACKERS] Big 7.1 open items
|
||
In-Reply-To: <3959D7CF.E447565@tpf.co.jp>
|
||
Message-ID: <Pine.LNX.4.21.0006290401170.360-100000@localhost.localdomain>
|
||
MIME-Version: 1.0
|
||
Content-Type: TEXT/PLAIN; charset=ISO-8859-1
|
||
Content-Transfer-Encoding: 8BIT
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
Hiroshi Inoue writes:
|
||
|
||
> According to your another posting,your *database* hierarchy is
|
||
> instance -> database -> schema -> object
|
||
> like Oracle.
|
||
>
|
||
> However SQL92 seems to have another hierarchy:
|
||
> cluster -> catalog -> schema -> object
|
||
> and dot notation catalog.schema.object could be used.
|
||
|
||
FYI:
|
||
|
||
An "instance" is a "cluster". I don't know where the word instance came
|
||
from, the docs sometimes call it "installation" or "site", which is even
|
||
worse. I have been using "database cluster" for the latest documentation
|
||
work. My dictionary defines a cluster as "a group of things gathered or
|
||
occurring closely together", which is what this is. Call it a "data area"
|
||
or an "initdb'ed thing", etc.
|
||
|
||
A "catalog" can be equated with our "database". The method of creating
|
||
catalogs is implementation defined, so our CREATE DATABASE command is in
|
||
perfect compliance with the standard. We don't support the
|
||
catalog.schema.object notation but that notation only makes sense when you
|
||
can access more than one catalog at a time. We don't allow that and SQL
|
||
doesn't require it. We could allow that notation and throw an error when
|
||
the catalog name doesn't match the current database, but that's mere
|
||
cosmetic work.
|
||
|
||
In entry level SQL 92, a "schema" is essentially the same as table
|
||
ownership. You can execute the command CREATE SCHEMA AUTHORIZATION
|
||
"peter", which means that user "peter" (where he came from is
|
||
"implementation-defined") can now create tables under his name. There is
|
||
no such thing as a table owner, there's the "containing schema" and its
|
||
owner. The tables "peter" creates can then be referenced by the dotted
|
||
notation. But it is not correct to equate this with CREATE USER. Even if
|
||
there was no schema for "peter" he could still connect and query other
|
||
people's tables.
|
||
|
||
Moving beyond SQL 92 you can also create schemas with a different name
|
||
than your user name. This is merely a little more naming flexibility.
|
||
|
||
|
||
--
|
||
Peter Eisentraut Sernanders v<>g 10:115
|
||
peter_e@gmx.net 75262 Uppsala
|
||
http://yi.org/peter-e/ Sweden
|
||
|
||
|
||
From peter@localhost.its.uu.se Thu Jun 29 19:25:40 2000
|
||
Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA00202
|
||
for <pgman@candle.pha.pa.us>; Thu, 29 Jun 2000 19:25:39 -0400 (EDT)
|
||
Received: from regulus.student.UU.SE ([130.238.5.2]:52854 "EHLO
|
||
regulus.its.uu.se") by merganser.its.uu.se with ESMTP
|
||
id <S274570AbQF2XZ1>; Fri, 30 Jun 2000 01:25:27 +0200
|
||
Received: from peter (helo=localhost)
|
||
by regulus.its.uu.se with local-esmtp (Exim 3.02 #2)
|
||
id 137nnA-00023q-00; Fri, 30 Jun 2000 01:32:20 +0200
|
||
Date: Fri, 30 Jun 2000 01:32:20 +0200 (CEST)
|
||
From: Peter Eisentraut <peter_e@gmx.net>
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
cc: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>,
|
||
"'Hiroshi Inoue'" <Inoue@tpf.co.jp>,
|
||
Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Bruce Momjian <pgman@candle.pha.pa.us>, Jan Wieck <JanWieck@yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-Reply-To: <17726.962240702@sss.pgh.pa.us>
|
||
Message-ID: <Pine.LNX.4.21.0006300041120.397-100000@localhost.localdomain>
|
||
MIME-Version: 1.0
|
||
Content-Type: TEXT/PLAIN; charset=ISO-8859-1
|
||
Content-Transfer-Encoding: 8BIT
|
||
Sender: Peter Eisentraut <peter@candle.pha.pa.us>
|
||
Status: RO
|
||
|
||
Tom Lane writes:
|
||
|
||
> You can put *user* tables from more than one database into a table space.
|
||
> The restriction is just on *system* tables.
|
||
|
||
I think my understanding as a user would be that a table space represents
|
||
a storage location. If I want to put a table/object/entire database on a
|
||
fancy disk somewhere I create a table space for it there. But if I want to
|
||
store all my stuff under /usr/local/pgsql/data then I wouldn't expect to
|
||
have to create more than one table space. So the table spaces become at
|
||
that point affected by the logical hierarchy: I must make sure to have
|
||
enough table spaces to have many databases.
|
||
|
||
More specifically, what would the user interface to this look like?
|
||
Clearly there has to be some sort of CREATE TABLESPACE command. Now does
|
||
CREATE DATABASE imply a CREATE TABLESPACE? I think not. Do you have to
|
||
create a table space before creating each database? I think not.
|
||
|
||
> We could avoid it along the lines you suggest (name table files like
|
||
> DBOID.RELOID.VERSION instead of just RELOID.VERSION) but is it really
|
||
> worth it?
|
||
|
||
I only intended that for pg_class and other bootstrap-sort-of tables,
|
||
maybe all system tables. Normal heap files could look like RELOID.VERSION,
|
||
whereas system tables would look like "name.DBOID". Clearly there's no
|
||
market for renaming system tables or dropping any of their columns. We're
|
||
obviously going to have to treat pg_class special anyway.
|
||
|
||
> Vadim's concerned about every byte that has to go into the WAL log,
|
||
> and I think he's got a good point.
|
||
|
||
True. But if you only do it for the system tables then it might take less
|
||
space than keeping track of lots of table spaces that are unneeded. :-)
|
||
|
||
|
||
--
|
||
Peter Eisentraut Sernanders v<>g 10:115
|
||
peter_e@gmx.net 75262 Uppsala
|
||
http://yi.org/peter-e/ Sweden
|
||
|
||
|
||
|
||
From pgsql-hackers-owner+M4032@hub.org Thu Jun 29 20:12:39 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA00852
|
||
for <pgman@candle.pha.pa.us>; Thu, 29 Jun 2000 20:12:38 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5TNwm184774;
|
||
Thu, 29 Jun 2000 19:58:48 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5TNvD180670
|
||
for <pgsql-hackers@postgresql.org>; Thu, 29 Jun 2000 19:57:14 -0400 (EDT)
|
||
Received: from tpf.co.jp ([126.0.1.56] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with ESMTP
|
||
id IAA04081; Fri, 30 Jun 2000 08:56:46 +0900
|
||
Message-ID: <395BE2F5.687E90B0@tpf.co.jp>
|
||
Date: Fri, 30 Jun 2000 08:59:49 +0900
|
||
From: Hiroshi Inoue <Inoue@tpf.co.jp>
|
||
X-Mailer: Mozilla 4.73 [ja] (Windows NT 5.0; U)
|
||
X-Accept-Language: ja
|
||
MIME-Version: 1.0
|
||
To: Peter Eisentraut <peter_e@gmx.net>
|
||
CC: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>,
|
||
"'Mikheev, Vadim'" <vmikheev@SECTORBASE.COM>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: AW: [HACKERS] Big 7.1 open items
|
||
References: <Pine.LNX.4.21.0006290401170.360-100000@localhost.localdomain>
|
||
Content-Type: text/plain; charset=iso-2022-jp
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
Peter Eisentraut wrote:
|
||
|
||
> Hiroshi Inoue writes:
|
||
>
|
||
> > According to your another posting,your *database* hierarchy is
|
||
> > instance -> database -> schema -> object
|
||
> > like Oracle.
|
||
> >
|
||
> > However SQL92 seems to have another hierarchy:
|
||
> > cluster -> catalog -> schema -> object
|
||
> > and dot notation catalog.schema.object could be used.
|
||
>
|
||
> FYI:
|
||
|
||
Thanks.
|
||
I'm asking to all what our *DATABASE* is.
|
||
Different from you,I couldn't see any decisive feature in our *DATABASE*.
|
||
|
||
>
|
||
>
|
||
> An "instance" is a "cluster". I don't know where the word instance came
|
||
|
||
I could find the word in Oracle.
|
||
IMHO,it corresponds to our initdb'ed thing(a postmaster controls).
|
||
|
||
>
|
||
> from, the docs sometimes call it "installation" or "site", which is even
|
||
> worse. I have been using "database cluster" for the latest documentation
|
||
> work. My dictionary defines a cluster as "a group of things gathered or
|
||
> occurring closely together", which is what this is. Call it a "data area"
|
||
> or an "initdb'ed thing", etc.
|
||
>
|
||
|
||
SQL92 seems to say that a cluster corresponds to a target of connection
|
||
and has no name(after connection was established). Isn't it same as our
|
||
*DATABASE* ?
|
||
|
||
>
|
||
> A "catalog" can be equated with our "database". The method of creating
|
||
> catalogs is implementation defined, so our CREATE DATABASE command is in
|
||
> perfect compliance with the standard. We don't support the
|
||
> catalog.schema.object notation but that notation only makes sense when you
|
||
> can access more than one catalog at a time.
|
||
|
||
Yes,it's most essential that we couldn't access more than one catalog.
|
||
This means that we have only one (noname) "catalog" per "cluster".
|
||
|
||
> We don't allow that and SQL
|
||
> doesn't require it. We could allow that notation and throw an error when
|
||
> the catalog name doesn't match the current database, but that's mere
|
||
> cosmetic work.
|
||
>
|
||
> In entry level SQL 92, a "schema" is essentially the same as table
|
||
> ownership. You can execute the command CREATE SCHEMA AUTHORIZATION
|
||
> "peter", which means that user "peter" (where he came from is
|
||
> "implementation-defined") can now create tables under his name. There is
|
||
> no such thing as a table owner, there's the "containing schema" and its
|
||
> owner. The tables "peter" creates can then be referenced by the dotted
|
||
> notation. But it is not correct to equate this with CREATE USER. Even if
|
||
> there was no schema for "peter" he could still connect and query other
|
||
> people's tables.
|
||
>
|
||
|
||
I've used *username* "schema"s in Oracle for a long time but I've never
|
||
thought that it's the essence of "schema". If I recoginze correctly,the
|
||
concept of "catalog" hasn't necessarily been important while "schema"
|
||
= "user". The conflict of "schema" name is equivalent to the conflict
|
||
of "user" name if "schema" = "user". IMHO,SQL92 has required the
|
||
concept of "catalog" because "schema" has been changed to be
|
||
independent of "user".
|
||
|
||
Anyway in current PG "cluster":"catalog":"schema"=1:1:1(0) and
|
||
our *DATABASE* is an only confusing concept in the hierarchy..
|
||
|
||
Regards,
|
||
|
||
Hiroshi Inoue
|
||
|
||
|
||
|
||
From tgl@sss.pgh.pa.us Thu Jun 29 20:42:56 2000
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA00958
|
||
for <pgman@candle.pha.pa.us>; Thu, 29 Jun 2000 20:42:55 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id UAA02520;
|
||
Thu, 29 Jun 2000 20:43:32 -0400 (EDT)
|
||
To: Peter Eisentraut <peter_e@gmx.net>
|
||
cc: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>,
|
||
"'Hiroshi Inoue'" <Inoue@tpf.co.jp>,
|
||
Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Bruce Momjian <pgman@candle.pha.pa.us>, Jan Wieck <JanWieck@yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <Pine.LNX.4.21.0006300041120.397-100000@localhost.localdomain>
|
||
References: <Pine.LNX.4.21.0006300041120.397-100000@localhost.localdomain>
|
||
Comments: In-reply-to Peter Eisentraut <peter_e@gmx.net>
|
||
message dated "Fri, 30 Jun 2000 01:32:20 +0200"
|
||
Date: Thu, 29 Jun 2000 20:43:32 -0400
|
||
Message-ID: <2517.962325812@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: RO
|
||
|
||
Peter Eisentraut <peter_e@gmx.net> writes:
|
||
> Tom Lane writes:
|
||
>> You can put *user* tables from more than one database into a table space.
|
||
>> The restriction is just on *system* tables.
|
||
|
||
> More specifically, what would the user interface to this look like?
|
||
> Clearly there has to be some sort of CREATE TABLESPACE command. Now does
|
||
> CREATE DATABASE imply a CREATE TABLESPACE? I think not. Do you have to
|
||
> create a table space before creating each database? I think not.
|
||
|
||
I would say that CREATE DATABASE just implicitly creates a new
|
||
tablespace that's physically located right under the toplevel data
|
||
directory of the installation, no symlink. What's wrong with that?
|
||
You need not keep anything except the system tables of the DB there
|
||
if you don't want to. In practice, for someone who doesn't need to
|
||
worry about tablespaces (because they put the installation on a disk
|
||
with enough room for their purposes), the whole thing acts exactly
|
||
the same as it does now.
|
||
|
||
>> We could avoid it along the lines you suggest (name table files like
|
||
>> DBOID.RELOID.VERSION instead of just RELOID.VERSION) but is it really
|
||
>> worth it?
|
||
|
||
> I only intended that for pg_class and other bootstrap-sort-of tables,
|
||
> maybe all system tables. Normal heap files could look like RELOID.VERSION,
|
||
> whereas system tables would look like "name.DBOID".
|
||
|
||
That would imply that the very bottom levels of the system know all
|
||
about which tables are system tables and which are not (and, if you
|
||
are really going to insist on the "name" part of that, that they
|
||
know what name goes with each system-table OID). I'd prefer to avoid
|
||
that. The less the smgr knows about the upper levels of the system,
|
||
the better.
|
||
|
||
> Clearly there's no market for renaming system tables or dropping any
|
||
> of their columns.
|
||
|
||
No, but there is a market for compacting indexes on system relations,
|
||
and I haven't heard a good proposal for doing index compaction in place.
|
||
So we need versioning for system indexes.
|
||
|
||
>> Vadim's concerned about every byte that has to go into the WAL log,
|
||
>> and I think he's got a good point.
|
||
|
||
> True. But if you only do it for the system tables then it might take less
|
||
> space than keeping track of lots of table spaces that are unneeded. :-)
|
||
|
||
Again, WAL should not need to distinguish system and user tables.
|
||
|
||
And as for the keeping track, the tablespace OID will simply replace the
|
||
database OID in the log and in the smgr interfaces. There's no "extra"
|
||
cost, except maybe by comparison to a system with neither tablespaces
|
||
nor multiple databases.
|
||
|
||
regards, tom lane
|
||
|
||
From peter@localhost.its.uu.se Sat Jul 1 10:39:11 2000
|
||
Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA02996
|
||
for <pgman@candle.pha.pa.us>; Sat, 1 Jul 2000 10:39:10 -0400 (EDT)
|
||
Received: from regulus.student.UU.SE ([130.238.5.2]:50862 "EHLO
|
||
regulus.its.uu.se") by merganser.its.uu.se with ESMTP
|
||
id <S110734AbQGAO4t>; Sat, 1 Jul 2000 16:56:49 +0200
|
||
Received: from peter (helo=localhost)
|
||
by regulus.its.uu.se with local-esmtp (Exim 3.02 #2)
|
||
id 138Oo3-0003UQ-00; Sat, 01 Jul 2000 17:03:43 +0200
|
||
Date: Sat, 1 Jul 2000 17:03:42 +0200 (CEST)
|
||
From: Peter Eisentraut <peter_e@gmx.net>
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
cc: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>,
|
||
"'Hiroshi Inoue'" <Inoue@tpf.co.jp>,
|
||
Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Bruce Momjian <pgman@candle.pha.pa.us>, Jan Wieck <JanWieck@yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-Reply-To: <2517.962325812@sss.pgh.pa.us>
|
||
Message-ID: <Pine.LNX.4.21.0007011653280.13037-100000@localhost.localdomain>
|
||
MIME-Version: 1.0
|
||
Content-Type: TEXT/PLAIN; charset=ISO-8859-1
|
||
Content-Transfer-Encoding: 8BIT
|
||
Sender: Peter Eisentraut <peter@candle.pha.pa.us>
|
||
Status: RO
|
||
|
||
Tom Lane writes:
|
||
|
||
> In practice, for someone who doesn't need to worry about tablespaces
|
||
> (because they put the installation on a disk with enough room for
|
||
> their purposes), the whole thing acts exactly the same as it does now.
|
||
|
||
But I'd venture the guess that for someone who wants to use tablespaces it
|
||
wouldn't work as expected. Table spaces should represent a physical
|
||
storage location. Creation of table spaces should be a restricted
|
||
operation, possibly more than, but at least differently from, databases.
|
||
Eventually, table spaces probably will have attributes, such as
|
||
optimization parameters (random_page_cost). This will not work as expected
|
||
if you intermix them with the databases.
|
||
|
||
I'd expect that if I have three disks and 50 databases, then I make three
|
||
tablespaces and assign the databases to them. I'll bet lunch that if we
|
||
don't do it that way that before long people will come along and ask for
|
||
something that does work this way.
|
||
|
||
|
||
--
|
||
Peter Eisentraut Sernanders v<>g 10:115
|
||
peter_e@gmx.net 75262 Uppsala
|
||
http://yi.org/peter-e/ Sweden
|
||
|
||
|
||
From pgsql-hackers-owner+M4066@hub.org Sat Jul 1 13:21:39 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA03777
|
||
for <pgman@candle.pha.pa.us>; Sat, 1 Jul 2000 13:21:38 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e61He8S63312;
|
||
Sat, 1 Jul 2000 13:40:08 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e61Hd7S58820
|
||
for <pgsql-hackers@postgresql.org>; Sat, 1 Jul 2000 13:39:07 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id NAA22822;
|
||
Sat, 1 Jul 2000 13:37:21 -0400 (EDT)
|
||
To: Peter Eisentraut <peter_e@gmx.net>
|
||
cc: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>,
|
||
"'Hiroshi Inoue'" <Inoue@tpf.co.jp>,
|
||
Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Bruce Momjian <pgman@candle.pha.pa.us>, Jan Wieck <JanWieck@yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-reply-to: <Pine.LNX.4.21.0007011653280.13037-100000@localhost.localdomain>
|
||
References: <Pine.LNX.4.21.0007011653280.13037-100000@localhost.localdomain>
|
||
Comments: In-reply-to Peter Eisentraut <peter_e@gmx.net>
|
||
message dated "Sat, 01 Jul 2000 17:03:42 +0200"
|
||
Date: Sat, 01 Jul 2000 13:37:21 -0400
|
||
Message-ID: <22819.962473041@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
Peter Eisentraut <peter_e@gmx.net> writes:
|
||
> I'd expect that if I have three disks and 50 databases, then I make three
|
||
> tablespaces and assign the databases to them.
|
||
|
||
In our last installment, you were complaining that you didn't want to
|
||
be bothered with that ;-)
|
||
|
||
But I don't see any reason why CREATE DATABASE couldn't take optional
|
||
parameters indicating where to create the new DB's default tablespace.
|
||
We already have a LOCATION option for it that does something close to
|
||
that.
|
||
|
||
Come to think of it, it would probably make sense to adapt the existing
|
||
notion of "location" (cf initlocation script) into something meaning
|
||
"directory that users are allowed to create tablespaces (including
|
||
databases) in". If there were an explicit table of allowed locations,
|
||
it could be used to address the protection issues you raise --- for
|
||
example, a location could be restricted so that only some users could
|
||
create tablespaces/databases in it. $PGDATA/data would be just the
|
||
first location in every installation.
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M4078@hub.org Sun Jul 2 11:16:52 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA14294
|
||
for <pgman@candle.pha.pa.us>; Sun, 2 Jul 2000 11:16:51 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e62FGqS51200;
|
||
Sun, 2 Jul 2000 11:16:52 -0400 (EDT)
|
||
Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e62FGaS50925
|
||
for <pgsql-hackers@postgresql.org>; Sun, 2 Jul 2000 11:16:36 -0400 (EDT)
|
||
Received: from regulus.student.UU.SE ([130.238.5.2]:52424 "EHLO
|
||
regulus.its.uu.se") by merganser.its.uu.se with ESMTP
|
||
id <S53395AbQGBPP5>; Sun, 2 Jul 2000 17:15:57 +0200
|
||
Received: from peter (helo=localhost)
|
||
by regulus.its.uu.se with local-esmtp (Exim 3.02 #2)
|
||
id 138lZz-0001VD-00; Sun, 02 Jul 2000 17:22:43 +0200
|
||
Date: Sun, 2 Jul 2000 17:22:43 +0200 (CEST)
|
||
From: Peter Eisentraut <peter_e@gmx.net>
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
cc: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>,
|
||
"'Hiroshi Inoue'" <Inoue@tpf.co.jp>,
|
||
Thomas Lockhart <lockhart@alumni.caltech.edu>,
|
||
Bruce Momjian <pgman@candle.pha.pa.us>, Jan Wieck <JanWieck@yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: Re: [HACKERS] Big 7.1 open items
|
||
In-Reply-To: <22819.962473041@sss.pgh.pa.us>
|
||
Message-ID: <Pine.LNX.4.21.0007021716140.351-100000@localhost.localdomain>
|
||
MIME-Version: 1.0
|
||
Content-Type: TEXT/PLAIN; charset=ISO-8859-1
|
||
Content-Transfer-Encoding: 8BIT
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
Tom Lane writes:
|
||
|
||
> Come to think of it, it would probably make sense to adapt the existing
|
||
> notion of "location" (cf initlocation script) into something meaning
|
||
> "directory that users are allowed to create tablespaces (including
|
||
> databases) in".
|
||
|
||
This is what I've been trying to push all along. But note that this
|
||
mechanism does allow multiple databases per location. :)
|
||
|
||
|
||
--
|
||
Peter Eisentraut Sernanders v<>g 10:115
|
||
peter_e@gmx.net 75262 Uppsala
|
||
http://yi.org/peter-e/ Sweden
|
||
|
||
|
||
From ZeugswetterA@wien.spardat.at Mon Jul 3 04:30:07 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id EAA16088
|
||
for <pgman@candle.pha.pa.us>; Mon, 3 Jul 2000 04:30:05 -0400 (EDT)
|
||
Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id EAA19031 for <pgman@candle.pha.pa.us>; Mon, 3 Jul 2000 04:30:07 -0400 (EDT)
|
||
Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
|
||
by gandalf.it-austria.net (xxx/xxx) with ESMTP id KAA28416;
|
||
Mon, 3 Jul 2000 10:28:06 +0200
|
||
Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
|
||
id <M6F0B65Y>; Mon, 3 Jul 2000 10:28:06 +0200
|
||
Message-ID: <219F68D65015D011A8E000006F8590C605BA59B0@sdexcsrv1.f000.d0188.sd.spardat.at>
|
||
From: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
|
||
To: "'Hiroshi Inoue'" <Inoue@seiren.co.jp>,
|
||
Peter Eisentraut
|
||
<peter_e@gmx.net>, Tom Lane <tgl@sss.pgh.pa.us>
|
||
Cc: Bruce Momjian <pgman@candle.pha.pa.us>, Jan Wieck <JanWieck@Yahoo.com>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>,
|
||
"Ross J. Reedstrom" <reedstrm@rice.edu>
|
||
Subject: AW: [HACKERS] Big 7.1 open items
|
||
Date: Mon, 3 Jul 2000 10:28:05 +0200
|
||
MIME-Version: 1.0
|
||
X-Mailer: Internet Mail Service (5.5.2448.0)
|
||
Content-Type: text/plain;
|
||
charset="windows-1252"
|
||
Status: RO
|
||
|
||
|
||
> > > > > In my mind the point of the "database" concept is to
|
||
> > > provide a domain
|
||
> > > > > within which custom datatypes and functions are available.
|
||
> > > >
|
||
> > >
|
||
> > > AFAIK few users understand it and many users have wondered
|
||
> > > why we couldn't issue cross "database" queries.
|
||
> >
|
||
> > Imho the same issue is access to tables on another machine.
|
||
> > If we "fix" that, access to another db on the same instance is just
|
||
> > a variant of the above.
|
||
> >
|
||
>
|
||
> What is a difference between SCHAMA and your "database" ?
|
||
> I myself am confused about them.
|
||
|
||
"my *database*" corresponds to the current database, which is created with
|
||
"create database" in postgresql. It corresponds to the catalog concept in
|
||
SQL99.
|
||
|
||
The schema is below the database. Access to different schemas with one
|
||
connection
|
||
is mandatory. Access to different catalogs (databases) with one connection
|
||
is not mandatory,
|
||
but should imho be solved analogous to access to another catalog on a
|
||
different
|
||
(SQL99) cluster. This would be a very nifty feature.
|
||
|
||
Andreas
|
||
|
||
From pgsql-hackers-owner+M3496@hub.org Fri Jun 16 15:55:14 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA02116
|
||
for <pgman@candle.pha.pa.us>; Fri, 16 Jun 2000 14:55:13 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id NAA21581 for <pgman@candle.pha.pa.us>; Fri, 16 Jun 2000 13:53:58 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5GHpqN06086;
|
||
Fri, 16 Jun 2000 13:51:52 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5GHpcN05946
|
||
for <pgsql-hackers@postgreSQL.org>; Fri, 16 Jun 2000 13:51:39 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id NAA07945
|
||
for <pgsql-hackers@postgreSQL.org>; Fri, 16 Jun 2000 13:51:38 -0400 (EDT)
|
||
To: pgsql-hackers@postgresql.org
|
||
Subject: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated filename
|
||
Date: Fri, 16 Jun 2000 13:51:37 -0400
|
||
Message-ID: <7942.961177897@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
After further thought I think there's a lot of merit in Hiroshi's
|
||
opinion that physical file names should not be tied to relation OID.
|
||
If we use a separately generated value for the file name, we can
|
||
solve a lot of problems pretty nicely by means of "table versioning".
|
||
|
||
For example: VACUUM can't compact indexes at the moment, and what it
|
||
does do (scan the index and delete unused entries) is really slow.
|
||
The right thing to do is for it to generate an all-new index file,
|
||
but how do we do that without creating a risk of leaving the index
|
||
corrupted if we crash partway through? The answer is to build the
|
||
new index in a new physical file. But how do we install the new
|
||
file as the real index atomically, when it might span multiple
|
||
segments? If the physical file name is decoupled from the relation's
|
||
name *and* OID then there is no problem: the atomic event that makes
|
||
the new file(s) the real table contents is the commit of the new
|
||
pg_class row with the new value for the physical filename.
|
||
|
||
Aside from possible improvements in VACUUM, this would let us do a
|
||
robust implementation of CLUSTER, and we could do the "really change
|
||
the table" variant of ALTER TABLE DROP COLUMN the same way if anyone
|
||
wants to do it.
|
||
|
||
The only cost is that we need an additional column in pg_class to
|
||
hold the physical file name. That's not so bad, especially when
|
||
you remember that we'd surely need to add something to pg_class for
|
||
tablespace support anyway.
|
||
|
||
If we bite that bullet, then we could also do something to satisfy
|
||
Bruce about having legible file names ;-). The column in pg_class
|
||
could perfectly well be a string, not a pure number, and that means
|
||
that we can throw in the relname (truncated to fit of course). So
|
||
the thing would act a lot like the original-relname-plus-OID variant
|
||
that's been discussed so far. (Original relname because ALTER TABLE
|
||
RENAME would *not* change the physical file name. But we could
|
||
think about a form of VACUUM that creates a whole new table by
|
||
versioning, and that would presumably bring the physical name back
|
||
in sync with the logical relname.)
|
||
|
||
Here is a sketch of a concrete proposal. I see no need to have
|
||
separate pg_class columns for tablespace and physical relname;
|
||
instead, I suggest there be a column of type NAME that is the
|
||
file pathname (relative to the database directory). Further,
|
||
instead of the existing convention of appending .N to the base
|
||
file name to make extension segment names, I propose that we
|
||
always have a segment number in the physical file name, and that
|
||
the pg_class entry be required to contain a "%d" somewhere that
|
||
indicates where. The actual filename is manufactured by
|
||
sprintf(tempbuf, value_from_pg_class_column, segment_number);
|
||
|
||
As an example, the arrangement I was suggesting earlier today
|
||
about segments in different subdirectories of a tablespace
|
||
could be implemented by assigning physical filenames like
|
||
|
||
tablespace/%d/12345_relname
|
||
|
||
where the 12345 is a value generated separately from the table's OID.
|
||
(We would still use the OID counter to produce these numbers, and
|
||
in fact there's no reason not to use the table's OID as the initial
|
||
unique ID for the physical filename. The point is just that the
|
||
physical filename doesn't have to remain forever equal to the
|
||
relation's OID.)
|
||
|
||
If we use type NAME for this string then the tablespace part of the path
|
||
would have to be kept to no more than ~ 15 characters, but that seems
|
||
workable enough. (Anybody who really didn't like that could recompile
|
||
with larger NAMEDATALEN. Doesn't seem worth inventing a separate type.)
|
||
|
||
As Hiroshi pointed out, one of the best aspects of this approach
|
||
is that the physical table layout policy doesn't have to be hard-wired
|
||
into low-level file access routines. The low-level routines don't
|
||
need to know much of anything about the format of the pathname,
|
||
they just stuff in the right segment number and use the name. The
|
||
layout policy need only be known to one single routine that generates
|
||
the strings that go into pg_class. So it'd be really easy to change.
|
||
|
||
One thing we'd have to work out is that the critical system tables
|
||
(eg, pg_class itself, as well as its indexes) would have to have
|
||
predictable physical names. Otherwise there's no way for a new
|
||
backend to bootstrap itself up ... it can't very well read pg_class
|
||
to find out where pg_class is. A brute-force solution is to forbid
|
||
reversioning of the critical tables, but I suspect we can find a
|
||
less restrictive answer.
|
||
|
||
This seems like it'd satisfy all the concerns that have been raised.
|
||
Comments?
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M3524@hub.org Fri Jun 16 22:30:59 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA07796
|
||
for <pgman@candle.pha.pa.us>; Fri, 16 Jun 2000 21:30:58 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id VAA26393 for <pgman@candle.pha.pa.us>; Fri, 16 Jun 2000 21:16:37 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5H1EeM94683;
|
||
Fri, 16 Jun 2000 21:14:40 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5H1D0M94365
|
||
for <pgsql-hackers@postgreSQL.org>; Fri, 16 Jun 2000 21:13:00 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id VAA10209;
|
||
Fri, 16 Jun 2000 21:12:30 -0400 (EDT)
|
||
To: Chris Bitmead <chris@bitmead.com>
|
||
cc: pgsql-hackers@postgreSQL.org
|
||
Subject: Re: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated filename
|
||
In-reply-to: <394ACB42.C87C59B8@bitmead.com>
|
||
References: <7942.961177897@sss.pgh.pa.us> <394ACB42.C87C59B8@bitmead.com>
|
||
Comments: In-reply-to Chris Bitmead <chris@bitmead.com>
|
||
message dated "Sat, 17 Jun 2000 10:50:10 +1000"
|
||
Date: Fri, 16 Jun 2000 21:12:29 -0400
|
||
Message-ID: <10206.961204349@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
Chris Bitmead <chris@bitmead.com> writes:
|
||
> At least on UNIX, couldn't you use a hard-link and change the name in
|
||
> pg_class immediately? Let the brain-dead operating systems use the
|
||
> vacuum method.
|
||
|
||
Hmm ... maybe, but it doesn't seem worth the portability headache to
|
||
me. We do have an NT port that we don't want to break, and I don't
|
||
think RENAME TABLE is worth the trouble of testing/supporting two
|
||
implementations.
|
||
|
||
Even on Unix, aren't there filesystems that don't do hard links?
|
||
Not that I'd recommend running Postgres on such a volume, but...
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M3525@hub.org Sat Jun 17 07:01:03 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id GAA22194
|
||
for <pgman@candle.pha.pa.us>; Sat, 17 Jun 2000 06:01:02 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id FAA21836 for <pgman@candle.pha.pa.us>; Sat, 17 Jun 2000 05:39:21 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5H9bSM88777;
|
||
Sat, 17 Jun 2000 05:37:28 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5H9anM88603
|
||
for <pgsql-hackers@postgreSQL.org>; Sat, 17 Jun 2000 05:36:49 -0400 (EDT)
|
||
Received: from mcadnote1 (ppm130.noc.fukui.nsk.ne.jp [210.161.188.49])
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id SAA08384; Sat, 17 Jun 2000 18:36:00 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Tom Lane" <tgl@sss.pgh.pa.us>
|
||
Cc: <pgsql-hackers@postgresql.org>
|
||
Subject: RE: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated filename
|
||
Date: Sat, 17 Jun 2000 18:38:53 +0900
|
||
Message-ID: <EKEJJICOHDIEMGPNIFIJIEAKCCAA.Inoue@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-2022-jp"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
|
||
In-Reply-To: <7942.961177897@sss.pgh.pa.us>
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
|
||
Importance: Normal
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
> -----Original Message-----
|
||
> From: pgsql-hackers-owner@hub.org [mailto:pgsql-hackers-owner@hub.org]On
|
||
> Behalf Of Tom Lane
|
||
>
|
||
> After further thought I think there's a lot of merit in Hiroshi's
|
||
> opinion that physical file names should not be tied to relation OID.
|
||
> If we use a separately generated value for the file name, we can
|
||
> solve a lot of problems pretty nicely by means of "table versioning".
|
||
>
|
||
> For example: VACUUM can't compact indexes at the moment, and what it
|
||
> does do (scan the index and delete unused entries) is really slow.
|
||
> The right thing to do is for it to generate an all-new index file,
|
||
> but how do we do that without creating a risk of leaving the index
|
||
> corrupted if we crash partway through? The answer is to build the
|
||
> new index in a new physical file. But how do we install the new
|
||
> file as the real index atomically, when it might span multiple
|
||
> segments? If the physical file name is decoupled from the relation's
|
||
> name *and* OID then there is no problem: the atomic event that makes
|
||
> the new file(s) the real table contents is the commit of the new
|
||
> pg_class row with the new value for the physical filename.
|
||
>
|
||
> Aside from possible improvements in VACUUM, this would let us do a
|
||
> robust implementation of CLUSTER, and we could do the "really change
|
||
> the table" variant of ALTER TABLE DROP COLUMN the same way if anyone
|
||
> wants to do it.
|
||
>
|
||
|
||
Yes,I've wondered how do we implement column_is_really_dropped
|
||
ALTER TABLE DROP COLUMN feature without this kind of mechanism.
|
||
|
||
> The only cost is that we need an additional column in pg_class to
|
||
> hold the physical file name. That's not so bad, especially when
|
||
> you remember that we'd surely need to add something to pg_class for
|
||
> tablespace support anyway.
|
||
>
|
||
> If we bite that bullet, then we could also do something to satisfy
|
||
> Bruce about having legible file names ;-). The column in pg_class
|
||
> could perfectly well be a string, not a pure number, and that means
|
||
> that we can throw in the relname (truncated to fit of course). So
|
||
> the thing would act a lot like the original-relname-plus-OID variant
|
||
> that's been discussed so far. (Original relname because ALTER TABLE
|
||
> RENAME would *not* change the physical file name. But we could
|
||
> think about a form of VACUUM that creates a whole new table by
|
||
> versioning, and that would presumably bring the physical name back
|
||
> in sync with the logical relname.)
|
||
>
|
||
> As Hiroshi pointed out, one of the best aspects of this approach
|
||
> is that the physical table layout policy doesn't have to be hard-wired
|
||
> into low-level file access routines. The low-level routines don't
|
||
> need to know much of anything about the format of the pathname,
|
||
> they just stuff in the right segment number and use the name. The
|
||
> layout policy need only be known to one single routine that generates
|
||
> the strings that go into pg_class. So it'd be really easy to change.
|
||
>
|
||
|
||
Ross's approach is fundamentally same though he is using relname+OID
|
||
naming rule. I've said his trial is most practical one.
|
||
|
||
> One thing we'd have to work out is that the critical system tables
|
||
> (eg, pg_class itself, as well as its indexes) would have to have
|
||
> predictable physical names.
|
||
|
||
The only limitation of the relation filename is the uniqueness.
|
||
So it doesn't introduce any inconsistency that system tables
|
||
have fixed name.
|
||
As for system relations it wouldn't be so bad because CLUSTER/
|
||
ALTER TABLE DROP COLUMN ... would be unnecessary(maybe).
|
||
But as for system indexes,it is preferable that VACUUM/REINDEX
|
||
could rebuild them safely. System indexes never shrink currently.
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
From pgsql-hackers-owner+M3529@hub.org Sat Jun 17 10:01:24 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id JAA24004
|
||
for <pgman@candle.pha.pa.us>; Sat, 17 Jun 2000 09:01:23 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id IAA28633 for <pgman@candle.pha.pa.us>; Sat, 17 Jun 2000 08:57:47 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5HCtxM77095;
|
||
Sat, 17 Jun 2000 08:55:59 -0400 (EDT)
|
||
Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5HCtoM77026
|
||
for <pgsql-hackers@postgresql.org>; Sat, 17 Jun 2000 08:55:50 -0400 (EDT)
|
||
Received: from regulus.student.UU.SE ([130.238.5.2]:57716 "EHLO
|
||
regulus.its.uu.se") by merganser.its.uu.se with ESMTP
|
||
id <S276602AbQFQMzZ>; Sat, 17 Jun 2000 14:55:25 +0200
|
||
Received: from peter (helo=localhost)
|
||
by regulus.its.uu.se with local-esmtp (Exim 3.02 #2)
|
||
id 133IET-0002Y3-00; Sat, 17 Jun 2000 15:01:53 +0200
|
||
Date: Sat, 17 Jun 2000 15:01:53 +0200 (CEST)
|
||
From: Peter Eisentraut <peter_e@gmx.net>
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
cc: pgsql-hackers@postgreSQL.org
|
||
Subject: Re: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated
|
||
filename
|
||
In-Reply-To: <7942.961177897@sss.pgh.pa.us>
|
||
Message-ID: <Pine.LNX.4.21.0006170403000.17284-100000@localhost.localdomain>
|
||
MIME-Version: 1.0
|
||
Content-Type: TEXT/PLAIN; charset=ISO-8859-1
|
||
Content-Transfer-Encoding: 8BIT
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
Tom Lane writes:
|
||
|
||
> tablespace/%d/12345_relname
|
||
|
||
Throwing table spaces and relation names into one pot doesn't excite me
|
||
very much. For example, before long people will want to
|
||
|
||
* Query what tables are in what space (without using string operations)
|
||
Consider for example creating a new table and choosing where to put it.
|
||
|
||
* Rename table spaces
|
||
|
||
* Assign attributes of some sort to table spaces (permissions, etc.)
|
||
|
||
* Use table space names with more than 15 characters. :)
|
||
|
||
Somehow table spaces need to be catalogued. You could still make the
|
||
physical file name 'tablespaceoid/rest' without actually having to look up
|
||
anything, although that depends on your symlink idea which is still under
|
||
discussion.
|
||
|
||
Then, why are all nth segments of tables in one directory in that
|
||
proposal?
|
||
|
||
Also, you said before that an old relname (after rename) is worse than
|
||
none at all. I couldn't agree more.
|
||
|
||
Why not use OID.[SEGMENT.]VERSION for the physical relname (different
|
||
order possible)? That way you at least have some guaranteed correspondence
|
||
between files and tables. Version could probably be an INT2, so you save
|
||
some space.
|
||
|
||
|
||
--
|
||
Peter Eisentraut Sernanders v<>g 10:115
|
||
peter_e@gmx.net 75262 Uppsala
|
||
http://yi.org/peter-e/ Sweden
|
||
|
||
|
||
From pgsql-hackers-owner+M3534@hub.org Sat Jun 17 13:31:11 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA02801
|
||
for <pgman@candle.pha.pa.us>; Sat, 17 Jun 2000 12:31:10 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id MAA07848 for <pgman@candle.pha.pa.us>; Sat, 17 Jun 2000 12:27:14 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5HGPJM95074;
|
||
Sat, 17 Jun 2000 12:25:19 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5HGP1M94990
|
||
for <pgsql-hackers@postgreSQL.org>; Sat, 17 Jun 2000 12:25:01 -0400 (EDT)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA18939;
|
||
Sat, 17 Jun 2000 12:24:56 -0400 (EDT)
|
||
To: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
cc: pgsql-hackers@postgreSQL.org
|
||
Subject: Re: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated filename
|
||
In-reply-to: <EKEJJICOHDIEMGPNIFIJIEAKCCAA.Inoue@tpf.co.jp>
|
||
References: <EKEJJICOHDIEMGPNIFIJIEAKCCAA.Inoue@tpf.co.jp>
|
||
Comments: In-reply-to "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
message dated "Sat, 17 Jun 2000 18:38:53 +0900"
|
||
Date: Sat, 17 Jun 2000 12:24:56 -0400
|
||
Message-ID: <18936.961259096@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
|
||
>> One thing we'd have to work out is that the critical system tables
|
||
>> (eg, pg_class itself, as well as its indexes) would have to have
|
||
>> predictable physical names.
|
||
|
||
> The only limitation of the relation filename is the uniqueness.
|
||
> So it doesn't introduce any inconsistency that system tables
|
||
> have fixed name.
|
||
> As for system relations it wouldn't be so bad because CLUSTER/
|
||
> ALTER TABLE DROP COLUMN ... would be unnecessary(maybe).
|
||
> But as for system indexes,it is preferable that VACUUM/REINDEX
|
||
> could rebuild them safely. System indexes never shrink currently.
|
||
|
||
Right, it's the index-shrinking business that has me worried.
|
||
Most of the other reasons for swapping in a new file don't apply
|
||
to system tables, but that one does.
|
||
|
||
One possibility is to say that system *tables* can't be reversioned
|
||
(at least not the critical ones) but system *indexes* can be.
|
||
Then we'd have to use your ignore-system-indexes stuff during backend
|
||
startup, until we'd found out where the indexes are. Might be too big
|
||
a time penalty however... not sure. Shared cache inval of a system
|
||
index could be a little tricky too; I don't think the catcache routines
|
||
are prepared to fall back to non-index scan are they?
|
||
|
||
On the whole it might be better to cheat by using a side data structure
|
||
like the pg_internal.init file, that a backend could consult to find out
|
||
where the indexes are now.
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M3553@hub.org Sun Jun 18 18:31:03 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA08740
|
||
for <pgman@candle.pha.pa.us>; Sun, 18 Jun 2000 17:31:02 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id RAA18332 for <pgman@candle.pha.pa.us>; Sun, 18 Jun 2000 17:21:51 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5ILJcM11720;
|
||
Sun, 18 Jun 2000 17:19:38 -0400 (EDT)
|
||
Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5ILILM09628
|
||
for <pgsql-hackers@postgresql.org>; Sun, 18 Jun 2000 17:18:21 -0400 (EDT)
|
||
Received: from regulus.student.UU.SE ([130.238.5.2]:40239 "EHLO
|
||
regulus.its.uu.se") by merganser.its.uu.se with ESMTP
|
||
id <S436346AbQFRVRt>; Sun, 18 Jun 2000 23:17:49 +0200
|
||
Received: from peter (helo=localhost)
|
||
by regulus.its.uu.se with local-esmtp (Exim 3.02 #2)
|
||
id 133mYM-0000Ns-00; Sun, 18 Jun 2000 23:24:26 +0200
|
||
Date: Sun, 18 Jun 2000 23:24:26 +0200 (CEST)
|
||
From: Peter Eisentraut <peter_e@gmx.net>
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
cc: PostgreSQL Development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated
|
||
filename
|
||
In-Reply-To: <19045.961260445@sss.pgh.pa.us>
|
||
Message-ID: <Pine.LNX.4.21.0006181657280.562-100000@localhost.localdomain>
|
||
MIME-Version: 1.0
|
||
Content-Type: TEXT/PLAIN; charset=ISO-8859-1
|
||
Content-Transfer-Encoding: 8BIT
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
Tom Lane writes:
|
||
|
||
> I don't think it's a good idea to have to consult pg_tablespace to find
|
||
> out where a table actually is --- I think the pathname (or smgr access
|
||
> token as Ross would call it ;-)) ought to be determinable from just the
|
||
> pg_class entry.
|
||
|
||
That's why I suggested the table space oid. That would be readily
|
||
available from pg_class.
|
||
|
||
|
||
> Tablespaces can have logical names stored in pg_tablespace; they just
|
||
> can't contribute more than a dozen or so characters to file pathnames
|
||
> under the implementation I'm proposing. That doesn't seem too
|
||
> unreasonable; the pathname part can be some sort of abbreviated name.
|
||
|
||
Since the abbreviated name is really only used internally it might as well
|
||
be the oid. Otherwise you create a weird functional dependency like the
|
||
pg_shadow.usesysid field that's just an extra layer of maintenance.
|
||
|
||
|
||
> this implementation mechanism will support either policy choice ---
|
||
> original relname in the filename, or just a numeric ID for the
|
||
> filename
|
||
|
||
But when you look at a file name `12345_accounts_recei' you know neither
|
||
|
||
* whether the table name was really `accounts_recei' or whether the name
|
||
was truncated
|
||
|
||
* whether the table still has that name, whatever it was
|
||
|
||
* what table this is at all
|
||
|
||
So in the aggregate you really know less than nothing. :-)
|
||
|
||
|
||
> > Why not use OID.[SEGMENT.]VERSION for the physical relname (different
|
||
> > order possible)?
|
||
>
|
||
> Doesn't give you a manageable way to split segments across different
|
||
> disks.
|
||
|
||
Okay, so maybe ${base}/TABLESPACEOID/SEGMENT/RELOID.VERSION.
|
||
|
||
This doesn't need any catalog lookup outside of pg_class, yet it's still
|
||
easy to resolve to human-readable names by simple admin tools (SELECT *
|
||
FROM pg_foo WHERE oid = xxx). VERSION would be unique within a conceptual
|
||
relation, so you could even see how many times the relation was altered in
|
||
major ways (kind of).
|
||
|
||
|
||
--
|
||
Peter Eisentraut Sernanders v<>g 10:115
|
||
peter_e@gmx.net 75262 Uppsala
|
||
http://yi.org/peter-e/ Sweden
|
||
|
||
|
||
From pgsql-hackers-owner+M3561@hub.org Sun Jun 18 21:31:03 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA20523
|
||
for <pgman@candle.pha.pa.us>; Sun, 18 Jun 2000 20:31:02 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id UAA25719 for <pgman@candle.pha.pa.us>; Sun, 18 Jun 2000 20:26:49 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5J0OLM53050;
|
||
Sun, 18 Jun 2000 20:24:21 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5J0NmM50883
|
||
for <pgsql-hackers@postgreSQL.org>; Sun, 18 Jun 2000 20:23:49 -0400 (EDT)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id JAA09003; Mon, 19 Jun 2000 09:22:45 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Chris Bitmead" <chris@bitmead.com>, "Tom Lane" <tgl@sss.pgh.pa.us>
|
||
Cc: "Peter Eisentraut" <peter_e@gmx.net>, <pgsql-hackers@postgresql.org>
|
||
Subject: RE: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated filename
|
||
Date: Mon, 19 Jun 2000 09:24:56 +0900
|
||
Message-ID: <000901bfd984$cbf1dfc0$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="ISO-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
In-Reply-To: <394C20C6.9580A8A9@bitmead.com>
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
Importance: Normal
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
> -----Original Message-----
|
||
> From: pgsql-hackers-owner@hub.org [mailto:pgsql-hackers-owner@hub.org]On
|
||
> Behalf Of Chris Bitmead
|
||
>
|
||
> Tom Lane wrote:
|
||
>
|
||
> > > Also, you said before that an old relname (after rename) is worse than
|
||
> > > none at all. I couldn't agree more.
|
||
> >
|
||
> > I'm not the one who wants relnames in the physical names ;-). However,
|
||
> > this implementation mechanism will support either policy choice ---
|
||
> > original relname in the filename, or just a numeric ID for the filename
|
||
> > --- and that seems like a good sign to me.
|
||
> >
|
||
> > > Why not use OID.[SEGMENT.]VERSION for the physical relname (different
|
||
> > > order possible)?
|
||
>
|
||
> Unless VERSION is globally unique like an oid is, having RELNAME.VERSION
|
||
> would be a problem if you created a table with the same name as a
|
||
> recently renamed table.
|
||
>
|
||
|
||
In my proposal(relname+unique-id),the unique-id is globally unique
|
||
and relname is only for dba's convenience. I've said many times that
|
||
we should be free from the rule of file naming as far as possible.
|
||
I myself don't mind the name of relation files except that they should
|
||
be globally unique. I had to propose my opinion for file naming
|
||
because people have been so enthusiastic about globally_not_unique
|
||
file naming.
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
|
||
From pgsql-hackers-owner+M3523@hub.org Fri Jun 16 22:01:00 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA07568
|
||
for <pgman@candle.pha.pa.us>; Fri, 16 Jun 2000 21:00:59 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id UAA25354 for <pgman@candle.pha.pa.us>; Fri, 16 Jun 2000 20:54:02 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5H0q3M53458;
|
||
Fri, 16 Jun 2000 20:52:03 -0400 (EDT)
|
||
Received: from tech.com.au (IDENT:root@techpt.lnk.telstra.net [139.130.75.122])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5H0oRM47761
|
||
for <pgsql-hackers@postgreSQL.org>; Fri, 16 Jun 2000 20:50:28 -0400 (EDT)
|
||
Received: from bitmead.com (IDENT:chris@tardis [203.41.180.243])
|
||
by tech.com.au (8.9.3/8.9.3) with ESMTP id KAA21482;
|
||
Sat, 17 Jun 2000 10:50:14 +1000
|
||
Message-ID: <394ACB42.C87C59B8@bitmead.com>
|
||
Date: Sat, 17 Jun 2000 10:50:10 +1000
|
||
From: Chris Bitmead <chris@bitmead.com>
|
||
X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.14-5.0 i686)
|
||
X-Accept-Language: en
|
||
MIME-Version: 1.0
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
CC: pgsql-hackers@postgreSQL.org
|
||
Subject: Re: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated
|
||
filename
|
||
References: <7942.961177897@sss.pgh.pa.us>
|
||
Content-Type: text/plain; charset=us-ascii
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: ROr
|
||
|
||
Tom Lane wrote:
|
||
So
|
||
> the thing would act a lot like the original-relname-plus-OID variant
|
||
> that's been discussed so far. (Original relname because ALTER TABLE
|
||
> RENAME would *not* change the physical file name. But we could
|
||
> think about a form of VACUUM that creates a whole new table by
|
||
> versioning, and that would presumably bring the physical name back
|
||
> in sync with the logical relname.)
|
||
|
||
At least on UNIX, couldn't you use a hard-link and change the name in
|
||
pg_class immediately? Let the brain-dead operating systems use the
|
||
vacuum method.
|
||
|
||
From pgsql-hackers-owner+M3576@hub.org Mon Jun 19 01:58:35 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA00789
|
||
for <pgman@candle.pha.pa.us>; Mon, 19 Jun 2000 00:58:34 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5J4qfM87650;
|
||
Mon, 19 Jun 2000 00:52:41 -0400 (EDT)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5J4oUM77400
|
||
for <pgsql-hackers@postgresql.org>; Mon, 19 Jun 2000 00:50:30 -0400 (EDT)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id NAA09265; Mon, 19 Jun 2000 13:50:22 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Peter Eisentraut" <peter_e@gmx.net>
|
||
Cc: "PostgreSQL Development" <pgsql-hackers@postgresql.org>,
|
||
"Tom Lane" <tgl@sss.pgh.pa.us>
|
||
Subject: RE: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generatedfilename
|
||
Date: Mon, 19 Jun 2000 13:52:34 +0900
|
||
Message-ID: <001201bfd9aa$2f1c1320$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="ISO-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
In-Reply-To: <Pine.LNX.4.21.0006181657280.562-100000@localhost.localdomain>
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
Importance: Normal
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
> -----Original Message-----
|
||
> From: pgsql-hackers-owner@hub.org [mailto:pgsql-hackers-owner@hub.org]On
|
||
> Behalf Of Peter Eisentraut
|
||
>
|
||
> Tom Lane writes:
|
||
>
|
||
> > I don't think it's a good idea to have to consult pg_tablespace to find
|
||
> > out where a table actually is --- I think the pathname (or smgr access
|
||
> > token as Ross would call it ;-)) ought to be determinable from just the
|
||
> > pg_class entry.
|
||
>
|
||
> That's why I suggested the table space oid. That would be readily
|
||
> available from pg_class.
|
||
>
|
||
|
||
It seems to me that the following 1)2) has always been mixed up.
|
||
IMHO,they should be distinguished clearly.
|
||
|
||
1) Where the table is stored
|
||
Currently PostgreSQL relies on relname -> filename mapping
|
||
rule to access *existent* relations and doesn't have this
|
||
information in its database. Our(Tom,Ross,me) proposal is to
|
||
keep the information(token) in pg_class and provide a standard
|
||
transactional control mechanism for the change of table file
|
||
allocation. By doing it we would be able to be free from table
|
||
allocation(naming) rule.
|
||
Isn't it a kind of thing why we haven't had it from the first ?
|
||
|
||
2) Where to store the table
|
||
Yes,TABLE(DATA)SPACE should encapsulate this concept.
|
||
|
||
I want the decision about 1) first. Ross has already tried it without
|
||
2).
|
||
|
||
Comments ?
|
||
|
||
As for 2) every one seems to have each opinion and the discussion
|
||
has always been divergent. Please don't discard 1) together.
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
|
||
From pgsql-hackers-owner+M3591@hub.org Mon Jun 19 11:01:19 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA21409
|
||
for <pgman@candle.pha.pa.us>; Mon, 19 Jun 2000 10:01:18 -0400 (EDT)
|
||
Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id JAA05383 for <pgman@candle.pha.pa.us>; Mon, 19 Jun 2000 09:56:59 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e5JDsVM91574;
|
||
Mon, 19 Jun 2000 09:54:31 -0400 (EDT)
|
||
Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e5JDldM77267
|
||
for <pgsql-hackers@postgreSQL.org>; Mon, 19 Jun 2000 09:48:05 -0400 (EDT)
|
||
Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
|
||
by gandalf.it-austria.net (xxx/xxx) with ESMTP id PAA80686;
|
||
Mon, 19 Jun 2000 15:46:24 +0200
|
||
Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
|
||
id <M6F90H0A>; Mon, 19 Jun 2000 15:46:24 +0200
|
||
Message-ID: <219F68D65015D011A8E000006F8590C605BA5978@sdexcsrv1.f000.d0188.sd.spardat.at>
|
||
From: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
|
||
To: "'Tom Lane'" <tgl@sss.pgh.pa.us>, Peter Eisentraut <peter_e@gmx.net>
|
||
Cc: pgsql-hackers@postgresql.org
|
||
Subject: AW: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated
|
||
filename
|
||
Date: Mon, 19 Jun 2000 15:46:22 +0200
|
||
MIME-Version: 1.0
|
||
X-Mailer: Internet Mail Service (5.5.2448.0)
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
|
||
> It's better than *all* segments of tables in one directory, which is
|
||
> what you get if the segment number is just a component of a flat file
|
||
> name. We have to have a better answer than that for people who need
|
||
> to cope with tables bigger than a disk. Perhaps someone can
|
||
> think of a
|
||
> better answer than subdirectory-per-segment-number, but I think that
|
||
> will work well enough; and it doesn't add any complexity for file
|
||
> access.
|
||
|
||
I do not see this connection between a filesystem and a disk ?
|
||
Modern systems have the ability to join more than one disk into
|
||
one filesystem.
|
||
|
||
Also if we think about separating large tables into smaller parts
|
||
we imho want something where the optimizer has knowledge
|
||
what data it finds in what part of the table.
|
||
|
||
Andreas
|
||
|
||
From pgsql-hackers-owner+M4680@hub.org Mon Jul 10 11:16:07 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA28153
|
||
for <pgman@candle.pha.pa.us>; Mon, 10 Jul 2000 10:16:06 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e6AEG5W83419;
|
||
Mon, 10 Jul 2000 10:16:05 -0400 (EDT)
|
||
Received: from corvette.mascari.com (dhcp160176144.columbus.rr.com [24.160.176.144])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e6AE7FW63372
|
||
for <pgsql-hackers@postgreSQL.org>; Mon, 10 Jul 2000 10:07:24 -0400 (EDT)
|
||
Received: from mascari.com (ferrari.mascari.com [192.168.2.1])
|
||
by corvette.mascari.com (8.9.3/8.9.3) with ESMTP id KAA10768;
|
||
Mon, 10 Jul 2000 10:03:27 -0400
|
||
Message-ID: <3969D7CA.8AF9573C@mascari.com>
|
||
Date: Mon, 10 Jul 2000 10:03:54 -0400
|
||
From: Mike Mascari <mascarm@mascari.com>
|
||
Organization: Mascari Development Inc
|
||
X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.5-15 i586)
|
||
X-Accept-Language: en
|
||
MIME-Version: 1.0
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
CC: Tom Lane <tgl@sss.pgh.pa.us>, Philip Warner <pjw@rhyme.com.au>,
|
||
Chris Bitmead <chrisb@nimrod.itg.telstra.com.au>,
|
||
"pgsql-hackers@postgreSQL.org" <pgsql-hackers@postgreSQL.org>
|
||
Subject: Re: [HACKERS] Re: [GENERAL] PostgreSQL vs. MySQL
|
||
References: <200007101310.JAA26260@candle.pha.pa.us>
|
||
Content-Type: text/plain; charset=us-ascii
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: ROr
|
||
|
||
Bruce Momjian wrote:
|
||
>
|
||
> > And of course the major problem with *that* is how do you get the
|
||
> > connection request to arrive at a backend that's been prestarted in
|
||
> > the right database? If you don't commit to a database then there's
|
||
> > not a whole lot of prestarting that can be done.
|
||
> >
|
||
> > It occurs to me that this'd get a whole lot more feasible if one
|
||
> > postmaster == one database, which is something we *could* do if we
|
||
> > implemented schemas. Hiroshi's been arguing that the current hard
|
||
> > separation between databases in an installation should be done away
|
||
> > with in favor of schemas, and I'm starting to see his point...
|
||
>
|
||
> This is interesting. You believe schema's would allow a pool of
|
||
> backends to connect to any database? That would clearly be a win.
|
||
|
||
I'm just curious, but did a consensus ever develop on schemas? It
|
||
seemed that the schemas/tablespace thread just ran out of steam.
|
||
For what its worth, I like the idea of:
|
||
|
||
1. PostgreSQL installation -> SQL cluster of catalogs
|
||
2. PostgreSQL database -> SQL catalog
|
||
3. PostgreSQL schema -> SQL schema
|
||
|
||
This correlates nicely with the current representation of
|
||
DATABASE. People can run multiple SQL clusters by running
|
||
multiple postmasters on different ports. Today, most people
|
||
achieve a logical separation of data by issuing multiple CREATE
|
||
DATABASE commands. But under the above, most sites would run with
|
||
a single PostgreSQL database (SQL catalog), since:
|
||
|
||
"Catalogs are named collections of schemas in an SQL-environment"
|
||
|
||
This would mirror the behavior of Oracle, where most people run
|
||
with a single Oracle SID. The logical separation would be
|
||
achieved with SCHEMA's a level under the current DATABASE (a.k.a.
|
||
catalog). This eliminates the problem of using softlinks and
|
||
creating various subdirectories to mirror *logical* parititioning
|
||
of data. It also alleviates the problem people currently
|
||
encounter when they've built their data model around multiple
|
||
DATABASE's but learn later that they need access to more than one
|
||
simultaneously. Instead, they'll model their design around
|
||
multiple SCHEMA's which exist within a single DATABASE instance.
|
||
|
||
It seems that the discussion of tablespaces shouldn't be mixed
|
||
with SCHEMA's except to note that a DATABASE (catalog) should
|
||
have a default TABLESPACE whose path matches the current one:
|
||
|
||
../pgsql/data/base/<mydatabase>
|
||
|
||
Later, users might be able to create a hierarchy of default
|
||
TABLESPACE's where the location of the object is found with logic
|
||
like:
|
||
|
||
1. Is there a object-specified tablespace?
|
||
(ex: CREATE TABLE payroll IN TABLESPACE...)
|
||
2. Is there a user-specified default tablespace?
|
||
(ex: CREATE USER mike DEFAULT TABLESPACE...)
|
||
2. Is there a schema-specified default tablespace?
|
||
(ex: CREATE SCHEMA accounting DEFAULT TABLESPACE..)
|
||
3. Use the catalog-default tablespace
|
||
(ex: CREATE DATABASE postgres DEFAULT LOCATION '/home/pgsql')
|
||
|
||
with the last example creating the system tablespace,
|
||
'system_tablespace', with '/home/pgsql' as the location.
|
||
|
||
Anyways, it seems a consensus should be developed on the whole
|
||
Cluster/Catalog/Schema scenario.
|
||
|
||
Mike Mascari
|
||
|
||
From Albert.Langer@Directory-Designs.org Sun Apr 15 12:57:07 2001
|
||
Received: from relay1.pair.com (relay1.pair.com [209.68.1.20])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with SMTP id MAA22644
|
||
for <pgman@candle.pha.pa.us>; Sun, 15 Apr 2001 12:57:06 -0400 (EDT)
|
||
Received: (qmail 16730 invoked from network); 15 Apr 2001 16:56:26 -0000
|
||
Received: from cpe-144-132-70-18.vic.bigpond.net.au (HELO w98) (144.132.70.18)
|
||
by relay1.pair.com with SMTP; 15 Apr 2001 16:56:26 -0000
|
||
X-pair-Authenticated: 144.132.70.18
|
||
Reply-To: <Albert.Langer@Directory-Designs.org>
|
||
From: "Albert Langer" <Albert.Langer@Directory-Designs.org>
|
||
To: "'Bruce Momjian'" <pgman@candle.pha.pa.us>,
|
||
"'Hiroshi Inoue'" <Inoue@tpf.co.jp>,
|
||
"'Ross J. Reedstrom'" <reedstrm@wallace.ece.rice.edu>,
|
||
"'Mike Mascari'" <mascarm@mascari.com>, <JanWieck@t-online.de>,
|
||
"'Tom Lane'" <tgl@sss.pgh.pa.us>,
|
||
"'Zeugswetter Andreas SB'" <ZeugswetterA@wien.spardat.at>,
|
||
"'The Hermit Hacker'" <scrappy@hub.org>,
|
||
"'Oliver Elphick'" <olly@lfix.co.uk>,
|
||
"'Don Baccus'" <dhogaza@pacifier.com>,
|
||
"'Thomas Lockhart'" <lockhart@alumni.caltech.edu>,
|
||
"'Chris Bitmead'" <chrisb@nimrod.itg.telstra.com.au>,
|
||
"'Philip J. Warner'" <pjw@rhyme.com.au>,
|
||
"'Peter Eisentraut'" <peter_e@gmx.net>,
|
||
"'Lamar Owen'" <lamar.owen@wgcr.org>,
|
||
"'Vadim Mikheev'" <vmikheev@SECTORBASE.COM>
|
||
Subject: Tablespaces - checkout SAP DB
|
||
Date: Mon, 16 Apr 2001 02:56:04 +1000
|
||
Message-ID: <000001c0c5cc$f5fd6ac0$6628a8c0@nowhere.com>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook CWS, Build 9.0.2416 (9.0.2910.0)
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
|
||
Importance: Normal
|
||
Status: RO
|
||
|
||
Hi everyone,
|
||
|
||
Sorry about the long To list - this is to everyone I noticed commenting in:
|
||
http://www.postgresql.org/docs/pgsql/doc/TODO.detail/tablespaces
|
||
|
||
I strongly recommend checkout of approach used in SAP DB:
|
||
|
||
http://www.sap.com/solutions/technology/sapdb/sap_db_documentation.htm
|
||
|
||
Their glossy 2 page brochure emphasizes the way they handle
|
||
tablespaces as strongest point for ease of administration:
|
||
|
||
http://www.sap.com/solutions/technology/sapdb/pdf/50033321.pdf
|
||
|
||
Directory distribution explained in:
|
||
http://www.sap.com/solutions/technology/sapdb/pdf/directorydistrib_72eng.pdf
|
||
|
||
Architecture and tablespace/devspace concepts explained in:
|
||
|
||
http://www.sap.com/solutions/technology/sapdb/pdf/dbmgui_73eng.pdf
|
||
(721K)
|
||
|
||
A good short overview can be obtained from the Glossary:
|
||
|
||
http://www.sap.com/solutions/technology/sapdb/sap_db_glossary.htm
|
||
(not .pdf - ordinary html)
|
||
|
||
vvvvvvv
|
||
data devspace
|
||
|
||
The user data (tables, indexes) and the SQL catalog are stored in the data
|
||
devspaces. A table or an index needs one page (minimum); a table can use all
|
||
the data devspaces that is the whole database (maximum). A table increases
|
||
or decreases in size automatically without administrative intervention.
|
||
|
||
As a rule, a database internal striping algorithm distributes the data
|
||
belonging to a table evenly across all the data devspaces. An assignment of
|
||
tables to data devspaces is not possible nor is it necessary.
|
||
|
||
When installing the database instance you can configure one or more data
|
||
devspaces and while the database is running you can also add new data
|
||
devspaces. The disk storage space defined by all the data devspaces is the
|
||
total size of the database.
|
||
|
||
devspace
|
||
|
||
This term denotes a physical disk or part of a physical disk. This can be a
|
||
raw device or a file.
|
||
|
||
log devspace
|
||
|
||
What is recorded in a log devspace is all the changes in the contents of the
|
||
database, to enable the contents to be recovered or restored after hardware
|
||
faults. The complete log can consist of a number of devspaces. You can
|
||
define the number of log devspaces required when installing the database
|
||
instance and can add new log devspaces even while the database is operating.
|
||
To ensure that the data on the database is kept safe, you have the option of
|
||
mirroring the log devspace(s) (set parameter LOG_MODE to DUAL).
|
||
|
||
In log backups the contents of the log devspace(s) is copied to a file and
|
||
the space originally occupied by it is released for log data. The backup
|
||
files are numbered by the system in sequence. The selected size of the
|
||
archive log devspace should therefore be sufficient for all the changes
|
||
occurring between two backups to be recorded there.
|
||
|
||
serverdb
|
||
|
||
A Serverdb consists of the system devspace, one or more log devspaces, and
|
||
one or more data devspaces.
|
||
|
||
For security and performance reasons, each devspace type should be kept on a
|
||
different disk. The log devspaces of a serverdb can also be mirrored to
|
||
obtain a higher degree of availability. The disks used should present
|
||
uniform performance data (especially access speeds) because this is the only
|
||
way that equal usage of the devspaces can be achieved. If necessary, a
|
||
database instance can be expanded by additional data devspaces while the
|
||
database is running.
|
||
|
||
The devspace usage level of a database instance is therefore a critical
|
||
parameter of database operation and must be monitored. If the data devspaces
|
||
become full, database operation stops. Further data devspaces can be defined
|
||
in this state to allow database operation to continue.
|
||
|
||
system devspace
|
||
|
||
The restart information and the mapping of the logical page numbers to
|
||
physical page addresses are administered in the system devspace. The size of
|
||
the system devspace therefore depends directly on the database size and is
|
||
determined by the database kernel.
|
||
^^^^^^^^^^^^^^^^^
|
||
|
||
Concept of just flexibly assigning space to databases,
|
||
with only two types of space that should be kept on
|
||
different spindlesets, plus the ability to add space
|
||
*while running* is what justifies their claim to much
|
||
easier admin than Oracle.
|
||
|
||
Many Postgresql sites run with far too few spindles anyway
|
||
and don't have DBAs with a clue what to do with tablespaces.
|
||
Now that SAP DB is also open source, making it easy for them
|
||
could be critically important.
|
||
|
||
I'm not even subscribed to pgsql-hacker and don't understand
|
||
the internals enough to have any view on whether it's possible
|
||
or how.
|
||
|
||
But if it is possible to present similar *concepts* to DBAs
|
||
from the "outside", with whatever actually goes on internally,
|
||
that would be really *great*.
|
||
|
||
Once the internals are done, others could more easily add
|
||
admin tools and documentation comparable to SAP DB. Given
|
||
the overwhelming advantages of PostgreSQL from all other
|
||
points of view, this could be critically important.
|
||
|
||
I was surprised to find no discussion of comparisons with
|
||
SAP DB and what could be learned from it's source release
|
||
in a quick search of the web site and mailing lists.
|
||
|
||
Seeya, Albert
|
||
|
||
|
||
From pgsql-general-owner+M14288@postgresql.org Mon Aug 27 10:31:19 2001
|
||
Return-path: <pgsql-general-owner+M14288@postgresql.org>
|
||
Received: from server1.pgsql.org (server1.pgsql.org [64.39.15.238])
|
||
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id f7REVIF27112
|
||
for <pgman@candle.pha.pa.us>; Mon, 27 Aug 2001 10:31:18 -0400 (EDT)
|
||
Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
|
||
by server1.pgsql.org (8.11.6/8.11.6) with ESMTP id f7REVkq86991;
|
||
Mon, 27 Aug 2001 09:31:47 -0500 (CDT)
|
||
(envelope-from pgsql-general-owner+M14288@postgresql.org)
|
||
Received: from svana.org (svana.org [210.9.66.30])
|
||
by postgresql.org (8.11.3/8.11.4) with ESMTP id f7RDcEf82291
|
||
for <pgsql-general@postgresql.org>; Mon, 27 Aug 2001 09:38:15 -0400 (EDT)
|
||
(envelope-from kleptog@svana.org)
|
||
Received: from kleptog by svana.org with local (Exim 3.12 #1 (Debian))
|
||
id 15bMal-0000Ac-00; Mon, 27 Aug 2001 23:38:15 +1000
|
||
Date: Mon, 27 Aug 2001 23:38:15 +1000
|
||
From: Martijn van Oosterhout <kleptog@svana.org>
|
||
To: newsreader@mediaone.net
|
||
cc: Jeff Davis <list-pgsql-general@dynworks.com>, pgsql-general@postgresql.org
|
||
Subject: Re: [GENERAL] raw partition
|
||
Message-ID: <20010827233815.B32309@svana.org>
|
||
Reply-To: Martijn van Oosterhout <kleptog@svana.org>
|
||
References: <20010826125450.A11535@dragon.universe> <0GIP004VTV1MTO@mta7.pltn13.pbi.net> <20010827091141.A3208@dragon.universe>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain; charset=us-ascii
|
||
Content-Disposition: inline
|
||
User-Agent: Mutt/1.2.5i
|
||
In-Reply-To: <20010827091141.A3208@dragon.universe>; from newsreader@mediaone.net on Mon, Aug 27, 2001 at 09:11:41AM -0400
|
||
Precedence: bulk
|
||
Sender: pgsql-general-owner@postgresql.org
|
||
Status: OR
|
||
|
||
On Mon, Aug 27, 2001 at 09:11:41AM -0400, newsreader@mediaone.net wrote:
|
||
> On Mon, Aug 27, 2001 at 12:46:16AM -0700, Jeff Davis wrote:
|
||
> > On Sunday 26 August 2001 09:54 am, you wrote:
|
||
> >
|
||
> > Obviously, if done properly, it couldn't hurt. However, is it really worth
|
||
> > the extra trouble to set it up, and more so, to debug an extra form that disk
|
||
>
|
||
> I think it's only a matter of getting rid
|
||
> of file system layer.
|
||
|
||
But that won't work. Postgres currently stores each table in its own file.
|
||
Thus, to implement raw access postgres would have to implement it's own
|
||
filesystem within the raw partition.
|
||
|
||
By using the filesystems built into the OS, it can take advantage of
|
||
filesystem smarts already there. No to mention people just being able to use
|
||
normal system commands to view what's there e.g. symlinks to relocate
|
||
tables. I beleive that filesystem technology within the OS will advance much
|
||
faster than anything the postgres developers could come up with.
|
||
|
||
For example, by running your database on an ext3 partition, all file
|
||
metadata is automatically journalled, with no additional effort from the
|
||
postgres developers. You could even choose to journal all database access
|
||
(though I have no idea how that interacts with WAL).
|
||
|
||
> > marginal utility for integrated functionality? Consider this: should postgres
|
||
> > be it's own OS; bootable and everything (get rid of all that OS overhead)? I
|
||
>
|
||
> file system overhead is all, I think.
|
||
> The only thing I am sure about is that
|
||
> whether pg (and developers) will have to be
|
||
> aware of the disk technology since it is
|
||
> evolving continuosly. Or is there another
|
||
> layer provided by the OS: a layer
|
||
> between physical disk and the filesystem?
|
||
> That layer will have to understand UDMA technology,
|
||
> SCSI technology? I have no idea.
|
||
|
||
Well, a raw partition provided by the OS would hide such details. However,
|
||
postgres would have to make assumptions about what kind of access patterns
|
||
are optimal. The kernel is in a much better position to make such decisions
|
||
about resource usage. Which is precisly why we have OS's in the first place.
|
||
|
||
> > oracle allows this behaviour you speak of, but I have never used it. Does
|
||
> > someone have experience (or benchmarks or whatever) with oracle's
|
||
> > implementation?
|
||
>
|
||
> I have never used an oracle
|
||
|
||
I beleive (someone correct me if I'm wrong) that even when used on a
|
||
filesystem, oracle still places all it's tables in a single file i.e. it has
|
||
a filesystem layer builtin. I think that's why it's a clear win for oracle
|
||
because you *are* actually removing a layer.
|
||
|
||
IMHO it's something postgres should stay well away from.
|
||
--
|
||
Martijn van Oosterhout <kleptog@svana.org>
|
||
http://svana.org/kleptog/
|
||
> It would be nice if someone came up with a certification system that
|
||
> actually separated those who can barely regurgitate what they crammed over
|
||
> the last few weeks from those who command secret ninja networking powers.
|
||
|
||
---------------------------(end of broadcast)---------------------------
|
||
TIP 2: you can get off all lists at once with the unregister command
|
||
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
|
||
|
||
From jim@buttafuoco.net Sun Mar 3 14:34:59 2002
|
||
Return-path: <jim@buttafuoco.net>
|
||
Received: from dual.buttafuoco.net (vsat-148-63-214-126.c004.g4.mrt.starband.net [148.63.214.126])
|
||
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g23KYjM24547
|
||
for <pgman@candle.pha.pa.us>; Sun, 3 Mar 2002 15:34:52 -0500 (EST)
|
||
Received: from buttafuoco.net (dual [127.0.0.1])
|
||
by dual.buttafuoco.net (8.11.2/8.11.2) with ESMTP id g23KYaF05729;
|
||
Sun, 3 Mar 2002 15:34:36 -0500
|
||
From: "Jim Buttafuoco" <jim@buttafuoco.net>
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>, jim@buttafuoco.net
|
||
cc: Vadim Mikheev <vmikheev@sectorbase.com>,
|
||
pgsql-hackers <pgsql-hackers@postgresql.org>
|
||
Reply-To: jim@buttafuoco.net
|
||
Subject: Re: [HACKERS] Status of index location patch
|
||
Date: Sun, 3 Mar 2002 15:34:36 -0500
|
||
Message-ID: <20020303153436.M48726@buttafuoco.net>
|
||
In-Reply-To: <200202221805.g1MI5o429319@candle.pha.pa.us>
|
||
References: <200109151754.f8FHsdB08189@dual.buttafuoco.net> <200202221805.g1MI5o429319@candle.pha.pa.us>
|
||
X-Mailer: Open WebMail 1.62 20020220
|
||
X-OriginatingIP: 192.1.3.22 (jim)
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain; charset=iso-8859-1
|
||
Status: ORr
|
||
|
||
Bruce,
|
||
|
||
I stopped all work on this since people seemed confused about the
|
||
tablespace/location words. I don't think enough of the "core" team likes
|
||
this idea. Am I wrong here? Did I explain the patch good enough?
|
||
|
||
Please let me know, I still am planning on doing it for internal use. I
|
||
would prefer that it was a standard feature. If you think I should still
|
||
pursue this, let me know what I need to do to get it off the ground.
|
||
|
||
Thanks for your help
|
||
Jim
|
||
|
||
|
||
|
||
> Jim, do you have an updated patch that you would like applied for 7.3?
|
||
>
|
||
> ---------------------------------------------------------------------------
|
||
>
|
||
> Jim Buttafuoco wrote:
|
||
> > Vadim,
|
||
> >
|
||
> > I guess I am still confused...
|
||
> >
|
||
> > In dbcommands.c resolve_alt_dbpath() takes the db oid as a argument.
|
||
> > This number is used to "find" the directory where the data files live.
|
||
> > All the patch does is put the indexes into a "db oid"_index directory
|
||
> > instead of "db oid"
|
||
> >
|
||
> >
|
||
> > This is for tables snprintf(ret, len, "%s/base/%u", prefix, dboid);
|
||
> > This is for indexes snprintf(ret, len, "%s/base/%u_index", prefix,
|
||
> > dboid);
|
||
> >
|
||
> > And in catalog.c
|
||
> > tables: sprintf(path, "%s/base/%u/%u", DataDir, rnode.tblNode,
|
||
> > rnode.relNode);
|
||
> > indexes: sprintf(path, "%s/base/%u_index/%u", DataDir,
|
||
> > rnode.tblNode,rnode.relNode);
|
||
> >
|
||
> > Can you explain how I would get the tblNode for an existing database
|
||
> > index files if it doesn't have the same OID as the database entry in
|
||
> > pg_databases.
|
||
> >
|
||
> > Jim
|
||
> >
|
||
> >
|
||
> > > > Just wondering what is the status of this patch. Is seems from
|
||
> > comments
|
||
> > > > that people like the idea. I have also looked in the archives for
|
||
> > other
|
||
> > > > people looking for this kind of feature and have found alot of
|
||
> > interest.
|
||
> > > >
|
||
> > > > If you think it is a good idea for 7.2, let me know what needs to be
|
||
> > > > changed and I will work on it this weekend.
|
||
> > >
|
||
> > > Just change index' dir naming as was already discussed.
|
||
> > >
|
||
> > > Vadim
|
||
> > >
|
||
> > >
|
||
> >
|
||
> >
|
||
> >
|
||
> > ---------------------------(end of broadcast)---------------------------
|
||
> > TIP 2: you can get off all lists at once with the unregister command
|
||
> > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
|
||
> >
|
||
>
|
||
> --
|
||
> Bruce Momjian | http://candle.pha.pa.us
|
||
> pgman@candle.pha.pa.us | (610) 853-3000
|
||
> + If your life is a hard drive, | 830 Blythe Avenue
|
||
> + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
|
||
|
||
|
||
|
||
|
||
From lockhart@fourpalms.org Tue Mar 5 08:02:50 2002
|
||
Return-path: <lockhart@fourpalms.org>
|
||
Received: from golem.fourpalms.org (www.fourpalms.org [64.3.68.148])
|
||
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g25E2oY04958
|
||
for <pgman@candle.pha.pa.us>; Tue, 5 Mar 2002 09:02:50 -0500 (EST)
|
||
Received: from fourpalms.org (localhost.localdomain [127.0.0.1])
|
||
by golem.fourpalms.org (Postfix) with ESMTP
|
||
id CACDD1BC83; Tue, 5 Mar 2002 06:02:47 -0800 (PST)
|
||
Sender: lockhart@fourpalms.org
|
||
Message-ID: <3C84D007.46FB30B6@fourpalms.org>
|
||
Date: Tue, 05 Mar 2002 06:02:47 -0800
|
||
From: Thomas Lockhart <lockhart@fourpalms.org>
|
||
Reply-To: lockhart@fourpalms.org
|
||
Organization: Yes
|
||
X-Mailer: Mozilla 4.78 [en] (X11; U; Linux 2.4.8-34.1mdksmp i686)
|
||
X-Accept-Language: en
|
||
MIME-Version: 1.0
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
cc: Bruce Momjian <pgman@candle.pha.pa.us>, jim@buttafuoco.net,
|
||
pgsql-hackers <pgsql-hackers@postgresql.org>
|
||
Subject: Re: Storage Location Patch Proposal for V7.3
|
||
References: <200203050631.g256Vh924330@candle.pha.pa.us> <13961.1015313407@sss.pgh.pa.us>
|
||
Content-Type: text/plain; charset=us-ascii
|
||
Content-Transfer-Encoding: 7bit
|
||
Status: OR
|
||
|
||
...
|
||
> Forward compatibility to a future tablespace implementation.
|
||
> If we do this, we'll be stuck with supporting this feature set,
|
||
> not to mention this syntax; neither of which have garnered any
|
||
> support from the assembled hackers.
|
||
|
||
The feature set (in some incarnation) is exactly something we should
|
||
have. "Tablespace" could mean almost anything, since (I recall that) we
|
||
are not slavishly copying the Oracle features having a similar name. The
|
||
syntax (or something similar) seems acceptable to me. I haven't looked
|
||
at the implementation itself.
|
||
|
||
So, I'll guess that the particular objection to this implementation is
|
||
along the lines of wanting to be able to manage tablespaces/locations as
|
||
a single entity? So that one could issue commands like (forgive the
|
||
syntax) "move tablespace xxx to yyy;" and be able to yank the entire
|
||
contents from one place to another in a single line?
|
||
|
||
Jim's patches don't explicitly tie the pieces residing in a single
|
||
location together. Is that the objection? In all other respects (and
|
||
perhaps in all respects period) it seems to be a good starting point at
|
||
least.
|
||
|
||
I know that you have said that you want to look at "tablespaces" for
|
||
7.3. If we get there with a feature set we all find acceptable, then
|
||
great. If we don't, then Jim's subset of features would be great to
|
||
have.
|
||
|
||
Comments?
|
||
|
||
- Thomas
|
||
|
||
From pgsql-hackers-owner+M19763@postgresql.org Wed Mar 6 19:50:47 2002
|
||
Return-path: <pgsql-hackers-owner+M19763@postgresql.org>
|
||
Received: from postgresql.org (postgresql.org [64.49.215.8])
|
||
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g271okY15943
|
||
for <pgman@candle.pha.pa.us>; Wed, 6 Mar 2002 20:50:46 -0500 (EST)
|
||
Received: from postgresql.org (postgresql.org [64.49.215.8])
|
||
by postgresql.org (Postfix) with SMTP
|
||
id 220A3475B48; Wed, 6 Mar 2002 20:49:59 -0500 (EST)
|
||
Received: from dual.buttafuoco.net (vsat-148-63-214-126.c004.g4.mrt.starband.net [148.63.214.126])
|
||
by postgresql.org (Postfix) with ESMTP id 4D925475881
|
||
for <pgsql-hackers@postgresql.org>; Wed, 6 Mar 2002 20:44:51 -0500 (EST)
|
||
Received: from buttafuoco.net (dual [127.0.0.1])
|
||
by dual.buttafuoco.net (8.11.2/8.11.2) with ESMTP id g271ihm25853
|
||
for <pgsql-hackers@postgresql.org>; Wed, 6 Mar 2002 20:44:43 -0500
|
||
From: "Jim Buttafuoco" <jim@buttafuoco.net>
|
||
To: "pgsql-hackers" <pgsql-hackers@postgresql.org>
|
||
Reply-To: jim@buttafuoco.net
|
||
Subject: [HACKERS] Storage Location / Tablespaces (try 3)
|
||
Date: Wed, 6 Mar 2002 20:44:43 -0500
|
||
Message-ID: <20020306204443.M82891@buttafuoco.net>
|
||
X-Mailer: Open WebMail 1.62 20020220
|
||
X-OriginatingIP: 192.1.3.22 (jim)
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain; charset=iso-8859-1
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@postgresql.org
|
||
Status: OR
|
||
|
||
Me again, I have some more details on my storage location patch
|
||
|
||
|
||
|
||
This patch would allow the system admin (DBA) to specify the location of
|
||
databases, tables/indexes and temporary objects (temp tables and temp sort
|
||
space) independent of the database/system default location. This patch would
|
||
replace the current "LOCATION" code.
|
||
|
||
Please let me know if you have any questions/comments. I would like to see
|
||
this feature make 7.3. I believe it will take about 1 month of coding and
|
||
testing after I get started.
|
||
|
||
Thanks
|
||
Jim
|
||
|
||
==============================================================================
|
||
Storage Location Patch (Try 3)
|
||
|
||
|
||
(If people like TABLESPACE instead of LOCATION then s/LOCATION/TABLESPACE/g
|
||
below)
|
||
|
||
|
||
This patch would add the following NEW commands
|
||
----------------------------------------------------
|
||
CREATE LOCATION name PATH 'dbpath';
|
||
DROP LOCATION name;
|
||
|
||
where dbpath is any directory that the postgresql backend can write to.
|
||
(I know this is how Oracle works, don't know about the other major db systems)
|
||
|
||
The following NEW GLOBAL system table would be added.
|
||
-----------------------------------------------------
|
||
PG_LOCATION
|
||
(
|
||
LOC_NAME name,
|
||
LOC_PATH text -- This should be able to take any path name.
|
||
);
|
||
(initdb would add (PGDATA,'/usr/local/pgsql/data')
|
||
|
||
The following system tables would need to be modified
|
||
-----------------------------------------------------
|
||
PG_DATABASE drop datpath
|
||
add DATA_LOC_NAME name or DATA_LOC_OID OID
|
||
add INDEX_LOC_NAME name or INDEX_LOC_OID OID
|
||
add TEMP_LOC_NAME name or TEMP_LOC_OID OID
|
||
PG_CLASS to add LOC_NAME name or LOC_OID OID
|
||
|
||
DATA_LOC_* and INDEX_LOC_* would default to PGDATA if not specified.
|
||
|
||
(I like *LOC_NAME better but I believe the rest of the systems tables use OID)
|
||
|
||
|
||
The following command syntax would be modified
|
||
------------------------------------------------------
|
||
CREATE DATABASE WITH DATA_LOCATION=XXX INDEX_LOCATION=YYY TEMP_LOCATION=ZZZ
|
||
CREATE TABLE aaa (...) WITH LOCATION=XXX;
|
||
CREATE TABLE bbb (c1 text primary key location CCC) WITH LOCATION=XXX;
|
||
CREATE TABLE ccc (c2 text unique location CCC) WITH LOCATION=XXX;
|
||
CREATE INDEX XXX on SAMPLE (C2) WITH LOCATION BBB;
|
||
|
||
|
||
|
||
Now for an example
|
||
------------------------------------------------------
|
||
First:
|
||
postgresql is installed at /usr/local/pgsql
|
||
userid postgres
|
||
the postgres user also is the owner of /pg01 /pg02 /pg03
|
||
|
||
the dba executes the following script
|
||
CREATE LOCATION pg01 PATH '/pg01';
|
||
CREATE LOCATION pg02 PATH '/pg02';
|
||
CREATE LOCATION pg03 PATH '/pg03';
|
||
CREATE LOCATION bigdata PATH '/bigdata';
|
||
CREATE LOCATION bigidx PATH '/bigidx';
|
||
\q
|
||
|
||
PG_LOCATION now has
|
||
pg01 | /pg01
|
||
pg02 | /pg02
|
||
pg03 | /pg03
|
||
bigdata | /bigdata
|
||
bigidx | /bigidx
|
||
|
||
Now the following command is run
|
||
CREATE DATABASE jim1 WITH DATA_LOCATION='pg01' INDEX_LOCATION='pg02'
|
||
TEMP_LOCATION='pg03'
|
||
-- OID of 'jim1' tuple is 1786146
|
||
|
||
on disk the directories look like this
|
||
/pg01/1786146 <<-- Default DATA Location
|
||
/pg02/1786146 <<-- Default INDEX Location
|
||
/pg03/1786146 <<-- Default Temp Location
|
||
|
||
All files from the above directories will have symbolic links to
|
||
/usr/local/pgsql/data/base/1786146/
|
||
|
||
|
||
|
||
Now the system will have 1 BIG table that will get its own disk for data and
|
||
its own disk for index
|
||
create table big (a text,b text ..., primary key (a,b) location 'bigidx');
|
||
|
||
oid of big table is 1786150
|
||
oid of big table primary key index is 1786151
|
||
|
||
on disk directories look like this
|
||
/bigdata/1786146/1786150
|
||
/bigidx/1786146/1786151
|
||
/usr/local/pgsql/data/base/1786146/1786150 symbolic link to
|
||
/bigdata/1786146/1786150
|
||
/usr/local/pgsql/data/base/1786146/1786151 symbolic link to
|
||
/bigdata/1786146/1786151
|
||
|
||
|
||
|
||
The symbolic links will enable the rest of the software to be location
|
||
independent.
|
||
|
||
|
||
|
||
---------------------------(end of broadcast)---------------------------
|
||
TIP 2: you can get off all lists at once with the unregister command
|
||
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
|
||
|
||
From pgsql-hackers-owner+M19814@postgresql.org Thu Mar 7 17:25:06 2002
|
||
Return-path: <pgsql-hackers-owner+M19814@postgresql.org>
|
||
Received: from postgresql.org (postgresql.org [64.49.215.8])
|
||
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g27NP4Q10967
|
||
for <pgman@candle.pha.pa.us>; Thu, 7 Mar 2002 18:25:05 -0500 (EST)
|
||
Received: from postgresql.org (postgresql.org [64.49.215.8])
|
||
by postgresql.org (Postfix) with SMTP
|
||
id 74CC94761DE; Thu, 7 Mar 2002 17:50:44 -0500 (EST)
|
||
Received: from sss.pgh.pa.us (unknown [192.204.191.242])
|
||
by postgresql.org (Postfix) with ESMTP id 712F0476101
|
||
for <pgsql-hackers@postgresql.org>; Thu, 7 Mar 2002 17:47:04 -0500 (EST)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss.pgh.pa.us (8.11.4/8.11.4) with ESMTP id g27MkaS15710;
|
||
Thu, 7 Mar 2002 17:46:41 -0500 (EST)
|
||
To: jim@buttafuoco.net
|
||
cc: "Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at>,
|
||
"pgsql-hackers" <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Storage Location / Tablespaces (try 3)
|
||
In-Reply-To: <20020307160519.M90856@buttafuoco.net>
|
||
References: <46C15C39FEB2C44BA555E356FBCD6FA4961D67@m0114.s-mxs.net> <20020307160519.M90856@buttafuoco.net>
|
||
Comments: In-reply-to "Jim Buttafuoco" <jim@buttafuoco.net>
|
||
message dated "Thu, 07 Mar 2002 16:05:19 -0500"
|
||
Date: Thu, 07 Mar 2002 17:46:36 -0500
|
||
Message-ID: <15707.1015541196@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@postgresql.org
|
||
Status: OR
|
||
|
||
"Jim Buttafuoco" <jim@buttafuoco.net> writes:
|
||
> My first try passed the tablespace OID arround but someone pointed out the the
|
||
> WAL code doesn't know what the tablespace OID is or what it's location is.
|
||
|
||
The low-level file access code (including WAL references) names tables
|
||
by two OIDs, which currently are database OID and relfilenode (the
|
||
latter is NOT to be considered equivalent to table OID, even though it
|
||
presently always is equal).
|
||
|
||
I believe that the correct implementation approach is to revise things
|
||
so that the low-level name of a table is tablespace OID + relfilenode;
|
||
this physical table name would in concept be completely distinct from
|
||
the logical table identification (database OID + table OID). The file
|
||
reference path would become something like
|
||
"$PGDATA/base/tablespaceoid/relfilenode", where tablespaceoid might
|
||
reference a symlink to a directory instead of a plain directory.
|
||
Tablespace management then consists of setting up those symlinks
|
||
correctly, and there is essentially zero impact on the low-level access
|
||
code.
|
||
|
||
The hard part of this is that we are probably being sloppy in some
|
||
places about the difference between physical and logical table
|
||
identifications. Those places will need to be found and fixed.
|
||
This needs to happen anyway, of course, since the point of introducing
|
||
relfilenode was to allow table versioning, which we still want.
|
||
|
||
Vadim suggested long ago that bufmgr, smgr, and below should have
|
||
nothing to do with referencing files by relcache entries; they should
|
||
only deal in physical file identifiers. That requires some tedious but
|
||
(in principle) straightforward API changes.
|
||
|
||
BTW, if tablespaces can be shared by databases then DROP DATABASE
|
||
becomes rather tricky: how do you zap the correct files out of a shared
|
||
tablespace, keeping in mind that you are not logged into the doomed
|
||
database and can't look at its catalogs? The best idea I've seen for
|
||
this so far is:
|
||
|
||
1. Access path for tables is really
|
||
$PGDATA/base/databaseoid/tablespaceoid/relfilenode.
|
||
(BTW, we could save some work if we chdir'd into
|
||
$PGDATA/base/databaseoid at backend start and then used only relative
|
||
tablespaceoid/relfilenode paths. Right now we tend to use absolute
|
||
paths because the bootstrap code doesn't do that chdir; which seems
|
||
like a stupid solution...)
|
||
|
||
2. A shared tablespace directory contains a subdirectory for each database
|
||
that has files in the tablespace. Thus, the actual filesystem location
|
||
of a table is something like
|
||
<tablespace>/databaseoid/relfilenode
|
||
The symlink from a database's $PGDATA/base/databaseoid/ directory to
|
||
the tablespace points at <tablespace>/databaseoid. The first attempt to
|
||
create a table in a tablespace from a particular database will create
|
||
the hard subdirectory and set up the symlink; or perhaps that should be
|
||
done by an explicit tablespace management operation to "connect" the
|
||
database to the tablespace.
|
||
|
||
3. To drop a database, we examine the symlinks in its
|
||
$PGDATA/base/databaseoid/ and rm -rf each referenced tablespace
|
||
subdirectory before rm -rf'ing $PGDATA/base/databaseoid.
|
||
|
||
regards, tom lane
|
||
|
||
---------------------------(end of broadcast)---------------------------
|
||
TIP 2: you can get off all lists at once with the unregister command
|
||
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
|
||
|
||
From pgsql-general-owner+M22554=candle.pha.pa.us=pgman@postgresql.org Mon Mar 25 01:56:17 2002
|
||
Return-path: <pgsql-general-owner+M22554=candle.pha.pa.us=pgman@postgresql.org>
|
||
Received: from postgresql.org (postgresql.org [64.49.215.8])
|
||
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g2P7uGa20556
|
||
for <pgman@candle.pha.pa.us>; Mon, 25 Mar 2002 02:56:16 -0500 (EST)
|
||
Received: from postgresql.org (postgresql.org [64.49.215.8])
|
||
by postgresql.org (Postfix) with SMTP id D28B2475B61
|
||
for <pgman@candle.pha.pa.us>; Mon, 25 Mar 2002 02:56:17 -0500 (EST)
|
||
Received: from sss.pgh.pa.us (unknown [192.204.191.242])
|
||
by postgresql.org (Postfix) with ESMTP id EB3244758E9
|
||
for <pgsql-general@postgresql.org>; Mon, 25 Mar 2002 02:55:54 -0500 (EST)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss.pgh.pa.us (8.11.4/8.11.4) with ESMTP id g2P7toS17527;
|
||
Mon, 25 Mar 2002 02:55:50 -0500 (EST)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: Richard Emberson <emberson@phc.net>, pgsql-general@postgresql.org
|
||
Subject: Re: [GENERAL] Large Object Location in 7.3
|
||
In-Reply-To: <200203241932.g2OJWGV00796@candle.pha.pa.us>
|
||
References: <200203241932.g2OJWGV00796@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Sun, 24 Mar 2002 14:32:16 -0500"
|
||
Date: Mon, 25 Mar 2002 02:55:50 -0500
|
||
Message-ID: <17524.1017042950@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Precedence: bulk
|
||
Sender: pgsql-general-owner@postgresql.org
|
||
Status: OR
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> Richard Emberson wrote:
|
||
>> I expect (actually hope) to have thousands and thousands of blob/clobs
|
||
>> in the db I am designing.
|
||
>> I would like such largeobjects to be stored in their own file system.
|
||
|
||
> Sure, find the oid of pg_largeobject and symlink that to another file
|
||
> system. You need to do that toast table and any indexes for the table
|
||
> too.
|
||
|
||
If Richard's envisioning more than 1GB of large objects, I don't think
|
||
he's going to be very satisfied with manual symlinking.
|
||
|
||
This does bring up an interesting point: the tablespace schemes we've
|
||
discussed so far don't allow system catalogs to be moved out of the
|
||
default tablespace for a database. That doesn't bother me for most
|
||
of the system catalogs ... but pg_largeobject seems like it might be
|
||
an exception.
|
||
|
||
regards, tom lane
|
||
|
||
---------------------------(end of broadcast)---------------------------
|
||
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
|
||
|