mirror of
https://git.postgresql.org/git/postgresql.git
synced 2025-01-06 15:24:56 +08:00
2127 lines
83 KiB
Plaintext
2127 lines
83 KiB
Plaintext
From Inoue@tpf.co.jp Tue Jan 18 19:08:30 2000
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA10148
|
||
for <pgman@candle.pha.pa.us>; Tue, 18 Jan 2000 20:08:27 -0500 (EST)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id KAA02790; Wed, 19 Jan 2000 10:08:02 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Bruce Momjian" <pgman@candle.pha.pa.us>
|
||
Cc: "pgsql-hackers" <pgsql-hackers@postgreSQL.org>
|
||
Subject: RE: [HACKERS] Index recreation in vacuum
|
||
Date: Wed, 19 Jan 2000 10:13:40 +0900
|
||
Message-ID: <000201bf621a$6b9baf20$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
Importance: Normal
|
||
In-Reply-To: <200001181821.NAA02988@candle.pha.pa.us>
|
||
Status: ROr
|
||
|
||
> -----Original Message-----
|
||
> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
|
||
>
|
||
> [Charset iso-8859-1 unsupported, filtering to ASCII...]
|
||
> > Hi all,
|
||
> >
|
||
> > I'm trying to implement REINDEX command.
|
||
> >
|
||
> > REINDEX operation itself is available everywhere and
|
||
> > I've thought about applying it to VACUUM.
|
||
>
|
||
> That is a good idea. Vacuuming of indexes can be very slow.
|
||
>
|
||
> > .
|
||
> > My plan is as follows.
|
||
> >
|
||
> > Add a new option to force index recreation in vacuum
|
||
> > and if index recreation is specified.
|
||
>
|
||
> Couldn't we auto-recreate indexes based on the number of tuples moved by
|
||
> vacuum,
|
||
|
||
Yes,we could probably do it. But I'm not sure the availability of new
|
||
vacuum.
|
||
|
||
New vacuum would give us a big advantage that
|
||
1) Much faster than current if vacuum remove/moves many tuples.
|
||
2) Does shrink index files
|
||
|
||
But in case of abort/crash
|
||
1) couldn't choose index scan for the table
|
||
2) unique constraints of the table would be lost
|
||
|
||
I don't know how people estimate this disadvantage.
|
||
|
||
>
|
||
> > Now I'm inclined to use relhasindex of pg_class to
|
||
> > validate/invalidate indexes of a table at once.
|
||
>
|
||
> There are a few calls to CatalogIndexInsert() that know the
|
||
> system table they
|
||
> are using and know it has indexes, so it does not check that field. You
|
||
> could add cases for that.
|
||
>
|
||
|
||
I think there aren't so many places to check.
|
||
I would examine it if my idea is OK.
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
From owner-pgsql-hackers@hub.org Tue Jan 18 19:15:27 2000
|
||
Received: from hub.org (hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA10454
|
||
for <pgman@candle.pha.pa.us>; Tue, 18 Jan 2000 20:15:26 -0500 (EST)
|
||
Received: from localhost (majordom@localhost)
|
||
by hub.org (8.9.3/8.9.3) with SMTP id UAA42280;
|
||
Tue, 18 Jan 2000 20:10:35 -0500 (EST)
|
||
(envelope-from owner-pgsql-hackers)
|
||
Received: by hub.org (bulk_mailer v1.5); Tue, 18 Jan 2000 20:10:30 -0500
|
||
Received: (from majordom@localhost)
|
||
by hub.org (8.9.3/8.9.3) id UAA42081
|
||
for pgsql-hackers-outgoing; Tue, 18 Jan 2000 20:09:31 -0500 (EST)
|
||
(envelope-from owner-pgsql-hackers@postgreSQL.org)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
|
||
by hub.org (8.9.3/8.9.3) with ESMTP id UAA41943
|
||
for <pgsql-hackers@postgreSQL.org>; Tue, 18 Jan 2000 20:08:39 -0500 (EST)
|
||
(envelope-from Inoue@tpf.co.jp)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id KAA02790; Wed, 19 Jan 2000 10:08:02 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Bruce Momjian" <pgman@candle.pha.pa.us>
|
||
Cc: "pgsql-hackers" <pgsql-hackers@postgreSQL.org>
|
||
Subject: RE: [HACKERS] Index recreation in vacuum
|
||
Date: Wed, 19 Jan 2000 10:13:40 +0900
|
||
Message-ID: <000201bf621a$6b9baf20$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
Importance: Normal
|
||
In-Reply-To: <200001181821.NAA02988@candle.pha.pa.us>
|
||
Sender: owner-pgsql-hackers@postgreSQL.org
|
||
Status: RO
|
||
|
||
> -----Original Message-----
|
||
> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
|
||
>
|
||
> [Charset iso-8859-1 unsupported, filtering to ASCII...]
|
||
> > Hi all,
|
||
> >
|
||
> > I'm trying to implement REINDEX command.
|
||
> >
|
||
> > REINDEX operation itself is available everywhere and
|
||
> > I've thought about applying it to VACUUM.
|
||
>
|
||
> That is a good idea. Vacuuming of indexes can be very slow.
|
||
>
|
||
> > .
|
||
> > My plan is as follows.
|
||
> >
|
||
> > Add a new option to force index recreation in vacuum
|
||
> > and if index recreation is specified.
|
||
>
|
||
> Couldn't we auto-recreate indexes based on the number of tuples moved by
|
||
> vacuum,
|
||
|
||
Yes,we could probably do it. But I'm not sure the availability of new
|
||
vacuum.
|
||
|
||
New vacuum would give us a big advantage that
|
||
1) Much faster than current if vacuum remove/moves many tuples.
|
||
2) Does shrink index files
|
||
|
||
But in case of abort/crash
|
||
1) couldn't choose index scan for the table
|
||
2) unique constraints of the table would be lost
|
||
|
||
I don't know how people estimate this disadvantage.
|
||
|
||
>
|
||
> > Now I'm inclined to use relhasindex of pg_class to
|
||
> > validate/invalidate indexes of a table at once.
|
||
>
|
||
> There are a few calls to CatalogIndexInsert() that know the
|
||
> system table they
|
||
> are using and know it has indexes, so it does not check that field. You
|
||
> could add cases for that.
|
||
>
|
||
|
||
I think there aren't so many places to check.
|
||
I would examine it if my idea is OK.
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
************
|
||
|
||
From owner-pgsql-hackers@hub.org Tue Jan 18 19:57:21 2000
|
||
Received: from hub.org (hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA11764
|
||
for <pgman@candle.pha.pa.us>; Tue, 18 Jan 2000 20:57:19 -0500 (EST)
|
||
Received: from localhost (majordom@localhost)
|
||
by hub.org (8.9.3/8.9.3) with SMTP id UAA50653;
|
||
Tue, 18 Jan 2000 20:52:38 -0500 (EST)
|
||
(envelope-from owner-pgsql-hackers)
|
||
Received: by hub.org (bulk_mailer v1.5); Tue, 18 Jan 2000 20:52:30 -0500
|
||
Received: (from majordom@localhost)
|
||
by hub.org (8.9.3/8.9.3) id UAA50513
|
||
for pgsql-hackers-outgoing; Tue, 18 Jan 2000 20:51:32 -0500 (EST)
|
||
(envelope-from owner-pgsql-hackers@postgreSQL.org)
|
||
Received: from candle.pha.pa.us (pgman@s5-03.ppp.op.net [209.152.195.67])
|
||
by hub.org (8.9.3/8.9.3) with ESMTP id UAA50462
|
||
for <pgsql-hackers@postgreSQL.org>; Tue, 18 Jan 2000 20:51:06 -0500 (EST)
|
||
(envelope-from pgman@candle.pha.pa.us)
|
||
Received: (from pgman@localhost)
|
||
by candle.pha.pa.us (8.9.0/8.9.0) id UAA11421;
|
||
Tue, 18 Jan 2000 20:50:50 -0500 (EST)
|
||
From: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
Message-Id: <200001190150.UAA11421@candle.pha.pa.us>
|
||
Subject: Re: [HACKERS] Index recreation in vacuum
|
||
In-Reply-To: <000201bf621a$6b9baf20$2801007e@tpf.co.jp> from Hiroshi Inoue at
|
||
"Jan 19, 2000 10:13:40 am"
|
||
To: Hiroshi Inoue <Inoue@tpf.co.jp>
|
||
Date: Tue, 18 Jan 2000 20:50:50 -0500 (EST)
|
||
CC: pgsql-hackers <pgsql-hackers@postgreSQL.org>
|
||
X-Mailer: ELM [version 2.4ME+ PL66 (25)]
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain; charset=US-ASCII
|
||
Content-Transfer-Encoding: 7bit
|
||
Sender: owner-pgsql-hackers@postgreSQL.org
|
||
Status: ROr
|
||
|
||
> > > Add a new option to force index recreation in vacuum
|
||
> > > and if index recreation is specified.
|
||
> >
|
||
> > Couldn't we auto-recreate indexes based on the number of tuples moved by
|
||
> > vacuum,
|
||
>
|
||
> Yes,we could probably do it. But I'm not sure the availability of new
|
||
> vacuum.
|
||
>
|
||
> New vacuum would give us a big advantage that
|
||
> 1) Much faster than current if vacuum remove/moves many tuples.
|
||
> 2) Does shrink index files
|
||
>
|
||
> But in case of abort/crash
|
||
> 1) couldn't choose index scan for the table
|
||
> 2) unique constraints of the table would be lost
|
||
>
|
||
> I don't know how people estimate this disadvantage.
|
||
|
||
That's why I was recommending rename(). The actual window of
|
||
vunerability goes from perhaps hours to fractions of a second.
|
||
|
||
In fact, if I understand this right, you could make the vulerability
|
||
zero by just performing the rename as one operation.
|
||
|
||
In fact, for REINDEX cases where you don't have a lock on the entire
|
||
table as you do in vacuum, you could reindex the table with a simple
|
||
read-lock on the base table and index, and move the new index into place
|
||
with the users seeing no change. Only people traversing the index
|
||
during the change would have a problem. You just need an exclusive
|
||
access on the index for the duration of the rename() so no one is
|
||
traversing the index during the rename().
|
||
|
||
Destroying the index and recreating opens a large time span that there
|
||
is no index, and you have to jury-rig something so people don't try to
|
||
use the index. With rename() you just put the new index in place with
|
||
one operation. Just don't let people traverse the index during the
|
||
change. The pointers to the heap tuples is the same in both indexes.
|
||
|
||
In fact, with WAL, we will allow multiple physical files for the same
|
||
table by appending the table oid to the file name. In this case, the
|
||
old index could be deleted by rename, and people would continue to use
|
||
the old index until they deleted the open file pointers. Not sure how
|
||
this works in practice because new tuples would not be inserted into the
|
||
old copy of the index.
|
||
|
||
|
||
--
|
||
Bruce Momjian | http://www.op.net/~candle
|
||
pgman@candle.pha.pa.us | (610) 853-3000
|
||
+ If your life is a hard drive, | 830 Blythe Avenue
|
||
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
|
||
|
||
************
|
||
|
||
From pgman Tue Jan 18 20:04:11 2000
|
||
Received: (from pgman@localhost)
|
||
by candle.pha.pa.us (8.9.0/8.9.0) id VAA11990;
|
||
Tue, 18 Jan 2000 21:04:11 -0500 (EST)
|
||
From: Bruce Momjian <pgman>
|
||
Message-Id: <200001190204.VAA11990@candle.pha.pa.us>
|
||
Subject: Re: [HACKERS] Index recreation in vacuum
|
||
In-Reply-To: <200001190150.UAA11421@candle.pha.pa.us> from Bruce Momjian at "Jan
|
||
18, 2000 08:50:50 pm"
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
Date: Tue, 18 Jan 2000 21:04:11 -0500 (EST)
|
||
CC: Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
pgsql-hackers <pgsql-hackers@postgreSQL.org>
|
||
X-Mailer: ELM [version 2.4ME+ PL66 (25)]
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain; charset=US-ASCII
|
||
Content-Transfer-Encoding: 7bit
|
||
Status: RO
|
||
|
||
> > I don't know how people estimate this disadvantage.
|
||
>
|
||
> That's why I was recommending rename(). The actual window of
|
||
> vunerability goes from perhaps hours to fractions of a second.
|
||
>
|
||
> In fact, if I understand this right, you could make the vulerability
|
||
> zero by just performing the rename as one operation.
|
||
>
|
||
> In fact, for REINDEX cases where you don't have a lock on the entire
|
||
> table as you do in vacuum, you could reindex the table with a simple
|
||
> read-lock on the base table and index, and move the new index into place
|
||
> with the users seeing no change. Only people traversing the index
|
||
> during the change would have a problem. You just need an exclusive
|
||
> access on the index for the duration of the rename() so no one is
|
||
> traversing the index during the rename().
|
||
>
|
||
> Destroying the index and recreating opens a large time span that there
|
||
> is no index, and you have to jury-rig something so people don't try to
|
||
> use the index. With rename() you just put the new index in place with
|
||
> one operation. Just don't let people traverse the index during the
|
||
> change. The pointers to the heap tuples is the same in both indexes.
|
||
>
|
||
> In fact, with WAL, we will allow multiple physical files for the same
|
||
> table by appending the table oid to the file name. In this case, the
|
||
> old index could be deleted by rename, and people would continue to use
|
||
> the old index until they deleted the open file pointers. Not sure how
|
||
> this works in practice because new tuples would not be inserted into the
|
||
> old copy of the index.
|
||
|
||
Maybe I am all wrong here. Maybe most of the advantage of rename() are
|
||
meaningless with reindex using during vacuum, which is the most
|
||
important use of reindex.
|
||
|
||
Let's look at index using during vacuum. Right now, how does vacuum
|
||
handle indexes when it moves a tuple? Does it do each index update as
|
||
it moves a tuple? Is that why it is so slow?
|
||
|
||
If we don't do that and vacuum fails, what state is the table left in?
|
||
If we don't update the index for every tuple, the index is invalid in a
|
||
vacuum failure. rename() is not going to help us here. It keeps the
|
||
old index around, but the index is invalid anyway, right?
|
||
|
||
|
||
--
|
||
Bruce Momjian | http://www.op.net/~candle
|
||
pgman@candle.pha.pa.us | (610) 853-3000
|
||
+ If your life is a hard drive, | 830 Blythe Avenue
|
||
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
|
||
|
||
From Inoue@tpf.co.jp Tue Jan 18 20:18:48 2000
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA12437
|
||
for <pgman@candle.pha.pa.us>; Tue, 18 Jan 2000 21:18:46 -0500 (EST)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id LAA02845; Wed, 19 Jan 2000 11:18:18 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Bruce Momjian" <pgman@candle.pha.pa.us>
|
||
Cc: "pgsql-hackers" <pgsql-hackers@postgreSQL.org>
|
||
Subject: RE: [HACKERS] Index recreation in vacuum
|
||
Date: Wed, 19 Jan 2000 11:23:55 +0900
|
||
Message-ID: <000801bf6224$3bfdd9a0$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
Importance: Normal
|
||
In-Reply-To: <200001190204.VAA11990@candle.pha.pa.us>
|
||
Status: ROr
|
||
|
||
> -----Original Message-----
|
||
> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
|
||
>
|
||
> > > I don't know how people estimate this disadvantage.
|
||
> >
|
||
> > That's why I was recommending rename(). The actual window of
|
||
> > vunerability goes from perhaps hours to fractions of a second.
|
||
> >
|
||
> > In fact, if I understand this right, you could make the vulerability
|
||
> > zero by just performing the rename as one operation.
|
||
> >
|
||
> > In fact, for REINDEX cases where you don't have a lock on the entire
|
||
> > table as you do in vacuum, you could reindex the table with a simple
|
||
> > read-lock on the base table and index, and move the new index into place
|
||
> > with the users seeing no change. Only people traversing the index
|
||
> > during the change would have a problem. You just need an exclusive
|
||
> > access on the index for the duration of the rename() so no one is
|
||
> > traversing the index during the rename().
|
||
> >
|
||
> > Destroying the index and recreating opens a large time span that there
|
||
> > is no index, and you have to jury-rig something so people don't try to
|
||
> > use the index. With rename() you just put the new index in place with
|
||
> > one operation. Just don't let people traverse the index during the
|
||
> > change. The pointers to the heap tuples is the same in both indexes.
|
||
> >
|
||
> > In fact, with WAL, we will allow multiple physical files for the same
|
||
> > table by appending the table oid to the file name. In this case, the
|
||
> > old index could be deleted by rename, and people would continue to use
|
||
> > the old index until they deleted the open file pointers. Not sure how
|
||
> > this works in practice because new tuples would not be inserted into the
|
||
> > old copy of the index.
|
||
>
|
||
> Maybe I am all wrong here. Maybe most of the advantage of rename() are
|
||
> meaningless with reindex using during vacuum, which is the most
|
||
> important use of reindex.
|
||
>
|
||
> Let's look at index using during vacuum. Right now, how does vacuum
|
||
> handle indexes when it moves a tuple? Does it do each index update as
|
||
> it moves a tuple? Is that why it is so slow?
|
||
>
|
||
|
||
Yes,I believe so. It's necessary to keep consistency between heap
|
||
table and indexes even in case of abort/crash.
|
||
As far as I see,it has been a big charge for vacuum.
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
|
||
From owner-pgsql-hackers@hub.org Tue Jan 18 20:53:49 2000
|
||
Received: from hub.org (hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA13285
|
||
for <pgman@candle.pha.pa.us>; Tue, 18 Jan 2000 21:53:47 -0500 (EST)
|
||
Received: from localhost (majordom@localhost)
|
||
by hub.org (8.9.3/8.9.3) with SMTP id VAA65183;
|
||
Tue, 18 Jan 2000 21:47:47 -0500 (EST)
|
||
(envelope-from owner-pgsql-hackers)
|
||
Received: by hub.org (bulk_mailer v1.5); Tue, 18 Jan 2000 21:47:33 -0500
|
||
Received: (from majordom@localhost)
|
||
by hub.org (8.9.3/8.9.3) id VAA65091
|
||
for pgsql-hackers-outgoing; Tue, 18 Jan 2000 21:46:33 -0500 (EST)
|
||
(envelope-from owner-pgsql-hackers@postgreSQL.org)
|
||
Received: from candle.pha.pa.us (pgman@s5-03.ppp.op.net [209.152.195.67])
|
||
by hub.org (8.9.3/8.9.3) with ESMTP id VAA65034
|
||
for <pgsql-hackers@postgreSQL.org>; Tue, 18 Jan 2000 21:46:12 -0500 (EST)
|
||
(envelope-from pgman@candle.pha.pa.us)
|
||
Received: (from pgman@localhost)
|
||
by candle.pha.pa.us (8.9.0/8.9.0) id VAA13040;
|
||
Tue, 18 Jan 2000 21:45:27 -0500 (EST)
|
||
From: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
Message-Id: <200001190245.VAA13040@candle.pha.pa.us>
|
||
Subject: Re: [HACKERS] Index recreation in vacuum
|
||
In-Reply-To: <000801bf6224$3bfdd9a0$2801007e@tpf.co.jp> from Hiroshi Inoue at
|
||
"Jan 19, 2000 11:23:55 am"
|
||
To: Hiroshi Inoue <Inoue@tpf.co.jp>
|
||
Date: Tue, 18 Jan 2000 21:45:27 -0500 (EST)
|
||
CC: pgsql-hackers <pgsql-hackers@postgreSQL.org>
|
||
X-Mailer: ELM [version 2.4ME+ PL66 (25)]
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain; charset=US-ASCII
|
||
Content-Transfer-Encoding: 7bit
|
||
Sender: owner-pgsql-hackers@postgreSQL.org
|
||
Status: RO
|
||
|
||
> > > In fact, for REINDEX cases where you don't have a lock on the entire
|
||
> > > table as you do in vacuum, you could reindex the table with a simple
|
||
> > > read-lock on the base table and index, and move the new index into place
|
||
> > > with the users seeing no change. Only people traversing the index
|
||
> > > during the change would have a problem. You just need an exclusive
|
||
> > > access on the index for the duration of the rename() so no one is
|
||
> > > traversing the index during the rename().
|
||
> > >
|
||
> > > Destroying the index and recreating opens a large time span that there
|
||
> > > is no index, and you have to jury-rig something so people don't try to
|
||
> > > use the index. With rename() you just put the new index in place with
|
||
> > > one operation. Just don't let people traverse the index during the
|
||
> > > change. The pointers to the heap tuples is the same in both indexes.
|
||
> > >
|
||
> > > In fact, with WAL, we will allow multiple physical files for the same
|
||
> > > table by appending the table oid to the file name. In this case, the
|
||
> > > old index could be deleted by rename, and people would continue to use
|
||
> > > the old index until they deleted the open file pointers. Not sure how
|
||
> > > this works in practice because new tuples would not be inserted into the
|
||
> > > old copy of the index.
|
||
> >
|
||
> > Maybe I am all wrong here. Maybe most of the advantage of rename() are
|
||
> > meaningless with reindex using during vacuum, which is the most
|
||
> > important use of reindex.
|
||
> >
|
||
> > Let's look at index using during vacuum. Right now, how does vacuum
|
||
> > handle indexes when it moves a tuple? Does it do each index update as
|
||
> > it moves a tuple? Is that why it is so slow?
|
||
> >
|
||
>
|
||
> Yes,I believe so. It's necessary to keep consistency between heap
|
||
> table and indexes even in case of abort/crash.
|
||
> As far as I see,it has been a big charge for vacuum.
|
||
|
||
OK, how about making a copy of the heap table before starting vacuum,
|
||
moving all the tuples in that copy, create new index, and then move the
|
||
new heap and indexes over the old version. We already have an exclusive
|
||
lock on the table. That would be 100% reliable, with the disadvantage
|
||
of using 2x the disk space. Seems like a big win.
|
||
|
||
--
|
||
Bruce Momjian | http://www.op.net/~candle
|
||
pgman@candle.pha.pa.us | (610) 853-3000
|
||
+ If your life is a hard drive, | 830 Blythe Avenue
|
||
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
|
||
|
||
************
|
||
|
||
From owner-pgsql-hackers@hub.org Tue Jan 18 21:15:24 2000
|
||
Received: from hub.org (hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA14115
|
||
for <pgman@candle.pha.pa.us>; Tue, 18 Jan 2000 22:15:23 -0500 (EST)
|
||
Received: from localhost (majordom@localhost)
|
||
by hub.org (8.9.3/8.9.3) with SMTP id WAA72950;
|
||
Tue, 18 Jan 2000 22:10:40 -0500 (EST)
|
||
(envelope-from owner-pgsql-hackers)
|
||
Received: by hub.org (bulk_mailer v1.5); Tue, 18 Jan 2000 22:10:32 -0500
|
||
Received: (from majordom@localhost)
|
||
by hub.org (8.9.3/8.9.3) id WAA72644
|
||
for pgsql-hackers-outgoing; Tue, 18 Jan 2000 22:09:36 -0500 (EST)
|
||
(envelope-from owner-pgsql-hackers@postgreSQL.org)
|
||
Received: from candle.pha.pa.us (pgman@s5-03.ppp.op.net [209.152.195.67])
|
||
by hub.org (8.9.3/8.9.3) with ESMTP id WAA72504
|
||
for <pgsql-hackers@postgreSQL.org>; Tue, 18 Jan 2000 22:08:40 -0500 (EST)
|
||
(envelope-from pgman@candle.pha.pa.us)
|
||
Received: (from pgman@localhost)
|
||
by candle.pha.pa.us (8.9.0/8.9.0) id WAA13965;
|
||
Tue, 18 Jan 2000 22:08:25 -0500 (EST)
|
||
From: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
Message-Id: <200001190308.WAA13965@candle.pha.pa.us>
|
||
Subject: Re: [HACKERS] Index recreation in vacuum
|
||
In-Reply-To: <000f01bf622a$bf423940$2801007e@tpf.co.jp> from Hiroshi Inoue at
|
||
"Jan 19, 2000 12:10:32 pm"
|
||
To: Hiroshi Inoue <Inoue@tpf.co.jp>
|
||
Date: Tue, 18 Jan 2000 22:08:25 -0500 (EST)
|
||
CC: pgsql-hackers <pgsql-hackers@postgreSQL.org>
|
||
X-Mailer: ELM [version 2.4ME+ PL66 (25)]
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain; charset=UNKNOWN-8BIT
|
||
Content-Transfer-Encoding: 8bit
|
||
Sender: owner-pgsql-hackers@postgreSQL.org
|
||
Status: RO
|
||
|
||
> I heard from someone that old vacuum had been like so.
|
||
> Probably 2x disk space for big tables was a big disadvantage.
|
||
|
||
That's interesting.
|
||
|
||
>
|
||
> In addition,rename(),unlink(),mv aren't preferable for transaction
|
||
> control as far as I see. We couldn't avoid inconsistency using
|
||
> those OS functions.
|
||
|
||
I disagree. Vacuum can't be rolled back anyway in the sense you can
|
||
bring back expire tuples, though I have no idea why you would want to.
|
||
|
||
You have an exclusive lock on the table. Putting new heap/indexes in
|
||
place that match and have no expired tuples seems like it can not fail
|
||
in any situation.
|
||
|
||
Of course, the buffers of the old table have to be marked as invalid,
|
||
but with an exclusive lock, that is not a problem. I am sure we do that
|
||
anyway<EFBFBD>in vacuum.
|
||
|
||
> We have to wait the change of relation file naming if copying
|
||
> vacuum is needed.
|
||
> Under the spec we need not rename(),mv etc.
|
||
|
||
Sorry, I don't agree, yet...
|
||
|
||
--
|
||
Bruce Momjian | http://www.op.net/~candle
|
||
pgman@candle.pha.pa.us | (610) 853-3000
|
||
+ If your life is a hard drive, | 830 Blythe Avenue
|
||
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
|
||
|
||
************
|
||
|
||
From Inoue@tpf.co.jp Tue Jan 18 21:05:23 2000
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA13858
|
||
for <pgman@candle.pha.pa.us>; Tue, 18 Jan 2000 22:05:21 -0500 (EST)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id MAA02870; Wed, 19 Jan 2000 12:04:55 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Bruce Momjian" <pgman@candle.pha.pa.us>
|
||
Cc: "pgsql-hackers" <pgsql-hackers@postgreSQL.org>
|
||
Subject: RE: [HACKERS] Index recreation in vacuum
|
||
Date: Wed, 19 Jan 2000 12:10:32 +0900
|
||
Message-ID: <000f01bf622a$bf423940$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
Importance: Normal
|
||
In-Reply-To: <200001190245.VAA13040@candle.pha.pa.us>
|
||
Status: ROr
|
||
|
||
> -----Original Message-----
|
||
> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
|
||
> > >
|
||
> > > Maybe I am all wrong here. Maybe most of the advantage of
|
||
> rename() are
|
||
> > > meaningless with reindex using during vacuum, which is the most
|
||
> > > important use of reindex.
|
||
> > >
|
||
> > > Let's look at index using during vacuum. Right now, how does vacuum
|
||
> > > handle indexes when it moves a tuple? Does it do each index update as
|
||
> > > it moves a tuple? Is that why it is so slow?
|
||
> > >
|
||
> >
|
||
> > Yes,I believe so. It's necessary to keep consistency between heap
|
||
> > table and indexes even in case of abort/crash.
|
||
> > As far as I see,it has been a big charge for vacuum.
|
||
>
|
||
> OK, how about making a copy of the heap table before starting vacuum,
|
||
> moving all the tuples in that copy, create new index, and then move the
|
||
> new heap and indexes over the old version. We already have an exclusive
|
||
> lock on the table. That would be 100% reliable, with the disadvantage
|
||
> of using 2x the disk space. Seems like a big win.
|
||
>
|
||
|
||
I heard from someone that old vacuum had been like so.
|
||
Probably 2x disk space for big tables was a big disadvantage.
|
||
|
||
In addition,rename(),unlink(),mv aren't preferable for transaction
|
||
control as far as I see. We couldn't avoid inconsistency using
|
||
those OS functions.
|
||
We have to wait the change of relation file naming if copying
|
||
vacuum is needed.
|
||
Under the spec we need not rename(),mv etc.
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
|
||
|
||
|
||
From dms@wplus.net Wed Jan 19 15:30:40 2000
|
||
Received: from relay.wplus.net (relay.wplus.net [195.131.52.179])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id QAA25919
|
||
for <pgman@candle.pha.pa.us>; Wed, 19 Jan 2000 16:30:38 -0500 (EST)
|
||
X-Real-To: pgman@candle.pha.pa.us
|
||
Received: from wplus.net (ppdms.dialup.wplus.net [195.131.52.71])
|
||
by relay.wplus.net (8.9.1/8.9.1/wplus.2) with ESMTP id AAA64218;
|
||
Thu, 20 Jan 2000 00:26:37 +0300 (MSK)
|
||
Message-ID: <38862C9D.C2151E4E@wplus.net>
|
||
Date: Thu, 20 Jan 2000 00:29:01 +0300
|
||
From: Dmitry Samersoff <dms@wplus.net>
|
||
X-Mailer: Mozilla 4.61 [en] (WinNT; I)
|
||
X-Accept-Language: ru,en
|
||
MIME-Version: 1.0
|
||
To: Hiroshi Inoue <Inoue@tpf.co.jp>
|
||
CC: Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
pgsql-hackers <pgsql-hackers@postgreSQL.org>
|
||
Subject: Re: [HACKERS] Index recreation in vacuum
|
||
References: <000f01bf622a$bf423940$2801007e@tpf.co.jp>
|
||
Content-Type: text/plain; charset=koi8-r
|
||
Content-Transfer-Encoding: 7bit
|
||
Status: ROr
|
||
|
||
Hiroshi Inoue wrote:
|
||
> > > Yes,I believe so. It's necessary to keep consistency between heap
|
||
> > > table and indexes even in case of abort/crash.
|
||
> > > As far as I see,it has been a big charge for vacuum.
|
||
> >
|
||
> > OK, how about making a copy of the heap table before starting vacuum,
|
||
> > moving all the tuples in that copy, create new index, and then move the
|
||
> > new heap and indexes over the old version. We already have an exclusive
|
||
> > lock on the table. That would be 100% reliable, with the disadvantage
|
||
> > of using 2x the disk space. Seems like a big win.
|
||
> >
|
||
>
|
||
> I heard from someone that old vacuum had been like so.
|
||
> Probably 2x disk space for big tables was a big disadvantage.
|
||
|
||
Yes, It is critical.
|
||
|
||
How about sequence like this:
|
||
|
||
* Drop indices (keeping somewhere index descriptions)
|
||
* vacuuming table
|
||
* recreate indices
|
||
|
||
If something crash, user have been noticed
|
||
to re-run vacuum or recreate indices by hand
|
||
when system restarts.
|
||
|
||
I use script like described above for vacuuming
|
||
- it really increase vacuum performance for large table.
|
||
|
||
|
||
--
|
||
Dmitry Samersoff, DM\S
|
||
dms@wplus.net http://devnull.wplus.net
|
||
* there will come soft rains
|
||
|
||
From dms@wplus.net Wed Jan 19 15:42:49 2000
|
||
Received: from relay.wplus.net (relay.wplus.net [195.131.52.179])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id QAA26645
|
||
for <pgman@candle.pha.pa.us>; Wed, 19 Jan 2000 16:42:47 -0500 (EST)
|
||
X-Real-To: pgman@candle.pha.pa.us
|
||
Received: from wplus.net (ppdms.dialup.wplus.net [195.131.52.71])
|
||
by relay.wplus.net (8.9.1/8.9.1/wplus.2) with ESMTP id AAA65264;
|
||
Thu, 20 Jan 2000 00:39:02 +0300 (MSK)
|
||
Message-ID: <38862F86.20328BD3@wplus.net>
|
||
Date: Thu, 20 Jan 2000 00:41:26 +0300
|
||
From: Dmitry Samersoff <dms@wplus.net>
|
||
X-Mailer: Mozilla 4.61 [en] (WinNT; I)
|
||
X-Accept-Language: ru,en
|
||
MIME-Version: 1.0
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
CC: Hiroshi Inoue <Inoue@tpf.co.jp>,
|
||
pgsql-hackers <pgsql-hackers@postgreSQL.org>
|
||
Subject: Re: [HACKERS] Index recreation in vacuum
|
||
References: <200001192132.QAA26048@candle.pha.pa.us>
|
||
Content-Type: text/plain; charset=koi8-r
|
||
Content-Transfer-Encoding: 7bit
|
||
Status: ROr
|
||
|
||
Bruce Momjian wrote:
|
||
>
|
||
> We need two things:
|
||
>
|
||
|
||
> auto-create index on startup
|
||
|
||
IMHO, It have to be controlled by user, because creating large index
|
||
can take a number of hours. Sometimes it's better to live without
|
||
indices
|
||
at all, and then build it by hand after workday end.
|
||
|
||
|
||
--
|
||
Dmitry Samersoff, DM\S
|
||
dms@wplus.net http://devnull.wplus.net
|
||
* there will come soft rains
|
||
|
||
From owner-pgsql-hackers@hub.org Thu Jan 20 23:51:34 2000
|
||
Received: from hub.org (hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA13891
|
||
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 00:51:31 -0500 (EST)
|
||
Received: from localhost (majordom@localhost)
|
||
by hub.org (8.9.3/8.9.3) with SMTP id AAA91784;
|
||
Fri, 21 Jan 2000 00:47:07 -0500 (EST)
|
||
(envelope-from owner-pgsql-hackers)
|
||
Received: by hub.org (bulk_mailer v1.5); Fri, 21 Jan 2000 00:45:38 -0500
|
||
Received: (from majordom@localhost)
|
||
by hub.org (8.9.3/8.9.3) id AAA91495
|
||
for pgsql-hackers-outgoing; Fri, 21 Jan 2000 00:44:40 -0500 (EST)
|
||
(envelope-from owner-pgsql-hackers@postgreSQL.org)
|
||
Received: from candle.pha.pa.us (pgman@s5-03.ppp.op.net [209.152.195.67])
|
||
by hub.org (8.9.3/8.9.3) with ESMTP id AAA91378
|
||
for <pgsql-hackers@postgreSQL.org>; Fri, 21 Jan 2000 00:44:04 -0500 (EST)
|
||
(envelope-from pgman@candle.pha.pa.us)
|
||
Received: (from pgman@localhost)
|
||
by candle.pha.pa.us (8.9.0/8.9.0) id AAA13592;
|
||
Fri, 21 Jan 2000 00:43:49 -0500 (EST)
|
||
From: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
Message-Id: <200001210543.AAA13592@candle.pha.pa.us>
|
||
Subject: [HACKERS] vacuum timings
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Date: Fri, 21 Jan 2000 00:43:49 -0500 (EST)
|
||
CC: PostgreSQL-development <pgsql-hackers@postgreSQL.org>
|
||
X-Mailer: ELM [version 2.4ME+ PL66 (25)]
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain; charset=US-ASCII
|
||
Content-Transfer-Encoding: 7bit
|
||
Sender: owner-pgsql-hackers@postgreSQL.org
|
||
Status: RO
|
||
|
||
I loaded 10,000,000 rows into CREATE TABLE test (x INTEGER); Table is
|
||
400MB and index is 160MB.
|
||
|
||
With index on the single in4 column, I got:
|
||
78 seconds for a vacuum
|
||
121 seconds for vacuum after deleting a single row
|
||
662 seconds for vacuum after deleting the entire table
|
||
|
||
With no index, I got:
|
||
43 seconds for a vacuum
|
||
43 seconds for vacuum after deleting a single row
|
||
43 seconds for vacuum after deleting the entire table
|
||
|
||
I find this quite interesting.
|
||
|
||
--
|
||
Bruce Momjian | http://www.op.net/~candle
|
||
pgman@candle.pha.pa.us | (610) 853-3000
|
||
+ If your life is a hard drive, | 830 Blythe Avenue
|
||
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
|
||
|
||
************
|
||
|
||
From owner-pgsql-hackers@hub.org Fri Jan 21 00:34:56 2000
|
||
Received: from hub.org (hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA15559
|
||
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 01:34:55 -0500 (EST)
|
||
Received: from localhost (majordom@localhost)
|
||
by hub.org (8.9.3/8.9.3) with SMTP id BAA06108;
|
||
Fri, 21 Jan 2000 01:32:23 -0500 (EST)
|
||
(envelope-from owner-pgsql-hackers)
|
||
Received: by hub.org (bulk_mailer v1.5); Fri, 21 Jan 2000 01:30:38 -0500
|
||
Received: (from majordom@localhost)
|
||
by hub.org (8.9.3/8.9.3) id BAA03704
|
||
for pgsql-hackers-outgoing; Fri, 21 Jan 2000 01:27:53 -0500 (EST)
|
||
(envelope-from owner-pgsql-hackers@postgreSQL.org)
|
||
Received: from sunpine.krs.ru (SunPine.krs.ru [195.161.16.37])
|
||
by hub.org (8.9.3/8.9.3) with ESMTP id BAA01710
|
||
for <pgsql-hackers@postgreSQL.org>; Fri, 21 Jan 2000 01:26:44 -0500 (EST)
|
||
(envelope-from vadim@krs.ru)
|
||
Received: from krs.ru (dune.krs.ru [195.161.16.38])
|
||
by sunpine.krs.ru (8.8.8/8.8.8) with ESMTP id NAA01685;
|
||
Fri, 21 Jan 2000 13:26:33 +0700 (KRS)
|
||
Message-ID: <3887FC19.80305217@krs.ru>
|
||
Date: Fri, 21 Jan 2000 13:26:33 +0700
|
||
From: Vadim Mikheev <vadim@krs.ru>
|
||
Organization: OJSC Rostelecom (Krasnoyarsk)
|
||
X-Mailer: Mozilla 4.5 [en] (X11; I; FreeBSD 3.0-RELEASE i386)
|
||
X-Accept-Language: ru, en
|
||
MIME-Version: 1.0
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
CC: Tom Lane <tgl@sss.pgh.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgreSQL.org>
|
||
Subject: Re: [HACKERS] vacuum timings
|
||
References: <200001210543.AAA13592@candle.pha.pa.us>
|
||
Content-Type: text/plain; charset=us-ascii
|
||
Content-Transfer-Encoding: 7bit
|
||
Sender: owner-pgsql-hackers@postgreSQL.org
|
||
Status: RO
|
||
|
||
Bruce Momjian wrote:
|
||
>
|
||
> I loaded 10,000,000 rows into CREATE TABLE test (x INTEGER); Table is
|
||
> 400MB and index is 160MB.
|
||
>
|
||
> With index on the single in4 column, I got:
|
||
> 78 seconds for a vacuum
|
||
> 121 seconds for vacuum after deleting a single row
|
||
> 662 seconds for vacuum after deleting the entire table
|
||
>
|
||
> With no index, I got:
|
||
> 43 seconds for a vacuum
|
||
> 43 seconds for vacuum after deleting a single row
|
||
> 43 seconds for vacuum after deleting the entire table
|
||
|
||
Wi/wo -F ?
|
||
|
||
Vadim
|
||
|
||
************
|
||
|
||
From vadim@krs.ru Fri Jan 21 00:26:33 2000
|
||
Received: from sunpine.krs.ru (SunPine.krs.ru [195.161.16.37])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA15239
|
||
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 01:26:31 -0500 (EST)
|
||
Received: from krs.ru (dune.krs.ru [195.161.16.38])
|
||
by sunpine.krs.ru (8.8.8/8.8.8) with ESMTP id NAA01685;
|
||
Fri, 21 Jan 2000 13:26:33 +0700 (KRS)
|
||
Sender: root@sunpine.krs.ru
|
||
Message-ID: <3887FC19.80305217@krs.ru>
|
||
Date: Fri, 21 Jan 2000 13:26:33 +0700
|
||
From: Vadim Mikheev <vadim@krs.ru>
|
||
Organization: OJSC Rostelecom (Krasnoyarsk)
|
||
X-Mailer: Mozilla 4.5 [en] (X11; I; FreeBSD 3.0-RELEASE i386)
|
||
X-Accept-Language: ru, en
|
||
MIME-Version: 1.0
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
CC: Tom Lane <tgl@sss.pgh.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgreSQL.org>
|
||
Subject: Re: [HACKERS] vacuum timings
|
||
References: <200001210543.AAA13592@candle.pha.pa.us>
|
||
Content-Type: text/plain; charset=us-ascii
|
||
Content-Transfer-Encoding: 7bit
|
||
Status: ROr
|
||
|
||
Bruce Momjian wrote:
|
||
>
|
||
> I loaded 10,000,000 rows into CREATE TABLE test (x INTEGER); Table is
|
||
> 400MB and index is 160MB.
|
||
>
|
||
> With index on the single in4 column, I got:
|
||
> 78 seconds for a vacuum
|
||
> 121 seconds for vacuum after deleting a single row
|
||
> 662 seconds for vacuum after deleting the entire table
|
||
>
|
||
> With no index, I got:
|
||
> 43 seconds for a vacuum
|
||
> 43 seconds for vacuum after deleting a single row
|
||
> 43 seconds for vacuum after deleting the entire table
|
||
|
||
Wi/wo -F ?
|
||
|
||
Vadim
|
||
|
||
From Inoue@tpf.co.jp Fri Jan 21 00:40:35 2000
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA15684
|
||
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 01:40:33 -0500 (EST)
|
||
Received: from cadzone ([126.0.1.40] (may be forged))
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id PAA04316; Fri, 21 Jan 2000 15:40:35 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Bruce Momjian" <pgman@candle.pha.pa.us>
|
||
Cc: "PostgreSQL-development" <pgsql-hackers@postgreSQL.org>,
|
||
"Tom Lane" <tgl@sss.pgh.pa.us>
|
||
Subject: RE: [HACKERS] vacuum timings
|
||
Date: Fri, 21 Jan 2000 15:46:15 +0900
|
||
Message-ID: <000201bf63db$36cdae20$2801007e@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
|
||
Importance: Normal
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
In-Reply-To: <200001210543.AAA13592@candle.pha.pa.us>
|
||
Status: RO
|
||
|
||
> -----Original Message-----
|
||
> From: owner-pgsql-hackers@postgreSQL.org
|
||
> [mailto:owner-pgsql-hackers@postgreSQL.org]On Behalf Of Bruce Momjian
|
||
>
|
||
> I loaded 10,000,000 rows into CREATE TABLE test (x INTEGER); Table is
|
||
> 400MB and index is 160MB.
|
||
>
|
||
> With index on the single in4 column, I got:
|
||
> 78 seconds for a vacuum
|
||
vc_vaconeind() is called once
|
||
|
||
> 121 seconds for vacuum after deleting a single row
|
||
vc_vaconeind() is called twice
|
||
|
||
Hmmm,vc_vaconeind() takes pretty long time even if it does little.
|
||
|
||
> 662 seconds for vacuum after deleting the entire table
|
||
>
|
||
|
||
How about half of the rows deleted case ?
|
||
It would take longer time.
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
From owner-pgsql-hackers@hub.org Fri Jan 21 12:00:49 2000
|
||
Received: from hub.org (hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA13329
|
||
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 13:00:47 -0500 (EST)
|
||
Received: from localhost (majordom@localhost)
|
||
by hub.org (8.9.3/8.9.3) with SMTP id MAA96106;
|
||
Fri, 21 Jan 2000 12:55:34 -0500 (EST)
|
||
(envelope-from owner-pgsql-hackers)
|
||
Received: by hub.org (bulk_mailer v1.5); Fri, 21 Jan 2000 12:53:53 -0500
|
||
Received: (from majordom@localhost)
|
||
by hub.org (8.9.3/8.9.3) id MAA95775
|
||
for pgsql-hackers-outgoing; Fri, 21 Jan 2000 12:52:54 -0500 (EST)
|
||
(envelope-from owner-pgsql-hackers@postgreSQL.org)
|
||
Received: from candle.pha.pa.us (root@s5-03.ppp.op.net [209.152.195.67])
|
||
by hub.org (8.9.3/8.9.3) with ESMTP id MAA95720
|
||
for <pgsql-hackers@postgreSQL.org>; Fri, 21 Jan 2000 12:52:39 -0500 (EST)
|
||
(envelope-from pgman@candle.pha.pa.us)
|
||
Received: (from pgman@localhost)
|
||
by candle.pha.pa.us (8.9.0/8.9.0) id MAA12106;
|
||
Fri, 21 Jan 2000 12:51:53 -0500 (EST)
|
||
From: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
Message-Id: <200001211751.MAA12106@candle.pha.pa.us>
|
||
Subject: [HACKERS] Re: vacuum timings
|
||
In-Reply-To: <3641.948433911@sss.pgh.pa.us> from Tom Lane at "Jan 21, 2000 00:51:51
|
||
am"
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Date: Fri, 21 Jan 2000 12:51:53 -0500 (EST)
|
||
CC: PostgreSQL-development <pgsql-hackers@postgreSQL.org>
|
||
X-Mailer: ELM [version 2.4ME+ PL66 (25)]
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain; charset=US-ASCII
|
||
Content-Transfer-Encoding: 7bit
|
||
Sender: owner-pgsql-hackers@postgreSQL.org
|
||
Status: RO
|
||
|
||
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> > I loaded 10,000,000 rows into CREATE TABLE test (x INTEGER); Table is
|
||
> > 400MB and index is 160MB.
|
||
>
|
||
> > With index on the single in4 column, I got:
|
||
> > 78 seconds for a vacuum
|
||
> > 121 seconds for vacuum after deleting a single row
|
||
> > 662 seconds for vacuum after deleting the entire table
|
||
>
|
||
> > With no index, I got:
|
||
> > 43 seconds for a vacuum
|
||
> > 43 seconds for vacuum after deleting a single row
|
||
> > 43 seconds for vacuum after deleting the entire table
|
||
>
|
||
> > I find this quite interesting.
|
||
>
|
||
> How long does it take to create the index on your setup --- ie,
|
||
> if vacuum did a drop/create index, would it be competitive?
|
||
|
||
OK, new timings with -F enabled:
|
||
|
||
index no index
|
||
519 same load
|
||
247 " first vacuum
|
||
40 " other vacuums
|
||
|
||
1222 X index creation
|
||
90 X first vacuum
|
||
80 X other vacuums
|
||
|
||
<1 90 delete one row
|
||
121 38 vacuum after delete 1 row
|
||
|
||
346 344 delete all rows
|
||
440 44 first vacuum
|
||
20 <1 other vacuums(index is still same size)
|
||
|
||
Conclusions:
|
||
|
||
o indexes never get smaller
|
||
o drop/recreate index is slower than vacuum of indexes
|
||
|
||
What other conclusions can be made?
|
||
|
||
--
|
||
Bruce Momjian | http://www.op.net/~candle
|
||
pgman@candle.pha.pa.us | (610) 853-3000
|
||
+ If your life is a hard drive, | 830 Blythe Avenue
|
||
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
|
||
|
||
************
|
||
|
||
From scrappy@hub.org Fri Jan 21 12:45:38 2000
|
||
Received: from thelab.hub.org (nat200.60.mpoweredpc.net [142.177.200.60])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA14380
|
||
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 13:45:29 -0500 (EST)
|
||
Received: from localhost (scrappy@localhost)
|
||
by thelab.hub.org (8.9.3/8.9.1) with ESMTP id OAA68289;
|
||
Fri, 21 Jan 2000 14:45:35 -0400 (AST)
|
||
(envelope-from scrappy@hub.org)
|
||
X-Authentication-Warning: thelab.hub.org: scrappy owned process doing -bs
|
||
Date: Fri, 21 Jan 2000 14:45:34 -0400 (AST)
|
||
From: The Hermit Hacker <scrappy@hub.org>
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: Tom Lane <tgl@sss.pgh.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Re: vacuum timings
|
||
In-Reply-To: <200001211751.MAA12106@candle.pha.pa.us>
|
||
Message-ID: <Pine.BSF.4.21.0001211443480.23487-100000@thelab.hub.org>
|
||
MIME-Version: 1.0
|
||
Content-Type: TEXT/PLAIN; charset=US-ASCII
|
||
Status: RO
|
||
|
||
On Fri, 21 Jan 2000, Bruce Momjian wrote:
|
||
|
||
> OK, new timings with -F enabled:
|
||
>
|
||
> index no index
|
||
> 519 same load
|
||
> 247 " first vacuum
|
||
> 40 " other vacuums
|
||
>
|
||
> 1222 X index creation
|
||
> 90 X first vacuum
|
||
> 80 X other vacuums
|
||
>
|
||
> <1 90 delete one row
|
||
> 121 38 vacuum after delete 1 row
|
||
>
|
||
> 346 344 delete all rows
|
||
> 440 44 first vacuum
|
||
> 20 <1 other vacuums(index is still same size)
|
||
>
|
||
> Conclusions:
|
||
>
|
||
> o indexes never get smaller
|
||
|
||
this one, I thought, was a known? if I remember right, Vadim changed it
|
||
so that space was reused, but index never shrunk in size ... no?
|
||
|
||
Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
|
||
Systems Administrator @ hub.org
|
||
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
|
||
|
||
|
||
From tgl@sss.pgh.pa.us Fri Jan 21 13:06:35 2000
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA14618
|
||
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 14:06:33 -0500 (EST)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id OAA16501;
|
||
Fri, 21 Jan 2000 14:06:31 -0500 (EST)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: PostgreSQL-development <pgsql-hackers@postgreSQL.org>
|
||
Subject: Re: vacuum timings
|
||
In-reply-to: <200001211751.MAA12106@candle.pha.pa.us>
|
||
References: <200001211751.MAA12106@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Fri, 21 Jan 2000 12:51:53 -0500"
|
||
Date: Fri, 21 Jan 2000 14:06:31 -0500
|
||
Message-ID: <16498.948481591@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: RO
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> Conclusions:
|
||
> o indexes never get smaller
|
||
|
||
Which we knew...
|
||
|
||
> o drop/recreate index is slower than vacuum of indexes
|
||
|
||
Quite a few people have reported finding the opposite in practice.
|
||
You should probably try vacuuming after deleting or updating some
|
||
fraction of the rows, rather than just the all or none cases.
|
||
|
||
regards, tom lane
|
||
|
||
From dms@wplus.net Fri Jan 21 13:51:27 2000
|
||
Received: from relay.wplus.net (relay.wplus.net [195.131.52.179])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA15623
|
||
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 14:51:24 -0500 (EST)
|
||
X-Real-To: pgman@candle.pha.pa.us
|
||
Received: from wplus.net (ppdms.dialup.wplus.net [195.131.52.71])
|
||
by relay.wplus.net (8.9.1/8.9.1/wplus.2) with ESMTP id WAA89451;
|
||
Fri, 21 Jan 2000 22:46:19 +0300 (MSK)
|
||
Message-ID: <3888B822.28F79A1F@wplus.net>
|
||
Date: Fri, 21 Jan 2000 22:48:50 +0300
|
||
From: Dmitry Samersoff <dms@wplus.net>
|
||
X-Mailer: Mozilla 4.7 [en] (WinNT; I)
|
||
X-Accept-Language: ru,en
|
||
MIME-Version: 1.0
|
||
To: Tom Lane <tgl@sss.pgh.pa.us>
|
||
CC: Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Re: vacuum timings
|
||
References: <200001211751.MAA12106@candle.pha.pa.us> <16498.948481591@sss.pgh.pa.us>
|
||
Content-Type: text/plain; charset=koi8-r
|
||
Content-Transfer-Encoding: 7bit
|
||
Status: ROr
|
||
|
||
Tom Lane wrote:
|
||
>
|
||
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> > Conclusions:
|
||
> > o indexes never get smaller
|
||
>
|
||
> Which we knew...
|
||
>
|
||
> > o drop/recreate index is slower than vacuum of indexes
|
||
>
|
||
> Quite a few people have reported finding the opposite in practice.
|
||
|
||
I'm one of them. On 1,5 GB table with three indices it about twice
|
||
slowly.
|
||
Probably becouse vacuuming indices brakes system cache policy.
|
||
(FreeBSD 3.3)
|
||
|
||
|
||
|
||
--
|
||
Dmitry Samersoff, DM\S
|
||
dms@wplus.net http://devnull.wplus.net
|
||
* there will come soft rains
|
||
|
||
From owner-pgsql-hackers@hub.org Fri Jan 21 14:04:08 2000
|
||
Received: from hub.org (hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id PAA16140
|
||
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 15:04:06 -0500 (EST)
|
||
Received: from localhost (majordom@localhost)
|
||
by hub.org (8.9.3/8.9.3) with SMTP id OAA34808;
|
||
Fri, 21 Jan 2000 14:59:30 -0500 (EST)
|
||
(envelope-from owner-pgsql-hackers)
|
||
Received: by hub.org (bulk_mailer v1.5); Fri, 21 Jan 2000 14:57:48 -0500
|
||
Received: (from majordom@localhost)
|
||
by hub.org (8.9.3/8.9.3) id OAA34320
|
||
for pgsql-hackers-outgoing; Fri, 21 Jan 2000 14:56:50 -0500 (EST)
|
||
(envelope-from owner-pgsql-hackers@postgreSQL.org)
|
||
Received: from candle.pha.pa.us (pgman@s5-03.ppp.op.net [209.152.195.67])
|
||
by hub.org (8.9.3/8.9.3) with ESMTP id OAA34255
|
||
for <pgsql-hackers@postgresql.org>; Fri, 21 Jan 2000 14:56:18 -0500 (EST)
|
||
(envelope-from pgman@candle.pha.pa.us)
|
||
Received: (from pgman@localhost)
|
||
by candle.pha.pa.us (8.9.0/8.9.0) id OAA15772;
|
||
Fri, 21 Jan 2000 14:54:22 -0500 (EST)
|
||
From: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
Message-Id: <200001211954.OAA15772@candle.pha.pa.us>
|
||
Subject: Re: [HACKERS] Re: vacuum timings
|
||
In-Reply-To: <3888B822.28F79A1F@wplus.net> from Dmitry Samersoff at "Jan 21,
|
||
2000 10:48:50 pm"
|
||
To: Dmitry Samersoff <dms@wplus.net>
|
||
Date: Fri, 21 Jan 2000 14:54:21 -0500 (EST)
|
||
CC: Tom Lane <tgl@sss.pgh.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgreSQL.org>
|
||
X-Mailer: ELM [version 2.4ME+ PL66 (25)]
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain; charset=US-ASCII
|
||
Content-Transfer-Encoding: 7bit
|
||
Sender: owner-pgsql-hackers@postgreSQL.org
|
||
Status: RO
|
||
|
||
[Charset koi8-r unsupported, filtering to ASCII...]
|
||
> Tom Lane wrote:
|
||
> >
|
||
> > Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> > > Conclusions:
|
||
> > > o indexes never get smaller
|
||
> >
|
||
> > Which we knew...
|
||
> >
|
||
> > > o drop/recreate index is slower than vacuum of indexes
|
||
> >
|
||
> > Quite a few people have reported finding the opposite in practice.
|
||
>
|
||
> I'm one of them. On 1,5 GB table with three indices it about twice
|
||
> slowly.
|
||
> Probably becouse vacuuming indices brakes system cache policy.
|
||
> (FreeBSD 3.3)
|
||
|
||
OK, we are researching what things can be done to improve this. We are
|
||
toying with:
|
||
|
||
lock table for less duration, or read lock
|
||
creating another copy of heap/indexes, and rename() over old files
|
||
improving heap vacuum speed
|
||
improving index vacuum speed
|
||
moving analyze out of vacuum
|
||
|
||
|
||
--
|
||
Bruce Momjian | http://www.op.net/~candle
|
||
pgman@candle.pha.pa.us | (610) 853-3000
|
||
+ If your life is a hard drive, | 830 Blythe Avenue
|
||
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
|
||
|
||
************
|
||
|
||
From scrappy@hub.org Fri Jan 21 14:12:16 2000
|
||
Received: from thelab.hub.org (nat200.60.mpoweredpc.net [142.177.200.60])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id PAA16521
|
||
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 15:12:13 -0500 (EST)
|
||
Received: from localhost (scrappy@localhost)
|
||
by thelab.hub.org (8.9.3/8.9.1) with ESMTP id QAA69039;
|
||
Fri, 21 Jan 2000 16:12:25 -0400 (AST)
|
||
(envelope-from scrappy@hub.org)
|
||
X-Authentication-Warning: thelab.hub.org: scrappy owned process doing -bs
|
||
Date: Fri, 21 Jan 2000 16:12:25 -0400 (AST)
|
||
From: The Hermit Hacker <scrappy@hub.org>
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: Dmitry Samersoff <dms@wplus.net>, Tom Lane <tgl@sss.pgh.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
||
Subject: Re: [HACKERS] Re: vacuum timings
|
||
In-Reply-To: <200001211954.OAA15772@candle.pha.pa.us>
|
||
Message-ID: <Pine.BSF.4.21.0001211607080.23487-100000@thelab.hub.org>
|
||
MIME-Version: 1.0
|
||
Content-Type: TEXT/PLAIN; charset=US-ASCII
|
||
Status: RO
|
||
|
||
On Fri, 21 Jan 2000, Bruce Momjian wrote:
|
||
|
||
> [Charset koi8-r unsupported, filtering to ASCII...]
|
||
> > Tom Lane wrote:
|
||
> > >
|
||
> > > Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> > > > Conclusions:
|
||
> > > > o indexes never get smaller
|
||
> > >
|
||
> > > Which we knew...
|
||
> > >
|
||
> > > > o drop/recreate index is slower than vacuum of indexes
|
||
> > >
|
||
> > > Quite a few people have reported finding the opposite in practice.
|
||
> >
|
||
> > I'm one of them. On 1,5 GB table with three indices it about twice
|
||
> > slowly.
|
||
> > Probably becouse vacuuming indices brakes system cache policy.
|
||
> > (FreeBSD 3.3)
|
||
>
|
||
> OK, we are researching what things can be done to improve this. We are
|
||
> toying with:
|
||
>
|
||
> lock table for less duration, or read lock
|
||
|
||
if there is some way that we can work around the bug that I believe Tom
|
||
found with removing the lock altogether (ie. makig use of MVCC), I think
|
||
that would be the best option ... if not possible, at least get things
|
||
down to a table lock vs the whole database?
|
||
|
||
a good example is the udmsearch that we are using on the site ... it uses
|
||
multiple tables to store the dictionary, each representing words of X size
|
||
... if I'm searching on a 4 letter word, and the whole database is locked
|
||
while it is working on the dictionary with 8 letter words, I'm sitting
|
||
there idle ... at least if we only locked the 8 letter table, everyone not
|
||
doing 8 letter searches can go on their merry way ...
|
||
|
||
Slightly longer vacuum's, IMHO, are acceptable if, to the end users, its
|
||
as transparent as possible ... locking per table would be slightly slower,
|
||
I think, because once a table is finished, the next table would need to
|
||
have an exclusive lock put on it before starting, so you'd have to
|
||
possibly wait for that...?
|
||
|
||
> creating another copy of heap/indexes, and rename() over old files
|
||
|
||
sounds to me like introducing a large potential for error here ...
|
||
|
||
> moving analyze out of vacuum
|
||
|
||
I think that should be done anyway ... if we ever get to the point that
|
||
we're able to re-use rows in tables, then that would eliminate the
|
||
immediate requirement for vacuum, but still retain a requirement for a
|
||
periodic analyze ... no?
|
||
|
||
Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
|
||
Systems Administrator @ hub.org
|
||
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
|
||
|
||
|
||
From tgl@sss.pgh.pa.us Fri Jan 21 16:02:07 2000
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA20290
|
||
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 17:02:06 -0500 (EST)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id RAA09697;
|
||
Fri, 21 Jan 2000 17:02:06 -0500 (EST)
|
||
To: The Hermit Hacker <scrappy@hub.org>
|
||
cc: Bruce Momjian <pgman@candle.pha.pa.us>,
|
||
PostgreSQL-development <pgsql-hackers@postgreSQL.org>
|
||
Subject: Re: [HACKERS] Re: vacuum timings
|
||
In-reply-to: <Pine.BSF.4.21.0001211607080.23487-100000@thelab.hub.org>
|
||
References: <Pine.BSF.4.21.0001211607080.23487-100000@thelab.hub.org>
|
||
Comments: In-reply-to The Hermit Hacker <scrappy@hub.org>
|
||
message dated "Fri, 21 Jan 2000 16:12:25 -0400"
|
||
Date: Fri, 21 Jan 2000 17:02:06 -0500
|
||
Message-ID: <9694.948492126@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: RO
|
||
|
||
The Hermit Hacker <scrappy@hub.org> writes:
|
||
>> lock table for less duration, or read lock
|
||
|
||
> if there is some way that we can work around the bug that I believe Tom
|
||
> found with removing the lock altogether (ie. makig use of MVCC), I think
|
||
> that would be the best option ... if not possible, at least get things
|
||
> down to a table lock vs the whole database?
|
||
|
||
Huh? VACUUM only requires an exclusive lock on the table it is
|
||
currently vacuuming; there's no database-wide lock.
|
||
|
||
Even a single-table exclusive lock is bad, of course, if it's a large
|
||
table that's critical to a 24x7 application. Bruce was talking about
|
||
the possibility of having VACUUM get just a write lock on the table;
|
||
other backends could still read it, but not write it, during the vacuum
|
||
process. That'd be a considerable step forward for 24x7 applications,
|
||
I think.
|
||
|
||
It looks like that could be done if we rewrote the table as a new file
|
||
(instead of compacting-in-place), but there's a problem when it comes
|
||
time to rename the new files into place. At that point you'd need to
|
||
get an exclusive lock to ensure all the readers are out of the table too
|
||
--- and upgrading from a plain lock to an exclusive lock is a well-known
|
||
recipe for deadlocks. Not sure if this can be solved.
|
||
|
||
regards, tom lane
|
||
|
||
From tgl@sss.pgh.pa.us Fri Jan 21 22:50:34 2000
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA01657
|
||
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 23:50:28 -0500 (EST)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id XAA19681;
|
||
Fri, 21 Jan 2000 23:50:13 -0500 (EST)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: PostgreSQL-development <pgsql-hackers@postgreSQL.org>
|
||
Subject: Re: vacuum timings
|
||
In-reply-to: <200001211751.MAA12106@candle.pha.pa.us>
|
||
References: <200001211751.MAA12106@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Fri, 21 Jan 2000 12:51:53 -0500"
|
||
Date: Fri, 21 Jan 2000 23:50:13 -0500
|
||
Message-ID: <19678.948516613@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: ROr
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> Conclusions:
|
||
> o drop/recreate index is slower than vacuum of indexes
|
||
|
||
BTW, I did some profiling of CREATE INDEX this evening (quite
|
||
unintentionally actually; I was interested in COPY IN, but the pg_dump
|
||
script I used as driver happened to create some indexes too). I was
|
||
startled to discover that 60% of the runtime of CREATE INDEX is spent in
|
||
_bt_invokestrat (which is called from tuplesort.c's comparetup_index,
|
||
and exists only to figure out which specific comparison routine to call).
|
||
Of this, a whopping 4% was spent in the useful subroutine, int4gt. All
|
||
the rest went into lookup and validation checks that by rights should be
|
||
done once per index creation, not once per comparison.
|
||
|
||
In short: a fairly straightforward bit of optimization will eliminate
|
||
circa 50% of the CPU time consumed by CREATE INDEX. All we need is to
|
||
figure out where to cache the lookup results. The optimization would
|
||
improve insertions and lookups in indexes, as well, if we can cache
|
||
the lookup results in those scenarios.
|
||
|
||
This was for a table small enough that tuplesort.c could do the sort
|
||
entirely in memory, so I'm sure the gains would be smaller for a large
|
||
table that requires a disk-based sort. Still, it seems worth looking
|
||
into...
|
||
|
||
regards, tom lane
|
||
|
||
From owner-pgsql-hackers@hub.org Sat Jan 22 02:31:03 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA06743
|
||
for <pgman@candle.pha.pa.us>; Sat, 22 Jan 2000 03:31:02 -0500 (EST)
|
||
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.7 $) with ESMTP id DAA07529 for <pgman@candle.pha.pa.us>; Sat, 22 Jan 2000 03:25:13 -0500 (EST)
|
||
Received: from localhost (majordom@localhost)
|
||
by hub.org (8.9.3/8.9.3) with SMTP id DAA31900;
|
||
Sat, 22 Jan 2000 03:19:53 -0500 (EST)
|
||
(envelope-from owner-pgsql-hackers)
|
||
Received: by hub.org (bulk_mailer v1.5); Sat, 22 Jan 2000 03:17:56 -0500
|
||
Received: (from majordom@localhost)
|
||
by hub.org (8.9.3/8.9.3) id DAA31715
|
||
for pgsql-hackers-outgoing; Sat, 22 Jan 2000 03:16:58 -0500 (EST)
|
||
(envelope-from owner-pgsql-hackers@postgreSQL.org)
|
||
Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
|
||
by hub.org (8.9.3/8.9.3) with ESMTP id DAA31647
|
||
for <pgsql-hackers@postgresql.org>; Sat, 22 Jan 2000 03:16:26 -0500 (EST)
|
||
(envelope-from Inoue@tpf.co.jp)
|
||
Received: from mcadnote1 (ppm114.noc.fukui.nsk.ne.jp [210.161.188.33])
|
||
by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
|
||
id RAA04754; Sat, 22 Jan 2000 17:14:43 +0900
|
||
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
To: "Tom Lane" <tgl@sss.pgh.pa.us>, "Bruce Momjian" <pgman@candle.pha.pa.us>
|
||
Cc: "PostgreSQL-development" <pgsql-hackers@postgresql.org>
|
||
Subject: RE: [HACKERS] Re: vacuum timings
|
||
Date: Sat, 22 Jan 2000 17:15:37 +0900
|
||
Message-ID: <NDBBIJLOILGIKBGDINDFIEEACCAA.Inoue@tpf.co.jp>
|
||
MIME-Version: 1.0
|
||
Content-Type: text/plain;
|
||
charset="iso-2022-jp"
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Priority: 3 (Normal)
|
||
X-MSMail-Priority: Normal
|
||
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
|
||
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
||
In-Reply-To: <16498.948481591@sss.pgh.pa.us>
|
||
Importance: Normal
|
||
Sender: owner-pgsql-hackers@postgresql.org
|
||
Status: RO
|
||
|
||
> -----Original Message-----
|
||
> From: owner-pgsql-hackers@postgresql.org
|
||
> [mailto:owner-pgsql-hackers@postgresql.org]On Behalf Of Tom Lane
|
||
>
|
||
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> > Conclusions:
|
||
> > o indexes never get smaller
|
||
>
|
||
> Which we knew...
|
||
>
|
||
> > o drop/recreate index is slower than vacuum of indexes
|
||
>
|
||
> Quite a few people have reported finding the opposite in practice.
|
||
> You should probably try vacuuming after deleting or updating some
|
||
> fraction of the rows, rather than just the all or none cases.
|
||
>
|
||
|
||
Vacuum after delelting all rows isn't a worst case.
|
||
There's no moving in that case and vacuum doesn't need to call
|
||
index_insert() corresponding to the moving of heap tuples.
|
||
|
||
Vacuum after deleting half of rows may be one of the worst case.
|
||
In this case,index_delete() is called as many times as 'delete all'
|
||
case and expensive index_insert() is called for moved_in tuples.
|
||
|
||
Regards.
|
||
|
||
Hiroshi Inoue
|
||
Inoue@tpf.co.jp
|
||
|
||
************
|
||
|
||
From tgl@sss.pgh.pa.us Sat Jan 22 10:31:02 2000
|
||
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA20882
|
||
for <pgman@candle.pha.pa.us>; Sat, 22 Jan 2000 11:31:00 -0500 (EST)
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.7 $) with ESMTP id LAA26612 for <pgman@candle.pha.pa.us>; Sat, 22 Jan 2000 11:12:44 -0500 (EST)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id LAA20569;
|
||
Sat, 22 Jan 2000 11:11:26 -0500 (EST)
|
||
To: "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
cc: "Bruce Momjian" <pgman@candle.pha.pa.us>,
|
||
"PostgreSQL-development" <pgsql-hackers@postgreSQL.org>
|
||
Subject: Re: [HACKERS] Re: vacuum timings
|
||
In-reply-to: <NDBBIJLOILGIKBGDINDFIEEACCAA.Inoue@tpf.co.jp>
|
||
References: <NDBBIJLOILGIKBGDINDFIEEACCAA.Inoue@tpf.co.jp>
|
||
Comments: In-reply-to "Hiroshi Inoue" <Inoue@tpf.co.jp>
|
||
message dated "Sat, 22 Jan 2000 17:15:37 +0900"
|
||
Date: Sat, 22 Jan 2000 11:11:25 -0500
|
||
Message-ID: <20566.948557485@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: RO
|
||
|
||
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
|
||
> Vacuum after deleting half of rows may be one of the worst case.
|
||
|
||
Or equivalently, vacuum after updating all the rows.
|
||
|
||
regards, tom lane
|
||
|
||
From tgl@sss.pgh.pa.us Thu Jan 20 23:51:49 2000
|
||
Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA13919
|
||
for <pgman@candle.pha.pa.us>; Fri, 21 Jan 2000 00:51:47 -0500 (EST)
|
||
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
||
by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id AAA03644;
|
||
Fri, 21 Jan 2000 00:51:51 -0500 (EST)
|
||
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
||
cc: PostgreSQL-development <pgsql-hackers@postgreSQL.org>
|
||
Subject: Re: vacuum timings
|
||
In-reply-to: <200001210543.AAA13592@candle.pha.pa.us>
|
||
References: <200001210543.AAA13592@candle.pha.pa.us>
|
||
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
||
message dated "Fri, 21 Jan 2000 00:43:49 -0500"
|
||
Date: Fri, 21 Jan 2000 00:51:51 -0500
|
||
Message-ID: <3641.948433911@sss.pgh.pa.us>
|
||
From: Tom Lane <tgl@sss.pgh.pa.us>
|
||
Status: ROr
|
||
|
||
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
||
> I loaded 10,000,000 rows into CREATE TABLE test (x INTEGER); Table is
|
||
> 400MB and index is 160MB.
|
||
|
||
> With index on the single in4 column, I got:
|
||
> 78 seconds for a vacuum
|
||
> 121 seconds for vacuum after deleting a single row
|
||
> 662 seconds for vacuum after deleting the entire table
|
||
|
||
> With no index, I got:
|
||
> 43 seconds for a vacuum
|
||
> 43 seconds for vacuum after deleting a single row
|
||
> 43 seconds for vacuum after deleting the entire table
|
||
|
||
> I find this quite interesting.
|
||
|
||
How long does it take to create the index on your setup --- ie,
|
||
if vacuum did a drop/create index, would it be competitive?
|
||
|
||
regards, tom lane
|
||
|
||
From pgsql-hackers-owner+M5909@hub.org Thu Aug 17 20:15:33 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA00644
|
||
for <pgman@candle.pha.pa.us>; Thu, 17 Aug 2000 20:15:32 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e7I0APm69660;
|
||
Thu, 17 Aug 2000 20:10:25 -0400 (EDT)
|
||
Received: from fw.wintelcom.net (bright@ns1.wintelcom.net [209.1.153.20])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e7I01Jm68072
|
||
for <pgsql-hackers@postgresql.org>; Thu, 17 Aug 2000 20:01:19 -0400 (EDT)
|
||
Received: (from bright@localhost)
|
||
by fw.wintelcom.net (8.10.0/8.10.0) id e7I01IA20820
|
||
for pgsql-hackers@postgresql.org; Thu, 17 Aug 2000 17:01:18 -0700 (PDT)
|
||
Date: Thu, 17 Aug 2000 17:01:18 -0700
|
||
From: Alfred Perlstein <bright@wintelcom.net>
|
||
To: pgsql-hackers@postgresql.org
|
||
Subject: [HACKERS] VACUUM optimization ideas.
|
||
Message-ID: <20000817170118.K4854@fw.wintelcom.net>
|
||
Mime-Version: 1.0
|
||
Content-Type: text/plain; charset=us-ascii
|
||
Content-Disposition: inline
|
||
User-Agent: Mutt/1.2.4i
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: ROr
|
||
|
||
Here's two ideas I had for optimizing vacuum, I apologize in advance
|
||
if the ideas presented here are niave and don't take into account
|
||
the actual code that makes up postgresql.
|
||
|
||
================
|
||
|
||
#1
|
||
|
||
Reducing the time vacuum must hold an exlusive lock on a table:
|
||
|
||
The idea is that since rows are marked deleted it's ok for the
|
||
vacuum to fill them with data from the tail of the table as
|
||
long as no transaction is in progress that has started before
|
||
the row was deleted.
|
||
|
||
This may allow the vacuum process to copyback all the data without
|
||
a lock, when all the copying is done it then aquires an exlusive lock
|
||
and does this:
|
||
|
||
Aquire an exclusive lock.
|
||
Walk all the deleted data marking it as current.
|
||
Truncate the table.
|
||
Release the lock.
|
||
|
||
Since the data is still marked invalid (right?) even if valid data
|
||
is copied into the space it should be ignored as long as there's no
|
||
transaction occurring that started before the data was invalidated.
|
||
|
||
================
|
||
|
||
#2
|
||
|
||
Reducing the amount of scanning a vaccum must do:
|
||
|
||
It would make sense that if a value of the earliest deleted chunk
|
||
was kept in a table then vacuum would not have to scan the entire
|
||
table in order to work, it would only need to start at the 'earliest'
|
||
invalidated row.
|
||
|
||
The utility of this (at least for us) is that we have several tables
|
||
that will grow to hundreds of megabytes, however changes will only
|
||
happen at the tail end (recently added rows). If we could reduce the
|
||
amount of time spent in a vacuum state it would help us a lot.
|
||
|
||
================
|
||
|
||
I'm wondering if these ideas make sense and may help at all.
|
||
|
||
thanks,
|
||
--
|
||
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
|
||
|
||
From pgsql-hackers-owner+M5912@hub.org Fri Aug 18 01:36:14 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA07787
|
||
for <pgman@candle.pha.pa.us>; Fri, 18 Aug 2000 01:36:12 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e7I5Q2m38759;
|
||
Fri, 18 Aug 2000 01:26:04 -0400 (EDT)
|
||
Received: from courier02.adinet.com.uy (courier02.adinet.com.uy [206.99.44.245])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e7I5Bam35785
|
||
for <pgsql-hackers@postgresql.org>; Fri, 18 Aug 2000 01:11:37 -0400 (EDT)
|
||
Received: from adinet.com.uy (haroldo@r207-50-240-116.adinet.com.uy [207.50.240.116])
|
||
by courier02.adinet.com.uy (8.9.3/8.9.3) with ESMTP id CAA17259;
|
||
Fri, 18 Aug 2000 02:10:49 -0300 (GMT)
|
||
Message-ID: <399CC739.B9B13D18@adinet.com.uy>
|
||
Date: Fri, 18 Aug 2000 02:18:49 -0300
|
||
From: hstenger@adinet.com.uy
|
||
Reply-To: hstenger@ieee.org
|
||
Organization: PRISMA, Servicio y Desarrollo
|
||
X-Mailer: Mozilla 4.72 [en] (X11; I; Linux 2.2.14 i586)
|
||
X-Accept-Language: en
|
||
MIME-Version: 1.0
|
||
To: Alfred Perlstein <bright@wintelcom.net>, pgsql-hackers@postgresql.org
|
||
Subject: Re: [HACKERS] VACUUM optimization ideas.
|
||
References: <20000817170118.K4854@fw.wintelcom.net>
|
||
Content-Type: text/plain; charset=us-ascii
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: ROr
|
||
|
||
Alfred Perlstein wrote:
|
||
> #1
|
||
>
|
||
> Reducing the time vacuum must hold an exlusive lock on a table:
|
||
>
|
||
> The idea is that since rows are marked deleted it's ok for the
|
||
> vacuum to fill them with data from the tail of the table as
|
||
> long as no transaction is in progress that has started before
|
||
> the row was deleted.
|
||
>
|
||
> This may allow the vacuum process to copyback all the data without
|
||
> a lock, when all the copying is done it then aquires an exlusive lock
|
||
> and does this:
|
||
>
|
||
> Aquire an exclusive lock.
|
||
> Walk all the deleted data marking it as current.
|
||
> Truncate the table.
|
||
> Release the lock.
|
||
>
|
||
> Since the data is still marked invalid (right?) even if valid data
|
||
> is copied into the space it should be ignored as long as there's no
|
||
> transaction occurring that started before the data was invalidated.
|
||
|
||
Yes, but nothing prevents newer transactions from modifying the _origin_ side of
|
||
the copied data _after_ it was copied, but before the Lock-Walk-Truncate-Unlock
|
||
cycle takes place, and so it seems unsafe. Maybe locking each record before
|
||
copying it up ...
|
||
|
||
Regards,
|
||
Haroldo.
|
||
|
||
--
|
||
----------------------+------------------------
|
||
Haroldo Stenger | hstenger@ieee.org
|
||
Montevideo, Uruguay. | hstenger@adinet.com.uy
|
||
----------------------+------------------------
|
||
Visit UYLUG Web Site: http://www.linux.org.uy
|
||
-----------------------------------------------
|
||
|
||
From pgsql-hackers-owner+M5917@hub.org Fri Aug 18 09:41:33 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id JAA05170
|
||
for <pgman@candle.pha.pa.us>; Fri, 18 Aug 2000 09:41:33 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e7IDVjm75143;
|
||
Fri, 18 Aug 2000 09:31:46 -0400 (EDT)
|
||
Received: from andie.ip23.net (andie.ip23.net [212.83.32.23])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e7IDPIm73296
|
||
for <pgsql-hackers@postgresql.org>; Fri, 18 Aug 2000 09:25:18 -0400 (EDT)
|
||
Received: from imap1.ip23.net (imap1.ip23.net [212.83.32.35])
|
||
by andie.ip23.net (8.9.3/8.9.3) with ESMTP id PAA58387;
|
||
Fri, 18 Aug 2000 15:25:12 +0200 (CEST)
|
||
Received: from ip23.net (spc.ip23.net [212.83.32.122])
|
||
by imap1.ip23.net (8.9.3/8.9.3) with ESMTP id PAA59177;
|
||
Fri, 18 Aug 2000 15:41:28 +0200 (CEST)
|
||
Message-ID: <399D3938.582FDB49@ip23.net>
|
||
Date: Fri, 18 Aug 2000 15:25:12 +0200
|
||
From: Sevo Stille <sevo@ip23.net>
|
||
Organization: IP23
|
||
X-Mailer: Mozilla 4.61 [en] (X11; I; Linux 2.2.10 i686)
|
||
X-Accept-Language: en, de
|
||
MIME-Version: 1.0
|
||
To: Alfred Perlstein <bright@wintelcom.net>
|
||
CC: pgsql-hackers@postgresql.org
|
||
Subject: Re: [HACKERS] VACUUM optimization ideas.
|
||
References: <20000817170118.K4854@fw.wintelcom.net>
|
||
Content-Type: text/plain; charset=us-ascii
|
||
Content-Transfer-Encoding: 7bit
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: RO
|
||
|
||
Alfred Perlstein wrote:
|
||
|
||
> The idea is that since rows are marked deleted it's ok for the
|
||
> vacuum to fill them with data from the tail of the table as
|
||
> long as no transaction is in progress that has started before
|
||
> the row was deleted.
|
||
|
||
Well, isn't one of the advantages of vacuuming in the reordering it
|
||
does? With a "fill deleted chunks" logic, we'd have far less order in
|
||
the databases.
|
||
|
||
> This may allow the vacuum process to copyback all the data without
|
||
> a lock,
|
||
|
||
Nope. Another process might update the values in between move and mark,
|
||
if the record is not locked. We'd either have to write-lock the entire
|
||
table for that period, write lock every item as it is moved, or lock,
|
||
move and mark on a per-record base. The latter would be slow, but it
|
||
could be done in a permanent low priority background process, utilizing
|
||
empty CPU cycles. Besides, it probably could not only be done simply
|
||
filling from the tail, but also moving up the records in a sorted
|
||
fashion.
|
||
|
||
> #2
|
||
>
|
||
> Reducing the amount of scanning a vaccum must do:
|
||
>
|
||
> It would make sense that if a value of the earliest deleted chunk
|
||
> was kept in a table then vacuum would not have to scan the entire
|
||
> table in order to work, it would only need to start at the 'earliest'
|
||
> invalidated row.
|
||
|
||
Trivial to do. But of course #1 may imply that the physical ordering is
|
||
even less likely to be related to the logical ordering in a way where
|
||
this helps.
|
||
|
||
> The utility of this (at least for us) is that we have several tables
|
||
> that will grow to hundreds of megabytes, however changes will only
|
||
> happen at the tail end (recently added rows).
|
||
|
||
The tail is a relative position - except for the case where you add
|
||
temporary records to a constant default set, everything in the tail will
|
||
move, at least relatively, to the head after some time.
|
||
|
||
> If we could reduce the
|
||
> amount of time spent in a vacuum state it would help us a lot.
|
||
|
||
Rather: If we can reduce the time spent in a locked state while
|
||
vacuuming, it would help a lot. Being in a vacuum is not the issue -
|
||
even permanent vacuuming need not be an issue, if the locks it uses are
|
||
suitably short-time.
|
||
|
||
Sevo
|
||
|
||
--
|
||
sevo@ip23.net
|
||
|
||
From pgsql-hackers-owner+M5911@hub.org Thu Aug 17 21:11:20 2000
|
||
Received: from hub.org (root@hub.org [216.126.84.1])
|
||
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA01882
|
||
for <pgman@candle.pha.pa.us>; Thu, 17 Aug 2000 21:11:20 -0400 (EDT)
|
||
Received: from hub.org (majordom@localhost [127.0.0.1])
|
||
by hub.org (8.10.1/8.10.1) with SMTP id e7I119m80626;
|
||
Thu, 17 Aug 2000 21:01:09 -0400 (EDT)
|
||
Received: from acheron.rime.com.au (root@albatr.lnk.telstra.net [139.130.54.222])
|
||
by hub.org (8.10.1/8.10.1) with ESMTP id e7I0wMm79870
|
||
for <pgsql-hackers@postgresql.org>; Thu, 17 Aug 2000 20:58:22 -0400 (EDT)
|
||
Received: from oberon (Oberon.rime.com.au [203.8.195.100])
|
||
by acheron.rime.com.au (8.9.3/8.9.3) with SMTP id KAA03215;
|
||
Fri, 18 Aug 2000 10:58:25 +1000
|
||
Message-Id: <3.0.5.32.20000818105835.0280ade0@mail.rhyme.com.au>
|
||
X-Sender: pjw@mail.rhyme.com.au
|
||
X-Mailer: QUALCOMM Windows Eudora Pro Version 3.0.5 (32)
|
||
Date: Fri, 18 Aug 2000 10:58:35 +1000
|
||
To: Chris Bitmead <chrisb@nimrod.itg.telstra.com.au>,
|
||
Ben Adida <ben@openforce.net>
|
||
From: Philip Warner <pjw@rhyme.com.au>
|
||
Subject: Re: [HACKERS] Inserting a select statement result into another
|
||
table
|
||
Cc: Andrew Selle <aselle@upl.cs.wisc.edu>, pgsql-hackers@postgresql.org
|
||
In-Reply-To: <399C7689.2DDDAD1D@nimrod.itg.telecom.com.au>
|
||
References: <20000817130517.A10909@upl.cs.wisc.edu>
|
||
<399BF555.43FB70C8@openforce.net>
|
||
Mime-Version: 1.0
|
||
Content-Type: text/plain; charset="us-ascii"
|
||
X-Mailing-List: pgsql-hackers@postgresql.org
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@hub.org
|
||
Status: O
|
||
|
||
At 09:34 18/08/00 +1000, Chris Bitmead wrote:
|
||
>
|
||
>He does ask a legitimate question though. If you are going to have a
|
||
>LIMIT feature (which of course is not pure SQL), there seems no reason
|
||
>you shouldn't be able to insert the result into a table.
|
||
|
||
This feature is supported by two commercial DBs: Dec/RDB and SQL/Server. I
|
||
have no idea if Oracle supports it, but it is such a *useful* feature that
|
||
I would be very surprised if it didn't.
|
||
|
||
|
||
>Ben Adida wrote:
|
||
>>
|
||
>> What is the purpose you're trying to accomplish with this order by? No
|
||
matter what, all the
|
||
>> rows where done='f' will be inserted, and you will not be left with any
|
||
indication of that
|
||
>> order once the rows are in the todolist table.
|
||
|
||
I don't know what his *purpose* was, but the query should only insert the
|
||
first two rows from the select bacause of the limit).
|
||
|
||
>> Andrew Selle wrote:
|
||
>>
|
||
>> > Alright. My situation is this. I have a list of things that need to
|
||
be done
|
||
>> > in a table called tasks. I have a list of users who will complete
|
||
these tasks.
|
||
>> > I want these users to be able to come in and "claim" the top 2 most
|
||
recent tasks
|
||
>> > that have been added. These tasks then get stored in a table called
|
||
todolist
|
||
>> > which stores who claimed the task, the taskid, and when the task was
|
||
claimed.
|
||
>> > For each time someone wants to claim some number of tasks, I want to
|
||
do something
|
||
>> > like
|
||
>> >
|
||
>> > INSERT INTO todolist
|
||
>> > SELECT taskid,'1',now()
|
||
>> > FROM tasks
|
||
>> > WHERE done='f'
|
||
>> > ORDER BY submit DESC
|
||
>> > LIMIT 2;
|
||
|
||
----------------------------------------------------------------
|
||
Philip Warner | __---_____
|
||
Albatross Consulting Pty. Ltd. |----/ - \
|
||
(A.B.N. 75 008 659 498) | /(@) ______---_
|
||
Tel: (+61) 0500 83 82 81 | _________ \
|
||
Fax: (+61) 0500 83 82 82 | ___________ |
|
||
Http://www.rhyme.com.au | / \|
|
||
| --________--
|
||
PGP key available upon request, | /
|
||
and from pgp5.ai.mit.edu:11371 |/
|
||
|
||
From pgsql-hackers-owner+M29308@postgresql.org Mon Sep 23 09:47:54 2002
|
||
Return-path: <pgsql-hackers-owner+M29308@postgresql.org>
|
||
Received: from postgresql.org (postgresql.org [64.49.215.8])
|
||
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g8NDlqd00289
|
||
for <pgman@candle.pha.pa.us>; Mon, 23 Sep 2002 09:47:53 -0400 (EDT)
|
||
Received: from localhost (postgresql.org [64.49.215.8])
|
||
by postgresql.org (Postfix) with ESMTP
|
||
id 7CA64476497; Mon, 23 Sep 2002 09:43:28 -0400 (EDT)
|
||
Received: from postgresql.org (postgresql.org [64.49.215.8])
|
||
by postgresql.org (Postfix) with SMTP
|
||
id EDA70475BC3; Mon, 23 Sep 2002 09:43:20 -0400 (EDT)
|
||
Received: from localhost (postgresql.org [64.49.215.8])
|
||
by postgresql.org (Postfix) with ESMTP id 85264476479
|
||
for <pgsql-hackers@postgresql.org>; Mon, 23 Sep 2002 09:43:15 -0400 (EDT)
|
||
Received: from www.pspl.co.in (www.pspl.co.in [202.54.11.65])
|
||
by postgresql.org (Postfix) with ESMTP id C7899476477
|
||
for <pgsql-hackers@postgresql.org>; Mon, 23 Sep 2002 09:43:12 -0400 (EDT)
|
||
Received: (from root@localhost)
|
||
by www.pspl.co.in (8.11.6/8.11.6) id g8NDiQ030526
|
||
for <pgsql-hackers@postgresql.org>; Mon, 23 Sep 2002 19:14:26 +0530
|
||
Received: from daithan (daithan.intranet.pspl.co.in [192.168.7.161])
|
||
by www.pspl.co.in (8.11.6/8.11.0) with ESMTP id g8NDiQ330521;
|
||
Mon, 23 Sep 2002 19:14:26 +0530
|
||
From: "Shridhar Daithankar" <shridhar_daithankar@persistent.co.in>
|
||
To: pgsql-hackers@postgresql.org, pgsql-general@postgresql.org
|
||
Date: Mon, 23 Sep 2002 19:13:44 +0530
|
||
MIME-Version: 1.0
|
||
Subject: [HACKERS] Postgresql Automatic vacuum
|
||
Reply-To: shridhar_daithankar@persistent.co.in
|
||
Message-ID: <3D8F67E8.7500.4E0E180@localhost>
|
||
X-Mailer: Pegasus Mail for Windows (v4.02)
|
||
Content-Type: text/plain; charset=US-ASCII
|
||
Content-Transfer-Encoding: 7BIT
|
||
Content-Description: Mail message body
|
||
X-Virus-Scanned: by AMaViS new-20020517
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@postgresql.org
|
||
X-Virus-Scanned: by AMaViS new-20020517
|
||
Status: OR
|
||
|
||
Hello All,
|
||
|
||
I have written a small daemon that can automatically vacuum PostgreSQL
|
||
database, depending upon activity per table.
|
||
|
||
It sits on top of postgres statistics collector. The postgres installation
|
||
should have per row statistics collection enabled.
|
||
|
||
Features are,
|
||
|
||
* Vacuuming based on activity on the table
|
||
* Per table vacuum. So only heavily updated tables are vacuumed.
|
||
* multiple databases supported
|
||
* Performs 'vacuum analyze' only, so it will not block the database
|
||
|
||
|
||
The project location is
|
||
http://gborg.postgresql.org/project/pgavd/projdisplay.php
|
||
|
||
Let me know for bugs/improvements and comments..
|
||
|
||
I am sure real world postgres installations has some sort of scripts doing
|
||
similar thing. This is an attempt to provide a generic interface to periodic
|
||
vacuum.
|
||
|
||
|
||
Bye
|
||
Shridhar
|
||
|
||
--
|
||
The Abrams' Principle: The shortest distance between two points is off the
|
||
wall.
|
||
|
||
|
||
---------------------------(end of broadcast)---------------------------
|
||
TIP 3: if posting/reading through Usenet, please send an appropriate
|
||
subscribe-nomail command to majordomo@postgresql.org so that your
|
||
message can get through to the mailing list cleanly
|
||
|
||
From pgsql-hackers-owner+M29344@postgresql.org Tue Sep 24 02:42:36 2002
|
||
Return-path: <pgsql-hackers-owner+M29344@postgresql.org>
|
||
Received: from postgresql.org (postgresql.org [64.49.215.8])
|
||
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g8O6gYg19416
|
||
for <pgman@candle.pha.pa.us>; Tue, 24 Sep 2002 02:42:35 -0400 (EDT)
|
||
Received: from localhost (postgresql.org [64.49.215.8])
|
||
by postgresql.org (Postfix) with ESMTP
|
||
id 128704762AF; Tue, 24 Sep 2002 02:42:36 -0400 (EDT)
|
||
Received: from postgresql.org (postgresql.org [64.49.215.8])
|
||
by postgresql.org (Postfix) with SMTP
|
||
id DE80C4760F5; Tue, 24 Sep 2002 02:42:32 -0400 (EDT)
|
||
Received: from localhost (postgresql.org [64.49.215.8])
|
||
by postgresql.org (Postfix) with ESMTP id 40A8A475DBC
|
||
for <pgsql-hackers@postgresql.org>; Tue, 24 Sep 2002 02:42:25 -0400 (EDT)
|
||
Received: from relay.icomedias.com (relay.icomedias.com [62.99.232.66])
|
||
by postgresql.org (Postfix) with ESMTP id 7ECC8475DAD
|
||
for <pgsql-hackers@postgresql.org>; Tue, 24 Sep 2002 02:42:23 -0400 (EDT)
|
||
Received: from loki ([10.192.17.128])
|
||
by relay.icomedias.com (8.12.5/8.12.5) with ESMTP id g8O6g8BX014226;
|
||
Tue, 24 Sep 2002 08:42:09 +0200
|
||
Content-Type: text/plain;
|
||
charset="iso-8859-1"
|
||
From: Mario Weilguni <mweilguni@sime.com>
|
||
To: shridhar_daithankar@persistent.co.in, matthew@zeut.net
|
||
Subject: Re: [HACKERS] Postgresql Automatic vacuum
|
||
Date: Tue, 24 Sep 2002 08:42:06 +0200
|
||
User-Agent: KMail/1.4.3
|
||
cc: pgsql-hackers@postgresql.org
|
||
References: <3D8F67E8.7500.4E0E180@localhost> <3D9050B2.9782.86E55C0@localhost>
|
||
In-Reply-To: <3D9050B2.9782.86E55C0@localhost>
|
||
MIME-Version: 1.0
|
||
Message-ID: <200209240842.06459.mweilguni@sime.com>
|
||
avpresult: 0, ok, ok
|
||
X-Scanned-By: MIMEDefang 2.16 (www . roaringpenguin . com / mimedefang)
|
||
X-Virus-Scanned: by AMaViS new-20020517
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@postgresql.org
|
||
X-Virus-Scanned: by AMaViS new-20020517
|
||
Content-Transfer-Encoding: 8bit
|
||
X-MIME-Autoconverted: from quoted-printable to 8bit by candle.pha.pa.us id g8O6gYg19416
|
||
Status: OR
|
||
|
||
Am Dienstag, 24. September 2002 08:16 schrieb Shridhar Daithankar:
|
||
>
|
||
> > I will play with it more and give you some more feedback.
|
||
>
|
||
> Awaiting that.
|
||
>
|
||
|
||
IMO there are still several problems with that approach, namely:
|
||
* every database will get "polluted" with the autovacuum table, which is undesired
|
||
* the biggest problem is the ~/.pgavrc file. I think it should work like other postgres utils do, e.g. supporting -U, -d, ....
|
||
* it's not possible to use without activly administration the config file. it should be able to work without
|
||
adminstrator assistance.
|
||
|
||
When this is a daemon, why not store the data in memory? Even with several thousands of tables the memory footprint would
|
||
still be small. And it should be possible to use for all databases without modifying a config file.
|
||
|
||
Two weeks ago I began writing a similar daemon, but had no time yet to finish it. I've tried to avoid using fixed numbers (namely "vacuum table
|
||
after 1000 updates") and tried to make my own heuristic based on the statistics data and the size of the table. The reason is, for a large table 1000 entries might be
|
||
a small percentage and vacuum is not necessary, while for small tables 10 updates might be sufficient.
|
||
|
||
Best regards,
|
||
Mario Weilguni
|
||
|
||
|
||
---------------------------(end of broadcast)---------------------------
|
||
TIP 2: you can get off all lists at once with the unregister command
|
||
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
|
||
|
||
From pgsql-hackers-owner+M29345@postgresql.org Tue Sep 24 03:02:50 2002
|
||
Return-path: <pgsql-hackers-owner+M29345@postgresql.org>
|
||
Received: from postgresql.org (postgresql.org [64.49.215.8])
|
||
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g8O72lg21051
|
||
for <pgman@candle.pha.pa.us>; Tue, 24 Sep 2002 03:02:48 -0400 (EDT)
|
||
Received: from localhost (postgresql.org [64.49.215.8])
|
||
by postgresql.org (Postfix) with ESMTP
|
||
id 9B3EA4762F6; Tue, 24 Sep 2002 03:02:48 -0400 (EDT)
|
||
Received: from postgresql.org (postgresql.org [64.49.215.8])
|
||
by postgresql.org (Postfix) with SMTP
|
||
id 902EA476020; Tue, 24 Sep 2002 03:02:45 -0400 (EDT)
|
||
Received: from localhost (postgresql.org [64.49.215.8])
|
||
by postgresql.org (Postfix) with ESMTP id 98689475DAD
|
||
for <pgsql-hackers@postgresql.org>; Tue, 24 Sep 2002 03:02:18 -0400 (EDT)
|
||
Received: from www.pspl.co.in (www.pspl.co.in [202.54.11.65])
|
||
by postgresql.org (Postfix) with ESMTP id 47B8647592C
|
||
for <pgsql-hackers@postgresql.org>; Tue, 24 Sep 2002 03:02:16 -0400 (EDT)
|
||
Received: (from root@localhost)
|
||
by www.pspl.co.in (8.11.6/8.11.6) id g8O73QQ16318
|
||
for <pgsql-hackers@postgresql.org>; Tue, 24 Sep 2002 12:33:26 +0530
|
||
Received: from daithan (daithan.intranet.pspl.co.in [192.168.7.161])
|
||
by www.pspl.co.in (8.11.6/8.11.0) with ESMTP id g8O73Q316313
|
||
for <pgsql-hackers@postgresql.org>; Tue, 24 Sep 2002 12:33:26 +0530
|
||
From: "Shridhar Daithankar" <shridhar_daithankar@persistent.co.in>
|
||
To: pgsql-hackers@postgresql.org
|
||
Date: Tue, 24 Sep 2002 12:32:43 +0530
|
||
MIME-Version: 1.0
|
||
Subject: Re: [HACKERS] Postgresql Automatic vacuum
|
||
Reply-To: shridhar_daithankar@persistent.co.in
|
||
Message-ID: <3D905B6B.1635.898382A@localhost>
|
||
References: <3D9050B2.9782.86E55C0@localhost>
|
||
In-Reply-To: <200209240842.06459.mweilguni@sime.com>
|
||
X-Mailer: Pegasus Mail for Windows (v4.02)
|
||
Content-Type: text/plain; charset=US-ASCII
|
||
Content-Transfer-Encoding: 7BIT
|
||
Content-Description: Mail message body
|
||
X-Virus-Scanned: by AMaViS new-20020517
|
||
Precedence: bulk
|
||
Sender: pgsql-hackers-owner@postgresql.org
|
||
X-Virus-Scanned: by AMaViS new-20020517
|
||
Status: OR
|
||
|
||
On 24 Sep 2002 at 8:42, Mario Weilguni wrote:
|
||
|
||
> Am Dienstag, 24. September 2002 08:16 schrieb Shridhar Daithankar:
|
||
> IMO there are still several problems with that approach, namely:
|
||
> * every database will get "polluted" with the autovacuum table, which is undesired
|
||
|
||
I agree. But that was the best alternative I could see. explanation
|
||
follows..Besides I didn't want to touch PG meta data..
|
||
|
||
> * the biggest problem is the ~/.pgavrc file. I think it should work like other postgres utils do, e.g. supporting -U, -d, ....
|
||
|
||
Shouldn't be a problem. The config stuff is working and I can add that. I would
|
||
rather term it a minor issue. On personal preference, I would just fire it
|
||
without any arguments. It's not a thing that you change daily. Configure it in
|
||
config file and done..
|
||
|
||
> * it's not possible to use without activly administration the config file. it should be able to work without
|
||
> adminstrator assistance.
|
||
|
||
Well. I would call that tuning. Each admin can tune it. Yes it's an effort but
|
||
certainly not an active administration.
|
||
|
||
> When this is a daemon, why not store the data in memory? Even with several thousands of tables the memory footprint would
|
||
> still be small. And it should be possible to use for all databases without modifying a config file.
|
||
|
||
Well. When postgresql has ability to deal with arbitrary number of rows, it
|
||
seemed redundant to me to duplicate all those functionality. Why write lists
|
||
and arrays again and again? Let postgresql do it.
|
||
|
||
|
||
> Two weeks ago I began writing a similar daemon, but had no time yet to finish it. I've tried to avoid using fixed numbers (namely "vacuum table
|
||
> after 1000 updates") and tried to make my own heuristic based on the statistics data and the size of the table. The reason is, for a large table 1000 entries might be
|
||
> a small percentage and vacuum is not necessary, while for small tables 10 updates might be sufficient.
|
||
|
||
Well, that fixed number is not really fixed but admin tunable, that too per
|
||
database. These are just defaults. Tune it to suit your needs.
|
||
|
||
The objective of whole exercise is to get rid of periodic vacuum as this app.
|
||
shifts threshold to activity rather than time.
|
||
|
||
Besides a table should be vacuumed when it starts affecting performance. On an
|
||
installation if a table a 1M rows and change 1K rows affects performance, there
|
||
will be a similar performance hit for a 100K rows table for 1K rows update.
|
||
Because overhead involved would be almost same.(Not disk space. pgavd does not
|
||
target vacuum full but tuple size should matter).
|
||
|
||
At least me thinks so..
|
||
|
||
I plan to implement per table threshold in addition to per database thresholds.
|
||
But right now, it seems like overhead to me. Besides there is an item in TODO,
|
||
to shift unit of work from rows to blocks affected. I guess that takes care of
|
||
some of your points..
|
||
Bye
|
||
Shridhar
|
||
|
||
--
|
||
Jones' Second Law: The man who smiles when things go wrong has thought of
|
||
someone to blame it on.
|
||
|
||
|
||
---------------------------(end of broadcast)---------------------------
|
||
TIP 5: Have you checked our extensive FAQ?
|
||
|
||
http://www.postgresql.org/users-lounge/docs/faq.html
|
||
|