2015-05-16 03:31:50 +08:00
|
|
|
CREATE EXTENSION tsm_system_time;
|
Redesign tablesample method API, and do extensive code review.
The original implementation of TABLESAMPLE modeled the tablesample method
API on index access methods, which wasn't a good choice because, without
specialized DDL commands, there's no way to build an extension that can
implement a TSM. (Raw inserts into system catalogs are not an acceptable
thing to do, because we can't undo them during DROP EXTENSION, nor will
pg_upgrade behave sanely.) Instead adopt an API more like procedural
language handlers or foreign data wrappers, wherein the only SQL-level
support object needed is a single handler function identified by having
a special return type. This lets us get rid of the supporting catalog
altogether, so that no custom DDL support is needed for the feature.
Adjust the API so that it can support non-constant tablesample arguments
(the original coding assumed we could evaluate the argument expressions at
ExecInitSampleScan time, which is undesirable even if it weren't outright
unsafe), and discourage sampling methods from looking at invisible tuples.
Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable
within and across queries, as required by the SQL standard, and deal more
honestly with methods that can't support that requirement.
Make a full code-review pass over the tablesample additions, and fix
assorted bugs, omissions, infelicities, and cosmetic issues (such as
failure to put the added code stanzas in a consistent ordering).
Improve EXPLAIN's output of tablesample plans, too.
Back-patch to 9.5 so that we don't have to support the original API
in production.
2015-07-26 02:39:00 +08:00
|
|
|
CREATE TABLE test_tablesample (id int, name text);
|
|
|
|
INSERT INTO test_tablesample SELECT i, repeat(i::text, 1000)
|
|
|
|
FROM generate_series(0, 30) s(i);
|
2015-05-16 03:31:50 +08:00
|
|
|
ANALYZE test_tablesample;
|
Redesign tablesample method API, and do extensive code review.
The original implementation of TABLESAMPLE modeled the tablesample method
API on index access methods, which wasn't a good choice because, without
specialized DDL commands, there's no way to build an extension that can
implement a TSM. (Raw inserts into system catalogs are not an acceptable
thing to do, because we can't undo them during DROP EXTENSION, nor will
pg_upgrade behave sanely.) Instead adopt an API more like procedural
language handlers or foreign data wrappers, wherein the only SQL-level
support object needed is a single handler function identified by having
a special return type. This lets us get rid of the supporting catalog
altogether, so that no custom DDL support is needed for the feature.
Adjust the API so that it can support non-constant tablesample arguments
(the original coding assumed we could evaluate the argument expressions at
ExecInitSampleScan time, which is undesirable even if it weren't outright
unsafe), and discourage sampling methods from looking at invisible tuples.
Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable
within and across queries, as required by the SQL standard, and deal more
honestly with methods that can't support that requirement.
Make a full code-review pass over the tablesample additions, and fix
assorted bugs, omissions, infelicities, and cosmetic issues (such as
failure to put the added code stanzas in a consistent ordering).
Improve EXPLAIN's output of tablesample plans, too.
Back-patch to 9.5 so that we don't have to support the original API
in production.
2015-07-26 02:39:00 +08:00
|
|
|
-- It's a bit tricky to test SYSTEM_TIME in a platform-independent way.
|
|
|
|
-- We can test the zero-time corner case ...
|
|
|
|
SELECT count(*) FROM test_tablesample TABLESAMPLE system_time (0);
|
2015-05-16 03:31:50 +08:00
|
|
|
count
|
|
|
|
-------
|
Redesign tablesample method API, and do extensive code review.
The original implementation of TABLESAMPLE modeled the tablesample method
API on index access methods, which wasn't a good choice because, without
specialized DDL commands, there's no way to build an extension that can
implement a TSM. (Raw inserts into system catalogs are not an acceptable
thing to do, because we can't undo them during DROP EXTENSION, nor will
pg_upgrade behave sanely.) Instead adopt an API more like procedural
language handlers or foreign data wrappers, wherein the only SQL-level
support object needed is a single handler function identified by having
a special return type. This lets us get rid of the supporting catalog
altogether, so that no custom DDL support is needed for the feature.
Adjust the API so that it can support non-constant tablesample arguments
(the original coding assumed we could evaluate the argument expressions at
ExecInitSampleScan time, which is undesirable even if it weren't outright
unsafe), and discourage sampling methods from looking at invisible tuples.
Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable
within and across queries, as required by the SQL standard, and deal more
honestly with methods that can't support that requirement.
Make a full code-review pass over the tablesample additions, and fix
assorted bugs, omissions, infelicities, and cosmetic issues (such as
failure to put the added code stanzas in a consistent ordering).
Improve EXPLAIN's output of tablesample plans, too.
Back-patch to 9.5 so that we don't have to support the original API
in production.
2015-07-26 02:39:00 +08:00
|
|
|
0
|
2015-05-16 03:31:50 +08:00
|
|
|
(1 row)
|
|
|
|
|
Redesign tablesample method API, and do extensive code review.
The original implementation of TABLESAMPLE modeled the tablesample method
API on index access methods, which wasn't a good choice because, without
specialized DDL commands, there's no way to build an extension that can
implement a TSM. (Raw inserts into system catalogs are not an acceptable
thing to do, because we can't undo them during DROP EXTENSION, nor will
pg_upgrade behave sanely.) Instead adopt an API more like procedural
language handlers or foreign data wrappers, wherein the only SQL-level
support object needed is a single handler function identified by having
a special return type. This lets us get rid of the supporting catalog
altogether, so that no custom DDL support is needed for the feature.
Adjust the API so that it can support non-constant tablesample arguments
(the original coding assumed we could evaluate the argument expressions at
ExecInitSampleScan time, which is undesirable even if it weren't outright
unsafe), and discourage sampling methods from looking at invisible tuples.
Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable
within and across queries, as required by the SQL standard, and deal more
honestly with methods that can't support that requirement.
Make a full code-review pass over the tablesample additions, and fix
assorted bugs, omissions, infelicities, and cosmetic issues (such as
failure to put the added code stanzas in a consistent ordering).
Improve EXPLAIN's output of tablesample plans, too.
Back-patch to 9.5 so that we don't have to support the original API
in production.
2015-07-26 02:39:00 +08:00
|
|
|
-- ... and we assume that this will finish before running out of time:
|
|
|
|
SELECT count(*) FROM test_tablesample TABLESAMPLE system_time (100000);
|
|
|
|
count
|
|
|
|
-------
|
|
|
|
31
|
2015-05-16 03:31:50 +08:00
|
|
|
(1 row)
|
|
|
|
|
Redesign tablesample method API, and do extensive code review.
The original implementation of TABLESAMPLE modeled the tablesample method
API on index access methods, which wasn't a good choice because, without
specialized DDL commands, there's no way to build an extension that can
implement a TSM. (Raw inserts into system catalogs are not an acceptable
thing to do, because we can't undo them during DROP EXTENSION, nor will
pg_upgrade behave sanely.) Instead adopt an API more like procedural
language handlers or foreign data wrappers, wherein the only SQL-level
support object needed is a single handler function identified by having
a special return type. This lets us get rid of the supporting catalog
altogether, so that no custom DDL support is needed for the feature.
Adjust the API so that it can support non-constant tablesample arguments
(the original coding assumed we could evaluate the argument expressions at
ExecInitSampleScan time, which is undesirable even if it weren't outright
unsafe), and discourage sampling methods from looking at invisible tuples.
Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable
within and across queries, as required by the SQL standard, and deal more
honestly with methods that can't support that requirement.
Make a full code-review pass over the tablesample additions, and fix
assorted bugs, omissions, infelicities, and cosmetic issues (such as
failure to put the added code stanzas in a consistent ordering).
Improve EXPLAIN's output of tablesample plans, too.
Back-patch to 9.5 so that we don't have to support the original API
in production.
2015-07-26 02:39:00 +08:00
|
|
|
-- bad parameters should get through planning, but not execution:
|
|
|
|
EXPLAIN (COSTS OFF)
|
|
|
|
SELECT id FROM test_tablesample TABLESAMPLE system_time (-1);
|
|
|
|
QUERY PLAN
|
|
|
|
--------------------------------------------------
|
|
|
|
Sample Scan on test_tablesample
|
|
|
|
Sampling: system_time ('-1'::double precision)
|
|
|
|
(2 rows)
|
|
|
|
|
|
|
|
SELECT id FROM test_tablesample TABLESAMPLE system_time (-1);
|
|
|
|
ERROR: sample collection time must not be negative
|
|
|
|
-- fail, this method is not repeatable:
|
|
|
|
SELECT * FROM test_tablesample TABLESAMPLE system_time (10) REPEATABLE (0);
|
|
|
|
ERROR: tablesample method system_time does not support REPEATABLE
|
|
|
|
LINE 1: SELECT * FROM test_tablesample TABLESAMPLE system_time (10) ...
|
|
|
|
^
|
|
|
|
-- since it's not repeatable, we expect a Materialize node in these plans:
|
|
|
|
EXPLAIN (COSTS OFF)
|
|
|
|
SELECT * FROM
|
|
|
|
(VALUES (0),(100000)) v(time),
|
|
|
|
LATERAL (SELECT COUNT(*) FROM test_tablesample
|
|
|
|
TABLESAMPLE system_time (100000)) ss;
|
|
|
|
QUERY PLAN
|
|
|
|
------------------------------------------------------------------------
|
|
|
|
Nested Loop
|
|
|
|
-> Aggregate
|
|
|
|
-> Materialize
|
|
|
|
-> Sample Scan on test_tablesample
|
|
|
|
Sampling: system_time ('100000'::double precision)
|
|
|
|
-> Values Scan on "*VALUES*"
|
|
|
|
(6 rows)
|
|
|
|
|
|
|
|
SELECT * FROM
|
|
|
|
(VALUES (0),(100000)) v(time),
|
|
|
|
LATERAL (SELECT COUNT(*) FROM test_tablesample
|
|
|
|
TABLESAMPLE system_time (100000)) ss;
|
|
|
|
time | count
|
|
|
|
--------+-------
|
|
|
|
0 | 31
|
|
|
|
100000 | 31
|
|
|
|
(2 rows)
|
|
|
|
|
|
|
|
EXPLAIN (COSTS OFF)
|
|
|
|
SELECT * FROM
|
|
|
|
(VALUES (0),(100000)) v(time),
|
|
|
|
LATERAL (SELECT COUNT(*) FROM test_tablesample
|
|
|
|
TABLESAMPLE system_time (time)) ss;
|
|
|
|
QUERY PLAN
|
|
|
|
----------------------------------------------------------------
|
|
|
|
Nested Loop
|
|
|
|
-> Values Scan on "*VALUES*"
|
|
|
|
-> Aggregate
|
|
|
|
-> Materialize
|
|
|
|
-> Sample Scan on test_tablesample
|
|
|
|
Sampling: system_time ("*VALUES*".column1)
|
|
|
|
(6 rows)
|
|
|
|
|
|
|
|
SELECT * FROM
|
|
|
|
(VALUES (0),(100000)) v(time),
|
|
|
|
LATERAL (SELECT COUNT(*) FROM test_tablesample
|
|
|
|
TABLESAMPLE system_time (time)) ss;
|
|
|
|
time | count
|
|
|
|
--------+-------
|
|
|
|
0 | 0
|
|
|
|
100000 | 31
|
|
|
|
(2 rows)
|
|
|
|
|
|
|
|
CREATE VIEW vv AS
|
|
|
|
SELECT * FROM test_tablesample TABLESAMPLE system_time (20);
|
|
|
|
EXPLAIN (COSTS OFF) SELECT * FROM vv;
|
|
|
|
QUERY PLAN
|
|
|
|
--------------------------------------------------
|
|
|
|
Sample Scan on test_tablesample
|
|
|
|
Sampling: system_time ('20'::double precision)
|
|
|
|
(2 rows)
|
|
|
|
|
|
|
|
DROP EXTENSION tsm_system_time; -- fail, view depends on extension
|
|
|
|
ERROR: cannot drop extension tsm_system_time because other objects depend on it
|
|
|
|
DETAIL: view vv depends on function system_time(internal)
|
|
|
|
HINT: Use DROP ... CASCADE to drop the dependent objects too.
|