2011-02-14 09:06:41 +08:00
|
|
|
/* contrib/tablefunc/tablefunc--1.0.sql */
|
2002-10-19 02:41:22 +08:00
|
|
|
|
2011-10-13 03:45:03 +08:00
|
|
|
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
|
|
|
|
\echo Use "CREATE EXTENSION tablefunc" to load this file. \quit
|
|
|
|
|
2011-02-14 10:24:14 +08:00
|
|
|
CREATE FUNCTION normal_rand(int4, float8, float8)
|
2002-10-19 02:41:22 +08:00
|
|
|
RETURNS setof float8
|
|
|
|
AS 'MODULE_PATHNAME','normal_rand'
|
2006-02-28 00:09:50 +08:00
|
|
|
LANGUAGE C VOLATILE STRICT;
|
As mentioned above, here is my contrib/tablefunc patch. It includes
three functions which exercise the tablefunc API.
show_all_settings()
- returns the same information as SHOW ALL, but as a query result
normal_rand(int numvals, float8 mean, float8 stddev, int seed)
- returns a set of normally distributed float8 values
- This routine implements Algorithm P (Polar method for normal
deviates) from Knuth's _The_Art_of_Computer_Programming_, Volume 2,
3rd ed., pages 122-126. Knuth cites his source as "The polar
method", G. E. P. Box, M. E. Muller, and G. Marsaglia,
_Annals_Math,_Stat._ 29 (1958), 610-611.
crosstabN(text sql)
- returns a set of row_name plus N category value columns
- crosstab2(), crosstab3(), and crosstab4() are defined for you,
but you can create additional crosstab functions per directions
in the README.
Joe Conway
2002-07-31 00:31:11 +08:00
|
|
|
|
2005-05-31 07:09:07 +08:00
|
|
|
-- the generic crosstab function:
|
2011-02-14 10:24:14 +08:00
|
|
|
CREATE FUNCTION crosstab(text)
|
2005-05-31 07:09:07 +08:00
|
|
|
RETURNS setof record
|
|
|
|
AS 'MODULE_PATHNAME','crosstab'
|
2006-02-28 00:09:50 +08:00
|
|
|
LANGUAGE C STABLE STRICT;
|
2005-05-31 07:09:07 +08:00
|
|
|
|
|
|
|
-- examples of building custom type-specific crosstab functions:
|
2002-09-15 03:53:59 +08:00
|
|
|
CREATE TYPE tablefunc_crosstab_2 AS
|
|
|
|
(
|
2002-10-19 02:41:22 +08:00
|
|
|
row_name TEXT,
|
|
|
|
category_1 TEXT,
|
|
|
|
category_2 TEXT
|
2002-09-15 03:53:59 +08:00
|
|
|
);
|
|
|
|
|
|
|
|
CREATE TYPE tablefunc_crosstab_3 AS
|
|
|
|
(
|
2002-10-19 02:41:22 +08:00
|
|
|
row_name TEXT,
|
|
|
|
category_1 TEXT,
|
|
|
|
category_2 TEXT,
|
|
|
|
category_3 TEXT
|
2002-09-15 03:53:59 +08:00
|
|
|
);
|
|
|
|
|
|
|
|
CREATE TYPE tablefunc_crosstab_4 AS
|
|
|
|
(
|
2002-10-19 02:41:22 +08:00
|
|
|
row_name TEXT,
|
|
|
|
category_1 TEXT,
|
|
|
|
category_2 TEXT,
|
|
|
|
category_3 TEXT,
|
|
|
|
category_4 TEXT
|
2002-09-15 03:53:59 +08:00
|
|
|
);
|
As mentioned above, here is my contrib/tablefunc patch. It includes
three functions which exercise the tablefunc API.
show_all_settings()
- returns the same information as SHOW ALL, but as a query result
normal_rand(int numvals, float8 mean, float8 stddev, int seed)
- returns a set of normally distributed float8 values
- This routine implements Algorithm P (Polar method for normal
deviates) from Knuth's _The_Art_of_Computer_Programming_, Volume 2,
3rd ed., pages 122-126. Knuth cites his source as "The polar
method", G. E. P. Box, M. E. Muller, and G. Marsaglia,
_Annals_Math,_Stat._ 29 (1958), 610-611.
crosstabN(text sql)
- returns a set of row_name plus N category value columns
- crosstab2(), crosstab3(), and crosstab4() are defined for you,
but you can create additional crosstab functions per directions
in the README.
Joe Conway
2002-07-31 00:31:11 +08:00
|
|
|
|
2011-02-14 10:24:14 +08:00
|
|
|
CREATE FUNCTION crosstab2(text)
|
2002-10-19 02:41:22 +08:00
|
|
|
RETURNS setof tablefunc_crosstab_2
|
|
|
|
AS 'MODULE_PATHNAME','crosstab'
|
2006-02-28 00:09:50 +08:00
|
|
|
LANGUAGE C STABLE STRICT;
|
As mentioned above, here is my contrib/tablefunc patch. It includes
three functions which exercise the tablefunc API.
show_all_settings()
- returns the same information as SHOW ALL, but as a query result
normal_rand(int numvals, float8 mean, float8 stddev, int seed)
- returns a set of normally distributed float8 values
- This routine implements Algorithm P (Polar method for normal
deviates) from Knuth's _The_Art_of_Computer_Programming_, Volume 2,
3rd ed., pages 122-126. Knuth cites his source as "The polar
method", G. E. P. Box, M. E. Muller, and G. Marsaglia,
_Annals_Math,_Stat._ 29 (1958), 610-611.
crosstabN(text sql)
- returns a set of row_name plus N category value columns
- crosstab2(), crosstab3(), and crosstab4() are defined for you,
but you can create additional crosstab functions per directions
in the README.
Joe Conway
2002-07-31 00:31:11 +08:00
|
|
|
|
2011-02-14 10:24:14 +08:00
|
|
|
CREATE FUNCTION crosstab3(text)
|
2002-10-19 02:41:22 +08:00
|
|
|
RETURNS setof tablefunc_crosstab_3
|
|
|
|
AS 'MODULE_PATHNAME','crosstab'
|
2006-02-28 00:09:50 +08:00
|
|
|
LANGUAGE C STABLE STRICT;
|
As mentioned above, here is my contrib/tablefunc patch. It includes
three functions which exercise the tablefunc API.
show_all_settings()
- returns the same information as SHOW ALL, but as a query result
normal_rand(int numvals, float8 mean, float8 stddev, int seed)
- returns a set of normally distributed float8 values
- This routine implements Algorithm P (Polar method for normal
deviates) from Knuth's _The_Art_of_Computer_Programming_, Volume 2,
3rd ed., pages 122-126. Knuth cites his source as "The polar
method", G. E. P. Box, M. E. Muller, and G. Marsaglia,
_Annals_Math,_Stat._ 29 (1958), 610-611.
crosstabN(text sql)
- returns a set of row_name plus N category value columns
- crosstab2(), crosstab3(), and crosstab4() are defined for you,
but you can create additional crosstab functions per directions
in the README.
Joe Conway
2002-07-31 00:31:11 +08:00
|
|
|
|
2011-02-14 10:24:14 +08:00
|
|
|
CREATE FUNCTION crosstab4(text)
|
2002-10-19 02:41:22 +08:00
|
|
|
RETURNS setof tablefunc_crosstab_4
|
|
|
|
AS 'MODULE_PATHNAME','crosstab'
|
2006-02-28 00:09:50 +08:00
|
|
|
LANGUAGE C STABLE STRICT;
|
As mentioned above, here is my contrib/tablefunc patch. It includes
three functions which exercise the tablefunc API.
show_all_settings()
- returns the same information as SHOW ALL, but as a query result
normal_rand(int numvals, float8 mean, float8 stddev, int seed)
- returns a set of normally distributed float8 values
- This routine implements Algorithm P (Polar method for normal
deviates) from Knuth's _The_Art_of_Computer_Programming_, Volume 2,
3rd ed., pages 122-126. Knuth cites his source as "The polar
method", G. E. P. Box, M. E. Muller, and G. Marsaglia,
_Annals_Math,_Stat._ 29 (1958), 610-611.
crosstabN(text sql)
- returns a set of row_name plus N category value columns
- crosstab2(), crosstab3(), and crosstab4() are defined for you,
but you can create additional crosstab functions per directions
in the README.
Joe Conway
2002-07-31 00:31:11 +08:00
|
|
|
|
2005-05-31 07:09:07 +08:00
|
|
|
-- obsolete:
|
2011-02-14 10:24:14 +08:00
|
|
|
CREATE FUNCTION crosstab(text,int)
|
2002-10-19 02:41:22 +08:00
|
|
|
RETURNS setof record
|
|
|
|
AS 'MODULE_PATHNAME','crosstab'
|
2006-02-28 00:09:50 +08:00
|
|
|
LANGUAGE C STABLE STRICT;
|
Attached is an update to contrib/tablefunc. It introduces a new
function, connectby(), which can serve as a reference implementation for
the changes made in the last few days -- namely the ability of a
function to return an entire tuplestore, and the ability of a function
to make use of the query provided "expected" tuple description.
Description:
connectby(text relname, text keyid_fld, text parent_keyid_fld,
text start_with, int max_depth [, text branch_delim])
- returns keyid, parent_keyid, level, and an optional branch string
- requires anonymous composite type syntax in the FROM clause. See
the instructions in the documentation below.
Joe Conway
2002-09-02 13:44:05 +08:00
|
|
|
|
2011-02-14 10:24:14 +08:00
|
|
|
CREATE FUNCTION crosstab(text,text)
|
Attached is an update to contrib/tablefunc. It implements a new hashed
version of crosstab. This fixes a major deficiency in real-world use of
the original version. Easiest to undestand with an illustration:
Data:
-------------------------------------------------------------------
select * from cth;
id | rowid | rowdt | attribute | val
----+-------+---------------------+----------------+---------------
1 | test1 | 2003-03-01 00:00:00 | temperature | 42
2 | test1 | 2003-03-01 00:00:00 | test_result | PASS
3 | test1 | 2003-03-01 00:00:00 | volts | 2.6987
4 | test2 | 2003-03-02 00:00:00 | temperature | 53
5 | test2 | 2003-03-02 00:00:00 | test_result | FAIL
6 | test2 | 2003-03-02 00:00:00 | test_startdate | 01 March 2003
7 | test2 | 2003-03-02 00:00:00 | volts | 3.1234
(7 rows)
Original crosstab:
-------------------------------------------------------------------
SELECT * FROM crosstab(
'SELECT rowid, attribute, val FROM cth ORDER BY 1,2',4)
AS c(rowid text, temperature text, test_result text, test_startdate
text, volts text);
rowid | temperature | test_result | test_startdate | volts
-------+-------------+-------------+----------------+--------
test1 | 42 | PASS | 2.6987 |
test2 | 53 | FAIL | 01 March 2003 | 3.1234
(2 rows)
Hashed crosstab:
-------------------------------------------------------------------
SELECT * FROM crosstab(
'SELECT rowid, attribute, val FROM cth ORDER BY 1',
'SELECT DISTINCT attribute FROM cth ORDER BY 1')
AS c(rowid text, temperature int4, test_result text, test_startdate
timestamp, volts float8);
rowid | temperature | test_result | test_startdate | volts
-------+-------------+-------------+---------------------+--------
test1 | 42 | PASS | | 2.6987
test2 | 53 | FAIL | 2003-03-01 00:00:00 | 3.1234
(2 rows)
Notice that the original crosstab slides data over to the left in the
result tuple when it encounters missing data. In order to work around
this you have to be make your source sql do all sorts of contortions
(cartesian join of distinct rowid with distinct attribute; left join
that back to the real source data). The new version avoids this by
building a hash table using a second distinct attribute query.
The new version also allows for "extra" columns (see the README) and
allows the result columns to be coerced into differing datatypes if they
are suitable (as shown above).
In testing a "real-world" data set (69 distinct rowid's, 27 distinct
categories/attributes, multiple missing data points) I saw about a
5-fold improvement in execution time (from about 2200 ms old, to 440 ms
new).
I left the original version intact because: 1) BC, 2) it is probably
slightly faster if you know that you have no missing attributes.
README and regression test adjustments included. If there are no
objections, please apply.
Joe Conway
2003-03-20 14:46:30 +08:00
|
|
|
RETURNS setof record
|
|
|
|
AS 'MODULE_PATHNAME','crosstab_hash'
|
2006-02-28 00:09:50 +08:00
|
|
|
LANGUAGE C STABLE STRICT;
|
Attached is an update to contrib/tablefunc. It implements a new hashed
version of crosstab. This fixes a major deficiency in real-world use of
the original version. Easiest to undestand with an illustration:
Data:
-------------------------------------------------------------------
select * from cth;
id | rowid | rowdt | attribute | val
----+-------+---------------------+----------------+---------------
1 | test1 | 2003-03-01 00:00:00 | temperature | 42
2 | test1 | 2003-03-01 00:00:00 | test_result | PASS
3 | test1 | 2003-03-01 00:00:00 | volts | 2.6987
4 | test2 | 2003-03-02 00:00:00 | temperature | 53
5 | test2 | 2003-03-02 00:00:00 | test_result | FAIL
6 | test2 | 2003-03-02 00:00:00 | test_startdate | 01 March 2003
7 | test2 | 2003-03-02 00:00:00 | volts | 3.1234
(7 rows)
Original crosstab:
-------------------------------------------------------------------
SELECT * FROM crosstab(
'SELECT rowid, attribute, val FROM cth ORDER BY 1,2',4)
AS c(rowid text, temperature text, test_result text, test_startdate
text, volts text);
rowid | temperature | test_result | test_startdate | volts
-------+-------------+-------------+----------------+--------
test1 | 42 | PASS | 2.6987 |
test2 | 53 | FAIL | 01 March 2003 | 3.1234
(2 rows)
Hashed crosstab:
-------------------------------------------------------------------
SELECT * FROM crosstab(
'SELECT rowid, attribute, val FROM cth ORDER BY 1',
'SELECT DISTINCT attribute FROM cth ORDER BY 1')
AS c(rowid text, temperature int4, test_result text, test_startdate
timestamp, volts float8);
rowid | temperature | test_result | test_startdate | volts
-------+-------------+-------------+---------------------+--------
test1 | 42 | PASS | | 2.6987
test2 | 53 | FAIL | 2003-03-01 00:00:00 | 3.1234
(2 rows)
Notice that the original crosstab slides data over to the left in the
result tuple when it encounters missing data. In order to work around
this you have to be make your source sql do all sorts of contortions
(cartesian join of distinct rowid with distinct attribute; left join
that back to the real source data). The new version avoids this by
building a hash table using a second distinct attribute query.
The new version also allows for "extra" columns (see the README) and
allows the result columns to be coerced into differing datatypes if they
are suitable (as shown above).
In testing a "real-world" data set (69 distinct rowid's, 27 distinct
categories/attributes, multiple missing data points) I saw about a
5-fold improvement in execution time (from about 2200 ms old, to 440 ms
new).
I left the original version intact because: 1) BC, 2) it is probably
slightly faster if you know that you have no missing attributes.
README and regression test adjustments included. If there are no
objections, please apply.
Joe Conway
2003-03-20 14:46:30 +08:00
|
|
|
|
2011-02-14 10:24:14 +08:00
|
|
|
CREATE FUNCTION connectby(text,text,text,text,int,text)
|
2002-10-19 02:41:22 +08:00
|
|
|
RETURNS setof record
|
|
|
|
AS 'MODULE_PATHNAME','connectby_text'
|
2006-02-28 00:09:50 +08:00
|
|
|
LANGUAGE C STABLE STRICT;
|
Attached is an update to contrib/tablefunc. It introduces a new
function, connectby(), which can serve as a reference implementation for
the changes made in the last few days -- namely the ability of a
function to return an entire tuplestore, and the ability of a function
to make use of the query provided "expected" tuple description.
Description:
connectby(text relname, text keyid_fld, text parent_keyid_fld,
text start_with, int max_depth [, text branch_delim])
- returns keyid, parent_keyid, level, and an optional branch string
- requires anonymous composite type syntax in the FROM clause. See
the instructions in the documentation below.
Joe Conway
2002-09-02 13:44:05 +08:00
|
|
|
|
2011-02-14 10:24:14 +08:00
|
|
|
CREATE FUNCTION connectby(text,text,text,text,int)
|
2002-10-19 02:41:22 +08:00
|
|
|
RETURNS setof record
|
|
|
|
AS 'MODULE_PATHNAME','connectby_text'
|
2006-02-28 00:09:50 +08:00
|
|
|
LANGUAGE C STABLE STRICT;
|
2003-07-27 11:51:59 +08:00
|
|
|
|
|
|
|
-- These 2 take the name of a field to ORDER BY as 4th arg (for sorting siblings)
|
|
|
|
|
2011-02-14 10:24:14 +08:00
|
|
|
CREATE FUNCTION connectby(text,text,text,text,text,int,text)
|
2003-07-27 11:51:59 +08:00
|
|
|
RETURNS setof record
|
|
|
|
AS 'MODULE_PATHNAME','connectby_text_serial'
|
2006-02-28 00:09:50 +08:00
|
|
|
LANGUAGE C STABLE STRICT;
|
2003-07-27 11:51:59 +08:00
|
|
|
|
2011-02-14 10:24:14 +08:00
|
|
|
CREATE FUNCTION connectby(text,text,text,text,text,int)
|
2003-07-27 11:51:59 +08:00
|
|
|
RETURNS setof record
|
|
|
|
AS 'MODULE_PATHNAME','connectby_text_serial'
|
2006-02-28 00:09:50 +08:00
|
|
|
LANGUAGE C STABLE STRICT;
|