TODO list for PostgreSQL ======================== Last updated: Mon Sep 2 12:20:47 EDT 2002 Current maintainer: Bruce Momjian (pgman@candle.pha.pa.us) The most recent version of this document can be viewed at the PostgreSQL web site, http://www.PostgreSQL.org. A dash (-) marks changes that will appear in the upcoming 7.3 release. Bracketed items "[]" have more detailed. Urgent ====== * Add replication of distributed databases [replication] o automatic failover o load balancing o master/slave replication o multi-master replication o partition data across servers o sample implementation in contrib/rserv o queries across databases or servers (two-phase commit) o allow replication over unreliable or non-persistent links o http://gborg.postgresql.org/project/pgreplication/projdisplay.php * Point-in-time data recovery using backup and write-ahead log Reporting ========= * Allow elog() to return error codes, module name, file name, line number, not just messages (Peter E) * Add error codes (Peter E) * Make error messages more consistent [error] * -Change DEBUG startup tag to LOG (Bruce) * Show location of syntax error in query [yacc] * -Add getpid() function to backend * -Allow logging of query durations Permissions =========== * -Improve control over user privileges, including table creation * -Allow user/group names to be specified directly in pg_hba.conf (Bruce) * -Add PGPASSWORDFILE environment variable or ~/.pgpass to store user/host/password combinations * -Allow permissions for functions (Peter E) * -Allow object creation to be disabled for specific users Administration ============== * Incremental backups * -Make it easier to create a database owned by someone who can't createdb, perhaps CREATE DATABASE dbname WITH OWNER = "user" (Gavin) * -Make equals sign optional in CREATE DATABASE WITH param = 'val' * Remove unreferenced table files and temp tables during database vacuum or postmaster startup (Bruce) * Remove behavior of postmaster -o after making postmaster/postgres flags unique * -Prevent SIGHUP and 'pg_ctl reload' from changing command line specified parameters to postgresql.conf defaults (Peter E) * Allow easy display of usernames in a group * Allow configuration files to be specified in a different directory * -Reserve last few process slots for super-user if max_connections reached * -Add GUC parameter to print queries that generate errors Data Types ========== * -Add domain capability (Rod Taylor) * Add IPv6 capability to INET/CIDR types * Remove Money type, add money formatting for decimal type * -SELECT cash_out(2) crashes because of opaque * -Declare typein/out functions in pg_proc with a special "C string" data type * -Functions returning sets do not totally work * Change factorial to return a numeric * Change NUMERIC data type to use base 10,000 internally * Change NUMERIC to enforce the maximum precision, and increase it * Add function to return compressed length of TOAST data values (Tom) * -Add GUC parameter for DATESTYLE * Allow INET subnet tests using non-constants * -Allow bytea to handle LIKE with non-TEXT patterns * CONVERSION o -Store binary-compatible type information in the system o Allow better handling of numeric constants, type conversion [typeconv] * ARRAYS o Allow nulls in arrays o Allow arrays to be ORDER'ed o -Ensure we have array-eq operators for every built-in array type o Support construction of array result values in expressions * BINARY DATA o Improve vacuum of large objects, like /contrib/vacuumlo o Add security checking for large objects o Make file in/out interface for TOAST columns, similar to large object interface (force out-of-line storage and no compression) o Auto-delete large objects when referencing row is deleted Multi-Language Support ====================== * Add NCHAR (as distinguished from ordinary varchar), * Allow LOCALE on a per-column basis, default to ASCII * Support multiple simultaneous character sets, per SQL92 * Improve Unicode combined character handling * Optimize locale to have minimal performance impact when not used (Peter E) * Add octet_length_server() and octet_length_client() (Thomas, Tatsuo) * Make octet_length_client the same as octet_length() (?) * Prevent mismatch of frontend/backend encodings from converting bytea data from being interpreted as encoded strings Views / Rules ============= * Automatically create rules on views so they are updateable, per SQL92 [view] * Add the functionality for WITH CHECK OPTION clause of CREATE VIEW * Allow NOTIFY in rules involving conditionals * Have views on temporary tables exist in the temporary namespace * Move psql backslash information into views * Allow RULE recompilation * -Remove brackets as multi-statement rule grouping, must use parens (Bruce) * Prevent aggregates from being used in rule WHERE clauses Indexes ======= * Allow CREATE INDEX zman_index ON test (date_trunc( 'day', zman ) datetime_ops) fails index can't store constant parameters * Order duplicate index entries by tid for faster heap lookups * Allow inherited tables to inherit index, UNIQUE constraint, and primary key, foreign key [inheritance] * UNIQUE INDEX on base column not honored on inserts from inherited table INSERT INTO inherit_table (unique_index_col) VALUES (dup) should fail [inheritance] * -Allow UPDATE/DELETE on inherited table * Add UNIQUE capability to non-btree indexes * Add btree index support for reltime, tinterval, regproc * Add rtree index support for line, lseg, path, point * Certain indexes will not shrink, e.g. indexes on ever-increasing columns and indexes with many duplicate keys * Use indexes for min() and max() or convert to SELECT col FROM tab ORDER BY col DESC LIMIT 1 if appropriate index exists and WHERE clause acceptible * Allow LIKE indexing optimization for non-ASCII locales * Use index to restrict rows returned by multi-key index when used with non-consecutive keys or OR clauses, so fewer heap accesses * Be smarter about insertion of already-ordered data into btree index * -Add deleted bit to index tuples to reduce heap access * Prevent index uniqueness checks when UPDATE does not modifying column * Use bitmaps to fetch heap pages in sequential order [performance] * Use bitmaps to combine existing indexes [performance] * Improve handling of index scans for NULL * Allow SELECT * FROM tab WHERE int2col = 4 to use int2col index, int8, float4, numeric/decimal too [optimizer] * Add FILLFACTOR to btree index creation * Improve concurrency in GIST * Improve concurrency of hash indexes (Neil Conway) * -Test hash index performance and discourage usage Commands ======== * -Add SIMILAR TO to allow character classes, 'pg_[a-c]%' * Add BETWEEN ASYMMETRIC/SYMMETRIC (Christopher) * -Remove LIMIT #,# and force use LIMIT and OFFSET clauses in 7.3 (Bruce) * Allow LIMIT/OFFSET to use expressions * -Disallow TRUNCATE on tables that are involved in referential constraints * -Add OR REPLACE clauses to non-FUNCTION object creation * CREATE TABLE AS can not determine column lengths from expressions [atttypmod] * Allow UPDATE to handle complex aggregates [update] * -Prevent create/drop scripts from allowing extra args (Bruce) * Allow command blocks to ignore certain types of errors * Allow backslash handling in quoted strings to be disabled for portability * Return proper effected tuple count from complex commands [return] * Allow DELETE to handle table aliases for self-joins [delete] * Add CORRESPONDING BY to UNION/INTERSECT/EXCEPT * Allow REINDEX to rebuild all indexes, remove /contrib/reindex * ALTER o ALTER TABLE ADD COLUMN does not honor DEFAULT and non-CHECK CONSTRAINT o -Add ALTER TABLE DROP COLUMN feature o -Add ALTER TABLE DROP non-CHECK CONSTRAINT o -ALTER TABLE ADD PRIMARY KEY (Christopher) o -ALTER TABLE ADD UNIQUE (Christopher) o -ALTER TABLE ALTER COLUMN SET/DROP NOT NULL (Christopher) o ALTER TABLE ADD COLUMN column DEFAULT should fill existing rows with DEFAULT value o ALTER TABLE ADD COLUMN column SERIAL doesn't create sequence because of the item above o -Have ALTER TABLE OWNER change all dependant objects like indexes o Add ALTER TABLE tab SET WITHOUT OIDS * CLUSTER o -Cluster all tables at once using pg_index.indisclustered set during previous CLUSTER o -Prevent loss of indexes, permissions, inheritance o Automatically maintain clustering on a table o Allow CLUSTER to cluster all tables, remove clusterdb * COPY o -Allow specification of column names o Allow dump/load of CSV format o -Change syntax to WITH DELIMITER, (keep old syntax around?) o Allow COPY to report error lines and continue; optionally allow error codes to be specified; requires savepoints or can not be run in a multi-statement transaction o -Generate failure on short COPY lines rather than pad NULLs * CURSOR o Allow BINARY option to SELECT, just like DECLARE o MOVE 0 should not move to end of cursor o Allow UPDATE/DELETE WHERE CURRENT OF cursor using per-cursor tid stored in the backend o Prevent DROP of table being referenced by our own open cursor o Allow cursors outside transactions [cursor] * INSERT o Allow INSERT/UPDATE of system-generated oid value for a row o Allow INSERT INTO tab (col1, ..) VALUES (val1, ..), (val2, ..) o -Allow INSERT INTO my_table VALUES (a, b, c, DEFAULT, x, y, z, ...) o -Disallow missing columns in INSERT ... (col) VALUES, per ANSI o Allow INSERT/UPDATE ... RETURNING new.col or old.col; handle RULE cases (Philip) * SHOW/SET o -Add command to display locks o -Add SET or BEGIN timeout parameter to cancel query o Add SET REAL_FORMAT and SET DOUBLE_PRECISION_FORMAT using printf args o -Remove SET KSQO option now that OR processing is improved (Bruce) o Add SET PERFORMANCE_TIPS option to suggest INDEX, VACUUM, VACUUM ANALYZE, and CLUSTER o Add SHOW command to see locale o -Allow SHOW to output as a query result, like EXPLAIN o -Abort all SET changes made in an aborted transaction o Add SET SCHEMA * SERVER-SIDE LANGUAGES o Allow PL/PgSQL's RAISE function to take expressions o -Fix PL/PgSQL to handle quoted mixed-case identifiers o Change PL/PgSQL to use palloc() instead of malloc() o Add untrusted version of plpython o Allow Java server-side programming, http://pljava.sourceforge.net [java] o Fix problems with complex temporary table creation/destruction without using PL/PgSQL EXECUTE, needs cache prevention/invalidation o Fix PL/pgSQL RENAME to work on variables other than OLD/NEW o Improve PL/PgSQL exception handling o Allow parameters to be specified by name and type during definition o Allow function parameters to be passed by name, get_employee_salary(emp_id => 12345, tax_year => 2001) o Add PL/PgSQL packages o Allow array declarations and other data types in PL/PgSQl DECLARE o Add PL/PgSQL PROCEDURES that can return multiple values o Add table function support to pltcl, plperl, plpython Clients ======= * -Have pg_dump use LEFT OUTER JOIN in multi-table SELECTs or multiple SELECTS to avoid bad system catalog entries * -Have pg_dump -C dump database location and encoding information * -Allow psql \d to show foreign keys * -Allow psql \d to show temporary table structure (Tom) * Allow psql to show transaction status if backend protocol changes made * Add XML interface: psql, pg_dump, COPY, separate server (?) * -Have pg_dump use ADD PRIMARY KEY after COPY, for performance (Neil Conway) * JDBC o Comprehensive test suite. This may be available already. o -Updateable resultSet o JDBC-standard BLOB support o Error Codes (pending backend implementation) o Support both 'make' and 'ant' o Fix LargeObject API to handle OIDs as unsigned ints o -Implement cancel() method on Statement o Use cursors implicitly to avoid large results (see setCursorName()) o -Add support for CallableStatements o Add LISTEN/NOTIFY support to the JDBC driver (Barry) o -Compile under jdk 1.4 * ECPG o Implement set descriptor, using descriptor o Make casts work in variable initializations o Implement SQLDA o Allow multi-threaded use of SQLCA o Solve cardinality > 1 for input descriptors / variables o Understand structure definitions outside a declare section o sqlwarn[6] should be 'W' if the PRECISION or SCALE value specified o Improve error handling o Allow :var[:index] or :var[] as cvariable for an array var o Add a semantic check level, e.g. check if a table really exists o Fix nested C comments o Add SQLSTATE o fix handling of DB attributes that are arrays Referential Integrity ===================== * Add MATCH PARTIAL referential integrity [foreign] * Add deferred trigger queue file (Jan) * -Allow oid to act as a foreign key * Implement dirty reads and use them in RI triggers * Make triggers refer to columns by number, not name * Enforce referential integrity for system tables * -Allow user to control trigger firing order (Tom) * -Add ALTER TRIGGER ... RENAME * Change foreign key constraint for array -> element to mean element in array * -Fix foreign key constraints to not error on intermediate db states (Stephan) * Allow DEFERRABLE UNIQUE constraints * Allow triggers to be disabled [trigger] Dependency Checking =================== * -Add pg_depend table for dependency recording; use sysrelid, oid, depend_sysrelid, depend_oid, name * -Auto-destroy sequence on DROP of table with SERIAL; perhaps a separate SERIAL type * -Prevent column dropping if column is used by foreign key * -Propagate column or table renaming to foreign key constraints * -Automatically drop constraints/functions when object is dropped * -Make foreign key constraints clearer in dump file * -Make other constraints clearer in dump file * -Make foreign keys easier to identify * Flush cached query plans when their underlying catalog data changes * Use dependency information to dump data in proper order Transactions ============ * -Allow autocommit so always in a transaction block * Overhaul bufmgr/lockmgr/transaction manager * Allow savepoints / nested transactions [transactions] Exotic Features =============== * Add sql3 recursive unions * Add the concept of dataspaces/tablespaces [tablespaces] * -Add SQL92 schemas (Tom) * Allow queries across multiple databases [crossdb] * Add pre-parsing phase that converts non-ANSI features to supported features * Allow plug-in modules to emulate features from other databases * SQL*Net listener that makes PostgreSQL appear as an Oracle database to clients PERFORMANCE =========== Fsync ===== * Delay fsync() when other backends are about to commit too [fsync] o Determine optimal commit_delay value * Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options o Allow multiple blocks to be written to WAL with one write() Cache ===== * -Cache most recent query plan(s) (Neil) [prepare] * Shared catalog cache, reduce lseek()'s by caching table size in shared area * Add free-behind capability for large sequential scans (Bruce) * Allow binding query args over FE/BE protocol * Consider use of open/fcntl(O_DIRECT) to minimize OS caching * Make blind writes go through the file descriptor cache Vacuum ====== * Improve speed with indexes (perhaps recreate index instead) [vacuum] * Reduce lock time by moving tuples with read lock, then write lock and truncate table [vacuum] * Provide automatic running of vacuum in the background (Tom) * Allow free space map to be auto-sized or warn when it is too small Locking ======= * Make locking of shared data structures more fine-grained * Add code to detect an SMP machine and handle spinlocks accordingly from distributted.net, http://www1.distributed.net/source, in client/common/cpucheck.cpp * Research use of sched_yield() for spinlock acquisition failure Startup Time ============ * Experiment with multi-threaded backend [thread] * Add connection pooling [pool] * Allow persistent backends [persistent] * Create a transaction processor to aid in persistent connections and connection pooling * Do listen() in postmaster and accept() in pre-forked backend * Have pre-forked backend pre-connect to last requested database or pass file descriptor to backend pre-forked for matching database * -Cache system catalog information in per-database files (Tom) Write-Ahead Log =============== * Have after-change WAL write()'s write only modified data to kernel * Reduce number of after-change WAL writes; they exist only to gaurd against partial page writes [wal] * Turn off after-change writes if fsync is disabled (?) * Add WAL index reliability improvement to non-btree indexes * -Reorder postgresql.conf WAL items in order of importance (Bruce) * -Remove wal_files postgresql.conf option because WAL files are now recycled * Find proper defaults for postgresql.conf WAL entries * Add checkpoint_min_warning postgresql.conf option to warn about checkpoints that are too frequent * Allow xlog directory location to be specified during initdb, perhaps using symlinks Optimizer / Executor ==================== * Improve Subplan list handling * Allow Subplans to use efficient joins(hash, merge) with upper variable * -Improve dynamic memory allocation by introducing tuple-context memory allocation (Tom) * Add hash for evaluating GROUP BY aggregates * -Nested FULL OUTER JOINs don't work (Tom) * Allow merge and hash joins on expressions not just simple variables (Tom) * -Add new pg_proc cachable settings to specify whether function can be evaluated only once or once per query * -Change FIXED_CHAR_SEL to 0.20 from 0.04 to give better selectivity (Bruce) * Make IN/NOT IN have similar performance to EXISTS/NOT EXISTS [exists] * Missing optimizer selectivities for date, r-tree, etc. [optimizer] * Allow ORDER BY ... LIMIT to select top values without sort or index using a sequential scan for highest/lowest values (Oleg) * Inline simple SQL functions to avoid overhead (Tom) * Precompile SQL functions to avoid overhead (Neil Conway) * Add utility to compute accurate random_page_cost value * Improve ability to display optimizer analysis using OPTIMIZER_DEBUG * Use CHECK constraints to improve optimizer decisions Miscellaneous ============= * Do async I/O for faster random read-ahead of data * Get faster regex() code from Henry Spencer when it is available * Use mmap() rather than SYSV shared memory or to write WAL files (?) [mmap] * Improve caching of attribute offsets when NULLs exist in the row * -Add Intimate Shared Memory(ISM) for Solaris * Add documentation to lock shared memory into RAM for each OS, if possible * -Use faster flex flags for performance improvement (Peter E) * -Add BSD-licensed qsort() for Solaris Source Code =========== * Add use of 'const' for variables in source tree * -Fix problems with libpq non-blocking/async code * -Make sure all block numbers are unsigned to increase maximum table size * -Merge LockMethodCtl and LockMethodTable into one shared structure (Bruce) * -HOLDER/HOLDERTAB rename to PROCLOCK/PROCLOCKTAG (Bruce) * -Remove LockMethodTable.prio field, not used (Bruce) * Rename some /contrib modules from pg* to pg_* * Move some things from /contrib into main tree * Remove warnings created by -Wcast-align * Move platform-specific ps status display info from ps_status.c to ports * -Make one version of simple_prompt() in code (Bruce, Tom) * -Compile in syslog functionaility by default (Tatsuo) * Modify regression tests to prevent failures do to minor numeric rounding * Add OpenBSD's getpeereid() call for local socket authentication (Bruce) * Improve access-permissions check on data directory in Cygwin (Tom) * -Report failure to find readline or zlib at end of configure run * Add --port flag to regression tests * -Increase identifier length (NAMEDATALEN) if small performance hit, * -Increase maximum number of function parameters if little wasted space * Add documentation for perl, including mention of DBI/DBD perl location * Add optional CRC checksum to heap and index pages * Change representation of whole-tuple parameters to functions * Clarify use of 'application' and 'command' tags in SGML docs * Better document ability to build only certain interfaces (Marc) * Remove or relicense modules that are not under the BSD license, if possible * Remove memory/file descriptor freeing befor elog(ERROR) (Bruce) * Create native Win32 port [win32] * Fix glibc's mktime() to handle pre-1970's dates --------------------------------------------------------------------------- Developers who have claimed items are: -------------------------------------- * Barry is Barry Lind * Billy is Billy G. Allie * Bruce is Bruce Momjian * Christopher is Christopher Kings-Lynne * D'Arcy is D'Arcy J.M. Cain * Dave is Dave Cramer * Edmund is Edmund Mergl * Hiroshi is Hiroshi Inoue * Karel is Karel Zak * Jan is Jan Wieck * Liam is Liam Stewart * Marc is Marc Fournier * Mark is Mark Hollomon * Marko is Marko Kreen * Michael is Michael Meskes * Neil is Neil Conway * Oleg is Oleg Bartunov * Peter M is Peter T Mount * Peter E is Peter Eisentraut * Philip is Philip Warner * Rod is Rod Taylor * Ross is Ross J. Reedstrom * Ryan is Ryan Bradetich * Stephan is Stephan Szabo * Tatsuo is Tatsuo Ishii * Thomas is Thomas Lockhart * Tom is Tom Lane * TomH is Tom I Helbekkmo * Vadim is Vadim B. Mikheev