postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2025-02-05 19:09:58 +08:00

Author	SHA1	Message	Date
Teodor Sigaev	97f3014647	This supports the triconsistent function for pg_trgm GIN opclass to make it faster to implement indexed queries where some keys are common and some are rare. Patch by Jeff Janes	2015-07-20 18:18:48 +03:00
Andrew Dunstan	00eff86cb8	Enable transforms modules to build and test on Cygwin. This still doesn't work correctly with Python 3, but I am committing this so we can get Cygwin buildfarm members building with Python 2.	2015-07-18 10:09:04 -04:00
Noah Misch	7193436744	AIX: Link TRANSFORM modules with their dependencies. The result closely resembles linking of these modules for the "win32" port. Augment the $(exports_file) header so the file is also usable as an import file. Unfortunately, relocating an AIX installation will now require adding $(pkglibdir) to LD_LIBRARY_PATH. Back-patch to 9.5, where the modules were introduced.	2015-07-15 21:00:26 -04:00
Noah Misch	736c1f238b	MinGW: Link ltree_plpython with plpython. The MSVC build system already did this, and building against Python 3 requires it. Back-patch to 9.5, where the module was introduced.	2015-07-15 21:00:26 -04:00
Fujii Masao	705d397cd9	Prevent pgstattuple() from reporting BRIN as unknown index. Also this patch removes obsolete comment. Back-patch to 9.5 where BRIN index was added.	2015-07-14 22:36:51 +09:00
Noah Misch	0689cfc34b	Link pg_stat_statements with libm. The AIX 7.1 libm is static, and AIX postgres executables do not export symbols acquired from libraries. Back-patch to 9.5, where commit `cfe12763c3` added a sqrt() call.	2015-07-08 20:44:22 -04:00
Tom Lane	10fb48d66d	Add an optional missing_ok argument to SQL function current_setting(). This allows convenient checking for existence of a GUC from SQL, which is particularly useful when dealing with custom variables. David Christensen, reviewed by Jeevan Chalke	2015-07-02 16:41:07 -04:00
Heikki Linnakangas	f92d6a540a	Use appendStringInfoString/Char et al where appropriate. Patch by David Rowley. Backpatch to 9.5, as some of the calls were new in 9.5, and keeping the code in sync with master makes future backpatching easier.	2015-07-02 12:36:03 +03:00
Fujii Masao	fb174687f7	Make use of xlog_internal.h's macros in WAL-related utilities. Commit `179cdd09` added macros to check if a filename is a WAL segment or other such file. However there were still some instances of the strlen + strspn combination to check for that in WAL-related utilities like pg_archivecleanup. Those checks can be replaced with the macros. This patch makes use of the macros in those utilities and which would make the code a bit easier to read. Back-patch to 9.5. Michael Paquier	2015-07-02 10:35:38 +09:00
Andres Freund	d47a1136e4	Fix test_decoding's handling of nonexistant columns in old tuple versions. test_decoding used fastgetattr() to extract column values. That's wrong when decoding updates and deletes if a table's replica identity is set to FULL and new columns have been added since the old version of the tuple was created. Due to the lack of a crosscheck with the datum's natts values an invalid value will be output, leading to errors or worse. Bug: #13470 Reported-By: Krzysztof Kotlarski Discussion: 20150626100333.3874.90852@wrigleys.postgresql.org Backpatch to 9.4, where the feature, including the bug, was added.	2015-06-27 19:00:45 +02:00
Peter Eisentraut	75f9d17638	Make Python tests more portable Newer Python versions randomize the hash seed for dictionaries, resulting in a random output order, which messes up the regression test diffs. Instead, use Python assert to compare the dictionaries with their expected value.	2015-05-31 07:10:45 -04:00
Stephen Frost	cde9cf170c	Finish removing pg_audit	2015-05-28 12:48:25 -04:00
Stephen Frost	e5f1a4f1e3	Remove pg_audit This removes pg_audit, per discussion: 20150528082038.GU26667@tamriel.snowman.net	2015-05-28 12:41:26 -04:00
Tom Lane	2aa0476dc3	Manual cleanup of pgindent results. Fix some places where pgindent did silly stuff, often because project style wasn't followed to begin with. (I've not touched the atomics headers, though.)	2015-05-24 15:04:10 -04:00
Tom Lane	91e79260f6	Remove no-longer-required function declarations. Remove a bunch of "extern Datum foo(PG_FUNCTION_ARGS);" declarations that are no longer needed now that PG_FUNCTION_INFO_V1(foo) provides that. Some of these were evidently missed in commit `e7128e8dbb`, but others were cargo-culted in in code added since then. Possibly that can be blamed in part on the fact that we'd not fixed relevant documentation examples, which I've now done.	2015-05-24 12:20:23 -04:00
Bruce Momjian	807b9e0dff	pgindent run for 9.5	2015-05-23 21:35:49 -04:00
Heikki Linnakangas	4fc72cc7bb	Collection of typo fixes. Use "a" and "an" correctly, mostly in comments. Two error messages were also fixed (they were just elogs, so no translation work required). Two function comments in pg_proc.h were also fixed. Etsuro Fujita reported one of these, but I found a lot more with grep. Also fix a few other typos spotted while grepping for the a/an typos. For example, "consists out of ..." -> "consists of ...". Plus a "though"/ "through" mixup reported by Euler Taveira. Many of these typos were in old code, which would be nice to backpatch to make future backpatching easier. But much of the code was new, and I didn't feel like crafting separate patches for each branch. So no backpatching.	2015-05-20 16:56:22 +03:00
Andres Freund	0740cbd759	Refactor ON CONFLICT index inference parse tree representation. Defer lookup of opfamily and input type of a of a user specified opclass until the optimizer selects among available unique indexes; and store the opclass in the parse analyzed tree instead. The primary reason for doing this is that for rule deparsing it's easier to use the opclass than the previous representation. While at it also rename a variable in the inference code to better fit it's purpose. This is separate from the actual fixes for deparsing to make review easier.	2015-05-19 21:21:27 +02:00
Peter Eisentraut	0779f2ba2d	Fix parse tree of DROP TRANSFORM and COMMENT ON TRANSFORM The plain C string language name needs to be wrapped in makeString() so that the parse tree is copyable. This is detectable by -DCOPY_PARSE_PLAN_TREES. Add a test case for the COMMENT case. Also make the quoting in the error messages more consistent. discovered by Tom Lane	2015-05-18 22:55:14 -04:00
Noah Misch	85270ac7a2	pgcrypto: Report errant decryption as "Wrong key or corrupt data". This has been the predominant outcome. When the output of decrypting with a wrong key coincidentally resembled an OpenPGP packet header, pgcrypto could instead report "Corrupt data", "Not text data" or "Unsupported compression algorithm". The distinct "Corrupt data" message added no value. The latter two error messages misled when the decrypted payload also exhibited fundamental integrity problems. Worse, error message variance in other systems has enabled cryptologic attacks; see RFC 4880 section "14. Security Considerations". Whether these pgcrypto behaviors are likewise exploitable is unknown. In passing, document that pgcrypto does not resist side-channel attacks. Back-patch to 9.0 (all supported versions). Security: CVE-2015-3167	2015-05-18 10:02:31 -04:00
Tom Lane	b14cf229f4	Use += not = to set makefile variables after including base makefiles. The previous coding in hstore_plpython and ltree_plpython wiped out any values set by the base makefiles. This at least had the effect of running the tests in "regression" not "contrib_regression" as expected. These being pretty new modules, there might be other bad effects we'd not noticed yet.	2015-05-17 20:04:42 -04:00
Stephen Frost	a936743b33	pg_audit Makefile, REINDEX changes Clean up the Makefile, per Michael Paquier. Classify REINDEX as we do in core, use '1.0' for the version, per Fujii.	2015-05-17 09:56:57 -04:00
Magnus Hagander	3b075e9d7b	Fix typos in comments Dmitriy Olshevskiy	2015-05-17 14:58:04 +02:00
Peter Eisentraut	fab6ca23ea	hstore_plpython: Fix regression tests under Python 3	2015-05-16 23:35:29 -04:00
Peter Eisentraut	e6dc503445	Fix whitespace	2015-05-16 20:43:32 -04:00
Andres Freund	f3d3118532	Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com	2015-05-16 03:46:31 +02:00
Alvaro Herrera	26df7066cc	Move strategy numbers to include/access/stratnum.h For upcoming BRIN opclasses, it's convenient to have strategy numbers defined in a single place. Since there's nothing appropriate, create it. The StrategyNumber typedef now lives there, as well as existing strategy numbers for B-trees (from skey.h) and R-tree-and-friends (from gist.h). skey.h is forced to include stratnum.h because of the StrategyNumber typedef, but gist.h is not; extensions that currently rely on gist.h for rtree strategy numbers might need to add a new A few .c files can stop including skey.h and/or gist.h, which is a nice side benefit. Per discussion: https://www.postgresql.org/message-id/20150514232132.GZ2523@alvh.no-ip.org Authored by Emre Hasegeli and Álvaro. (It's not clear to me why bootscanner.l has any #include lines at all.)	2015-05-15 17:03:16 -03:00
Simon Riggs	df259759fb	Add to contrib/Makefile	2015-05-15 15:33:37 -04:00
Simon Riggs	56e121a508	contrib/tsm_system_time	2015-05-15 15:31:50 -04:00
Simon Riggs	4d40494b11	contrib/tsm_system_rows	2015-05-15 15:31:14 -04:00
Simon Riggs	f6d208d6e5	TABLESAMPLE, SQL Standard and extensible Add a TABLESAMPLE clause to SELECT statements that allows user to specify random BERNOULLI sampling or block level SYSTEM sampling. Implementation allows for extensible sampling functions to be written, using a standard API. Basic version follows SQLStandard exactly. Usable concrete use cases for the sampling API follow in later commits. Petr Jelinek Reviewed by Michael Paquier and Simon Riggs	2015-05-15 14:37:10 -04:00
Stephen Frost	aff27e3379	Remove useless pg_audit.conf No need to have pg_audit.conf any longer since the regression tests are just loading the module at the start of each session (to simulate being in shared_preload_libraries, which isn't something we can actually make happen on the buildfarm itself, it seems). Pointed out by Tom	2015-05-15 10:41:53 -04:00
Simon Riggs	83e176ec18	Separate block sampling functions Refactoring ahead of tablesample patch Requested and reviewed by Michael Paquier Petr Jelinek	2015-05-15 04:02:54 +02:00
Stephen Frost	b22b770683	Make repeated 'make installcheck' runs work In pg_audit, set client_min_messages up to warning, then reset the role attributes, to completely reset the session while not making the regression tests depend on being run by any particular user.	2015-05-14 15:41:39 -04:00
Stephen Frost	ed6ea8e815	Improve pg_audit regression tests Instead of creating a new superuser role, extract out what the current user is and use that user instead. Further, clean up and drop all objects created by the regression test. Pointed out by Tom.	2015-05-14 15:16:27 -04:00
Tom Lane	35a1e1d159	Fix portability issue in pg_audit. "%ld" is not a portable way to print int64's. This may explain the buildfarm crashes we're seeing --- it seems to make dromedary happy, at least.	2015-05-14 13:19:26 -04:00
Tom Lane	6c9e93d3ff	Suppress uninitialized-variable warning.	2015-05-14 12:16:06 -04:00
Stephen Frost	8a2e1edd2b	Further fixes for the buildfarm for pg_audit Also, use a function to load the extension ahead of all other calls, simulating load from shared_libraries_preload, to make sure the hooks are in place before logging start.	2015-05-14 11:55:36 -04:00
Stephen Frost	c703b1e689	Further fixes for the buildfarm for pg_audit The database built by the buildfarm is specific to the extension, use \connect - instead.	2015-05-14 11:44:16 -04:00
Stephen Frost	dfb7624a13	Fix buildfarm with regard to pg_audit Remove the check that pg_audit be installed by shared_preload_libraries as that's not going to work when running the regressions tests in the buildfarm. That check was primairly a nice to have and isn't required anyway.	2015-05-14 10:57:12 -04:00
Stephen Frost	ac52bb0442	Add pg_audit, an auditing extension This extension provides detailed logging classes, ability to control logging at a per-object level, and includes fully-qualified object names for logged statements (DML and DDL) in independent fields of the log output. Authors: Ian Barwick, Abhijit Menon-Sen, David Steele Reviews by: Robert Haas, Tatsuo Ishii, Sawada Masahiko, Fujii Masao, Simon Riggs Discussion with: Josh Berkus, Jaime Casanova, Peter Eisentraut, David Fetter, Yeb Havinga, Alvaro Herrera, Petr Jelinek, Tom Lane, MauMau, Bruce Momjian, Jim Nasby, Michael Paquier, Fabrízio de Royes Mello, Neil Tiffin	2015-05-14 10:36:16 -04:00
Tom Lane	0bb8528b5c	Fix postgres_fdw to return the right ctid value in EvalPlanQual cases. If a postgres_fdw foreign table is a non-locked source relation in an UPDATE, DELETE, or SELECT FOR UPDATE/SHARE, and the query selects its ctid column, the wrong value would be returned if an EvalPlanQual recheck occurred. This happened because the foreign table's result row was copied via the ROW_MARK_COPY code path, and EvalPlanQualFetchRowMarks just unconditionally set the reconstructed tuple's t_self to "invalid". To fix that, we can have EvalPlanQualFetchRowMarks copy the composite datum's t_ctid field, and be sure to initialize that along with t_self when postgres_fdw constructs a tuple to return. If we just did that much then EvalPlanQualFetchRowMarks would start returning "(0,0)" as ctid for all other ROW_MARK_COPY cases, which perhaps does not matter much, but then again maybe it might. The cause of that is that heap_form_tuple, which is the ultimate source of all composite datums, simply leaves t_ctid as zeroes in newly constructed tuples. That seems like a bad idea on general principles: a field that's really not been initialized shouldn't appear to have a valid value. So let's eat the trivial additional overhead of doing "ItemPointerSetInvalid(&(td->t_ctid))" in heap_form_tuple. This closes out our handling of Etsuro Fujita's report that tableoid and ctid weren't correctly set in postgres_fdw EvalPlanQual cases. Along the way we did a great deal of work to improve FDWs' ability to control row locking behavior; which was not wasted effort by any means, but it didn't end up being a fix for this problem because that feature would be too expensive for postgres_fdw to use all the time. Although the fix for the tableoid misbehavior was back-patched, I'm hesitant to do so here; it seems far less likely that people would care about remote ctid than tableoid, and even such a minor behavioral change as this in heap_form_tuple is perhaps best not back-patched. So commit to HEAD only, at least for the moment. Etsuro Fujita, with some adjustments by me	2015-05-13 14:05:29 -04:00
Andres Freund	5850b20f58	Add pgstattuple_approx() to the pgstattuple extension. The new function allows to estimate bloat and other table level statics in a faster, but approximate, way. It does so by using information from the free space map for pages marked as all visible in the visibility map. The rest of the table is actually read and free space/bloat is measured accurately. In many cases that allows to get bloat information much quicker, causing less IO. Author: Abhijit Menon-Sen Reviewed-By: Andres Freund, Amit Kapila and Tomas Vondra Discussion: 20140402214144.GA28681@kea.toroid.org	2015-05-13 07:35:06 +02:00
Peter Eisentraut	d02f16470f	Replace some appendStringInfo* calls with more appropriate variants Author: David Rowley <dgrowleyml@gmail.com>	2015-05-11 20:38:55 -04:00
Tom Lane	1a8a4e5cde	Code review for foreign/custom join pushdown patch. Commit `e7cb7ee145` included some design decisions that seem pretty questionable to me, and there was quite a lot of stuff not to like about the documentation and comments. Clean up as follows: * Consider foreign joins only between foreign tables on the same server, rather than between any two foreign tables with the same underlying FDW handler function. In most if not all cases, the FDW would simply have had to apply the same-server restriction itself (far more expensively, both for lack of caching and because it would be repeated for each combination of input sub-joins), or else risk nasty bugs. Anyone who's really intent on doing something outside this restriction can always use the set_join_pathlist_hook. * Rename fdw_ps_tlist/custom_ps_tlist to fdw_scan_tlist/custom_scan_tlist to better reflect what they're for, and allow these custom scan tlists to be used even for base relations. * Change make_foreignscan() API to include passing the fdw_scan_tlist value, since the FDW is required to set that. Backwards compatibility doesn't seem like an adequate reason to expect FDWs to set it in some ad-hoc extra step, and anyway existing FDWs can just pass NIL. * Change the API of path-generating subroutines of add_paths_to_joinrel, and in particular that of GetForeignJoinPaths and set_join_pathlist_hook, so that various less-used parameters are passed in a struct rather than as separate parameter-list entries. The objective here is to reduce the probability that future additions to those parameter lists will result in source-level API breaks for users of these hooks. It's possible that this is even a small win for the core code, since most CPU architectures can't pass more than half a dozen parameters efficiently anyway. I kept root, joinrel, outerrel, innerrel, and jointype as separate parameters to reduce code churn in joinpath.c --- in particular, putting jointype into the struct would have been problematic because of the subroutines' habit of changing their local copies of that variable. * Avoid ad-hocery in ExecAssignScanProjectionInfo. It was probably all right for it to know about IndexOnlyScan, but if the list is to grow we should refactor the knowledge out to the callers. * Restore nodeForeignscan.c's previous use of the relcache to avoid extra GetFdwRoutine lookups for base-relation scans. * Lots of cleanup of documentation and missed comments. Re-order some code additions into more logical places.	2015-05-10 14:36:36 -04:00
Andrew Dunstan	0c90f6769d	Add new OID alias type regrole The new type has the scope of whole the database cluster so it doesn't behave the same as the existing OID alias types which have database scope, concerning object dependency. To avoid confusion constants of the new type are prohibited from appearing where dependencies are made involving it. Also, add a note to the docs about possible MVCC violation and optimization issues, which are general over the all reg* types. Kyotaro Horiguchi	2015-05-09 13:06:49 -04:00
Andres Freund	581f4f9657	Remove dependency on ordering in logical decoding upsert test. Buildfarm member magpie sorted the output differently than intended by Peter. "Resolve" the problem by simply not aggregating, it's not that many lines.	2015-05-08 06:06:03 +02:00
Andres Freund	168d5805e4	Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE. The newly added ON CONFLICT clause allows to specify an alternative to raising a unique or exclusion constraint violation error when inserting. ON CONFLICT refers to constraints that can either be specified using a inference clause (by specifying the columns of a unique constraint) or by naming a unique or exclusion constraint. DO NOTHING avoids the constraint violation, without touching the pre-existing row. DO UPDATE SET ... [WHERE ...] updates the pre-existing tuple, and has access to both the tuple proposed for insertion and the existing tuple; the optional WHERE clause can be used to prevent an update from being executed. The UPDATE SET and WHERE clauses have access to the tuple proposed for insertion using the "magic" EXCLUDED alias, and to the pre-existing tuple using the table name or its alias. This feature is often referred to as upsert. This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted. To handle the possible ambiguity between the excluded alias and a table named excluded, and for convenience with long relation names, INSERT INTO now can alias its target table. Bumps catversion as stored rules change. Author: Peter Geoghegan, with significant contributions from Heikki Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes. Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs, Dean Rasheed, Stephen Frost and many others.	2015-05-08 05:43:10 +02:00
Andres Freund	2c8f4836db	Represent columns requiring insert and update privileges indentently. Previously, relation range table entries used a single Bitmapset field representing which columns required either UPDATE or INSERT privileges, despite the fact that INSERT and UPDATE privileges are separately cataloged, and may be independently held. As statements so far required either insert or update privileges but never both, that was sufficient. The required permission could be inferred from the top level statement run. The upcoming INSERT ... ON CONFLICT UPDATE feature needs to independently check for both privileges in one statement though, so that is not sufficient anymore. Bumps catversion as stored rules change. Author: Peter Geoghegan Reviewed-By: Andres Freund	2015-05-08 00:20:46 +02:00
Alvaro Herrera	db5f98ab4f	Improve BRIN infra, minmax opclass and regression test The minmax opclass was using the wrong support functions when cross-datatypes queries were run. Instead of trying to fix the pg_amproc definitions (which apparently is not possible), use the already correct pg_amop entries instead. This requires jumping through more hoops (read: extra syscache lookups) to obtain the underlying functions to execute, but it is necessary for correctness. Author: Emre Hasegeli, tweaked by Álvaro Review: Andreas Karlsson Also change BrinOpcInfo to record each stored type's typecache entry instead of just the OID. Turns out that the full type cache is necessary in brin_deform_tuple: the original code used the indexed type's byval and typlen properties to extract the stored tuple, which is correct in Minmax; but in other implementations that want to store something different, that's wrong. The realization that this is a bug comes from Emre also, but I did not use his patch. I also adopted Emre's regression test code (with smallish changes), which is more complete.	2015-05-07 13:02:22 -03:00
Tom Lane	b22527f29d	Fix incorrect declaration of citext's regexp_matches() functions. These functions should return SETOF TEXT[], like the core functions they are wrappers for; but they were incorrectly declared as returning just TEXT[]. This mistake had two results: first, if there was no match you got a scalar null result, whereas what you should get is an empty set (zero rows). Second, the 'g' flag was effectively ignored, since you would get only one result array even if there were multiple matches, as reported by Jeff Certain. While ignoring 'g' is a clear bug, the behavior for no matches might well have been thought to be the intended behavior by people who hadn't compared it carefully to the core regexp_matches() functions. So we should tread carefully about introducing this change in the back branches. Still, it clearly is a bug and so providing some fix is desirable. After discussion, the conclusion was to introduce the change in a 1.1 version of the citext extension (as we would need to do anyway); 1.0 still contains the incorrect behavior. 1.1 is the default and only available version in HEAD, but it is optional in the back branches, where 1.0 remains the default version. People wishing to adopt the fix in back branches will need to explicitly do ALTER EXTENSION citext UPDATE TO '1.1'. (I also provided a downgrade script in the back branches, so people could go back to 1.0 if necessary.) This should be called out as an incompatible change in the 9.5 release notes, although we'll also document it in the next set of back-branch release notes. The notes should mention that any views or rules that use citext's regexp_matches() functions will need to be dropped before upgrading to 1.1, and then recreated again afterwards. Back-patch to 9.1. The bug goes all the way back to citext's introduction in 8.4, but pre-9.1 there is no extension mechanism with which to manage the change. Given the lack of previous complaints it seems unnecessary to change this behavior in 9.0, anyway.	2015-05-05 15:51:22 -04:00
Peter Eisentraut	c0574cd5aa	hstore_plpython: Support tests on Python 2.3 Python 2.3 does not have the sorted() function, so do it the long way.	2015-05-04 22:30:21 -04:00
Andrew Dunstan	f802c6ddba	Enable transforms modules to build and run with Mingw builds. These modules were all missing essential Windows scaffolding, including resources files and descriptions, and links to the relevant library import files. This latter item means that the modules can't be built with pgxs on Windows, as we don't install the import files. If we ever decide to install them this restriction could probably be removed. Also, as with plperl we need to make sure that perl's CORE directory is last on the include list, as on Windows it appears to contain some headers with names that clash with names of some headers we include.	2015-05-03 09:10:47 -04:00
Peter Eisentraut	e30a864963	hstore_plperl: Move port-specific parts to later in the makefile PORTNAME isn't set until the global makefiles have been included.	2015-05-02 08:03:47 -04:00
Peter Eisentraut	0fd764647a	Make hstore_plperl's build even more like plperl's Combine the two places that set CPPFLAGS into one. Also, some settings should be restricted to Windows only. More precisely, -Wno-comment is a GCC-only option, but Windows in a makefile implies GCC at the moment. Also, since -Wno-comment is more properly a preprocessor option, move it to CPPFLAGS to simplify things a bit.	2015-05-01 22:16:58 -04:00
Andrew Dunstan	77477e745b	Make hstore_plperl's build more like plperl's This involves moving perl's CORE library to the end of the include list, and adding other compilation settings that plperl uses. This won't completely fix the breakage currently being seen by gcc builds on Windows, but it will let the build get further, and should be wholly benign, if not beneficial, on *nix.	2015-05-01 15:36:44 -04:00
Robert Haas	924bcf4f16	Create an infrastructure for parallel computation in PostgreSQL. This does four basic things. First, it provides convenience routines to coordinate the startup and shutdown of parallel workers. Second, it synchronizes various pieces of state (e.g. GUCs, combo CID mappings, transaction snapshot) from the parallel group leader to the worker processes. Third, it prohibits various operations that would result in unsafe changes to that state while parallelism is active. Finally, it propagates events that would result in an ErrorResponse, NoticeResponse, or NotifyResponse message being sent to the client from the parallel workers back to the master, from which they can then be sent on to the client. Robert Haas, Amit Kapila, Noah Misch, Rushabh Lathia, Jeevan Chalke. Suggestions and review from Andres Freund, Heikki Linnakangas, Noah Misch, Simon Riggs, Euler Taveira, and Jim Nasby.	2015-04-30 15:02:14 -04:00
Peter Eisentraut	dbf2ec1a1c	Fix parallel make risk with new check temp-install setup The "check" target no longer needs to depend on "all", because it now runs "install" directly, which in turn depends on "all". Doing both will cause problems with parallel make, because two builds will run next to each other. Also remove the redirection of the temp-install output into a log file. This was appropriate when this was done from within pg_regress, but now it's just a regular make run, and especially with the above changes this will now take the place of running the "all" target before the test suites. problem report by Jeff Janes, patch in part by Michael Paquier	2015-04-29 20:34:22 -04:00
Andres Freund	5aa2350426	Introduce replication progress tracking infrastructure. When implementing a replication solution ontop of logical decoding, two related problems exist: * How to safely keep track of replication progress * How to change replication behavior, based on the origin of a row; e.g. to avoid loops in bi-directional replication setups The solution to these problems, as implemented here, consist out of three parts: 1) 'replication origins', which identify nodes in a replication setup. 2) 'replication progress tracking', which remembers, for each replication origin, how far replay has progressed in a efficient and crash safe manner. 3) The ability to filter out changes performed on the behest of a replication origin during logical decoding; this allows complex replication topologies. E.g. by filtering all replayed changes out. Most of this could also be implemented in "userspace", e.g. by inserting additional rows contain origin information, but that ends up being much less efficient and more complicated. We don't want to require various replication solutions to reimplement logic for this independently. The infrastructure is intended to be generic enough to be reusable. This infrastructure also replaces the 'nodeid' infrastructure of commit timestamps. It is intended to provide all the former capabilities, except that there's only 2^16 different origins; but now they integrate with logical decoding. Additionally more functionality is accessible via SQL. Since the commit timestamp infrastructure has also been introduced in 9.5 (commit `73c986add`) changing the API is not a problem. For now the number of origins for which the replication progress can be tracked simultaneously is determined by the max_replication_slots GUC. That GUC is not a perfect match to configure this, but there doesn't seem to be sufficient reason to introduce a separate new one. Bumps both catversion and wal page magic. Author: Andres Freund, with contributions from Petr Jelinek and Craig Ringer Reviewed-By: Heikki Linnakangas, Petr Jelinek, Robert Haas, Steve Singer Discussion: 20150216002155.GI15326@awork2.anarazel.de, 20140923182422.GA15776@alap3.anarazel.de, 20131114172632.GE7522@alap2.anarazel.de	2015-04-29 19:30:53 +02:00
Peter Eisentraut	f95425478a	Fix hstore_plperl regression tests on some platforms On some platforms, plperl and plperlu cannot be loaded at the same time. So split the test into two separate test files.	2015-04-26 16:13:58 -04:00
Peter Eisentraut	cac7658205	Add transforms feature This provides a mechanism for specifying conversions between SQL data types and procedural languages. As examples, there are transforms for hstore and ltree for PL/Perl and PL/Python. reviews by Pavel Stěhule and Andres Freund	2015-04-26 10:33:14 -04:00
Tom Lane	f320cbb615	Fix typo in linux startup script. Missed a "$" in what was meant to be a variable substitution. Careless mistake in commit `f23425fa95`.	2015-04-26 09:43:15 -04:00
Peter Eisentraut	dcae5facca	Improve speed of make check-world Before, make check-world would create a new temporary installation for each test suite, which is slow and wasteful. Instead, we now create one test installation that is used by all test suites that are part of a make run. The management of the temporary installation is removed from pg_regress and handled in the makefiles. This allows for better control, and unifies the code with that of test suites not run through pg_regress. review and msvc support by Michael Paquier <michael.paquier@gmail.com> more review by Fabien Coelho <coelho@cri.ensmp.fr>	2015-04-23 08:59:52 -04:00
Stephen Frost	4ccc5bd28e	Pull in tableoid for inheiritance with rowMarks As noted by Etsuro Fujita [1] and Dean Rasheed[2], `cb1ca4d800` changed ExecBuildAuxRowMark() to always look for the tableoid in the target list, but didn't also change preprocess_targetlist() to always include the tableoid. This resulted in errors with soon-to-be-added RLS with inheritance tests, and errors when using inheritance with foreign tables. Authors: Etsuro Fujita and Dean Rasheed (independently) Minor word-smithing on the comments by me. [1] 552CF0B6.8010006@lab.ntt.co.jp [2] CAEZATCVmFUfUOwwhnBTcgi6AquyjQ0-1fyKd0T3xBWJvn+xsFA@mail.gmail.com	2015-04-22 11:29:35 -04:00
Andres Freund	cef939c347	Rename pg_replication_slot's new active_in to active_pid. In `d811c037ce` active_in was added but discussion since showed that active_pid is preferred as a name. Discussion: CAMsr+YFKgZca5_7_ouaMWxA5PneJC9LNViPzpDHusaPhU9pA7g@mail.gmail.com	2015-04-22 09:43:40 +02:00
Peter Eisentraut	b0a738f428	Move pg_xlogdump from contrib/ to src/bin/ Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2015-04-21 19:03:49 -04:00
Andres Freund	d811c037ce	Add 'active_in' column to pg_replication_slots. Right now it is visible whether a replication slot is active in any session, but not in which. Adding the active_in column, containing the pid of the backend having acquired the slot, makes it much easier to associate pg_replication_slots entries with the corresponding pg_stat_replication/pg_stat_activity row. This should have been done from the start, but I (Andres) dropped the ball there somehow. Author: Craig Ringer, revised by me Discussion: CAMsr+YFKgZca5_7_ouaMWxA5PneJC9LNViPzpDHusaPhU9pA7g@mail.gmail.com	2015-04-21 11:51:06 +02:00
Peter Eisentraut	528c2e44ab	Move pg_test_timing from contrib/ to src/bin/ Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2015-04-20 21:30:12 -04:00
Peter Eisentraut	00882d9e5c	Move pg_test_fsync from contrib/ to src/bin/ Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2015-04-19 22:20:49 -04:00
Peter Eisentraut	9fa8b0ee90	Move pg_upgrade from contrib/ to src/bin/ Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2015-04-14 19:26:38 -04:00
Peter Eisentraut	30982be4e5	Integrate pg_upgrade_support module into backend Previously, these functions were created in a schema "binary_upgrade", which was deleted after pg_upgrade was finished. Because we don't want to keep that schema around permanently, move them to pg_catalog but rename them with a binary_upgrade_... prefix. The provided functions are only small wrappers around global variables that were added specifically for pg_upgrade use, so keeping the module separate does not create any modularity. The functions still check that they are only called in binary upgrade mode, so it is not possible to call these during normal operation. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2015-04-14 19:26:37 -04:00
Heikki Linnakangas	4f700bcd20	Reorganize our CRC source files again. Now that we use CRC-32C in WAL and the control file, the "traditional" and "legacy" CRC-32 variants are not used in any frontend programs anymore. Move the code for those back from src/common to src/backend/utils/hash. Also move the slicing-by-8 implementation (back) to src/port. This is in preparation for next patch that will add another implementation that uses Intel SSE 4.2 instructions to calculate CRC-32C, where available.	2015-04-14 17:03:42 +03:00
Peter Eisentraut	81134af3ec	Move pgbench from contrib/ to src/bin/ Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2015-04-13 13:07:16 -04:00
Peter Eisentraut	83aca89f7c	Move pg_archivecleanup from contrib/ to src/bin/ Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2015-04-11 23:29:18 -04:00
Alvaro Herrera	27846f02c1	Optimize locking a tuple already locked by another subxact Locking and updating the same tuple repeatedly led to some strange multixacts being created which had several subtransactions of the same parent transaction holding locks of the same strength. However, once a subxact of the current transaction holds a lock of a given strength, it's not necessary to acquire the same lock again. This made some coding patterns much slower than required. The fix is twofold. First we change HeapTupleSatisfiesUpdate to return HeapTupleBeingUpdated for the case where the current transaction is already a single-xid locker for the given tuple; it used to return HeapTupleMayBeUpdated for that case. The new logic is simpler, and the change to pgrowlocks is a testament to that: previously we needed to check for the single-xid locker separately in a very ugly way. That test is simpler now. As fallout from the HTSU change, some of its callers need to be amended so that tuple-locked-by-own-transaction is taken into account in the BeingUpdated case rather than the MayBeUpdated case. For many of them there is no difference; but heap_delete() and heap_update now check explicitely and do not grab tuple lock in that case. The HTSU change also means that routine MultiXactHasRunningRemoteMembers introduced in commit `11ac4c73cb` is no longer necessary and can be removed; the case that used to require it is now handled naturally as result of the changes to heap_delete and heap_update. The second part of the fix to the performance issue is to adjust heap_lock_tuple to avoid the slowness: 1. Previously we checked for the case that our own transaction already held a strong enough lock and returned MayBeUpdated, but only in the multixact case. Now we do it for the plain Xid case as well, which saves having to LockTuple. 2. If the current transaction is the only locker of the tuple (but with a lock not as strong as what we need; otherwise it would have been caught in the check mentioned above), we can skip sleeping on the multixact, and instead go straight to create an updated multixact with the additional lock strength. 3. Most importantly, make sure that both the single-xid-locker case and the multixact-locker case optimization are applied always. We do this by checking both in a single place, rather than them appearing in two separate portions of the routine -- something that is made possible by the HeapTupleSatisfiesUpdate API change. Previously we would only check for the single-xid case when HTSU returned MayBeUpdated, and only checked for the multixact case when HTSU returned BeingUpdated. This was at odds with what HTSU actually returned in one case: if our own transaction was locker in a multixact, it returned MayBeUpdated, so the optimization never applied. This is what led to the large multixacts in the first place. Per bug report #8470 by Oskari Saarenmaa.	2015-04-10 13:47:15 -03:00
Robert Haas	e41beea0dd	Improve pgbench error reporting. This would have been worth doing on general principle anyway, but the recent addition of an expression syntax to pgbench makes it an even better idea than it would have been otherwise. Fabien Coelho	2015-04-02 16:26:49 -04:00
Andres Freund	62e2a8dc2c	Define integer limits independently from the system definitions. In `83ff1618` we defined integer limits iff they're not provided by the system. That turns out not to be the greatest idea because there's different ways some datatypes can be represented. E.g. on OSX PG's 64bit datatype will be a 'long int', but OSX unconditionally uses 'long long'. That disparity then can lead to warnings, e.g. around printf formats. One way to fix that would be to back int64 using stdint.h's int64_t. While a good idea it's not that easy to implement. We would e.g. need to include stdint.h in our external headers, which we don't today. Also computing the correct int64 printf formats in that case is nontrivial. Instead simply prefix the integer limits with PG_ and define them unconditionally. I've adjusted all the references to them in code, but not the ones in comments; the latter seems unnecessary to me. Discussion: 20150331141423.GK4878@alap3.anarazel.de	2015-04-02 17:43:35 +02:00
Bruce Momjian	a0efc71453	pg_upgrade: call 'postgres' binary to get data directory location This matches the binary 'pg_ctl' calls. Previously we called the 'postmaster'. Report by Christoph Berg	2015-04-01 18:25:45 -04:00
Bruce Momjian	0cf16b44cb	btree_gin: properly call DirectFunctionCall1() Previously we called DirectFunctionCall3() with dummy arguments. Fixed version of previous patch. Report by Jon Nelson	2015-03-31 10:26:45 -04:00
Heikki Linnakangas	1d0db8de04	Remove spurious semicolons. Petr Jelinek	2015-03-31 15:12:27 +03:00
Andrew Dunstan	fa1e5afa8a	Run pg_upgrade and pg_resetxlog with restricted token on Windows As with initdb these programs need to run with a restricted token, and if they don't pg_upgrade will fail when run as a user with Adminstrator privileges. Backpatch to all live branches. On the development branch the code is reorganized so that the restricted token code is now in a single location. On the stable bramches a less invasive change is made by simply copying the relevant code to pg_upgrade.c and pg_resetxlog.c. Patches and bug report from Muhammad Asif Naeem, reviewed by Michael Paquier, slightly edited by me.	2015-03-30 17:07:52 -04:00
Tom Lane	542320c2bd	Be more careful about printing constants in ruleutils.c. The previous coding in get_const_expr() tried to avoid quoting integer, float, and numeric literals if at all possible. While that looks nice, it means that dumped expressions might re-parse to something that's semantically equivalent but not the exact same parsetree; for example a FLOAT8 constant would re-parse as a NUMERIC constant with a cast to FLOAT8. Though the result would be the same after constant-folding, this is problematic in certain contexts. In particular, Jeff Davis pointed out that this could cause unexpected failures in ALTER INHERIT operations because of child tables having not-exactly-equivalent CHECK expressions. Therefore, favor correctness over legibility and dump such constants in quotes except in the limited cases where they'll be interpreted as the same type even without any casting. This results in assorted small changes in the regression test outputs, and will affect display of user-defined views and rules similarly. The odds of that causing problems in the field seem non-negligible; given the lack of previous complaints, it seems best not to change this in the back branches.	2015-03-30 14:59:49 -04:00
Tom Lane	e9dd03c03a	Minor code cleanups in pgbench expression support. Get rid of unnecessary expr_yylex declaration (we haven't supported flex 2.5.4 in a long time, and even if we still did, the declaration in pgbench.h makes this one unnecessary and inappropriate). Fix copyright dates, improve some layout choices, etc.	2015-03-29 13:06:59 -04:00
Tom Lane	2c33e0fbce	Better fix for misuse of Float8GetDatumFast(). We can use that macro as long as we put the value into a local variable. Commit `735cd6128` was not wrong on its own terms, but I think this way looks nicer, and it should save a few cycles on 32-bit machines.	2015-03-28 13:56:37 -04:00
Andrew Dunstan	cfe12763c3	Use standard librart sqrt function in pg_stat_statements The stddev calculation included a faster but unportable sqrt function. This is not worth the extra effort, and won't work everywhere. If the standard library function is good enough for the SQL function it should be good enough here too.	2015-03-28 09:22:51 -04:00
Heikki Linnakangas	e09b48316c	Add index-only scan support to btree_gist. inet, cidr, and timetz indexes still cannot support index-only scans, because they don't store the original unmodified value in the index, but a derived approximate value.	2015-03-27 23:35:16 +02:00
Andrew Dunstan	735cd6128a	Fix portability issues with stddev in pg_stat_statements Stddev is calculated on the fly, and the code in commit `717f709532` was using Float8GetDatumFast() inappropriately to convert the result to a Datum. Mea culpa. It now uses Float8GetDatum().	2015-03-27 17:29:59 -04:00
Andrew Dunstan	717f709532	Add stats for min, max, mean, stddev times to pg_stat_statements. The new fields are min_time, max_time, mean_time and stddev_time. Based on an original patch from Mitsumasa KONDO, modified by me. Reviewed by Petr Jelínek.	2015-03-27 15:43:22 -04:00
Heikki Linnakangas	8816af65e4	Minor refactoring of btree_gist code. The gbt_var_key_copy function was doing two different things depending on the boolean argument. Seems cleaner to have two separate functions. Remove unused argument from gbt_num_compress.	2015-03-26 23:10:10 +02:00
Tom Lane	785941cdc3	Tweak __attribute__-wrapping macros for better pgindent results. This improves on commit `bbfd7edae5` by making two simple changes: * pg_attribute_noreturn now takes parentheses, ie pg_attribute_noreturn(). Likewise pg_attribute_unused(), pg_attribute_packed(). This reduces pgindent's tendency to misformat declarations involving them. * attributes are now always attached to function declarations, not definitions. Previously some places were taking creative shortcuts, which were not merely candidates for bad misformatting by pgindent but often were outright wrong anyway. (It does little good to put a noreturn annotation where callers can't see it.) In any case, if we would like to believe that these macros can be used with non-gcc compilers, we should avoid gratuitous variance in usage patterns. I also went through and manually improved the formatting of a lot of declarations, and got rid of excessively repetitive (and now obsolete anyway) comments informing the reader what pg_attribute_printf is for.	2015-03-26 14:03:25 -04:00
Andres Freund	83ff1618bc	Centralize definition of integer limits. Several submitted and even committed patches have run into the problem that C89, our baseline, does not provide minimum/maximum values for various integer datatypes. C99's stdint.h does, but we can't rely on it. Several parts of the code defined limits locally, so instead centralize the definitions to c.h. This patch also changes the more obvious usages of literal limit values; there's more places that could be changed, but it's less clear whether it's beneficial to change those. Author: Andrew Gierth Discussion: 87619tc5wc.fsf@news-spur.riddles.org.uk	2015-03-25 22:39:42 +01:00
Bruce Momjian	11226e3817	Revert commit `843cd0bfe6` Report by Tom Lane	2015-03-24 22:35:05 -04:00
Bruce Momjian	843cd0bfe6	btree_gin: properly call DirectFunctionCall1() Previously we called DirectFunctionCall3() with dummy arguments. Patch by Jon Nelson	2015-03-24 20:53:29 -04:00
Tom Lane	cb1ca4d800	Allow foreign tables to participate in inheritance. Foreign tables can now be inheritance children, or parents. Much of the system was already ready for this, but we had to fix a few things of course, mostly in the area of planner and executor handling of row locks. As side effects of this, allow foreign tables to have NOT VALID CHECK constraints (and hence to accept ALTER ... VALIDATE CONSTRAINT), and to accept ALTER SET STORAGE and ALTER SET WITH/WITHOUT OIDS. Continuing to disallow these things would've required bizarre and inconsistent special cases in inheritance behavior. Since foreign tables don't enforce CHECK constraints anyway, a NOT VALID one is a complete no-op, but that doesn't mean we shouldn't allow it. And it's possible that some FDWs might have use for SET STORAGE or SET WITH OIDS, though doubtless they will be no-ops for most. An additional change in support of this is that when a ModifyTable node has multiple target tables, they will all now be explicitly identified in EXPLAIN output, for example: Update on pt1 (cost=0.00..321.05 rows=3541 width=46) Update on pt1 Foreign Update on ft1 Foreign Update on ft2 Update on child3 -> Seq Scan on pt1 (cost=0.00..0.00 rows=1 width=46) -> Foreign Scan on ft1 (cost=100.00..148.03 rows=1170 width=46) -> Foreign Scan on ft2 (cost=100.00..148.03 rows=1170 width=46) -> Seq Scan on child3 (cost=0.00..25.00 rows=1200 width=46) This was done mainly to provide an unambiguous place to attach "Remote SQL" fields, but it is useful for inherited updates even when no foreign tables are involved. Shigeru Hanada and Etsuro Fujita, reviewed by Ashutosh Bapat and Kyotaro Horiguchi, some additional hacking by me	2015-03-22 13:53:21 -04:00
Tom Lane	8d1f239003	Replace insertion sort in contrib/intarray with qsort(). It's all very well to claim that a simplistic sort is fast in easy cases, but O(N^2) in the worst case is not good ... especially if the worst case is as easy to hit as "descending order input". Replace that bit with our standard qsort. Per bug #12866 from Maksym Boguk. Back-patch to all active branches.	2015-03-15 23:22:03 -04:00
Tom Lane	7b8b8a4331	Improve representation of PlanRowMark. This patch fixes two inadequacies of the PlanRowMark representation. First, that the original LockingClauseStrength isn't stored (and cannot be inferred for foreign tables, which always get ROW_MARK_COPY). Since some PlanRowMarks are created out of whole cloth and don't actually have an ancestral RowMarkClause, this requires adding a dummy LCS_NONE value to enum LockingClauseStrength, which is fairly annoying but the alternatives seem worse. This fix allows getting rid of the use of get_parse_rowmark() in FDWs (as per the discussion around commits `462bd95705` and `8ec8760fc8`), and it simplifies some things elsewhere. Second, that the representation assumed that all child tables in an inheritance hierarchy would use the same RowMarkType. That's true today but will soon not be true. We add an "allMarkTypes" field that identifies the union of mark types used in all a parent table's children, and use that where appropriate (currently, only in preprocess_targetlist()). In passing fix a couple of minor infelicities left over from the SKIP LOCKED patch, notably that _outPlanRowMark still thought waitPolicy is a bool. Catversion bump is required because the numeric values of enum LockingClauseStrength can appear in on-disk rules. Extracted from a much larger patch to support foreign table inheritance; it seemed worth breaking this out, since it's a separable concern. Shigeru Hanada and Etsuro Fujita, somewhat modified by me	2015-03-15 18:41:47 -04:00
Robert Haas	e96b7c6b9f	sepgsql: Improve error message when unsupported object type is labeled. KaiGai Kohei, reviewed by Álvaro Herrera and myself	2015-03-11 12:12:10 -04:00
Andres Freund	bbfd7edae5	Add macros wrapping all usage of gcc's __attribute__. Until now __attribute__() was defined to be empty for all compilers but gcc. That's problematic because it prevents using it in other compilers; which is necessary e.g. for atomics portability. It's also just generally dubious to do so in a header as widely included as c.h. Instead add pg_attribute_format_arg, pg_attribute_printf, pg_attribute_noreturn macros which are implemented in the compilers that understand them. Also add pg_attribute_noreturn and pg_attribute_packed, but don't provide fallbacks, since they can affect functionality. This means that external code that, possibly unwittingly, relied on __attribute__ defined to be empty on !gcc compilers may now run into warnings or errors on those compilers. But there shouldn't be many occurances of that and it's hard to work around... Discussion: 54B58BA3.8040302@ohmu.fi Author: Oskari Saarenmaa, with some minor changes by me.	2015-03-11 14:30:01 +01:00
Fujii Masao	57aa5b2bb1	Add GUC to enable compression of full page images stored in WAL. When newly-added GUC parameter, wal_compression, is on, the PostgreSQL server compresses a full page image written to WAL when full_page_writes is on or during a base backup. A compressed page image will be decompressed during WAL replay. Turning this parameter on can reduce the WAL volume without increasing the risk of unrecoverable data corruption, but at the cost of some extra CPU spent on the compression during WAL logging and on the decompression during WAL replay. This commit changes the WAL format (so bumping WAL version number) so that the one-byte flag indicating whether a full page image is compressed or not is included in its header information. This means that the commit increases the WAL volume one-byte per a full page image even if WAL compression is not used at all. We can save that one-byte by borrowing one-bit from the existing field like hole_offset in the header and using it as the flag, for example. But which would reduce the code readability and the extensibility of the feature. Per discussion, it's not worth paying those prices to save only one-byte, so we decided to add the one-byte flag to the header. This commit doesn't introduce any new compression algorithm like lz4. Currently a full page image is compressed using the existing PGLZ algorithm. Per discussion, we decided to use it at least in the first version of the feature because there were no performance reports showing that its compression ratio is unacceptably lower than that of other algorithm. Of course, in the future, it's worth considering the support of other compression algorithm for the better compression. Rahila Syed and Michael Paquier, reviewed in various versions by myself, Andres Freund, Robert Haas, Abhijit Menon-Sen and many others.	2015-03-11 15:52:24 +09:00
Alvaro Herrera	e491bd2ee3	Move BRIN page type to page's last two bytes ... which is the usual convention among AMs, so that pg_filedump and similar utilities can tell apart pages of different AMs. It was also the intent of the original code, but I failed to realize that alignment considerations would move the whole thing to the previous-to-last word in the page. The new definition of the associated macro makes surrounding code a bit leaner, too. Per note from Heikki at http://www.postgresql.org/message-id/546A16EF.9070005@vmware.com	2015-03-10 12:27:15 -03:00

1 2 3 4 5 ...

2899 Commits