postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2024-12-21 08:29:39 +08:00

Author	SHA1	Message	Date
Tom Lane	442231d7f7	Fix postmaster to attempt restart after a hot-standby crash. The postmaster was coded to treat any unexpected exit of the startup process (i.e., the WAL replay process) as a catastrophic crash, and not try to restart it. This was OK so long as the startup process could not have any sibling postmaster children. However, if a hot-standby backend crashes, we SIGQUIT the startup process along with everything else, and the resulting exit is hardly "unexpected". Treating it as such meant we failed to restart a standby server after any child crash at all, not only a crash of the WAL replay process as intended. Adjust that. Back-patch to 9.0 where hot standby was introduced.	2012-02-06 15:30:21 -05:00
Michael Meskes	0ee23b53be	Allow the connection keyword array to carry all seven items in ecpglib.	2012-02-06 20:58:57 +01:00
Tom Lane	5fc78efcec	Avoid throwing ERROR during WAL replay of DROP TABLESPACE. Although we will not even issue an XLOG_TBLSPC_DROP WAL record unless removal of the tablespace's directories succeeds, that does not guarantee that the same operation will succeed during WAL replay. Foreseeable reasons for it to fail include temp files created in the tablespace by Hot Standby backends, wrong directory permissions on a standby server, etc etc. The original coding threw ERROR if replay failed to remove the directories, but that is a serious overreaction. Throwing an error aborts recovery, and worse means that manual intervention will be needed to get the database to start again, since otherwise the same error will recur on subsequent attempts to replay the same WAL record. And the consequence of failing to remove the directories is only that some probably-small amount of disk space is wasted, so it hardly seems justified to throw an error. Accordingly, arrange to report such failures as LOG messages and keep going when a failure occurs during replay. Back-patch to 9.0 where Hot Standby was introduced. In principle such problems can occur in earlier releases, but Hot Standby increases the odds of trouble significantly. Given the lack of field reports of such issues, I'm satisfied with patching back as far as the patch applies easily.	2012-02-06 14:44:41 -05:00
Robert Haas	3b157cf21d	pg_dump: Remove global Archive pointer. Instead, everything that needs the Archive object now gets it as a parameter. This is necessary infrastructure for parallel pg_dump, but is also amply justified by the ugliness of the current code (though a lot more than this is needed to fix that problem).	2012-02-06 14:07:55 -05:00
Robert Haas	622f862868	pg_dump: Reduce dependencies on global variables. Change various places in the code that are referencing the global Archive object g_fout to instead reference the Archive object fout which is already being passed as a parameter. For parallel pg_dump to work, we're going to need multiple Archive(Handle) objects, so the real solution here is to pass down the Archive object to everywhere that it needs to go, but we might as well pick the low-hanging fruit first.	2012-02-06 13:06:34 -05:00
Tom Lane	c6d76d7c82	Add locking around WAL-replay modification of shared-memory variables. Originally, most of this code assumed that no Postgres backends could be running concurrently with it, and so no locking could be needed. That assumption fails in Hot Standby. While it's still true that Hot Standby backends should never change values like nextXid, they can examine them, and consistency is important in some cases such as when computing a snapshot. Therefore, prudence requires that WAL replay code obtain the relevant locks when modifying such variables, even though it can examine them without taking a lock. We were following that coding rule in some places but not all. This commit applies the coding rule uniformly to all updates of ShmemVariableCache and MultiXactState fields; a search of the replay routines did not find any other cases that seemed to be at risk. In addition, this commit fixes a longstanding thinko in replay of NEXTOID and checkpoint records: we tried to advance nextOid only if it was behind the value in the WAL record, but the comparison would draw the wrong conclusion if OID wraparound had occurred since the previous value. Better to just unconditionally assign the new value, since OID assignment shouldn't be happening during replay anyway. The additional locking seems to be more in the nature of future-proofing than fixing any live bug, so I am not going to back-patch it. The NEXTOID fix will be back-patched separately.	2012-02-06 12:34:10 -05:00
Robert Haas	96abd81744	Remove dead declaration.	2012-02-06 12:09:20 -05:00
Alvaro Herrera	0c88086df3	fe-misc.c depends on pg_config_paths.h Declare this in Makefile to avoid failures in parallel compiles. Author: Lionel Elie Mamane	2012-02-06 11:50:01 -03:00
Tom Lane	17118825b8	Fix transient clobbering of shared buffers during WAL replay. RestoreBkpBlocks was in the habit of zeroing and refilling the target buffer; which was perfectly safe when the code was written, but is unsafe during Hot Standby operation. The reason is that we have coding rules that allow backends to continue accessing a tuple in a heap relation while holding only a pin on its buffer. Such a backend could see transiently zeroed data, if WAL replay had occasion to change other data on the page. This has been shown to be the cause of bug #6425 from Duncan Rance (who deserves kudos for developing a sufficiently-reproducible test case) as well as Bridget Frey's re-report of bug #6200. It most likely explains the original report as well, though we don't yet have confirmation of that. To fix, change the code so that only bytes that are supposed to change will change, even transiently. This actually saves cycles in RestoreBkpBlocks, since it's not writing the same bytes twice. Also fix seq_redo, which has the same disease, though it has to work a bit harder to meet the requirement. So far as I can tell, no other WAL replay routines have this type of bug. In particular, the index-related replay routines, which would certainly be broken if they had to meet the same standard, are not at risk because we do not have coding rules that allow access to an index page when not holding a buffer lock on it. Back-patch to 9.0 where Hot Standby was added.	2012-02-05 15:49:17 -05:00
Tom Lane	ee68a44106	Improve comment.	2012-02-04 22:37:34 -05:00
Tom Lane	2af72cefea	Add missing Assert and fix inaccurate elog message in standby_redo(). All other WAL redo routines either call RestoreBkpBlocks() or Assert that they haven't been passed any backup blocks. Make this one do likewise. Also, fix incorrect routine name in its failure message.	2012-02-04 22:32:35 -05:00
Tom Lane	9bff0780cf	Allow SQL-language functions to reference parameters by name. Matthew Draper, reviewed by Hitoshi Harada	2012-02-04 19:23:49 -05:00
Tom Lane	342b83fdca	Revert "Add some regression test cases for denormalized float8 input." This reverts commit `500cf66d55`. As was more or less expected, a small minority of platforms won't accept denormalized input even with the recent changes. It doesn't seem especially helpful to test this if we're going to have to provide an alternate expected-file to allow failure.	2012-02-04 15:52:09 -05:00
Bruce Momjian	072ba77bff	Remove tabs in SGML file.	2012-02-04 07:11:44 -05:00
Michael Meskes	fc211f8277	Applied Peter's patch to PQconnectdbParams in ecpglib instead of the old PQconectdb.	2012-02-04 01:19:10 +01:00
Andrew Dunstan	39909d1d39	Add array_to_json and row_to_json functions. Also move the escape_json function from explain.c to json.c where it seems to belong. Andrew Dunstan, Reviewd by Abhijit Menon-Sen.	2012-02-03 12:11:16 -05:00
Peter Eisentraut	69e9768e7b	ecpg: Improve test building Further improve on commit `c75e143646`. Instead of building both .o files and binaries in the same make rule, just rely on the normal .c -> .o rule. This will ensure that dependency tracking is used when enabled. To do this, disable the implicit direct .c -> binary rule globally, which will also prevent the original problem (*.dSYM junk) from reappearing elsewhere.	2012-02-02 20:33:29 +02:00
Robert Haas	0ed7445d73	Allow spgist's text_ops to handle pattern-matching operators. This was presumably intended to work this way all along, but a few key bits of indxpath.c didn't get the memo. Robert Haas and Tom Lane	2012-02-02 13:10:56 -05:00
Robert Haas	b4e0741727	Avoid re-checking for visibility map extension too frequently. When testing bits (but not when setting or clearing them), we now won't check whether the map has been extended. This significantly improves performance in the case where the visibility map doesn't exist yet, by avoiding an extra system call per tuple. To make sure backends notice eventually, send an smgr inval on VM extension. Dean Rasheed, with minor modifications by me.	2012-02-01 20:35:42 -05:00
Peter Eisentraut	8a02339e9b	initdb: Add options --auth-local and --auth-host reviewed by Robert Haas and Pavel Stehule	2012-02-01 21:18:55 +02:00
Peter Eisentraut	69f4f1c357	psql: Case preserving completion of SQL key words Instead of always completing SQL key words in upper case, look at the word being completed and match the case. reviewed by Fujii Masao	2012-02-01 20:18:32 +02:00
Tom Lane	500cf66d55	Add some regression test cases for denormalized float8 input. This was submitted with the previous patch, but I'm committing it separately to ease backing it out if these results prove too unportable. Marti Raudsepp, after a proposal by Jeroen Vermeulen	2012-02-01 13:13:54 -05:00
Tom Lane	c318aeed84	Try to be more consistent about accepting denormalized float8 numbers. On some platforms, strtod() reports ERANGE for a denormalized value (ie, one that can be represented as distinct from zero, but is too small to have full precision). On others, it doesn't. It seems better to try to accept these values consistently, so add a test to see if the result value indicates a true out-of-range condition. This should be okay per Single Unix Spec. On machines where the underlying math isn't IEEE standard, the behavior for such small numbers may not be very consistent, but then it wouldn't be anyway. Marti Raudsepp, after a proposal by Jeroen Vermeulen	2012-02-01 13:11:16 -05:00
Alvaro Herrera	b2e431a4db	Implement dry-run mode for pg_archivecleanup In dry-run mode, just the name of the file to be removed is printed to stdout; this is so the user can easily plug it into another program through a pipe. If debug mode is also specified, a more verbose message is printed to stderr. Author: Gabriele Bartolini Reviewer: Josh Kupershmidt	2012-02-01 14:18:12 -03:00
Magnus Hagander	21238deea5	Properly free the sslcompression field in PGconn Marko Kreen	2012-02-01 16:51:35 +01:00
Tom Lane	bef47331b6	Code review for plpgsql fn_signature patch. Don't quote the output of format_procedure(); it's already quoted quite enough. Remove the fn_name field, which was now just dead weight. Fix remaining expected-output files.	2012-02-01 02:14:37 -05:00
Peter Eisentraut	4b77bfc37a	psql: Reduce the amount of const lies a bit	2012-01-31 21:23:17 +02:00
Peter Eisentraut	88a6ac9f93	pg_dump: Add GCC noreturn attribute to appropriate functions This is a small help to the compiler and static analyzers.	2012-01-31 20:49:10 +02:00
Robert Haas	5ae88c65da	Adjust expected regression test outputs for PL/python. This got broken by commit `4c6cedd1b0`, which caused PL/pgsql error messages to print the function signature, not just the name. Per buildfarm.	2012-01-31 13:16:38 -05:00
Robert Haas	c327108140	Catversion bump for JSON patch. Sigh.	2012-01-31 11:51:51 -05:00
Robert Haas	5384a73f98	Built-in JSON data type. Like the XML data type, we simply store JSON data as text, after checking that it is valid. More complex operations such as canonicalization and comparison may come later, but this is enough for not. There are a few open issues here, such as whether we should attempt to detect UTF-8 surrogate pairs represented as \uXXXX\uYYYY, but this gets the basic framework in place.	2012-01-31 11:48:23 -05:00
Heikki Linnakangas	4c6cedd1b0	Print function signature, not just name, in PL/pgSQL error messages. This makes it unambiguous which function the message is coming from, if you have overloaded functions. Pavel Stehule, reviewed by Abhijit Menon-Sen.	2012-01-31 10:36:20 +02:00
Heikki Linnakangas	82d4b262d9	Fix bug in the new wait-until-lwlock-is-free mechanism. If there was a wait-until-free process in the head of the wait queue, followed by an exclusive locker, the exclusive locker was not be woken up as it should.	2012-01-31 00:09:30 +02:00
Peter Eisentraut	82e83f46a2	Add sequence USAGE privileges to information schema The sequence USAGE privilege is sufficiently similar to the SQL standard that it seems reasonable to show in the information schema. Also add some compatibility notes about it on the GRANT reference page.	2012-01-30 21:45:42 +02:00
Peter Eisentraut	ee7fa66b19	PL/Python: Add result metadata functions Add result object functions .colnames, .coltypes, .coltypmods to obtain information about the result column names and types, which was previously not possible in the PL/Python SPI interface. reviewed by Abhijit Menon-Sen	2012-01-30 21:38:52 +02:00
Peter Eisentraut	c6ea8ccea6	Use abort() instead of exit() to abort library functions In some hopeless situations, certain library functions in libpq and libpgport quit the program. Use abort() for that instead of exit(), so we don't interfere with the normal exit codes the program might use, we clearly signal the abnormal termination, and the caller has a chance of catching the termination. This was originally pointed out by Debian's Lintian program.	2012-01-30 21:34:00 +02:00
Robert Haas	423ee49b49	Remove prototype for nonexistent function.	2012-01-30 11:59:40 -05:00
Heikki Linnakangas	9b38d46d9f	Make group commit more effective. When a backend needs to flush the WAL, and someone else is already flushing the WAL, wait until it releases the WALInsertLock and check if we still need to do the flush or if the other backend already did the work for us, before acquiring WALInsertLock. This helps group commit, because when the WAL flush finishes, all the backends that were waiting for it can be woken up in one go, and the can all concurrently observe that they're done, rather than waking them up one by one in a cascading fashion. This is based on a new LWLock function, LWLockWaitUntilFree(), which has peculiar semantics. If the lock is immediately free, it grabs the lock and returns true. If it's not free, it waits until it is released, but then returns false without grabbing the lock. This is used in XLogFlush(), so that when the lock is acquired, the backend flushes the WAL, but if it's not, the backend first checks the current flush location before retrying. Original patch and benchmarking by Peter Geoghegan and Simon Riggs, although this patch as committed ended up being very different from that.	2012-01-30 16:53:48 +02:00
Simon Riggs	ba1868ba31	Minor bug fix and cleanup from self-review of sync rep queues patch.	2012-01-30 14:36:17 +00:00
Simon Riggs	73f617f13f	Various minor comments changes from bgwriter to checkpointer.	2012-01-30 14:34:25 +00:00
Heikki Linnakangas	a578257040	Accept a non-existent value in "ALTER USER/DATABASE SET ..." command. When default_text_search_config, default_tablespace, or temp_tablespaces setting is set per-user or per-database, with an "ALTER USER/DATABASE SET ..." statement, don't throw an error if the text search configuration or tablespace does not exist. In case of text search configuration, even if it doesn't exist in the current database, it might exist in another database, where the setting is intended to have its effect. This behavior is now the same as search_path's. Tablespaces are cluster-wide, so the same argument doesn't hold for tablespaces, but there's a problem with pg_dumpall: it dumps "ALTER USER SET ..." statements before the "CREATE TABLESPACE" statements. Arguably that's pg_dumpall's fault - it should dump the statements in such an order that the tablespace is created first and then the "ALTER USER SET default_tablespace ..." statements after that - but it seems better to be consistent with search_path and default_text_search_config anyway. Besides, you could still create a dump that throws an error, by creating the tablespace, running "ALTER USER SET default_tablespace", then dropping the tablespace and running pg_dumpall on that. Backpatch to all supported versions.	2012-01-30 11:13:36 +02:00
Tom Lane	ad10853b30	Assorted comment fixes, mostly just typos, but some obsolete statements. YAMAMOTO Takashi	2012-01-29 19:23:56 -05:00
Tom Lane	dd243b3e40	Fix typo in comment. Peter Geoghegan	2012-01-29 18:56:35 -05:00
Tom Lane	21a39de580	Tweak index costing for problems with partial indexes. btcostestimate() makes an estimate of the number of index tuples that will be visited based on knowledge of which index clauses can actually bound the scan within nbtree. However, it forgot to account for partial indexes in this calculation, with the result that the cost of the index scan could be significantly overestimated for a partial index. Fix that by merging the predicate with the abbreviated indexclause list, in the same way as we do with the full list to estimate how many heap tuples will be visited. Also, slightly increase the "fudge factor" that's meant to give preference to smaller indexes over larger ones. While this is applied to all indexes, it's most important for partial indexes since it can be the only factor that makes a partial index look cheaper than a similar full index. Experimentation shows that the existing value is so small as to easily get swamped by noise such as page-boundary-roundoff behavior. I'm tempted to kick it up more than this, but will refrain for now. Per report from Ruben Blanco. These are long-standing issues, but given the lack of prior complaints I'm not going to risk changing planner behavior in back branches by back-patching.	2012-01-29 18:37:14 -05:00
Tom Lane	b28ffd0fcc	Fix pushing of index-expression qualifications through UNION ALL. In commit `57664ed25e`, I made the planner wrap non-simple-variable outputs of appendrel children (IOW, child SELECTs of UNION ALL subqueries) inside PlaceHolderVars, in order to solve some issues with EquivalenceClass processing. However, this means that any upper-level WHERE clauses mentioning such outputs will now contain PlaceHolderVars after they're pushed down into the appendrel child, and that prevents indxpath.c from recognizing that they could be matched to index expressions. To fix, add explicit stripping of PlaceHolderVars from index operands, same as we have long done for RelabelType nodes. Add a regression test covering both this and the plain-UNION case (which is a totally different code path, but should also be able to do it). Per bug #6416 from Matteo Beccati. Back-patch to 9.1, same as the previous change.	2012-01-29 16:31:23 -05:00
Tom Lane	ed6e0545f5	Add caution about multiple unique indexes breaking plpgsql upsert example. Per Phil Sorber, though I didn't use his wording exactly.	2012-01-28 21:06:41 -05:00
Tom Lane	17d3233e1b	Update statement about sorting of character-string data. The sort order is no longer fixed at database creation time, but can be controlled via COLLATE. Noted by Thomas Kellerer.	2012-01-28 20:54:56 -05:00
Tom Lane	4ec6581c0c	Fix handling of init_plans list in inheritance_planner(). Formerly we passed an empty list to each per-child-table invocation of grouping_planner, and then merged the results into the global list. However, that fails if there's a CTE attached to the statement, because create_ctescan_plan uses the list to find the plan referenced by a CTE reference; so it was unable to find any CTEs attached to the outer UPDATE or DELETE. But there's no real reason not to use the same list throughout the process, and doing so is simpler and faster anyway. Per report from Josh Berkus of "could not find plan for CTE" failures. Back-patch to 9.1 where we added support for WITH attached to UPDATE or DELETE. Add some regression test cases, too.	2012-01-28 20:24:42 -05:00
Tom Lane	759d9d6769	Add simple tests of EvalPlanQual using the isolationtester infrastructure. Much more could be done here, but at least now we have some automated test coverage of that mechanism. In particular this tests the writable-CTE case reported by Phil Sorber. In passing, remove isolationtester's arbitrary restriction on the number of steps in a permutation list. I used this so that a single spec file could be used to run several related test scenarios, but there are other possible reasons to want a step series that's not exactly a permutation. Improve documentation and fix a couple other nits as well.	2012-01-28 17:55:08 -05:00
Tom Lane	7c1719bc68	Fix handling of data-modifying CTE subplans in EvalPlanQual. We can't just skip initializing such subplans, because the referencing CTE node will expect to find the subplan available when it initializes. That in turn means that ExecInitModifyTable must allow the case (which actually it needed to do anyway, since there's no guarantee that ModifyTable is exactly at the top of the CTE plan tree). So move the complaint about not being allowed in EvalPlanQual mode to execution instead of initialization. Testing turned up yet another problem, which is that we'd try to re-initialize the result relation's index list, leading to leaks and dangling pointers. Per report from Phil Sorber. Back-patch to 9.1 where data-modifying CTEs were introduced.	2012-01-28 17:43:57 -05:00

... 2 3 4 5 6 ...

33275 Commits