postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2024-12-27 08:39:28 +08:00

Author	SHA1	Message	Date
Robert Haas	d02c0ddb15	Fix missing parentheses resulting in wrong order of dereference. This could result in referencing uninitialized memory. Michael Paquier, in response to a complaint from Andres Freund	2014-01-15 11:00:50 -05:00
Robert Haas	b5763cedde	test_shm_mq: Fix checks for negative queue size. Noted while addressing compiler warnings pointed out on pgsql-hackers.	2014-01-15 10:51:06 -05:00
Robert Haas	d89746c7c5	Fix compiler warning. Kevin Gritter reports that his compiler complains about inq and outq being possibly-uninitialized at the point where they are passed to shm_mq_attach(). They are initialized by the call to setup_dynamic_shared_memory, but apparently his compiler is inlining that function and then having doubts about whether the for loop will always execute at least once. Fix by initializing them to NULL.	2014-01-15 10:24:28 -05:00
Robert Haas	be361ef2a0	Fix compiler warning: Size isn't 64 bits on 32 bit platforms. Report by Peter Eisentraut.	2014-01-15 08:31:04 -05:00
Tom Lane	5df99f6481	Improve FILES section of psql reference page. Primarily, explain where to find the system-wide psqlrc file, per recent gripe from John Sutton. Do some general wordsmithing and improve the markup, too. Also adjust psqlrc.sample so its comments about file location are somewhat trustworthy. (Not sure why we bother with this file when it's empty, but whatever.) Back-patch to 9.2 where the startup file naming scheme was last changed.	2014-01-14 19:27:57 -05:00
Tom Lane	061b079f89	Fix multiple bugs in index page locking during hot-standby WAL replay. In ordinary operation, VACUUM must be careful to take a cleanup lock on each leaf page of a btree index; this ensures that no indexscans could still be "in flight" to heap tuples due to be deleted. (Because of possible index-tuple motion due to concurrent page splits, it's not enough to lock only the pages we're deleting index tuples from.) In Hot Standby, the WAL replay process must likewise lock every leaf page. There were several bugs in the code for that: * The replay scan might come across unused, all-zero pages in the index. While btree_xlog_vacuum itself did the right thing (ie, nothing) with such pages, xlogutils.c supposed that such pages must be corrupt and would throw an error. This accounts for various reports of replication failures with "PANIC: WAL contains references to invalid pages". To fix, add a ReadBufferMode value that instructs XLogReadBufferExtended not to complain when we're doing this. * btree_xlog_vacuum performed the extra locking if standbyState == STANDBY_SNAPSHOT_READY, but that's not the correct test: we won't open up for hot standby queries until the database has reached consistency, and we don't want to do the extra locking till then either, for fear of reading corrupted pages (which bufmgr.c would complain about). Fix by exporting a new function from xlog.c that will report whether we're actually in hot standby replay mode. * To ensure full coverage of the index in the replay scan, btvacuumscan would emit a dummy WAL record for the last page of the index, if no vacuuming work had been done on that page. However, if the last page of the index is all-zero, that would result in corruption of said page, since the functions called on it weren't prepared to handle that case. There's no need to lock any such pages, so change the logic to target the last normal leaf page instead. The first two of these bugs were diagnosed by Andres Freund, the other one by me. Fixes based on ideas from Heikki Linnakangas and myself. This has been wrong since Hot Standby was introduced, so back-patch to 9.0.	2014-01-14 17:35:21 -05:00
Robert Haas	16cad3e867	Documentation for test_shm_mq. Commit `4db3744f1f` added this contrib module but neglected to document it. Oops.	2014-01-14 15:58:15 -05:00
Robert Haas	b682709451	Mention that VACUUM FREEZE also effectively zeroes the table freeze age. Maciek Sakrejda, reviewed by Amit Kapila	2014-01-14 14:48:57 -05:00
Robert Haas	246a9a8d0c	Fix typo in comment. Etsuro Fujita	2014-01-14 14:34:57 -05:00
Robert Haas	4db3744f1f	Test code for shared memory message queue facility. This code is intended as a demonstration of how the dynamic shared memory and dynamic background worker facilities can be used to establish a group of coooperating processes which can coordinate their activities using the shared memory message queue facility. By itself, the code does nothing particularly interesting: it simply allows messages to be passed through a loop of workers and back to the original process. But it's a useful unit test, in addition to its demonstration value.	2014-01-14 12:24:12 -05:00
Robert Haas	ec9037df26	Single-reader, single-writer, lightweight shared message queue. This code provides infrastructure for user backends to communicate relatively easily with background workers. The message queue is structured as a ring buffer and allows messages of arbitary length to be sent and received. Patch by me. Review by KaiGai Kohei and Andres Freund.	2014-01-14 12:23:22 -05:00
Robert Haas	6ddd5137b2	Simple table of contents for a shared memory segment. This interface is intended to make it simple to divide a dynamic shared memory segment into different regions with distinct purposes. It therefore serves much the same purpose that ShmemIndex accomplishes for the main shared memory segment, but it is intended to be more lightweight. Patch by me. Review by Andres Freund.	2014-01-14 12:18:58 -05:00
Robert Haas	05ff5062da	Code improvements for ALTER SYSTEM .. SET. Move FreeConfigVariables() later to make sure ErrorConfFile is valid when we use it, and get rid of an unnecessary string copy operation. Amit Kapila, kibitzed by me.	2014-01-13 14:54:00 -05:00
Robert Haas	2bb1f14b89	Make bitmap heap scans show exact/lossy block info in EXPLAIN ANALYZE. Etsuro Fujita	2014-01-13 14:42:16 -05:00
Tom Lane	c3ccc9ee58	Fix possible buffer overrun in contrib/pg_trgm. Allow for the possibility that folding a string to lower case makes it longer (due to replacing a character with a longer multibyte character). This doesn't change the number of trigrams that will be extracted, but it does affect the required size of an intermediate buffer in generate_trgm(). Per bug #8821 from Ufuk Kayserilioglu. Also install some checks that the input string length is not so large as to cause overflow in the calculations of palloc request sizes. Back-patch to all supported versions.	2014-01-13 13:07:10 -05:00
Heikki Linnakangas	866a1f0923	Fix calculation of ISMN check digit. This has always been broken, so back-patch to all supported versions. Fabien COELHO	2014-01-13 15:43:29 +02:00
Heikki Linnakangas	04038148b6	Add OVERLAPS to index in the docs. Per report from Adam Mackler and Jonathan Katz	2014-01-13 15:16:26 +02:00
Michael Meskes	976a7d1156	Always use the same way to addres a descriptor in ecpg's regression tests.	2014-01-13 10:41:53 +01:00
Bruce Momjian	bb953ad164	Fix pg_dumpall on pre-8.1 servers rolname did not exist in pg_shadow. Backpatch to 9.3 Report by Andrew Gierth via IRC	2014-01-12 22:25:36 -05:00
Tom Lane	158b7fa6a3	Disallow LATERAL references to the target table of an UPDATE/DELETE. On second thought, commit `0c051c9008` was over-hasty: rather than allowing this case, we ought to reject it for now. That leaves the field clear for a future feature that allows the target table to be re-specified in the FROM (or USING) clause, which will enable left-joining the target table to something else. We can then also allow LATERAL references to such an explicitly re-specified target table. But allowing them right now will create ambiguities or worse for such a feature, and it isn't something we documented 9.3 as supporting. While at it, add a convenience subroutine to avoid having several copies of the ereport for disalllowed-LATERAL-reference cases.	2014-01-11 19:03:12 -05:00
Tom Lane	910bac5953	Fix possible crashes due to using elog/ereport too early in startup. Per reports from Andres Freund and Luke Campbell, a server failure during set_pglocale_pgservice results in a segfault rather than a useful error message, because the infrastructure needed to use ereport hasn't been initialized; specifically, MemoryContextInit hasn't been called. One known cause of this is starting the server in a directory it doesn't have permission to read. We could try to prevent set_pglocale_pgservice from using anything that depends on palloc or elog, but that would be messy, and the odds of future breakage seem high. Moreover there are other things being called in main.c that look likely to use palloc or elog too --- perhaps those things shouldn't be there, but they are there today. The best solution seems to be to move the call of MemoryContextInit to very early in the backend's real main() function. I've verified that an elog or ereport occurring immediately after that is now capable of sending something useful to stderr. I also added code to elog.c to print something intelligible rather than just crashing if MemoryContextInit hasn't created the ErrorContext. This could happen if MemoryContextInit itself fails (due to malloc failure), and provides some future-proofing against someone trying to sneak in new code even earlier in server startup. Back-patch to all supported branches. Since we've only heard reports of this type of failure recently, it may be that some recent change has made it more likely to see a crash of this kind; but it sure looks like it's broken all the way back.	2014-01-11 16:36:07 -05:00
Bruce Momjian	d84c584ece	Revert `fd2ace8028` Seems we want to document '=' plpgsql assignment instead.	2014-01-11 14:00:47 -05:00
Tom Lane	6286526207	Fix compute_scalar_stats() for case that all values exceed WIDTH_THRESHOLD. The standard typanalyze functions skip over values whose detoasted size exceeds WIDTH_THRESHOLD (1024 bytes), so as to limit memory bloat during ANALYZE. However, we (I think I, actually :-() failed to consider the possibility that every non-null value in a column is too wide. While compute_minimal_stats() seems to behave reasonably anyway in such a case, compute_scalar_stats() just fell through and generated no pg_statistic entry at all. That's unnecessarily pessimistic: we can still produce valid stanullfrac and stawidth values in such cases, since we do include too-wide values in the average-width calculation. Furthermore, since the general assumption in this code is that too-wide values are probably all distinct from each other, it seems reasonable to set stadistinct to -1 ("all distinct"). Per complaint from Kadri Raudsepp. This has been like this since roughly neolithic times, so back-patch to all supported branches.	2014-01-11 13:42:42 -05:00
Bruce Momjian	fd2ace8028	docs: remove undocumented assign syntax in plpgsql examples Pavel Stehule	2014-01-11 13:41:08 -05:00
Tom Lane	28233ffaa4	Add another regression test cross-checking operator and function comments. Add a query that lists all the functions that are operator implementation functions and have a SQL comment that doesn't just say "implementation of XYZ operator". (Note that the preceding test checks that such functions' comments exactly match the corresponding operators' comments.) While it's not forbidden to add more functions to this list, that should only be done when we're encouraging users to use either the function or operator syntax for the functionality, which is a fairly rare situation.	2014-01-11 00:16:08 -05:00
Andrew Dunstan	11829ff8b2	Remove DESCR entries for json operator functions. Per -hackers discussion.	2014-01-10 22:25:04 -05:00
Bruce Momjian	850ade3e32	Adjust pg_upgrade for move of username lookup functions to /common	2014-01-10 20:56:47 -05:00
Bruce Momjian	111022eac6	Move username lookup functions from /port to /common Per suggestion from Peter E and Alvaro	2014-01-10 18:03:28 -05:00
Alvaro Herrera	423e1211a8	Accept pg_upgraded tuples during multixact freezing The new MultiXact freezing routines introduced by commit `8e9a16ab8f` neglected to consider tuples that came from a pg_upgrade'd database; a vacuum run that tried to freeze such tuples would die with an error such as ERROR: MultiXactId 11415437 does no longer exist -- apparent wraparound To fix, ensure that GetMultiXactIdMembers is allowed to return empty multis when the infomask bits are right, as is done in other callsites. Per trouble report from F-Secure. In passing, fix a copy&paste bug reported by Andrey Karpov from VIVA64 from their PVS-Studio static checked, that instead of setting relminmxid to Invalid, we were setting relfrozenxid twice. Not an important mistake because that code branch is about relations for which we don't use the frozenxid/minmxid values at all in the first place, but seems to warrants a fix nonetheless.	2014-01-10 18:03:18 -03:00
Tom Lane	faab7a957d	Remove unnecessary local variables to work around an icc optimization bug. Buildfarm member dunlin has been crashing since commit `8b49a60`, but other machines seem fine with that code. It turns out that removing the local variables in ordered_set_startup() that are copies of fields in "qstate" dodges the problem. This might cost a few cycles on register-rich machines, but it's probably a wash on others, and in any case this code isn't performance-critical. Thanks to Jeremy Drake for off-list investigation.	2014-01-09 12:59:55 -05:00
Michael Meskes	192b4aacad	Changed regression test to ecpg test suite for alignment problem just with last commit.	2014-01-09 16:20:19 +01:00
Michael Meskes	d685e24249	Fix descriptor output in ECPG. While working on most platforms the old way sometimes created alignment problems. This should fix it. Also the regresion tests were updated to test for the reported case. Report and fix by MauMau <maumau307@gmail.com>	2014-01-09 16:20:19 +01:00
Heikki Linnakangas	c945af80cf	Refactor checking whether we've reached the recovery target. Makes the replay loop slightly more readable, by separating the concerns of whether to stop and whether to delay, and how to extract the timestamp from a record. This has the user-visible change that the timestamp of the last applied record is now updated after actually applying it. Before, it was updated just before applying it. That meant that pg_last_xact_replay_timestamp() could return the timestamp of a commit record that is in process of being replayed, but not yet applied. Normally the difference is small, but if min_recovery_apply_delay is set, there could be a significant delay between reading a record and applying it. Another behavioral change is that if you recover to a restore point, we stop after the restore point record, not before it. It makes no difference as far as running queries on the server is concerned, as applying a restore point record changes nothing, but if examine the timeline history you will see that the new timeline branched off just after the restore point record, not before it. One practical consequence is that if you do PITR to the new timeline, and set recovery target to the same named restore point again, it will find and stop recovery at the same restore point. Conceptually, I think it makes more sense to consider the restore point as part of the new timeline's history than not. In principle, setting the last-replayed timestamp before actually applying the record was a bug all along, but it doesn't seem worth the risk to backpatch, since min_recovery_apply_delay was only added in 9.4.	2014-01-09 14:00:39 +02:00
Peter Eisentraut	10a3b165a3	pgcrypto: Make header files stand alone pgp.h used to require including mbuf.h and px.h first. Include those in pgp.h, so that it can be used without prerequisites. Remove mbuf.h inclusions in .c files where mbuf.h features are not used directly. (px.h was always used.)	2014-01-09 06:44:24 -05:00
Tom Lane	220b34331f	We don't need to include pg_sema.h in s_lock.h anymore. Minor improvement to commit `daa7527afc`: s_lock.h no longer has any need to mention PGSemaphoreData, so we can rip out the #include that supplies that. In a non-HAVE_SPINLOCKS build, this doesn't really buy much since we still need the #include in spin.h --- but everywhere else, this reduces #include footprint by some trifle, and helps keep the different locking facilities separate.	2014-01-08 20:58:22 -05:00
Tom Lane	080b7db72e	Fix "cannot accept a set" error when only some arms of a CASE return a set. In commit `c1352052ef`, I implemented an optimization that assumed that a function's argument expressions would either always return a set (ie multiple rows), or always not. This is wrong however: we allow CASE expressions in which some arms return a set of some type and others just return a scalar of that type. There may be other examples as well. To fix, replace the run-time test of whether an argument returned a set with a static precheck (expression_returns_set). This adds a little bit of query startup overhead, but it seems barely measurable. Per bug #8228 from David Johnston. This has been broken since 8.0, so patch all supported branches.	2014-01-08 20:18:58 -05:00
Robert Haas	daa7527afc	Reduce the number of semaphores used under --disable-spinlocks. Instead of allocating a semaphore from the operating system for every spinlock, allocate a fixed number of semaphores (by default, 1024) from the operating system and multiplex all the spinlocks that get created onto them. This could self-deadlock if a process attempted to acquire more than one spinlock at a time, but since processes aren't supposed to execute anything other than short stretches of straight-line code while holding a spinlock, that shouldn't happen. One motivation for this change is that, with the introduction of dynamic shared memory, it may be desirable to create spinlocks that last for less than the lifetime of the server. Without this change, attempting to use such facilities under --disable-spinlocks would quickly exhaust any supply of available semaphores. Quite apart from that, it's desirable to contain the quantity of semaphores needed to run the server simply on convenience grounds, since using too many may make it harder to get PostgreSQL running on a new platform, which is mostly the point of --disable-spinlocks in the first place. Patch by me; review by Tom Lane.	2014-01-08 18:58:00 -05:00
Heikki Linnakangas	3739e5ab93	Fix pause_at_recovery_target + recovery_target_inclusive combination. If pause_at_recovery_target is set, recovery pauses before applying the target record, even if recovery_target_inclusive is set. If you then continue with pg_xlog_replay_resume(), it will apply the target record before ending recovery. In other words, if you log in while it's paused and verify that the database looks OK, ending recovery changes its state again, possibly destroying data that you were tring to salvage with PITR. Backpatch to 9.1, this has been broken since pause_at_recovery_target was added.	2014-01-08 23:28:52 +02:00
Heikki Linnakangas	815d71deed	If multiple recovery_targets are specified, use the latest one. The docs say that only one of recovery_target_xid, recovery_target_time, or recovery_target_name can be specified. But the code actually did something different, so that a name overrode time, and xid overrode both time and name. Now the target specified last takes effect, whether it's an xid, time or name. With this patch, we still accept multiple recovery_target settings, even though docs say that only one can be specified. It's a general property of the recovery.conf file parser that you if you specify the same option twice, the last one takes effect, like with postgresql.conf.	2014-01-08 22:26:39 +02:00
Tom Lane	847e46abc9	Avoid extra AggCheckCallContext() checks in ordered-set aggregates. In the transition functions, we don't really need to recheck this after the first call. I had been feeling paranoid about possibly getting a non-null argument value in some other context; but it's probably game over anyway if we have a non-null "internal" value that's not what we are expecting. In the final functions, the general convention in pre-existing final functions seems to be that an Assert() is good enough, so do it like that here too. This seems to save a few tenths of a percent of overall query runtime, which isn't much, but still it's just overhead if there's not a plausible case where the checks would fire.	2014-01-08 14:33:52 -05:00
Tom Lane	e6336b8b57	Save a few cycles in advance_transition_function(). Keep a pre-initialized FunctionCallInfoData in AggStatePerAggData, and re-use that at each row instead of doing InitFunctionCallInfoData each time. This saves only half a dozen assignments and maybe some stack manipulation, and yet that seems to be good for a percent or two of the overall query run time for simple aggregates such as count(*). The cost is that the FunctionCallInfoData (which is about a kilobyte, on 64-bit machines) stays allocated for the duration of the query instead of being short-lived stack data. But we're already paying an equivalent space cost for each regular FuncExpr or OpExpr node, so I don't feel bad about paying it for aggregate functions. The code seems a little cleaner this way too, since the number of things passed to advance_transition_function decreases.	2014-01-08 13:58:37 -05:00
Heikki Linnakangas	d59ff6c110	Fix bug in determining when recovery has reached consistency. When starting WAL replay from an online checkpoint, the last replayed WAL record variable was initialized using the checkpoint record's location, even though the records between the REDO location and the checkpoint record had not been replayed yet. That was noted as "slightly confusing" but harmless in the comment, but in some cases, it fooled CheckRecoveryConsistency to incorrectly conclude that we had already reached a consistent state immediately at the beginning of WAL replay. That caused the system to accept read-only connections in hot standby mode too early, and also PANICs with message "WAL contains references to invalid pages". Fix by initializing the variables to the REDO location instead. In 9.2 and above, change CheckRecoveryConsistency() to use lastReplayedEndRecPtr variable when checking if backup end location has been reached. It was inconsistently using EndRecPtr for that check, but lastReplayedEndRecPtr when checking min recovery point. It made no difference before this patch, because in all the places where CheckRecoveryConsistency was called the two variables were the same, but it was always an accident waiting to happen, and would have been wrong after this patch anyway. Report and analysis by Tomonari Katsumata, bug #8686. Backpatch to 9.0, where hot standby was introduced.	2014-01-08 15:03:09 +02:00
Peter Eisentraut	ca607b155e	pg_upgrade: Fix fatal error handling Restore exiting when pg_log(PG_FATAL) is called directly instead of calling pg_fatal(). Fault introduced in `264aa14a2f`.	2014-01-08 07:01:16 -05:00
Bruce Momjian	7e04792a1c	Update copyright for 2014 Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.	2014-01-07 16:05:30 -05:00
Tom Lane	0c051c9008	Fix LATERAL references to target table of UPDATE/DELETE. I failed to think much about UPDATE/DELETE when implementing LATERAL :-(. The implemented behavior ended up being that subqueries in the FROM or USING clause (respectively) could access the update/delete target table as though it were a lateral reference; which seems fine if they said LATERAL, but certainly ought to draw an error if they didn't. Fix it so you get a suitable error when you omit LATERAL. Per report from Emre Hasegeli.	2014-01-07 15:25:27 -05:00
Heikki Linnakangas	f68220df92	Silence compiler warning on MSVC. MSVC doesn't know that elog(ERROR) doesn't return, and gives a warning about missing return. Silence that. Amit Kapila	2014-01-07 21:49:15 +02:00
Magnus Hagander	9544cc0d65	Move permissions check from do_pg_start_backup to pg_start_backup And the same for do_pg_stop_backup. The code in do_pg_* is not allowed to access the catalogs. For manual base backups, the permissions check can be handled in the calling function, and for streaming base backups only users with the required permissions can get past the authentication step in the first place. Reported by Antonin Houska, diagnosed by Andres Freund	2014-01-07 17:50:56 +01:00
Magnus Hagander	b168c5ef27	Avoid including tablespaces inside PGDATA twice in base backups If a tablespace was crated inside PGDATA it was backed up both as part of the PGDATA backup and as the backup of the tablespace. Avoid this by skipping any directory inside PGDATA that contains one of the active tablespaces. Dimitri Fontaine and Magnus Hagander	2014-01-07 17:11:32 +01:00
Peter Eisentraut	edc43458d7	Add more use of psprintf()	2014-01-06 21:30:26 -05:00
Heikki Linnakangas	10a82cda67	Remove bogus -K option from pg_dump. I added it to the getopt call by accident in commit `691e595dd9`. Amit Kapila	2014-01-06 12:30:19 +02:00

1 2 3 4 5 ...

36036 Commits