postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2025-02-11 19:20:40 +08:00

Author	SHA1	Message	Date
Tom Lane	ed7eff89fd	Guard against using plperl's Makefile without specifying --with-perl. The $(PERL) macro will be set by configure if it finds perl at all, but $(perl_privlibexp) isn't configured unless you said --with-perl. This results in confusing error messages if someone cd's into src/pl/plperl and tries to build there despite the configure omission, as reported by Tomas Vondra in bug #6198. Add simple checks to provide a more useful report, while not disabling other use of the makefile such as "make clean". Back-patch to 9.0, which is as far as the patch applies easily.	2011-09-04 20:07:42 -04:00
Tom Lane	0962182f01	Fix typo in pg_srand48 (srand48 in older branches). ">" should be ">>". This typo results in failure to use all of the bits of the provided seed. This might rise to the level of a security bug if we were relying on srand48 for any security-critical purposes, but we are not --- in fact, it's not used at all unless the platform lacks srandom(), which is improbable. Even on such a platform the exposure seems minimal. Reported privately by Andres Freund.	2011-09-03 16:17:44 -04:00
Michael Meskes	2cda30e757	Fix brace indentation of commit `f8c7442201` to fit PostgreSQL style.	2011-09-02 09:48:19 +02:00
Michael Meskes	f8c7442201	In ecpglib restore LC_NUMERIC in case of an error.	2011-09-01 15:31:16 +02:00
Heikki Linnakangas	a02e409904	Move the line to undefine setlocale() macro on Win32 outside USE_REPL_SNPRINTF ifdef block. It has nothing to do with whether the replacement snprintf function is used. It caused no live bug, because the replacement snprintf function is always used on Win32, but it was nevertheless misplaced.	2011-09-01 09:18:27 +03:00
Tom Lane	3505862a8d	Further repair of eqjoinsel ndistinct-clamping logic. Examination of examples provided by Mark Kirkwood and others has convinced me that actually commit `7f3eba30c9` was quite a few bricks shy of a load. The useful part of that patch was clamping ndistinct for the inner side of a semi or anti join, and the reason why that's needed is that it's the only way that restriction clauses eliminating rows from the inner relation can affect the estimated size of the join result. I had not clearly understood why the clamping was appropriate, and so mis-extrapolated to conclude that we should clamp ndistinct for the outer side too, as well as for both sides of regular joins. These latter actions were all wrong, and are reverted with this patch. In addition, the clamping logic is now made to affect the behavior of both paths in eqjoinsel_semi, with or without MCV lists to compare. When we have MCVs, we suppose that the most common values are the ones that are most likely to survive the decimation resulting from a lower restriction clause, so we think of the clamping as eliminating non-MCV values, or potentially even the least-common MCVs for the inner relation. Back-patch to 8.4, same as previous fixes in this area.	2011-09-01 00:20:05 -04:00
Bruce Momjian	e724b969d8	Fix pg_upgrade to preserve toast relfrozenxids for old 8.3 servers. This fixes a pg_upgrade bug that could lead to query errors when clog files are improperly removed. Backpatch to 8.4, 9.0, 9.1.	2011-08-31 21:50:00 -04:00
Tom Lane	53434c6f0d	Improve eqjoinsel's ndistinct clamping to work for multiple levels of join. This patch fixes an oversight in my commit `7f3eba30c9` of 2008-10-23. That patch accounted for baserel restriction clauses that reduced the number of rows coming out of a table (and hence the number of possibly-distinct values of a join variable), but not for join restriction clauses that might have been applied at a lower level of join. To account for the latter, look up the sizes of the min_lefthand and min_righthand inputs of the current join, and clamp with those in the same way as for the base relations. Noted while investigating a complaint from Ben Chobot, although this in itself doesn't seem to explain his report. Back-patch to 8.4; previous versions used different estimation methods for which this heuristic isn't relevant.	2011-08-31 16:04:58 -04:00
Tom Lane	047f205f4e	Fix a missed case in code for "moving average" estimate of reltuples. It is possible for VACUUM to scan no pages at all, if the visibility map shows that all pages are all-visible. In this situation VACUUM has no new information to report about the relation's tuple density, so it wasn't changing pg_class.reltuples ... but it updated pg_class.relpages anyway. That's wrong in general, since there is no evidence to justify changing the density ratio reltuples/relpages, but it's particularly bad if the previous state was relpages=reltuples=0, which means "unknown tuple density". We just replaced "unknown" with "zero". ANALYZE would eventually recover from this, but it could take a lot of repetitions of ANALYZE to do so if the relation size is much larger than the maximum number of pages ANALYZE will scan, because of the moving-average behavior introduced by commit `b4b6923e03`. The only known situation where we could have relpages=reltuples=0 and yet the visibility map asserts everything's visible is immediately following a pg_upgrade. It might be advisable for pg_upgrade to try to preserve the relpages/reltuples statistics; but in any case this code is wrong on its own terms, so fix it. Per report from Sergey Koposov. Back-patch to 8.4, where the visibility map was introduced, same as the previous change.	2011-08-30 14:49:57 -04:00
Tom Lane	2de0fdeb68	Actually, all of parallel restore's limitations should be tested earlier. On closer inspection, whining in restore_toc_entries_parallel is really much too late for any user-facing error case. The right place to do it is at the start of RestoreArchive(), before we've done anything interesting (suh as trying to DROP all the targets ...) Back-patch to 8.4, where parallel restore was introduced.	2011-08-28 22:28:10 -04:00
Tom Lane	fbf776a2eb	Be more user-friendly about unsupported cases for parallel pg_restore. If we are unable to do a parallel restore because the input file is stdin or is otherwise unseekable, we should complain and fail immediately, not after having done some of the restore. Complaining once per thread isn't so cool either, and the messages should be worded to make it clear this is an unsupported case not some weird race-condition bug. Per complaint from Lonni Friedman. Back-patch to 8.4, where parallel restore was introduced.	2011-08-28 21:49:21 -04:00
Tom Lane	42de04f6ae	Don't assume that "E" response to NEGOTIATE_SSL_CODE means pre-7.0 server. These days, such a response is far more likely to signify a server-side problem, such as fork failure. Reporting "server does not support SSL" (in sslmode=require) could be quite misleading. But the results could be even worse in sslmode=prefer: if the problem was transient and the next connection attempt succeeds, we'll have silently fallen back to protocol version 2.0, possibly disabling features the user needs. Hence, it seems best to just eliminate the assumption that backing off to non-SSL/2.0 protocol is the way to recover from an "E" response, and instead treat the server error the same as we would in non-SSL cases. I tested this change against a pre-7.0 server, and found that there was a second logic bug in the "prefer" path: the test to decide whether to make a fallback connection attempt assumed that we must have opened conn->ssl, which in fact does not happen given an "E" response. After fixing that, the code does indeed connect successfully to pre-7.0, as long as you didn't set sslmode=require. (If you did, you get "Unsupported frontend protocol", which isn't completely off base given the server certainly doesn't support SSL.) Since there seems no reason to believe that pre-7.0 servers exist anymore in the wild, back-patch to all supported branches.	2011-08-27 16:37:08 -04:00
Tom Lane	431b638045	Ensure we discard unread/unsent data when abandoning a connection attempt. There are assorted situations wherein PQconnectPoll() will abandon a connection attempt and try again with different parameters (eg, SSL versus not SSL). However, the code forgot to discard any pending data in libpq's I/O buffers when doing this. In at least one case (server returns E message during SSL negotiation), there is unread input data which bollixes the next connection attempt. I have not checked to see whether this is possible in the other cases where we close the socket and retry, but it seems like a matter of good defensive programming to add explicit buffer-flushing code to all of them. This is one of several issues exposed by Daniel Farina's report of misbehavior after a server-side fork failure. This has been wrong since forever, so back-patch to all supported branches.	2011-08-27 14:16:25 -04:00
Tom Lane	20139f4f1c	Fix potential memory clobber in tsvector_concat(). tsvector_concat() allocated its result workspace using the "conservative" estimate of the sum of the two input tsvectors' sizes. Unfortunately that wasn't so conservative as all that, because it supposed that the number of pad bytes required could not grow. Which it can, as per test case from Jesper Krogh, if there's a mix of lexemes with positions and lexemes without them in the input data. The fix is to assume that we might add a not-previously-present pad byte for each and every lexeme in the two inputs; which really is conservative, but it doesn't seem worthwhile to try to be more precise. This is an aboriginal bug in tsvector_concat, so back-patch to all versions containing it.	2011-08-26 16:51:46 -04:00
Bruce Momjian	df957a79cc	In pg_upgrade, limit schema name filter to include toast tables. Bug introduced recently when trying to filter out temp tables. Backpatch to 9.0 and 9.1.	2011-08-26 00:12:39 -04:00
Tom Lane	9354f5b76a	Fix psql lexer to avoid use of backtracking. Per previous experimentation, backtracking slows down lexing performance significantly (by about a third). It's usually pretty easy to avoid, just need to have rules that accept an incomplete construct and do whatever the lexer would have done otherwise. The backtracking was introduced by the patch that added quoted variable substitution. Back-patch to 9.0 where that was added.	2011-08-25 14:33:37 -04:00
Robert Haas	b7cd5c836a	Properly quote SQL/MED generic options in pg_dump output. Shigeru Hanada	2011-08-25 12:38:06 -04:00
Tom Lane	8a32c94658	Fix pgstatindex() to give consistent results for empty indexes. For an empty index, the pgstatindex() function would compute 0.0/0.0 for its avg_leaf_density and leaf_fragmentation outputs. On machines that follow the IEEE float arithmetic standard with any care, that results in a NaN. However, per report from Rushabh Lathia, Microsoft couldn't manage to get this right, so you'd get a bizarre error on Windows. Fix by forcing the results to be NaN explicitly, rather than relying on the division operator to give that or the snprintf function to print it correctly. I have some doubts that this is really the most useful definition, but it seems better to remain backward-compatible with those platforms for which the behavior wasn't completely broken. Back-patch to 8.2, since the code is like that in all current releases.	2011-08-24 23:50:20 -04:00
Heikki Linnakangas	7ec0258091	Add recovery.conf to the index in the user manual. Fujii Masao	2011-08-23 11:57:43 +03:00
Tom Lane	52120ee834	Fix trigger WHEN conditions when both BEFORE and AFTER triggers exist. Due to tuple-slot mismanagement, evaluation of WHEN conditions for AFTER ROW UPDATE triggers could crash if there had been a BEFORE ROW trigger fired for the same update. Fix by not trying to overload the use of estate->es_trig_tuple_slot. Per report from Yoran Heling. Back-patch to 9.0, when trigger WHEN conditions were introduced.	2011-08-21 18:16:08 -04:00
Tom Lane	706493a1f7	Fix performance problem when building a lossy tidbitmap. As pointed out by Sergey Koposov, repeated invocations of tbm_lossify can make building a large tidbitmap into an O(N^2) operation. To fix, make sure we remove more than the minimum amount of information per call, and add a fallback path to behave sanely if we're unable to fit the bitmap within the requested amount of memory. This has been wrong since the tidbitmap code was written, so back-patch to all supported branches.	2011-08-20 14:51:32 -04:00
Peter Eisentraut	d4c24254fa	Change PyInit_plpy to external linkage Module initialization functions in Python 3 must have external linkage, because PyMODINIT_FUNC does dllexport on Windows-like platforms. Without this change, the build with Python 3 fails on Windows.	2011-08-18 13:47:35 +03:00
Tom Lane	1853e120f7	Forget about targeting catalog cache invalidations by tuple TID. The TID isn't stable enough: we might queue an sinval event before a VACUUM FULL, and then process it afterwards, when the target tuple no longer has the same TID. So we must invalidate entries on the basis of hash value only. The old coding can be shown to result in various bizarre, hard-to-reproduce errors in the presence of concurrent VACUUM FULLs on system catalogs, and could easily result in permanent catalog corruption, up to and including complete loss of tables. This commit is just a minimal fix that removes the unsafe comparison. We should remove transmission of the tuple TID from sinval messages altogether, and then arrange to suppress the extra message in the common case of a heap_update that doesn't change the key hashvalue. But that's going to be much more invasive, and will only produce a probably-marginal performance gain, so it doesn't seem like material for a back-patch. Back-patch to 9.0. Before that, VACUUM FULL refused to do any tuple moving if it found any INSERT_IN_PROGRESS or DELETE_IN_PROGRESS tuples (and CLUSTER would give up altogether), so there was no risk of moving a tuple that might be the subject of an unsent sinval message.	2011-08-16 15:26:35 -04:00
Tom Lane	38ef2e2fba	Fix incorrect order of operations during sinval reset processing. We have to be sure that we have revalidated each nailed-in-cache relcache entry before we try to use it to load data for some other relcache entry. The introduction of "mapped relations" in 9.0 broke this, because although we updated the state kept in relmapper.c early enough, we failed to propagate that information into relcache entries soon enough; in particular, we could try to fetch pg_class rows out of pg_class before we'd updated its relcache entry's rd_node.relNode value from the map. This bug accounts for Dave Gould's report of failures after "vacuum full pg_class", and I believe that there is risk for other system catalogs as well. The core part of the fix is to copy relmapper data into the relcache entries during "phase 1" in RelationCacheInvalidate(), before they'll be used in "phase 2". To try to future-proof the code against other similar bugs, I also rearranged the order in which nailed relations are visited during phase 2: now it's pg_class first, then pg_class_oid_index, then other nailed relations. This should ensure that RelationClearRelation can apply RelationReloadIndexInfo to all nailed indexes without risking use of not-yet-revalidated relcache entries. Back-patch to 9.0 where the relation mapper was introduced.	2011-08-16 14:38:35 -04:00
Tom Lane	44b6d53b46	Preserve toast value OIDs in toast-swap-by-content for CLUSTER/VACUUM FULL. This works around the problem that a catalog cache entry might contain a toast pointer that we try to dereference just as a VACUUM FULL completes on that catalog. We will see the sinval message on the cache entry when we acquire lock on the toast table, but by that point we've already told tuptoaster.c "here's the pointer to fetch", so it's difficult from a code structural standpoint to update the pointer before we use it. Much less painful to ensure that toast pointers are not invalidated in the first place. We have to add a bit of code to deal with the case that a value that previously wasn't toasted becomes so; but that should be a seldom-exercised corner case, so the inefficiency shouldn't be significant. Back-patch to 9.0. In prior versions, we didn't allow CLUSTER on system catalogs, and VACUUM FULL didn't result in reassignment of toast OIDs, so there was no problem.	2011-08-16 13:48:16 -04:00
Tom Lane	93519b0c62	Fix race condition in relcache init file invalidation. The previous code tried to synchronize by unlinking the init file twice, but that doesn't actually work: it leaves a window wherein a third process could read the already-stale init file but miss the SI messages that would tell it the data is stale. The result would be bizarre failures in catalog accesses, typically "could not read block 0 in file ..." later during startup. Instead, hold RelCacheInitLock across both the unlink and the sending of the SI messages. This is more straightforward, and might even be a bit faster since only one unlink call is needed. This has been wrong since it was put in (in 2002!), so back-patch to all supported releases.	2011-08-16 13:12:10 -04:00
Bruce Momjian	f239ec5727	In pg_upgrade, avoid dumping orphaned temporary tables. This makes the pg_upgrade schema matching pattern match pg_dump/pg_dumpall. Fix for 9.0, 9.1, and 9.2.	2011-08-15 22:39:38 -04:00
Tom Lane	5707f35559	Fix unsafe order of operations in foreign-table DDL commands. When updating or deleting a system catalog tuple, it's necessary to acquire RowExclusiveLock on the catalog before looking up the tuple; otherwise a concurrent VACUUM FULL on the catalog might move the tuple to a different TID before we can apply the update. Coding patterns that find the tuple via a table scan aren't at risk here, but when obtaining the tuple from a catalog cache, correct ordering is important; and several routines in foreigncmds.c got it wrong. Noted while running the regression tests in parallel with VACUUM FULL of assorted system catalogs. For consistency I moved all the heap_open calls to the starts of their functions, including a couple for which there was no actual bug. Back-patch to 8.4 where foreigncmds.c was added.	2011-08-14 15:40:36 -04:00
Tom Lane	739cbdd9c3	Fix incorrect timeout handling during initial authentication transaction. The statement start timestamp was not set before initiating the transaction that is used to look up client authentication information in pg_authid. In consequence, enable_sig_alarm computed a wrong value (far in the past) for statement_fin_time. That didn't have any immediate effect, because the timeout alarm was set without reference to statement_fin_time; but if we subsequently blocked on a lock for a short time, CheckStatementTimeout would consult the bogus value when we cancelled the lock timeout wait, and then conclude we'd timed out, leading to immediate failure of the connection attempt. Thus an innocent "vacuum full pg_authid" would cause failures of concurrent connection attempts. Noted while testing other, more serious consequences of vacuum full on system catalogs. We should set the statement timestamp before StartTransactionCommand(), so that the transaction start timestamp is also valid. I'm not sure if there are any non-cosmetic effects of it not being valid, but the xact timestamp is at least sent to the statistics machinery. Back-patch to 9.0. Before that, the client authentication timeout was done outside any transaction and did not depend on this state to be valid.	2011-08-13 17:52:52 -04:00
Tom Lane	8a14bdb10f	Fix nested PlaceHolderVar expressions that appear only in targetlists. A PlaceHolderVar's expression might contain another, lower-level PlaceHolderVar. If the outer PlaceHolderVar is used, the inner one certainly will be also, and so we have to make sure that both of them get into the placeholder_list with correct ph_may_need values during the initial pre-scan of the query (before deconstruct_jointree starts). We did this correctly for PlaceHolderVars appearing in the query quals, but overlooked the issue for those appearing in the top-level targetlist; with the result that nested placeholders referenced only in the targetlist did not work correctly, as illustrated in bug #6154. While at it, add some error checking to find_placeholder_info to ensure that we don't try to create new placeholders after it's too late to do so; they have to all be created before deconstruct_jointree starts. Back-patch to 8.4 where the PlaceHolderVar mechanism was introduced.	2011-08-09 00:49:04 -04:00
Tom Lane	f60078232d	Fix thinko in documentation of local_preload_libraries. Somebody added a cross-reference to shared_preload_libraries, but wrote the wrong variable name when they did it (and didn't bother to make it a link either). Spotted by Christoph Anton Mitterer.	2011-08-05 21:18:23 -04:00
Bruce Momjian	082f906334	Fix markup for recent wal_level clarification. Backpatch to 9.1 and 9.0.	2011-08-04 15:02:03 -04:00
Bruce Momjian	072e6076d1	In documentaiton, clarify which commands have reduced WAL volume for wal_level = minimum. Backpatch to 9.1 and 9.0.	2011-08-04 12:06:54 -04:00
Tom Lane	d3061f036d	Move CheckRecoveryConflictDeadlock() call to a safer place. This kluge was inserted in a spot apparently chosen at random: the lock manager's state is not yet fully set up for the wait, and in particular LockWaitCancel hasn't been armed by setting lockAwaited, so the ProcLock will not get cleaned up if the ereport is thrown. This seems to not cause any observable problem in trivial test cases, because LockReleaseAll will silently clean up the debris; but I was able to cause failures with tests involving subtransactions. Fixes breakage induced by commit `c85c941470`. Back-patch to all affected branches.	2011-08-02 15:16:44 -04:00
Tom Lane	0f904c95a4	Fix incorrect initialization of ProcGlobal->startupBufferPinWaitBufId. It was initialized in the wrong place and to the wrong value. With bad luck this could result in incorrect query-cancellation failures in hot standby sessions, should a HS backend be holding pin on buffer number 1 while trying to acquire a lock.	2011-08-02 13:24:06 -04:00
Heikki Linnakangas	f00fbad6bd	Avoid integer overflow when LIMIT + OFFSET >= 2^63. This fixes bug #6139 reported by Hitoshi Harada.	2011-08-02 11:30:38 +03:00
Tom Lane	78e957dd46	Fix pg_restore's direct-to-database mode for standard_conforming_strings. pg_backup_db.c contained a mini SQL lexer with which it tried to identify boundaries between SQL commands, but that code was not designed to cope with standard_conforming_strings, and would get the wrong answer if a backslash immediately precedes a closing single quote in such a string, as per report from Julian Mehnle. The bug only affects direct-to-database restores from archive files made with standard_conforming_strings = on. Rather than complicating the code some more to try to fix that, let's just rip it all out. The only reason it was needed was to cope with COPY data embedded into ordinary archive entries, which was a layout that was used only for about the first three weeks of the archive format's existence, and never in any production release of pg_dump. Instead, just rely on the archive file layout to tell us whether we're printing COPY data or not. This bug represents a data corruption hazard in all releases in which standard_conforming_strings can be turned on, ie 8.2 and later, so back-patch to all supported branches.	2011-07-28 14:07:09 -04:00
Robert Haas	bc9d2e7c4a	Fix typo. Noted by Josh Kupershmidt.	2011-07-27 11:21:05 -04:00
Peter Eisentraut	9df8ce8482	Add missing newlines at end of error messages	2011-07-26 23:28:44 +03:00
Robert Haas	6f8f9c2bdd	Clarify which relkinds accept column comments. Per discussion with Josh Kupershmidt.	2011-07-26 09:38:33 -04:00
Tom Lane	65c033cbe9	Fix previous patch so it also works if not USE_SSL (mea culpa). On balance, the need to cover this case changes my mind in favor of pushing all error-message generation duties into the two fe-secure.c routines. So do it that way.	2011-07-24 23:29:15 -04:00
Tom Lane	77e4fd5c4a	Improve libpq's error reporting for SSL failures. In many cases, pqsecure_read/pqsecure_write set up useful error messages, which were then overwritten with useless ones by their callers. Fix this by defining the responsibility to set an error message to be entirely that of the lower-level function when using SSL. Back-patch to 8.3; the code is too different in 8.2 to be worth the trouble.	2011-07-24 16:29:18 -04:00
Tom Lane	f0dadcc60b	Use OpenSSL's SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER flag. This disables an entirely unnecessary "sanity check" that causes failures in nonblocking mode, because OpenSSL complains if we move or compact the write buffer. The only actual requirement is that we not modify pending data once we've attempted to send it, which we don't. Per testing and research by Martin Pihlak, though this fix is a lot simpler than his patch. I put the same change into the backend, although it's less clear whether it's necessary there. We do use nonblock mode in some situations in streaming replication, so seems best to keep the same behavior in the backend as in libpq. Back-patch to all supported releases.	2011-07-24 15:18:02 -04:00
Tom Lane	fe0e1a633a	Fix PQsetvalue() to avoid possible crash when adding a new tuple. PQsetvalue unnecessarily duplicated the logic in pqAddTuple, and didn't duplicate it exactly either --- pqAddTuple does not care what is in the tuple-pointer array positions beyond the last valid entry, whereas the code in PQsetvalue assumed such positions would contain NULL. This led to possible crashes if PQsetvalue was applied to a PGresult that had previously been enlarged with pqAddTuple, for instance one built from a server query. Fix by relying on pqAddTuple instead of duplicating logic, and not assuming anything about the contents of res->tuples[res->ntups]. Back-patch to 8.4, where PQsetvalue was introduced. Andrew Chernow	2011-07-21 12:25:01 -04:00
Bruce Momjian	431b7b84fe	In pg_upgrade, fix the -l/log option to work on Windows. Also, double-quote the log file name in all places, to allow (on all platforms) log file names with spaces. Back patch to 9.0 and 9.1.	2011-07-20 18:31:08 -04:00
Michael Meskes	3089a3a101	Adapted expected result for latest change to ecpglib.	2011-07-18 19:03:51 +02:00
Michael Meskes	77a7a57f7f	Made ecpglib write double with a precision of 15 digits. Patch originally by Akira Kurosawa <kurosawa-akira@mxc.nes.nec.co.jp>.	2011-07-18 16:29:59 +02:00
Magnus Hagander	d662e3970d	Fix SSPI login when multiple roundtrips are required This fixes SSPI login failures showing "The function requested is not supported", often showing up when connecting to localhost. The reason was not properly updating the SSPI handle when multiple roundtrips were required to complete the authentication sequence. Report and analysis by Ahmed Shinwari, patch by Magnus Hagander	2011-07-16 20:01:47 +02:00
Heikki Linnakangas	75f386df50	Fix two ancient bugs in GiST code to re-find a parent after page split: First, when following a right-link, we incorrectly marked the current page as the parent of the right sibling. In reality, the parent of the right page is the same as the parent of the current page (or some page to the right of it, gistFindCorrectParent() will sort that out). Secondly, when we follow a right-link, we must prepend, not append, the right page to our list of pages to visit. That's because we assume that once we hit a leaf page in the list, all the rest are leaf pages too, and give up. To hit these bugs, you need concurrent actions and several unlucky accidents. Another backend must split the root page, while you're in process of splitting a lower-level page. Furthermore, while you scan the internal nodes to re-find the parent, another backend needs to again split some more internal pages. Even then, the bugs don't necessarily manifest as user-visible errors or index corruption. While we're at it, make the error reporting a bit better if gistFindPath() fails to re-find the parent. It used to be an assertion, but an elog() seems more appropriate. Backpatch to all supported branches.	2011-07-15 11:05:37 +03:00
Tom Lane	0dd46a7766	In planner, don't assume that empty parent tables aren't really empty. There's a heuristic in estimate_rel_size() to clamp the minimum size estimate for a table to 10 pages, unless we can see that vacuum or analyze has been run (and set relpages to something nonzero, so this will always happen for a table that's actually empty). However, it would be better not to do this for inheritance parent tables, which very commonly are really empty and can be expected to stay that way. Per discussion of a recent pgsql-performance report from Anish Kejariwal. Also prevent it from happening for indexes (although this is more in the nature of documentation, since CREATE INDEX normally initializes relpages to something nonzero anyway). Back-patch to 9.0, because the ability to collect statistics across a whole inheritance tree has improved the planner's estimates to the point where this relatively small error makes a significant difference. In the referenced report, merge or hash joins were incorrectly estimated as cheaper than a nestloop with inner indexscan on the inherited table. That was less likely before 9.0 because the lack of inherited stats would have resulted in a default (and rather pessimistic) estimate of the cost of a merge or hash join.	2011-07-14 17:31:25 -04:00

1 2 3 4 5 ...

30711 Commits