to eliminate unnecessary deadlocks. This commit adds SELECT ... FOR SHARE
paralleling SELECT ... FOR UPDATE. The implementation uses a new SLRU
data structure (managed much like pg_subtrans) to represent multiple-
transaction-ID sets. When more than one transaction is holding a shared
lock on a particular row, we create a MultiXactId representing that set
of transactions and store its ID in the row's XMAX. This scheme allows
an effectively unlimited number of row locks, just as we did before,
while not costing any extra overhead except when a shared lock actually
has to be shared. Still TODO: use the regular lock manager to control
the grant order when multiple backends are waiting for a row lock.
Alvaro Herrera and Tom Lane.
< * Allow ORDER BY ... LIMIT 1 to select high/low value without sort or
> * Allow ORDER BY ... LIMIT # to select high/low value without sort or
868c868
< Right now, if no index exists, ORDER BY ... LIMIT 1 requires we sort
> Right now, if no index exists, ORDER BY ... LIMIT # requires we sort
870a871
> MIN/MAX already does this, but not for LIMIT > 1.
> * Allow ORDER BY ... LIMIT 1 to select high/low value without sort or
> index using a sequential scan for highest/lowest values
>
> Right now, if no index exists, ORDER BY ... LIMIT 1 requires we sort
> all values to return the high/low value. Instead The idea is to do a
> sequential scan to find the high/low value, thus avoiding the sort.
>
or bitmap), use pred_test to be a little smarter about cases where a
filter clause is logically unnecessary. This may be overkill for the
plain indexscan case, but it's definitely useful for OR'd bitmap scans.
> One possible implementation is to start sequential scans from the lowest
> numbered buffer in the shared cache, and when reaching the end wrap
> around to the beginning, rather than always starting sequential scans
> at the start of the table.
node, as this behavior is now better done as a bitmap OR indexscan.
This allows considerable simplification in nodeIndexscan.c itself as
well as several planner modules concerned with indexscan plan generation.
Also we can improve the sharing of code between regular and bitmap
indexscans, since they are now working with nigh-identical Plan nodes.
but just to open and close it during MultiExecBitmapIndexScan. This
avoids acquiring duplicate resources (eg, multiple locks on the same
relation) in a tree with many bitmap scans. Also, don't bother to
lock the parent heap at all here, since we must be underneath a
BitmapHeapScan node that will be holding a suitable lock.
< This allows vacuum to reclaim free space without requiring
< a sequential scan
> This allows vacuum to target specific pages for possible free space
> without requiring a sequential scan.
of timetz values misbehaved in --enable-integer-datetime cases, and
EXTRACT(EPOCH) subtracted the zone instead of adding it in all cases.
Backpatch to all supported releases (except --enable-integer-datetime code
does not exist in 7.2).
As I pointed out a few days ago, this code has failed to do anything useful
for some time ... and if we did want to revive the capability to select
functions by nearness of inheritance ancestry, this is the wrong place
and way to do it anyway. The knowledge would need to go into
func_select_candidate() instead. Perhaps someday someone will be motivated
to do that, but I am not today.
< * Consider parallel processing a single query
<
< This would involve using multiple threads or processes to do optimization,
< sorting, or execution of single query. The major advantage of such a
< feature would be to allow multiple CPUs to work together to process a
< single query.
<
< * Allow ORDER BY ... LIMIT 1 to select high/low value without sort or
< index using a sequential scan for highest/lowest values
<
< If only one value is needed, there is no need to sort the entire
< table. Instead a sequential scan could get the matching value.
<
< Solaris) might benefit from threading.
> Solaris) might benefit from threading. Also explore the idea of
> a single session using multiple threads to execute a query faster.
ExprContexts will be freed anyway when FreeExecutorState() is reached,
and letting that routine do the work is more efficient because it will
automatically free the ExprContexts in reverse creation order. The
existing coding was effectively freeing them in exactly the worst
possible order, resulting in O(N^2) behavior inside list_delete_ptr,
which becomes highly visible in cases with a few thousand plan nodes.
ExecFreeExprContext is now effectively a no-op and could be removed,
but I left it in place in case we ever want to put it back to use.