Update to describe new set of globally-known contexts planned for support

of extended query features in new FE/BE protocol.  TransactionCommandContext
is gone (PortalContext replaces it for some purposes), and QueryContext
has taken on a new meaning (MessageContext plays its old role).
This commit is contained in:
Tom Lane 2003-04-30 19:04:12 +00:00
parent aa282d4446
commit 0c57d69dd7

View File

@ -1,4 +1,4 @@
$Header: /cvsroot/pgsql/src/backend/utils/mmgr/README,v 1.3 2001/02/15 21:38:26 tgl Exp $ $Header: /cvsroot/pgsql/src/backend/utils/mmgr/README,v 1.4 2003/04/30 19:04:12 tgl Exp $
Notes about memory allocation redesign Notes about memory allocation redesign
-------------------------------------- --------------------------------------
@ -110,109 +110,121 @@ children of a given context, but don't reset or delete that context
itself". itself".
Top-level contexts Globally known contexts
------------------ -----------------------
There will be several top-level contexts --- these contexts have no parent There will be several widely-known contexts that will typically be
and will be referenced by global variables. At any instant the system may referenced through global variables. At any instant the system may
contain many additional contexts, but all other contexts should be direct contain many additional contexts, but all other contexts should be direct
or indirect children of one of the top-level contexts to ensure they are or indirect children of one of these contexts to ensure they are not
not leaked in event of an error. I presently envision these top-level leaked in event of an error.
contexts:
TopMemoryContext --- allocating here is essentially the same as "malloc", TopMemoryContext --- this is the actual top level of the context tree;
because this context will never be reset or deleted. This is for stuff every other context is a direct or indirect child of this one. Allocating
that should live forever, or for stuff that you know you will delete here is essentially the same as "malloc", because this context will never
at the appropriate time. An example is fd.c's tables of open files, be reset or deleted. This is for stuff that should live forever, or for
as well as the context management nodes for memory contexts themselves. stuff that the controlling module will take care of deleting at the
Avoid allocating stuff here unless really necessary, and especially appropriate time. An example is fd.c's tables of open files, as well as
avoid running with CurrentMemoryContext pointing here. the context management nodes for memory contexts themselves. Avoid
allocating stuff here unless really necessary, and especially avoid
running with CurrentMemoryContext pointing here.
PostmasterContext --- this is the postmaster's normal working context. PostmasterContext --- this is the postmaster's normal working context.
After a backend is spawned, it can delete PostmasterContext to free its After a backend is spawned, it can delete PostmasterContext to free its
copy of memory the postmaster was using that it doesn't need. (Anything copy of memory the postmaster was using that it doesn't need. (Anything
that has to be passed from postmaster to backends will be passed in that has to be passed from postmaster to backends will be passed in
TopMemoryContext. The postmaster will probably have only TopMemoryContext, TopMemoryContext. The postmaster will have only TopMemoryContext,
PostmasterContext, and possibly ErrorContext --- the remaining top-level PostmasterContext, and ErrorContext --- the remaining top-level contexts
contexts will be set up in each backend during startup.) will be set up in each backend during startup.)
CacheMemoryContext --- permanent storage for relcache, catcache, and CacheMemoryContext --- permanent storage for relcache, catcache, and
related modules. This will never be reset or deleted, either, so it's related modules. This will never be reset or deleted, either, so it's
not truly necessary to distinguish it from TopMemoryContext. But it not truly necessary to distinguish it from TopMemoryContext. But it
seems worthwhile to maintain the distinction for debugging purposes. seems worthwhile to maintain the distinction for debugging purposes.
(Note: CacheMemoryContext may well have child-contexts with shorter (Note: CacheMemoryContext will have child-contexts with shorter lifespans.
lifespans. For example, a child context seems like the best place to For example, a child context is the best place to keep the subsidiary
keep the subsidiary storage associated with a relcache entry; that way storage associated with a relcache entry; that way we can free rule
we can free rule parsetrees and so forth easily, without having to depend parsetrees and so forth easily, without having to depend on constructing
on constructing a reliable version of freeObject().) a reliable version of freeObject().)
QueryContext --- this is where the storage holding a received query string MessageContext --- this context holds the current command message from the
is kept, as well as storage that should live as long as the query string, frontend, as well as any derived storage that need only live as long as
notably the parsetree constructed from it. This context will be reset at the current message (for example, in simple-Query mode the parse and plan
the top of each cycle of the outer loop of PostgresMain, thereby freeing trees can live here). This context will be reset, and any children
the old query and parsetree. We must keep this separate from deleted, at the top of each cycle of the outer loop of PostgresMain. This
TopTransactionContext because a query string might need to live either a is kept separate from per-transaction and per-portal contexts because a
longer or shorter time than a transaction, depending on whether it query string might need to live either a longer or shorter time than any
contains begin/end commands or not. (This'll also fix the nasty bug that single transaction or portal.
"vacuum; anything else" crashes if submitted as a single query string,
because vacuum's xact commit frees the memory holding the parsetree...)
TopTransactionContext --- this holds everything that lives until end of TopTransactionContext --- this holds everything that lives until end of
transaction (longer than one statement within a transaction!). An example transaction (longer than one statement within a transaction!). An example
of what has to be here is the list of pending NOTIFY messages to be sent of what has to be here is the list of pending NOTIFY messages to be sent
at xact commit. This context will be reset, and all its children deleted, at xact commit. This context will be reset, and all its children deleted,
at conclusion of each transaction cycle. Note: presently I envision that at conclusion of each transaction cycle. Note: this context is NOT
this context will NOT be cleared immediately upon error; its contents cleared immediately upon error; its contents will survive until the
will survive anyway until the transaction block is exited by transaction block is exited by COMMIT/ROLLBACK.
COMMIT/ROLLBACK. This seems appropriate since we want to move in the (If we ever implement nested transactions, TopTransactionContext may need
direction of allowing a transaction to continue processing after an error. to be split into a true "top" pointer and a "current transaction" pointer.)
TransactionCommandContext --- this is really a child of QueryContext --- this is not actually a separate context, but a global
TopTransactionContext, not a top-level context, but we'll probably store a variable pointing to the context that holds the current command's parse
link to it in a global variable anyway for convenience. All the memory and plan trees. (In simple-Query mode this points to MessageContext;
allocated during planning and execution lives here or in a child context. when executing a prepared statement it will point at the prepared
This context is deleted at statement completion, whether normal completion statement's private context.) Generally it is not appropriate for any
or error abort. code to use QueryContext as an allocation target --- from the point of
view of any code that would be referencing the QueryContext variable,
it's a read-only context.
ErrorContext --- this permanent context will be switched into PortalContext --- this is not actually a separate context either, but a
for error recovery processing, and then reset on completion of recovery. global variable pointing to the per-portal context of the currently active
We'll arrange to have, say, 8K of memory available in it at all times. execution portal. This can be used if it's necessary to allocate storage
In this way, we can ensure that some memory is available for error that will live just as long as the execution of the current portal requires.
recovery even if the backend has run out of memory otherwise. This should
allow out-of-memory to be treated as a normal ERROR condition, not a FATAL
error.
If we ever implement nested transactions, there may need to be some ErrorContext --- this permanent context will be switched into for error
additional levels of transaction-local contexts between recovery processing, and then reset on completion of recovery. We'll
TopTransactionContext and TransactionCommandContext, but that's beyond arrange to have, say, 8K of memory available in it at all times. In this
the scope of this proposal. way, we can ensure that some memory is available for error recovery even
if the backend has run out of memory otherwise. This allows out-of-memory
to be treated as a normal ERROR condition, not a FATAL error.
Contexts for prepared statements and portals
--------------------------------------------
A prepared-statement object has an associated private context, in which
the parse and plan trees for its query are stored. Because these trees
are read-only to the executor, the prepared statement can be re-used many
times without further copying of these trees. QueryContext points at this
private context while executing any portal built from the prepared
statement.
An execution-portal object has a private context that is referenced by
PortalContext when the portal is active. In the case of a portal created
by DECLARE CURSOR, this private context contains the query parse and plan
trees (there being no other object that can hold them). Portals created
from prepared statements simply reference the prepared statements' trees,
and won't actually need any storage allocated in their private contexts.
Transient contexts during execution Transient contexts during execution
----------------------------------- -----------------------------------
The planner will probably have a transient context in which it stores When creating a prepared statement, the parse and plan trees will be built
pathnodes; this will allow it to release the bulk of its temporary space in a temporary context that's a child of MessageContext (so that it will
usage (which can be a lot, for large joins) at completion of planning. go away automatically upon error). On success, the finished plan is
The completed plan tree will be in TransactionCommandContext. copied to the prepared statement's private context, and the temp context
is released; this allows planner temporary space to be recovered before
execution begins. (In simple-Query mode we'll not bother with the extra
copy step, so the planner temp space stays around till end of query.)
The top-level executor routines, as well as most of the "plan node" The top-level executor routines, as well as most of the "plan node"
execution code, will normally run in a context with command lifetime. execution code, will normally run in a context that is created by
(This will be TransactionCommandContext for normal queries, but when ExecutorStart and destroyed by ExecutorEnd; this context also holds the
executing a cursor, it will be a context associated with the cursor.) "plan state" tree built during ExecutorStart. Most of the memory
Most of the memory allocated in these routines is intended to live until allocated in these routines is intended to live until end of query,
end of query, so this is appropriate for those purposes. We already have so this is appropriate for those purposes. The executor's top context
a mechanism --- "tuple table slots" --- for avoiding leakage of tuples, is a child of PortalContext, that is, the per-portal context of the
which is the major kind of short-lived data handled by these routines. portal that represents the query's execution.
This still leaves a certain amount of explicit pfree'ing needed by plan
node code, but that code largely exists already and is probably not worth
trying to remove. I looked at the possibility of running in a shorter-
lived context (such as a context that gets reset per-tuple), but this
seems fairly impractical. The biggest problem with it is that code in
the index access routines, as well as some other complex algorithms like
tuplesort.c, assumes that palloc'd storage will live across tuples.
For example, rtree uses a palloc'd state stack to keep track of an index
scan.
The main improvement needed in the executor is that expression evaluation The main improvement needed in the executor is that expression evaluation
--- both for qual testing and for computation of targetlist entries --- --- both for qual testing and for computation of targetlist entries ---
@ -277,7 +289,7 @@ be released on error. Currently it does that through a "portal",
which is essentially a child context of TopMemoryContext. While that which is essentially a child context of TopMemoryContext. While that
way still works, it's ugly since xact abort needs special processing way still works, it's ugly since xact abort needs special processing
to delete the portal. Better would be to use a context that's a child to delete the portal. Better would be to use a context that's a child
of QueryContext and hence is certain to go away as part of normal of PortalContext and hence is certain to go away as part of normal
processing. (Eventually we might have an even better solution from processing. (Eventually we might have an even better solution from
nested transactions, but this'll do fine for now.) nested transactions, but this'll do fine for now.)
@ -371,12 +383,14 @@ the relcache's per-relation contexts).
Also, it will be possible to specify a minimum context size. If this Also, it will be possible to specify a minimum context size. If this
value is greater than zero then a block of that size will be grabbed value is greater than zero then a block of that size will be grabbed
immediately upon context creation, and cleared but not released during immediately upon context creation, and cleared but not released during
context resets. This feature is needed for ErrorContext (see above). context resets. This feature is needed for ErrorContext (see above),
It is also useful for per-tuple contexts, which will be reset frequently but will most likely not be used for other contexts.
and typically will not allocate very much space per tuple cycle. We can
save a lot of unnecessary malloc traffic if these contexts hang onto one We expect that per-tuple contexts will be reset frequently and typically
allocation block rather than releasing and reacquiring the block on will not allocate very much space per tuple cycle. To make this usage
each tuple cycle. pattern cheap, the first block allocated in a context is not given
back to malloc() during reset, but just cleared. This avoids malloc
thrashing.
Other notes Other notes