diff --git a/src/tools/backend/README b/src/tools/backend/README deleted file mode 100644 index 2b8692d393..0000000000 --- a/src/tools/backend/README +++ /dev/null @@ -1,4 +0,0 @@ -src/tools/backend/README - -Just point your browser at the index.html file, and click on the -flowchart to see the description and source code. diff --git a/src/tools/backend/backend_dirs.html b/src/tools/backend/backend_dirs.html deleted file mode 100644 index 16bd894582..0000000000 --- a/src/tools/backend/backend_dirs.html +++ /dev/null @@ -1,349 +0,0 @@ - - -
- -Click on any of the section headings to see the source code -for that section.
- -Because PostgreSQL requires access to system tables for almost -every operation, getting those system tables in place is a problem. -You can't just create the tables and insert data into them in the -normal way, because table creation and insertion requires the -tables to already exist. This code jams the data directly -into tables using a special syntax used only by the bootstrap -procedure.
- -This checks the process name(argv[0]) and various flags, and -passes control to the postmaster or postgres backend code.
- -This creates shared memory, and then goes into a loop waiting -for connection requests. When a connection request arrives, a -postgres backend is started, and the connection is passed to -it.
- -This handles communication to the client processes.
- -This contains the postgres backend main handler, as well -as the code that makes calls to the parser, optimizer, executor, -and /commands functions.
- -This converts SQL queries coming from libpq into -command-specific structures to be used the optimizer/executor, -or /commands routines. The SQL is lexically analyzed into -keywords, identifiers, and constants, and passed to the parser. The -parser creates command-specific structures to hold the elements of -the query. The command-specific structures are then broken apart, -checked, and passed to /commands processing routines, or -converted into Lists of Nodes to be handled by the -optimizer and executor.
- -This uses the parser output to generate an optimal plan for the -executor.
- -This takes the parser query output, and generates all possible -methods of executing the request. It examines table join order, -where clause restrictions, and optimizer table statistics to -evaluate each possible execution method, and assigns a cost to -each.
- -optimizer/path evaluates all possible ways to join the -requested tables. When the number of tables becomes great, the -number of tests made becomes great too. The Genetic Query Optimizer -considers each table separately, then figures the most optimal -order to perform the join. For a few tables, this method takes -longer, but for a large number of tables, it is faster. There is an -option to control when this feature is used.
- -This takes the optimizer/path output, chooses the path -with the least cost, and creates a plan for the executor.
- -This does special plan processing.
- -This contains support routines used by other parts of the -optimizer.
- -This handles select, insert, update, and delete -statements. The operations required to handle these statement types -include heap scans, index scans, sorting, joining tables, grouping, -aggregates, and uniqueness.
- -These process SQL commands that do not require complex handling. -It includes vacuum, copy, alter, create table, create type, -and many others. The code is called with the structures generated -by the parser. Most of the routines do some processing, then call -lower-level functions in the catalog directory to do the actual -work.
- -This contains functions that manipulate the system tables or -catalogs. Table, index, procedure, operator, type, and aggregate -creation and manipulation routines are here. These are low-level -routines, and are usually called by upper routines that pre-format -user requests into a predefined format.
- -These allow uniform resource access by the backend.
-
- storage/buffer - shared
-buffer pool manager
- storage/file - file
-manager
- storage/freespace - free
-space map
- storage/ipc - semaphores and
-shared memory
- storage/large_object
-- large objects
- storage/lmgr - lock
-manager
- storage/page - page
-manager
- storage/smgr - storage/disk
-manager
-
-
These control the way data is accessed in heap, indexes, and
-transactions.
-
- access/common - common
-access routines
- access/gist - easy-to-define
-access method system
- access/hash - hash
- access/heap - heap is use to
-store data rows
- access/index - used by all
-index types
- access/nbtree - Lehman and
-Yao's btree management algorithm
- access/transam -
-transaction manager (BEGIN/ABORT/COMMIT)
-
-
PostgreSQL stores information about SQL queries in structures -called nodes. Nodes are generic containers that have a -type field and then a type-specific data section. Nodes are -usually placed in Lists. A List is container with an -elem element, and a next field that points to the -next List. These List structures are chained together -in a forward linked list. In this way, a chain of List s can -contain an unlimited number of Node elements, and each -Node can contain any data type. These are used extensively -in the parser, optimizer, and executor to store requests and -data.
- -This contains all the PostgreSQL builtin data types.
- -PostgreSQL supports arbitrary data types, so no data types are -hard-coded into the core backend routines. When the backend needs -to find out about a type, is does a lookup of a system table. -Because these system tables are referred to often, a cache is -maintained that speeds lookups. There is a system relation cache, a -function/operator cache, and a relation information cache. This -last cache maintains information about all recently-accessed -tables, not just system ones.
- -Reports backend errors to the front end.
- -This handles the calling of dynamically-loaded functions, and -the calling of functions defined in the system tables.
- -These hash routines are used by the cache and memory-manager -routines to do quick lookups of dynamic data storage structures -maintained by the backend.
- -When PostgreSQL allocates memory, it does so in an explicit -context. Contexts can be statement-specific, transaction-specific, -or persistent/global. By doing this, the backend can easily free -memory once a statement or transaction completes.
- -When statement output must be sorted as part of a backend -operation, this code sorts the tuples, either in memory or using -disk files.
- -These routines do checking of tuple internal columns to -determine if the current row is still valid, or is part of a -non-committed transaction or superseded by a new row.
- -There are include directories for each subsystem.
- -This houses several generic routines.
- -This is used for regular expression handling in the backend, -i.e. '~'.
- -- - -
A query comes to the backend via data packets arriving through -TCP/IP or Unix Domain sockets. It is loaded into a string, and -passed to the parser, where the -lexical scanner, scan.l, -breaks the query up into tokens(words). The parser uses gram.y and the tokens to -identify the query type, and load the proper query-specific -structure, like CreateStmt or SelectStmt.
- -The statement is then identified as complex (SELECT / INSERT / -UPDATE / DELETE) or a simple, e.g CREATE USER, ANALYZE, , -etc. Simple utility commands are processed by statement-specific -functions in backend/commands. -Complex statements require more handling.
- -The parser takes a complex query, and creates a Query structure that -contains all the elements used by complex queries. Query.qual holds -the WHERE clause qualification, which is filled in by transformWhereClause(). -Each table referenced in the query is represented by a RangeTableEntry, and -they are linked together to form the range table of the -query, which is generated by transformFromClause(). -Query.rtable holds the query's range table.
- -Certain queries, like SELECT, return columns of data. -Other queries, like INSERT and UPDATE, specify the -columns modified by the query. These column references are -converted to TargetEntry entries, -which are linked together to make up the target list of the -query. The target list is stored in Query.targetList, which is -generated by transformTargetList().
- -Other query elements, like aggregates(SUM()), GROUP -BY, and ORDER BY are also stored in their own Query -fields.
- -The next step is for the Query to be modified by any -VIEWS or RULES that may apply to the query. This is -performed by the rewrite -system.
- -The optimizer takes the -Query structure and generates an optimal Plan, which contains the -operations to be performed to execute the query. The path module determines the -best table join order and join type of each table in the -RangeTable, using Query.qual(WHERE clause) to consider -optimal index usage.
- -The Plan is then passed to the executor for execution, and the -result returned to the client. The Plan is actually as set of nodes, -arranged in a tree structure with a top-level node, and various -sub-nodes as children.
- -There are many other modules that support this basic -functionality. They can be accessed by clicking on the -flowchart.
- -Another area of interest is the shared memory area, which -contains data accessible to all backends. It has recently used -data/index blocks, locks, backend process information, and lookup -tables for these structures:
- -Each data structure is created by calling ShmemInitStruct(), and -the lookups are created by ShmemInitHash().
- -