mirror of
https://git.postgresql.org/git/postgresql.git
synced 2025-02-17 19:30:00 +08:00
Update examples in planstats.sgml for 8.3, and improve some aspects of
that discussion. Add a link from perform.sgml.
This commit is contained in:
parent
45c9be3cdd
commit
f5678e8e07
@ -1,4 +1,4 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.67 2007/11/28 15:42:31 petere Exp $ -->
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.68 2007/12/28 21:03:31 tgl Exp $ -->
|
||||
|
||||
<chapter id="performance-tips">
|
||||
<title>Performance Tips</title>
|
||||
@ -8,7 +8,7 @@
|
||||
</indexterm>
|
||||
|
||||
<para>
|
||||
Query performance can be affected by many things. Some of these can
|
||||
Query performance can be affected by many things. Some of these can
|
||||
be manipulated by the user, while others are fundamental to the underlying
|
||||
design of the system. This chapter provides some hints about understanding
|
||||
and tuning <productname>PostgreSQL</productname> performance.
|
||||
@ -138,7 +138,7 @@ EXPLAIN SELECT * FROM tenk1;
|
||||
Rows output is a little tricky because it is <emphasis>not</emphasis> the
|
||||
number of rows processed or scanned by the plan node. It is usually less,
|
||||
reflecting the estimated selectivity of any <literal>WHERE</>-clause
|
||||
conditions that are being
|
||||
conditions that are being
|
||||
applied at the node. Ideally the top-level rows estimate will
|
||||
approximate the number of rows actually returned, updated, or deleted
|
||||
by the query.
|
||||
@ -469,8 +469,8 @@ EXPLAIN ANALYZE SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 < 100 AND t
|
||||
One component of the statistics is the total number of entries in
|
||||
each table and index, as well as the number of disk blocks occupied
|
||||
by each table and index. This information is kept in the table
|
||||
<link linkend="catalog-pg-class"><structname>pg_class</structname></link>, in
|
||||
the columns <structfield>reltuples</structfield> and
|
||||
<link linkend="catalog-pg-class"><structname>pg_class</structname></link>,
|
||||
in the columns <structfield>reltuples</structfield> and
|
||||
<structfield>relpages</structfield>. We can look at it with
|
||||
queries similar to this one:
|
||||
|
||||
@ -493,7 +493,7 @@ SELECT relname, relkind, reltuples, relpages FROM pg_class WHERE relname LIKE 't
|
||||
</para>
|
||||
|
||||
<para>
|
||||
For efficiency reasons, <structfield>reltuples</structfield>
|
||||
For efficiency reasons, <structfield>reltuples</structfield>
|
||||
and <structfield>relpages</structfield> are not updated on-the-fly,
|
||||
and so they usually contain somewhat out-of-date values.
|
||||
They are updated by <command>VACUUM</>, <command>ANALYZE</>, and a
|
||||
@ -517,7 +517,8 @@ SELECT relname, relkind, reltuples, relpages FROM pg_class WHERE relname LIKE 't
|
||||
<firstterm>selectivity</> of <literal>WHERE</> clauses, that is,
|
||||
the fraction of rows that match each condition in the
|
||||
<literal>WHERE</> clause. The information used for this task is
|
||||
stored in the <link linkend="catalog-pg-statistic"><structname>pg_statistic</structname></link>
|
||||
stored in the
|
||||
<link linkend="catalog-pg-statistic"><structname>pg_statistic</structname></link>
|
||||
system catalog. Entries in <structname>pg_statistic</structname>
|
||||
are updated by the <command>ANALYZE</> and <command>VACUUM
|
||||
ANALYZE</> commands, and are always approximate even when freshly
|
||||
@ -530,7 +531,8 @@ SELECT relname, relkind, reltuples, relpages FROM pg_class WHERE relname LIKE 't
|
||||
|
||||
<para>
|
||||
Rather than look at <structname>pg_statistic</structname> directly,
|
||||
it's better to look at its view <structname>pg_stats</structname>
|
||||
it's better to look at its view
|
||||
<link linkend="view-pg-stats"><structname>pg_stats</structname></link>
|
||||
when examining the statistics manually. <structname>pg_stats</structname>
|
||||
is designed to be more easily readable. Furthermore,
|
||||
<structname>pg_stats</structname> is readable by all, whereas
|
||||
@ -553,13 +555,8 @@ SELECT attname, n_distinct, most_common_vals FROM pg_stats WHERE tablename = 'ro
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<structname>pg_stats</structname> is described in detail in
|
||||
<xref linkend="view-pg-stats">.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The amount of information stored in <structname>pg_statistic</structname>,
|
||||
in particular the maximum number of entries in the
|
||||
The amount of information stored in <structname>pg_statistic</structname>
|
||||
by <command>ANALYZE</>, in particular the maximum number of entries in the
|
||||
<structfield>most_common_vals</> and <structfield>histogram_bounds</>
|
||||
arrays for each column, can be set on a
|
||||
column-by-column basis using the <command>ALTER TABLE SET STATISTICS</>
|
||||
@ -570,7 +567,12 @@ SELECT attname, n_distinct, most_common_vals FROM pg_stats WHERE tablename = 'ro
|
||||
columns with irregular data distributions, at the price of consuming
|
||||
more space in <structname>pg_statistic</structname> and slightly more
|
||||
time to compute the estimates. Conversely, a lower limit might be
|
||||
appropriate for columns with simple data distributions.
|
||||
sufficient for columns with simple data distributions.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Further details about the planner's use of statistics can be found in
|
||||
<xref linkend="planner-stats-details">.
|
||||
</para>
|
||||
|
||||
</sect1>
|
||||
@ -913,7 +915,7 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
|
||||
are designed not to write WAL at all if <varname>archive_mode</varname>
|
||||
is off. (They can guarantee crash safety more cheaply by doing an
|
||||
<function>fsync</> at the end than by writing WAL.)
|
||||
This applies to the following commands:
|
||||
This applies to the following commands:
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
|
@ -1,27 +1,22 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/planstats.sgml,v 1.8 2007/01/31 20:56:18 momjian Exp $ -->
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/planstats.sgml,v 1.9 2007/12/28 21:03:31 tgl Exp $ -->
|
||||
|
||||
<chapter id="planner-stats-details">
|
||||
<title>How the Planner Uses Statistics</title>
|
||||
|
||||
<para>
|
||||
This chapter builds on the material covered in <xref linkend="using-explain">
|
||||
and <xref linkend="planner-stats">, and shows how the planner uses the
|
||||
system statistics to estimate the number of rows each stage in a query might
|
||||
return. This is a significant part of the planning / optimizing process,
|
||||
This chapter builds on the material covered in <xref
|
||||
linkend="using-explain"> and <xref linkend="planner-stats"> to show some
|
||||
additional details about how the planner uses the
|
||||
system statistics to estimate the number of rows each part of a query might
|
||||
return. This is a significant part of the planning process,
|
||||
providing much of the raw material for cost calculation.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The intent of this chapter is not to document the code —
|
||||
better done in the code itself, but to present an overview of how it works.
|
||||
This will perhaps ease the learning curve for someone who subsequently
|
||||
wishes to read the code. As a consequence, the approach chosen is to analyze
|
||||
a series of incrementally more complex examples.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The outputs and algorithms shown below are taken from version 8.0.
|
||||
The behavior of earlier (or later) versions might vary.
|
||||
The intent of this chapter is not to document the code in detail,
|
||||
but to present an overview of how it works.
|
||||
This will perhaps ease the learning curve for someone who subsequently
|
||||
wishes to read the code.
|
||||
</para>
|
||||
|
||||
<sect1 id="row-estimation-examples">
|
||||
@ -33,160 +28,163 @@
|
||||
</indexterm>
|
||||
|
||||
<para>
|
||||
Using examples drawn from the regression test database, let's start with a
|
||||
very simple query:
|
||||
The examples shown below use tables in the <productname>PostgreSQL</>
|
||||
regression test database.
|
||||
The outputs shown are taken from version 8.3.
|
||||
The behavior of earlier (or later) versions might vary.
|
||||
Note also that since <command>ANALYZE</> uses random sampling
|
||||
while producing statistics, the results will change slightly after
|
||||
any new <command>ANALYZE</>.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Let's start with a very simple query:
|
||||
|
||||
<programlisting>
|
||||
EXPLAIN SELECT * FROM tenk1;
|
||||
|
||||
QUERY PLAN
|
||||
-------------------------------------------------------------
|
||||
Seq Scan on tenk1 (cost=0.00..445.00 rows=10000 width=244)
|
||||
Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244)
|
||||
</programlisting>
|
||||
|
||||
How the planner determines the cardinality of <classname>tenk1</classname>
|
||||
is covered in <xref linkend="using-explain">, but is repeated here for
|
||||
completeness. The number of rows is looked up from
|
||||
<classname>pg_class</classname>:
|
||||
|
||||
How the planner determines the cardinality of <structname>tenk1</structname>
|
||||
is covered in <xref linkend="planner-stats">, but is repeated here for
|
||||
completeness. The number of pages and rows is looked up in
|
||||
<structname>pg_class</structname>:
|
||||
|
||||
<programlisting>
|
||||
SELECT reltuples, relpages FROM pg_class WHERE relname = 'tenk1';
|
||||
SELECT relpages, reltuples FROM pg_class WHERE relname = 'tenk1';
|
||||
|
||||
relpages | reltuples
|
||||
----------+-----------
|
||||
345 | 10000
|
||||
</programlisting>
|
||||
The planner will check the <structfield>relpages</structfield>
|
||||
estimate (this is a cheap operation) and if incorrect might scale
|
||||
<structfield>reltuples</structfield> to obtain a row estimate. In this
|
||||
case it does not, thus:
|
||||
|
||||
<programlisting>
|
||||
rows = 10000
|
||||
358 | 10000
|
||||
</programlisting>
|
||||
|
||||
These numbers are current as of the last <command>VACUUM</> or
|
||||
<command>ANALYZE</> on the table. The planner then fetches the
|
||||
actual current number of pages in the table (this is a cheap operation,
|
||||
not requiring a table scan). If that is different from
|
||||
<structfield>relpages</structfield> then
|
||||
<structfield>reltuples</structfield> is scaled accordingly to
|
||||
arrive at a current number-of-rows estimate. In this case the values
|
||||
are correct so the rows estimate is the same as
|
||||
<structfield>reltuples</structfield>.
|
||||
</para>
|
||||
|
||||
|
||||
<para>
|
||||
let's move on to an example with a range condition in its
|
||||
Let's move on to an example with a range condition in its
|
||||
<literal>WHERE</literal> clause:
|
||||
|
||||
<programlisting>
|
||||
EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 1000;
|
||||
|
||||
QUERY PLAN
|
||||
------------------------------------------------------------
|
||||
Seq Scan on tenk1 (cost=0.00..470.00 rows=1031 width=244)
|
||||
Filter: (unique1 < 1000)
|
||||
QUERY PLAN
|
||||
--------------------------------------------------------------------------------
|
||||
Bitmap Heap Scan on tenk1 (cost=24.06..394.64 rows=1007 width=244)
|
||||
Recheck Cond: (unique1 < 1000)
|
||||
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..23.80 rows=1007 width=0)
|
||||
Index Cond: (unique1 < 1000)
|
||||
</programlisting>
|
||||
|
||||
The planner examines the <literal>WHERE</literal> clause condition:
|
||||
|
||||
<programlisting>
|
||||
unique1 < 1000
|
||||
</programlisting>
|
||||
|
||||
and looks up the restriction function for the operator
|
||||
<literal><</literal> in <classname>pg_operator</classname>.
|
||||
This is held in the column <structfield>oprrest</structfield>,
|
||||
and the result in this case is <function>scalarltsel</function>.
|
||||
The <function>scalarltsel</function> function retrieves the histogram for
|
||||
<structfield>unique1</structfield> from <classname>pg_statistics</classname>
|
||||
- we can follow this by using the simpler <classname>pg_stats</classname>
|
||||
The planner examines the <literal>WHERE</literal> clause condition
|
||||
and looks up the selectivity function for the operator
|
||||
<literal><</literal> in <structname>pg_operator</structname>.
|
||||
This is held in the column <structfield>oprrest</structfield>,
|
||||
and the entry in this case is <function>scalarltsel</function>.
|
||||
The <function>scalarltsel</function> function retrieves the histogram for
|
||||
<structfield>unique1</structfield> from
|
||||
<structname>pg_statistics</structname>. For manual queries it is more
|
||||
convenient to look in the simpler <structname>pg_stats</structname>
|
||||
view:
|
||||
|
||||
<programlisting>
|
||||
SELECT histogram_bounds FROM pg_stats
|
||||
SELECT histogram_bounds FROM pg_stats
|
||||
WHERE tablename='tenk1' AND attname='unique1';
|
||||
|
||||
histogram_bounds
|
||||
------------------------------------------------------
|
||||
{1,970,1943,2958,3971,5069,6028,7007,7919,8982,9995}
|
||||
{0,993,1997,3050,4040,5036,5957,7057,8029,9016,9995}
|
||||
</programlisting>
|
||||
|
||||
Next the fraction of the histogram occupied by <quote>< 1000</quote>
|
||||
is worked out. This is the selectivity. The histogram divides the range
|
||||
into equal frequency buckets, so all we have to do is locate the bucket
|
||||
that our value is in and count <emphasis>part</emphasis> of it and
|
||||
<emphasis>all</emphasis> of the ones before. The value 1000 is clearly in
|
||||
the second (970 - 1943) bucket, so by assuming a linear distribution of
|
||||
values inside each bucket we can calculate the selectivity as:
|
||||
Next the fraction of the histogram occupied by <quote>< 1000</quote>
|
||||
is worked out. This is the selectivity. The histogram divides the range
|
||||
into equal frequency buckets, so all we have to do is locate the bucket
|
||||
that our value is in and count <emphasis>part</emphasis> of it and
|
||||
<emphasis>all</emphasis> of the ones before. The value 1000 is clearly in
|
||||
the second bucket (993-1997). Assuming a linear distribution of
|
||||
values inside each bucket, we can calculate the selectivity as:
|
||||
|
||||
<programlisting>
|
||||
selectivity = (1 + (1000 - bckt[2].min)/(bckt[2].max - bckt[2].min))/num_bckts
|
||||
= (1 + (1000 - 970)/(1943 - 970))/10
|
||||
= 0.1031
|
||||
selectivity = (1 + (1000 - bucket[2].min)/(bucket[2].max - bucket[2].min))/num_buckets
|
||||
= (1 + (1000 - 993)/(1997 - 993))/10
|
||||
= 0.100697
|
||||
</programlisting>
|
||||
|
||||
that is, one whole bucket plus a linear fraction of the second, divided by
|
||||
the number of buckets. The estimated number of rows can now be calculated as
|
||||
the product of the selectivity and the cardinality of
|
||||
<classname>tenk1</classname>:
|
||||
the product of the selectivity and the cardinality of
|
||||
<structname>tenk1</structname>:
|
||||
|
||||
<programlisting>
|
||||
rows = rel_cardinality * selectivity
|
||||
= 10000 * 0.1031
|
||||
= 1031
|
||||
= 10000 * 0.100697
|
||||
= 1007 (rounding off)
|
||||
</programlisting>
|
||||
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Next let's consider an example with equality condition in its
|
||||
Next let's consider an example with an equality condition in its
|
||||
<literal>WHERE</literal> clause:
|
||||
|
||||
<programlisting>
|
||||
EXPLAIN SELECT * FROM tenk1 WHERE stringu1 = 'ATAAAA';
|
||||
EXPLAIN SELECT * FROM tenk1 WHERE stringu1 = 'CRAAAA';
|
||||
|
||||
QUERY PLAN
|
||||
----------------------------------------------------------
|
||||
Seq Scan on tenk1 (cost=0.00..470.00 rows=31 width=244)
|
||||
Filter: (stringu1 = 'ATAAAA'::name)
|
||||
Seq Scan on tenk1 (cost=0.00..483.00 rows=30 width=244)
|
||||
Filter: (stringu1 = 'CRAAAA'::name)
|
||||
</programlisting>
|
||||
|
||||
Again the planner examines the <literal>WHERE</literal> clause condition:
|
||||
Again the planner examines the <literal>WHERE</literal> clause condition
|
||||
and looks up the selectivity function for <literal>=</literal>, which is
|
||||
<function>eqsel</function>. For equality estimation the histogram is
|
||||
not useful; instead the list of <firstterm>most
|
||||
common values</> (<acronym>MCV</acronym>s) is used to determine the
|
||||
selectivity. Let's have a look at the MCVs, with some additional columns
|
||||
that will be useful later:
|
||||
|
||||
<programlisting>
|
||||
stringu1 = 'ATAAAA'
|
||||
</programlisting>
|
||||
|
||||
and looks up the restriction function for <literal>=</literal>, which is
|
||||
<function>eqsel</function>. This case is a bit different, as the most
|
||||
common values — <acronym>MCV</acronym>s, are used to determine the
|
||||
selectivity. Let's have a look at these, with some extra columns that will
|
||||
be useful later:
|
||||
|
||||
<programlisting>
|
||||
SELECT null_frac, n_distinct, most_common_vals, most_common_freqs FROM pg_stats
|
||||
SELECT null_frac, n_distinct, most_common_vals, most_common_freqs FROM pg_stats
|
||||
WHERE tablename='tenk1' AND attname='stringu1';
|
||||
|
||||
null_frac | 0
|
||||
n_distinct | 672
|
||||
most_common_vals | {FDAAAA,NHAAAA,ATAAAA,BGAAAA,EBAAAA,MOAAAA,NDAAAA,OWAAAA,BHAAAA,BJAAAA}
|
||||
most_common_freqs | {0.00333333,0.00333333,0.003,0.003,0.003,0.003,0.003,0.003,0.00266667,0.00266667}
|
||||
n_distinct | 676
|
||||
most_common_vals | {EJAAAA,BBAAAA,CRAAAA,FCAAAA,FEAAAA,GSAAAA,JOAAAA,MCAAAA,NAAAAA,WGAAAA}
|
||||
most_common_freqs | {0.00333333,0.003,0.003,0.003,0.003,0.003,0.003,0.003,0.003,0.003}
|
||||
|
||||
</programlisting>
|
||||
|
||||
The selectivity is merely the most common frequency (<acronym>MCF</acronym>)
|
||||
corresponding to the third <acronym>MCV</acronym> — 'ATAAAA':
|
||||
Since <literal>CRAAAA</> appears in the list of MCVs, the selectivity is
|
||||
merely the corresponding entry in the list of most common frequencies
|
||||
(<acronym>MCF</acronym>s):
|
||||
|
||||
<programlisting>
|
||||
selectivity = mcf[3]
|
||||
= 0.003
|
||||
</programlisting>
|
||||
|
||||
The estimated number of rows is just the product of this with the
|
||||
cardinality of <classname>tenk1</classname> as before:
|
||||
As before, the estimated number of rows is just the product of this with the
|
||||
cardinality of <structname>tenk1</structname>:
|
||||
|
||||
<programlisting>
|
||||
rows = 10000 * 0.003
|
||||
= 30
|
||||
</programlisting>
|
||||
|
||||
The number displayed by <command>EXPLAIN</command> is one more than this,
|
||||
due to some post estimation checks.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Now consider the same query, but with a constant that is not in the
|
||||
Now consider the same query, but with a constant that is not in the
|
||||
<acronym>MCV</acronym> list:
|
||||
|
||||
<programlisting>
|
||||
@ -194,116 +192,197 @@ EXPLAIN SELECT * FROM tenk1 WHERE stringu1 = 'xxx';
|
||||
|
||||
QUERY PLAN
|
||||
----------------------------------------------------------
|
||||
Seq Scan on tenk1 (cost=0.00..470.00 rows=15 width=244)
|
||||
Seq Scan on tenk1 (cost=0.00..483.00 rows=15 width=244)
|
||||
Filter: (stringu1 = 'xxx'::name)
|
||||
</programlisting>
|
||||
|
||||
This is quite a different problem, how to estimate the selectivity when the
|
||||
value is <emphasis>not</emphasis> in the <acronym>MCV</acronym> list.
|
||||
The approach is to use the fact that the value is not in the list,
|
||||
This is quite a different problem: how to estimate the selectivity when the
|
||||
value is <emphasis>not</emphasis> in the <acronym>MCV</acronym> list.
|
||||
The approach is to use the fact that the value is not in the list,
|
||||
combined with the knowledge of the frequencies for all of the
|
||||
<acronym>MCV</acronym>s:
|
||||
|
||||
<programlisting>
|
||||
selectivity = (1 - sum(mvf))/(num_distinct - num_mcv)
|
||||
= (1 - (0.00333333 + 0.00333333 + 0.003 + 0.003 + 0.003
|
||||
+ 0.003 + 0.003 + 0.003 + 0.00266667 + 0.00266667))/(672 - 10)
|
||||
= 0.001465
|
||||
= (1 - (0.00333333 + 0.003 + 0.003 + 0.003 + 0.003 + 0.003 +
|
||||
0.003 + 0.003 + 0.003 + 0.003))/(676 - 10)
|
||||
= 0.0014559
|
||||
</programlisting>
|
||||
|
||||
That is, add up all the frequencies for the <acronym>MCV</acronym>s and
|
||||
subtract them from one — because it is <emphasis>not</emphasis> one
|
||||
of these, and divide by the <emphasis>remaining</emphasis> distinct values.
|
||||
Notice that there are no null values so we don't have to worry about those.
|
||||
The estimated number of rows is calculated as usual:
|
||||
That is, add up all the frequencies for the <acronym>MCV</acronym>s and
|
||||
subtract them from one, then
|
||||
divide by the number of <emphasis>other</emphasis> distinct values.
|
||||
This amounts to assuming that the fraction of the column that is not any
|
||||
of the MCVs is evenly distributed among all the other distinct values.
|
||||
Notice that there are no null values so we don't have to worry about those
|
||||
(otherwise we'd subtract the null fraction from the numerator as well).
|
||||
The estimated number of rows is then calculated as usual:
|
||||
|
||||
<programlisting>
|
||||
rows = 10000 * 0.001465
|
||||
= 15
|
||||
rows = 10000 * 0.0014559
|
||||
= 15 (rounding off)
|
||||
</programlisting>
|
||||
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Let's increase the complexity to consider a case with more than one
|
||||
The previous example with <literal>unique1 < 1000</> was an
|
||||
oversimplification of what <function>scalarltsel</function> really does;
|
||||
now that we have seen an example of the use of MCVs, we can fill in some
|
||||
more detail. The example was correct as far as it went, because since
|
||||
<structfield>unique1</> is a unique column it has no MCVs (obviously, no
|
||||
value is any more common than any other value). For a non-unique
|
||||
column, there will normally be both a histogram and an MCV list, and
|
||||
<emphasis>the histogram does not include the portion of the column
|
||||
population represented by the MCVs</>. We do things this way because
|
||||
it allows more precise estimation. In this situation
|
||||
<function>scalarltsel</function> directly applies the condition (e.g.,
|
||||
<quote>< 1000</>) to each value of the MCV list, and adds up the
|
||||
frequencies of the MCVs for which the condition is true. This gives
|
||||
an exact estimate of the selectivity within the portion of the table
|
||||
that is MCVs. The histogram is then used in the same way as above
|
||||
to estimate the selectivity in the portion of the table that is not
|
||||
MCVs, and then the two numbers are combined to estimate the overall
|
||||
selectivity. For example, consider
|
||||
|
||||
<programlisting>
|
||||
EXPLAIN SELECT * FROM tenk1 WHERE stringu1 < 'IAAAAA';
|
||||
|
||||
QUERY PLAN
|
||||
------------------------------------------------------------
|
||||
Seq Scan on tenk1 (cost=0.00..483.00 rows=3077 width=244)
|
||||
Filter: (stringu1 < 'IAAAAA'::name)
|
||||
</programlisting>
|
||||
|
||||
We already saw the MCV information for <structfield>stringu1</>,
|
||||
and here is its histogram:
|
||||
|
||||
<programlisting>
|
||||
SELECT histogram_bounds FROM pg_stats
|
||||
WHERE tablename='tenk1' AND attname='stringu1';
|
||||
|
||||
histogram_bounds
|
||||
--------------------------------------------------------------------------------
|
||||
{AAAAAA,CQAAAA,FRAAAA,IBAAAA,KRAAAA,NFAAAA,PSAAAA,SGAAAA,VAAAAA,XLAAAA,ZZAAAA}
|
||||
</programlisting>
|
||||
|
||||
Checking the MCV list, we find that the condition <literal>stringu1 <
|
||||
'IAAAAA'</> is satisfied by the first six entries and not the last four,
|
||||
so the selectivity within the MCV part of the population is
|
||||
|
||||
<programlisting>
|
||||
selectivity = sum(relevant mvfs)
|
||||
= 0.00333333 + 0.003 + 0.003 + 0.003 + 0.003 + 0.003
|
||||
= 0.01833333
|
||||
</programlisting>
|
||||
|
||||
Summing all the MCFs also tells us that the total fraction of the
|
||||
population represented by MCVs is 0.03033333, and therefore the
|
||||
fraction represented by the histogram is 0.96966667 (again, there
|
||||
are no nulls, else we'd have to exclude them here). We can see
|
||||
that the value <literal>IAAAAA</> falls nearly at the end of the
|
||||
third histogram bucket. Using some rather cheesy assumptions
|
||||
about the frequency of different characters, the planner arrives
|
||||
at the estimate 0.298387 for the portion of the histogram population
|
||||
that is less than <literal>IAAAAA</>. We then combine the estimates
|
||||
for the MCV and non-MCV populations:
|
||||
|
||||
<programlisting>
|
||||
selectivity = mcv_selectivity + histogram_selectivity * histogram_fraction
|
||||
= 0.01833333 + 0.298387 * 0.96966667
|
||||
= 0.307669
|
||||
|
||||
rows = 10000 * 0.307669
|
||||
= 3077 (rounding off)
|
||||
</programlisting>
|
||||
|
||||
In this particular example, the correction from the MCV list is fairly
|
||||
small, because the column distribution is actually quite flat (the
|
||||
statistics showing these particular values as being more common than
|
||||
others are mostly due to sampling error). In a more typical case where
|
||||
some values are significantly more common than others, this complicated
|
||||
process gives a useful improvement in accuracy because the selectivity
|
||||
for the most common values is found exactly.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Now let's consider a case with more than one
|
||||
condition in the <literal>WHERE</literal> clause:
|
||||
|
||||
<programlisting>
|
||||
EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 1000 AND stringu1 = 'xxx';
|
||||
|
||||
QUERY PLAN
|
||||
-----------------------------------------------------------
|
||||
Seq Scan on tenk1 (cost=0.00..495.00 rows=2 width=244)
|
||||
Filter: ((unique1 < 1000) AND (stringu1 = 'xxx'::name))
|
||||
QUERY PLAN
|
||||
--------------------------------------------------------------------------------
|
||||
Bitmap Heap Scan on tenk1 (cost=23.80..396.91 rows=1 width=244)
|
||||
Recheck Cond: (unique1 < 1000)
|
||||
Filter: (stringu1 = 'xxx'::name)
|
||||
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..23.80 rows=1007 width=0)
|
||||
Index Cond: (unique1 < 1000)
|
||||
</programlisting>
|
||||
|
||||
An assumption of independence is made and the selectivities of the
|
||||
individual restrictions are multiplied together:
|
||||
The planner assumes that the two conditions are independent, so that
|
||||
the individual selectivities of the clauses can be multiplied together:
|
||||
|
||||
<programlisting>
|
||||
selectivity = selectivity(unique1 < 1000) * selectivity(stringu1 = 'xxx')
|
||||
= 0.1031 * 0.001465
|
||||
= 0.00015104
|
||||
= 0.100697 * 0.0014559
|
||||
= 0.0001466
|
||||
|
||||
rows = 10000 * 0.0001466
|
||||
= 1 (rounding off)
|
||||
</programlisting>
|
||||
|
||||
The row estimates are calculated as before:
|
||||
|
||||
<programlisting>
|
||||
rows = 10000 * 0.00015104
|
||||
= 2
|
||||
</programlisting>
|
||||
Notice that the number of rows estimated to be returned from the bitmap
|
||||
index scan reflects only the condition used with the index; this is
|
||||
important since it affects the cost estimate for the subsequent heap
|
||||
fetches.
|
||||
</para>
|
||||
|
||||
|
||||
<para>
|
||||
Finally we will examine a query that includes a <literal>JOIN</literal>
|
||||
together with a <literal>WHERE</literal> clause:
|
||||
Finally we will examine a query that involves a join:
|
||||
|
||||
<programlisting>
|
||||
EXPLAIN SELECT * FROM tenk1 t1, tenk2 t2
|
||||
EXPLAIN SELECT * FROM tenk1 t1, tenk2 t2
|
||||
WHERE t1.unique1 < 50 AND t1.unique2 = t2.unique2;
|
||||
|
||||
QUERY PLAN
|
||||
-----------------------------------------------------------------------------------------
|
||||
Nested Loop (cost=0.00..346.90 rows=51 width=488)
|
||||
-> Index Scan using tenk1_unique1 on tenk1 t1 (cost=0.00..192.57 rows=51 width=244)
|
||||
Index Cond: (unique1 < 50)
|
||||
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.00..3.01 rows=1 width=244)
|
||||
Index Cond: ("outer".unique2 = t2.unique2)
|
||||
--------------------------------------------------------------------------------------
|
||||
Nested Loop (cost=4.64..456.23 rows=50 width=488)
|
||||
-> Bitmap Heap Scan on tenk1 t1 (cost=4.64..142.17 rows=50 width=244)
|
||||
Recheck Cond: (unique1 < 50)
|
||||
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.63 rows=50 width=0)
|
||||
Index Cond: (unique1 < 50)
|
||||
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.00..6.27 rows=1 width=244)
|
||||
Index Cond: (t2.unique2 = t1.unique2)
|
||||
</programlisting>
|
||||
|
||||
The restriction on <classname>tenk1</classname>
|
||||
<quote>unique1 < 50</quote> is evaluated before the nested-loop join.
|
||||
This is handled analogously to the previous range example. The restriction
|
||||
operator for <literal><</literal> is <function>scalarlteqsel</function>
|
||||
as before, but this time the value 50 is in the first bucket of the
|
||||
The restriction on <structname>tenk1</structname>,
|
||||
<literal>unique1 < 50</literal>,
|
||||
is evaluated before the nested-loop join.
|
||||
This is handled analogously to the previous range example. This time the
|
||||
value 50 falls into the first bucket of the
|
||||
<structfield>unique1</structfield> histogram:
|
||||
|
||||
<programlisting>
|
||||
selectivity = (0 + (50 - bckt[1].min)/(bckt[1].max - bckt[1].min))/num_bckts
|
||||
= (0 + (50 - 1)/(970 - 1))/10
|
||||
= 0.005057
|
||||
selectivity = (0 + (50 - bucket[1].min)/(bucket[1].max - bucket[1].min))/num_buckets
|
||||
= (0 + (50 - 0)/(993 - 0))/10
|
||||
= 0.005035
|
||||
|
||||
rows = 10000 * 0.005057
|
||||
= 51
|
||||
rows = 10000 * 0.005035
|
||||
= 50 (rounding off)
|
||||
</programlisting>
|
||||
|
||||
The restriction for the join is:
|
||||
The restriction for the join is <literal>t2.unique2 = t1.unique2</>.
|
||||
The operator is just
|
||||
our familiar <literal>=</literal>, however the selectivity function is
|
||||
obtained from the <structfield>oprjoin</structfield> column of
|
||||
<structname>pg_operator</structname>, and is <function>eqjoinsel</function>.
|
||||
<function>eqjoinsel</function> looks up the statistical information for both
|
||||
<structname>tenk2</structname> and <structname>tenk1</structname>:
|
||||
|
||||
<programlisting>
|
||||
t2.unique2 = t1.unique2
|
||||
</programlisting>
|
||||
|
||||
This is due to the join method being nested-loop, with
|
||||
<classname>tenk1</classname> being in the outer loop. The operator is just
|
||||
our familiar <literal>=</literal>, however the restriction function is
|
||||
obtained from the <structfield>oprjoin</structfield> column of
|
||||
<classname>pg_operator</classname> - and is <function>eqjoinsel</function>.
|
||||
Additionally we use the statistical information for both
|
||||
<classname>tenk2</classname> and <classname>tenk1</classname>:
|
||||
|
||||
<programlisting>
|
||||
SELECT tablename, null_frac,n_distinct, most_common_vals FROM pg_stats
|
||||
WHERE tablename IN ('tenk1', 'tenk2') AND attname='unique2';
|
||||
SELECT tablename, null_frac,n_distinct, most_common_vals FROM pg_stats
|
||||
WHERE tablename IN ('tenk1', 'tenk2') AND attname='unique2';
|
||||
|
||||
tablename | null_frac | n_distinct | most_common_vals
|
||||
-----------+-----------+------------+------------------
|
||||
@ -311,41 +390,62 @@ tablename | null_frac | n_distinct | most_common_vals
|
||||
tenk2 | 0 | -1 |
|
||||
</programlisting>
|
||||
|
||||
In this case there is no <acronym>MCV</acronym> information for
|
||||
<structfield>unique2</structfield> because all the values appear to be
|
||||
unique, so we can use an algorithm that relies only on the number of
|
||||
In this case there is no <acronym>MCV</acronym> information for
|
||||
<structfield>unique2</structfield> because all the values appear to be
|
||||
unique, so we use an algorithm that relies only on the number of
|
||||
distinct values for both relations together with their null fractions:
|
||||
|
||||
<programlisting>
|
||||
selectivity = (1 - null_frac1) * (1 - null_frac2) * min(1/num_distinct1, 1/num_distinct2)
|
||||
= (1 - 0) * (1 - 0) * min(1/10000, 1/1000)
|
||||
= (1 - 0) * (1 - 0) / max(10000, 10000)
|
||||
= 0.0001
|
||||
</programlisting>
|
||||
|
||||
This is, subtract the null fraction from one for each of the relations,
|
||||
and divide by the maximum of the two distinct values. The number of rows
|
||||
that the join is likely to emit is calculated as the cardinality of
|
||||
Cartesian product of the two nodes in the nested-loop, multiplied by the
|
||||
This is, subtract the null fraction from one for each of the relations,
|
||||
and divide by the maximum of the numbers of distinct values.
|
||||
The number of rows
|
||||
that the join is likely to emit is calculated as the cardinality of the
|
||||
Cartesian product of the two inputs, multiplied by the
|
||||
selectivity:
|
||||
|
||||
<programlisting>
|
||||
rows = (outer_cardinality * inner_cardinality) * selectivity
|
||||
= (51 * 10000) * 0.0001
|
||||
= 51
|
||||
= (50 * 10000) * 0.0001
|
||||
= 50
|
||||
</programlisting>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
For those interested in further details, estimation of the number of rows in
|
||||
a relation is covered in
|
||||
<filename>src/backend/optimizer/util/plancat.c</filename>. The calculation
|
||||
logic for clause selectivities is in
|
||||
<filename>src/backend/optimizer/path/clausesel.c</filename>. The actual
|
||||
implementations of the operator and join restriction functions can be found
|
||||
Had there been MCV lists for the two columns,
|
||||
<function>eqjoinsel</function> would have used direct comparison of the MCV
|
||||
lists to determine the join selectivity within the part of the column
|
||||
populations represented by the MCVs. The estimate for the remainder of the
|
||||
populations follows the same approach shown here.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Notice that we showed <literal>inner_cardinality</> as 10000, that is,
|
||||
the unmodified size of <structname>tenk2</>. It might appear from
|
||||
inspection of the <command>EXPLAIN</> output that the estimate of
|
||||
join rows comes from 50 * 1, that is, the number of outer rows times
|
||||
the estimated number of rows obtained by each inner indexscan on
|
||||
<structname>tenk2</>. But this is not the case: the join relation size
|
||||
is estimated before any particular join plan has been considered. If
|
||||
everything is working well then the two ways of estimating the join
|
||||
size will produce about the same answer, but due to roundoff error and
|
||||
other factors they sometimes diverge significantly.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
For those interested in further details, estimation of the size of
|
||||
a table (before any <literal>WHERE</> clauses) is done in
|
||||
<filename>src/backend/optimizer/util/plancat.c</filename>. The generic
|
||||
logic for clause selectivities is in
|
||||
<filename>src/backend/optimizer/path/clausesel.c</filename>. The
|
||||
operator-specific selectivity functions are mostly found
|
||||
in <filename>src/backend/utils/adt/selfuncs.c</filename>.
|
||||
</para>
|
||||
|
||||
</sect1>
|
||||
|
||||
|
||||
</chapter>
|
||||
|
Loading…
Reference in New Issue
Block a user