ITS#9270 Additional information on indexing

This commit is contained in:
Ondřej Kuzník 2021-03-02 16:22:39 +00:00 committed by Quanah Gibson-Mount
parent 05b1b4688c
commit bab02ea40b

View File

@ -64,9 +64,20 @@ If the filter term has not been indexed, then the search must read every single
entry in the target scope and test to see if each entry matches the filter.
Obviously indexing can save a lot of work when it's used correctly.
In back-mdb, indexes can only track a certain number of entries per key (by
default that number is 2^16 = 65536). If more entries' values hash to this
key, some/all of them will have to be represented by a range of candidates,
making the index less useful over time as deletions cannot usually be tracked
accurately.
H3: What to index
You should create indices to match the actual filter terms used in
As a general rule, to make any use of indexes, you must set up an equality
index on objectClass:
> index objectClass eq
Then you should create indices to match the actual filter terms used in
search queries.
> index cn,sn,givenname,mail eq
@ -86,7 +97,8 @@ all of those entries are going to be read anyway, because they are valid
members of the result set. In a subtree where 100% of the
entries are going to contain the same attributes, the presence index does
absolutely NOTHING to benefit the search, because 100% of the entries match
that presence filter.
that presence filter. As an example, setting a presence index on objectClass
provides no benefit since it is present on every entry.
So the resource cost of generating the index is a
complete waste of CPU time, disk, and memory. Don't do it unless you know
@ -101,6 +113,32 @@ not be done, it's just wasted overhead.
See the {{Logging}} section below on what to watch out for if you have a frequently searched
for attribute that is unindexed.
H3: Equality indexing
Similarly to presence indexes, equality indexes are most useful if the
values searched for are uncommon. Most OpenLDAP indexes work by hashing
the normalised value and using the hash as the key. Hashing behaviour
depends on the matching rule syntax, some matching rules also implement
indexers that help speed up inequality (lower than, ...) queries.
Check the documentation and other parts of this guide if some indexes are
mandatory - e.g. to enable replication, it is expected you index certain
operational attributes, likewise if you rely on filters in ACL processing.
Approximate indexes are usually identical to equality indexes unless
a matching rule explicitly implements it. As of OpenLDAP 2.5, only
directoryStringApproxMatch and IA5StringApproxMatch matchers
and indexers are implemented, currently using soundex or metaphone, with
metaphone being the default.
H3: Substring indexing
Substring indexes work on spliting the value into short chunks and then
indexing those in a similar way to how equality index does. The storage
space needed to store all of this data is analogous to the amount of data
being indexed, which makes the indexes extremely heavy-handed in most
scenarios.
H2: Logging