If you're searching on a filter that has been indexed, then the search reads
the index and pulls exactly the entries that are referenced by the index.
If the filter term has not been indexed, then the search must read every single
entry in the target scope and test to see if each entry matches the filter.
Obviously indexing can save a lot of work when it's used correctly.
H3: What to index
You should create indices to match the actual filter terms used in
search queries.
> index cn,sn,givenname,mail eq
Each attribute index can be tuned further by selecting the set of index types to generate. For example, substring and approximate search for organizations (o) may make little sense (and isn't like done very often). And searching for {{userPassword}} likely makes no sense what so ever.
General rule: don't go overboard with indexes. Unused indexes must be maintained and hence can only slow things down.
See {{slapd.conf}}(8) and {{slapdindex}}(8) for more information
H3: Presense indexing
If your client application uses presence filters and if the
target attribute exists on the majority of entries in your target scope, then
all of those entries are going to be read anyway, because they are valid
members of the result set. In a subtree where 100% of the
entries are going to contain the same attributes, the presence index does
absolutely NOTHING to benefit the search, because 100% of the entries match
that presence filter.
So the resource cost of generating the index is a
complete waste of CPU time, disk, and memory. Don't do it unless you know
that it will be used, and that the attribute in question occurs very
infrequently in the target data.
Almost no applications use presence filters in their search queries. Presence
indexing is pointless when the target attribute exists on the majority of
entries in the database. In most LDAP deployments, presence indexing should
not be done, it's just wasted overhead.
See the {{Logging}} section below on what to watch our for if you have a frequently searched
(Xref) What are the DB_CONFIG configuration directives?
just change the set_lg_dir to point to your .log directory or comment that line.
Quick guide:
- Create a DB_CONFIG file in your ldap home directory (/var/lib/ldap/DB_CONFIG) with the correct "set_cachesize" value
- stop your ldap server and run db_recover -h /var/lib/ldap
- start your ldap server and check the new cache size with:
db_stat -h /var/lib/ldap -m | head -n 2
- this procedure is only needed if you use OpenLDAP 2.2 with the BDB or HDB backends; In OpenLDAP 2.3 DB recovery is performed automatically whenever the DB_CONFIG file is changed or when an unclean shutdown is detected.
Few questions, if you change the cachesize and idlecachesize entries, do
you have to do anything special aside from restarting slapd, such as run
slapindex or db_recover?
Also, is there any way to tell how much memory these caches are taking up
to make sure they are not set too large? What happens if you set your
cachesize too large and you don't have enough available memory to store
these? Will that cause an issue with openldap, or will it just not cache
those entries that would make it exceed its available memory. Will it
just use some sort of FIFO on those caches?
It will consume the memory resources of your system, and likely cause issues.
Finally, what do most people try to achieve with these values? Would the
goal be to make these as big as the directory? So, if I have 400,000 dn's
in my directory, would it be safe to set these at 400000 or would
something like 20,000 be good enough to get a nice performance increase?
I try to cache the most actively used entries. Unless you expect all 400,000 entries of your DB to be accessed regularly, there is no need to cache that many entries. My entry cache is set to 20,000 (out of a little over 400,000 entries).
The idl cache has to do with how many unique result sets of searches you want to store in memory. Setting up this cache will allow your most frequently placed searches to get results much faster, but I doubt you want to try and cache the results of every search that hits your system. ;)