mirror of
https://git.openldap.org/openldap/openldap.git
synced 2024-12-21 03:10:25 +08:00
361 lines
18 KiB
Plaintext
361 lines
18 KiB
Plaintext
# $OpenLDAP$
|
|
# Copyright 2003, The OpenLDAP Foundation, All Rights Reserved.
|
|
# COPYING RESTRICTIONS APPLY, see COPYRIGHT.
|
|
|
|
H1: LDAP Sync Replication
|
|
|
|
The LDAP Sync replication engine, syncrepl for short, is a consumer-side
|
|
replication engine that enables the consumer LDAP server to maintain
|
|
a shadow copy of a DIT fragment. A syncrepl engine resides at the
|
|
consumer-side as one of the {{slapd}} (8) threads. It creates and
|
|
maintains a consumer replica by connecting to the replication provider
|
|
to perform the initial DIT content load followed either by
|
|
periodic content polling or by timely updates upon content changes.
|
|
|
|
Syncrepl uses the LDAP Content Synchronization (or LDAP Sync for short)
|
|
protocol as the replica synchronization protocol.
|
|
|
|
Syncrepl provides a stateful replication which supports both the
|
|
pull-based and the push-based synchronizations and does not mandate
|
|
the use of the history store.
|
|
|
|
Syncrepl keeps track of the status of the replication content by
|
|
maintaining and exchanging synchronization cookies. Because the
|
|
syncrepl consumer and provider maintain their content status,
|
|
the consumer can poll the provider content to perform incremental
|
|
synchronization by asking the entries required to make the consumer
|
|
replica up-to-date with the provider content. Syncrepl also enables
|
|
convenient management of replicas by maintaining replica status.
|
|
The consumer replica can be constructed from a consumer-side or a
|
|
provider-side backup at any synchronization status. Syncrepl can
|
|
automatically resynchronize the consumer replica up-to-date with the
|
|
current provider content.
|
|
|
|
Syncrepl supports both the pull-based and the
|
|
push-based synchronization. In its basic refreshOnly mode synchronization,
|
|
the provider uses a pull-based synchronization where the consumer servers
|
|
need not be tracked and no history information is maintained.
|
|
The information required for the provider to process periodic polling
|
|
requests is contained in the synchronization cookie of the request itself.
|
|
To optimize the pull-based synchronization, syncrepl utilizes the present
|
|
phase of the LDAP Sync protocol as well as its delete phase, instead of
|
|
falling back on frequent full reloads. To further optimize the pull-based
|
|
synchronization, the provider can maintain a per-scope session log
|
|
as the history store. In its refreshAndPersist mode of synchronization,
|
|
the provider uses a push-based synchronization. The provider keeps
|
|
track of the consumer servers that have requested the persistent search
|
|
and sends them necessary updates as the provider replication content
|
|
gets modified.
|
|
|
|
With syncrepl, a consumer server can create a replica without changing
|
|
provider's configurations and without restarting the provider server,
|
|
if the consumer server has appropriate access privileges for the
|
|
DIT fragment to be replicated. The consumer server can stop the
|
|
replication also without the need for provider-side changes and restart.
|
|
|
|
Syncrepl supports both partial and sparse replications.
|
|
The shadow DIT fragment is defined by a general
|
|
search criteria consisting of base, scope, filter, and attribute list.
|
|
The replica content is also subject to the access privileges
|
|
of the bind identity of the syncrepl replication connection.
|
|
|
|
|
|
H2: The LDAP Content Synchronization Protocol
|
|
|
|
The LDAP Sync protocol allows a client to maintain a synchronized copy
|
|
of a DIT fragment. The LDAP Sync operation is defined as a set of
|
|
controls and other protocol elements which extend the LDAP search
|
|
operation. This section introduces the LDAP Content Sync protocol
|
|
only briefly. For more information, refer to the Internet Draft
|
|
{{The LDAP Content Synchronization Operation
|
|
<draft-zeilenga-ldup-sync-05.txt>}}.
|
|
|
|
The LDAP Sync protocol supports both polling and listening for
|
|
changes by defining two respective synchronization operations:
|
|
{{refreshOnly}} and {{refreshAndPersist}}.
|
|
The polling is implemented by the {{refreshOnly}} operation.
|
|
The client copy is synchronized to the server copy at the time of polling.
|
|
The server finishes the search operation by returning {{SearchResultDone}}
|
|
at the end of the search operation as in the normal search.
|
|
The listening is implemented by the {{refreshAndPersist}} operation.
|
|
Instead of finishing the search after returning all entries currently
|
|
matching the search criteria, the synchronization search remains
|
|
persistent in the server. Subsequent updates to the synchronization content
|
|
in the server have additional entry updates be sent to the client.
|
|
|
|
The {{refreshOnly}} operation and the refresh stage of the
|
|
{{refreshAndPersist}} operation can be performed by
|
|
a present phase or a delete phase.
|
|
|
|
In the present phase, the server sends the client the entries updated
|
|
within the search scope since the last synchronization. The server sends
|
|
all requested attributes, be it changed or not, of the updated entries.
|
|
For each unchanged entry which remains in the scope,
|
|
the server sends a present message consisting only of the name of the
|
|
entry and the synchronization control representing state present.
|
|
The present message does not contain any attributes of the entry.
|
|
After the client receives all update and present entries,
|
|
it can reliably determine the new client copy by adding the entries
|
|
added to the server, by replacing the entries modified at the server,
|
|
and by deleting entries in the client copy which have not
|
|
been updated nor specified as being present at the server.
|
|
|
|
The transmission of the updated entries in the delete phase is
|
|
the same as in the present phase. The server sends all the requested
|
|
attributes of the entries updated within the search scope since the
|
|
last synchronization to the client. In the delete phase, however,
|
|
the server sends a delete message for each entry deleted from the
|
|
search scope, instead of sending present messages.
|
|
The delete message consists only of the name of the entry
|
|
and the synchronization control representing state delete.
|
|
The new client copy can be determined by adding, modifying, and
|
|
removing entries according to the synchronization control
|
|
attached to the {{SearchResultEntry}} message.
|
|
|
|
In the case that the LDAP Sync server maintains a history store
|
|
and can determine which entries are scoped out of the client
|
|
copy since the last synchronization time, the server can use
|
|
the delete phase. If the server does not maintain any history store,
|
|
cannot determine the scoped-out entries from the history store,
|
|
or the history store does not cover the outdated synchronization
|
|
state of the client, the server should use the present phase.
|
|
The use of the present phase is much more efficient than a full
|
|
content reload in terms of the synchronization traffic.
|
|
To reduce the synchronization traffic further,
|
|
the LDAP Sync protocol also provides several optimizations
|
|
such as the transmission of the normalized {{EX:entryUUID}}s and the
|
|
transmission of the multiple {{EX:entryUUIDs}} in a single
|
|
{{syncIdSet}} message.
|
|
|
|
At the end of the {{refreshOnly}} synchronization,
|
|
the server sends a synchronization cookie to the client as a state
|
|
indicator of the client copy after the synchronization is completed.
|
|
The client will present the received cookie when it requests
|
|
the next incremental synchronization to the server.
|
|
|
|
When {{refreshAndPersist}} synchronization is used,
|
|
the server sends a synchronization cookie at the end of the
|
|
refresh stage by sending a Sync Info message with TRUE refreshDone.
|
|
It also sends a synchronization cookie by attaching it to
|
|
{{SearchResultEntry}} generated in the persist stage of the
|
|
synchronization search. During the persist stage, the server
|
|
can also send a Sync Info message containing the synchronization
|
|
cookie at any time the server wants to update the client-side state
|
|
indicator. The server also updates a synchronization indicator
|
|
of the client at the end of the persist stage.
|
|
|
|
In the LDAP Sync protocol, entries are uniquely identified by
|
|
the {{EX:entryUUID}} attribute value. It can function as a reliable
|
|
identifier of the entry. The DN of the entry, on the other hand,
|
|
can be changed over time and hence cannot be considered as the reliable
|
|
identifier. The {{EX:entryUUID}} is attached to each {{SearchResultEntry}}
|
|
or {{SearchResultReference}} as a part of the synchronization control.
|
|
|
|
|
|
H2: Syncrepl Details
|
|
|
|
The syncrepl engine utilizes both the {{refreshOnly}} and the
|
|
{{refreshAndPersist}} operations of the LDAP Sync protocol.
|
|
If a syncrepl specification is included in a database definition,
|
|
{{slapd}} (8) launches a syncrepl engine as a {{slapd}} (8) thread
|
|
and schedules its execution. If the {{refreshOnly}} operation is
|
|
specified, the syncrepl engine will be rescheduled at the interval
|
|
time after a synchronization operation is completed.
|
|
If the {{refreshAndPersist}} operation is specified, the engine will
|
|
remain active and process the persistent synchronization messages
|
|
from the provider.
|
|
|
|
The syncrepl engine utilizes both the present phase and the
|
|
delete phase of the refresh synchronization. It is possible to
|
|
configure a per-scope session log in the provider server
|
|
which stores the {{EX:entryUUID}}s and the names of a finite
|
|
number of entries deleted from a replication content.
|
|
Multiple replicas of single provider content share the same
|
|
per-scope session log. The syncrepl engine uses the delete phase
|
|
if the session log is present and the state of the consumer
|
|
server is recent enough that no session log entries are truncated
|
|
after the last synchronization of the client.
|
|
The syncrepl engine uses the present phase if no session log
|
|
is configured for the replication content or if the
|
|
consumer replica is too outdated to be covered by the session log.
|
|
The current design of the session log store is memory based, so
|
|
the information contained in the session log is not persistent
|
|
over multiple provider invocations. It is not currently supported
|
|
to access the session log store by using LDAP operations. It is
|
|
also not currently supported to impose access control to the session log.
|
|
|
|
As a further optimization, even in the case the synchronization search
|
|
is not associated with any session log, no entries will be transmitted
|
|
to the consumer server when there has been no update in the replication
|
|
context.
|
|
|
|
While {{slapd}} (8) can function as the LDAP Sync provider only
|
|
when it is configured with either {{back-bdb}} or {{back-hdb}} backend,
|
|
the syncrepl engine, which is a consumer-side replication engine,
|
|
can work with any backends.
|
|
|
|
The LDAP Sync provider maintains {{EX:contextCSN}} for each
|
|
database as the current synchronization state indicator of the
|
|
provider content. It is the largest {{EX:entryCSN}} in the provider
|
|
context such that no transactions for an entry having
|
|
smaller {{EX:entryCSN}} value remains outstanding.
|
|
{{EX:contextCSN}} could not just be set to the largest issued
|
|
{{EX:entryCSN}} because {{EX:entryCSN}} is obtained before
|
|
a transaction starts and transactions are not committed in the
|
|
issue order.
|
|
|
|
The provider stores the {{EX:contextCSN}} of a context in the
|
|
{{EX:syncreplCookie}} attribute of the immediate child entry of
|
|
the context suffix whose DN is {{cn=ldapsync,<suffix>}} and
|
|
object class is {{EX:syncProviderSubentry}}.
|
|
|
|
The consumer stores its replica state, which is the provider's
|
|
{{EX:contextCSN}} received as a synchronization cookie,
|
|
in the {{EX:syncreplCookie}} attribute of the immediate child
|
|
of the context suffix whose DN is {{cn=syncrepl<rid>,<suffix>}}
|
|
and object class is {{EX:syncConsumerSubentry}}.
|
|
The replica state maintained by a consumer server is used as the
|
|
synchronization state indicator when it performs subsequent incremental
|
|
synchronization with the provider server. It is also used as a
|
|
provider-side synchronization state indicator when it functions as
|
|
a secondary provider server in a cascading replication configuration.
|
|
<rid> is the replica ID uniquely identifying the replica locally in the
|
|
syncrepl consumer server. <rid> is an integer which has no more than
|
|
three decimal digits.
|
|
|
|
Because a general search filter can be used in the syncrepl specification,
|
|
not all entries in the context will be returned as the synchronization content.
|
|
The syncrepl engine creates a glue entry to fill in the holes
|
|
in the replica context if any part of the replica content is
|
|
subordinate to the holes. The glue entries will not be returned
|
|
as the search result unless {{ManageDsaIT}} control is provided.
|
|
|
|
It is possible to retrieve {{EX:syncProviderSubentry}} and
|
|
{{EX:syncConsumerSubentry}} by performing an LDAP search with
|
|
the respective entries as the base object and with the base scope.
|
|
|
|
|
|
H2: Configuring Syncrepl
|
|
|
|
Because syncrepl is a consumer-side replication engine, the syncrepl
|
|
specification is defined in {{slapd.conf}} (5) of the consumer server,
|
|
not in the provider server's configuration file.
|
|
The initial loading of the replica content can be performed either
|
|
by starting the syncrepl engine with no synchronization cookie
|
|
or by populating the consumer replica by adding and demoting an
|
|
{{TERM:LDIF}} file dumped as a backup at the provider.
|
|
{{slapadd}} (8) supports the replica promotion and demotion.
|
|
|
|
When loading from a backup, it is not required to perform the initial
|
|
loading from the up-to-date backup of the provider content. The syncrepl
|
|
engine will automatically synchronize the initial consumer replica to
|
|
the current provider content. As a result, it is not required
|
|
to stop the provider server in order to avoid the replica inconsistency
|
|
caused by the updates to the provider content during the
|
|
content backup and loading process.
|
|
|
|
When replicating a large scale directory, especially in a bandwidth
|
|
constrained environment, it is advised to load the consumer replica
|
|
from a backup instead of performing a full initial load using syncrepl.
|
|
|
|
H3: Set up the provider slapd
|
|
|
|
There is no special {{slapd.conf}} (5) directive for the provider
|
|
syncrepl server except for the session log directive. Because the
|
|
LDAP Sync search is subject to access control, proper access control
|
|
privileges should be set up for the replicated content.
|
|
|
|
When creating a provider database from the {{TERM:LDIF}} file using
|
|
{{slapadd}} (8), {{EX:contextCSN}} and the {{EX:syncProviderSubentry}}
|
|
entry must be created. {{slapadd -p -w}} will create
|
|
a new {{EX:contextCSN}} from the {{EX:entryCSN}}s of the added entries.
|
|
It is also possible to create the {{EX:syncProviderSubentry}} with
|
|
an appropriate {{EX:contextCSN}} value by directly including it
|
|
in the ldif file. {{slapadd -p}} will preserve the provider's
|
|
contextCSN or will change it to the consumer's contextCSN
|
|
if it is to promote a replica to the provider's content.
|
|
The {{EX:syncProviderSubentry}} can be included in the ldif output
|
|
when {{slapcat}} (8) is given the {{-m}} flag;
|
|
the {{EX:syncConsumerSubentry}} can be retrieved by the {{-k}}
|
|
flag of {{slapcat}} (8).
|
|
|
|
The session log is configured by
|
|
|
|
> sessionlog <sid> <limit>
|
|
|
|
directive, where {{<sid>}} is the ID of the per-scope session log
|
|
in the provider server and {{<limit>}} is the maximum number of
|
|
session log entries the session log store can record. {{<sid>}}
|
|
is an integer no longer than 3 decimal digits. If the consumer
|
|
server sends a synchronization cookie containing {{sid=<sid>}}
|
|
where {{<sid>}} matches the session log ID specified in the directive,
|
|
the LDAP Sync search is to utilize the session log store.
|
|
|
|
H3: Set up the consumer slapd
|
|
|
|
The syncrepl replication is specified in the database section
|
|
of {{slapd.conf}} (5) for the replica context.
|
|
The syncrepl engine is backend independent and the directive
|
|
can be defined with any database type.
|
|
|
|
> syncrepl rid=123
|
|
> provider=ldap://provider.example.com:389
|
|
> type=refreshOnly
|
|
> interval=01:00:00:00
|
|
> searchbase="dc=example,dc=com"
|
|
> filter="(objectClass=organizationalPerson)"
|
|
> scope=sub
|
|
> attrs="cn,sn,ou,telephoneNumber,title,l"
|
|
> schemachecking=off
|
|
> updatedn="cn=replica,dc=example,dc=com"
|
|
> bindmethod=simple
|
|
> binddn="cn=syncuser,dc=example,dc=com"
|
|
> credentials=secret
|
|
|
|
In this example, the consumer will connect to the provider slapd
|
|
at port 389 of {{FILE:ldap://provider.example.com}} to perform a
|
|
polling ({{refreshOnly}}) mode of synchronization once a day. It will
|
|
bind as {{EX:cn=syncuser,dc=example,dc=com}} using simple authentication
|
|
with password "secret". Note that the access control privilege of
|
|
{{EX:cn=syncuser,dc=example,dc=com}} should be set appropriately
|
|
in the provider to retrieve the desired replication content.
|
|
The consumer will write to its database with the privilege of the
|
|
{EX:cn=replica,dc=example,dc=com}} entry as specified in the
|
|
{{EX:updatedn=}} directive. The {{EX:updatedn}} entry should have
|
|
write permission to the replica content.
|
|
|
|
The synchronization search in the above example will search for the
|
|
entries whose objectClass is organizationalPerson in the entire subtree
|
|
rooted at {{EX:dc=example,dc=com}}. The requested attributes are
|
|
{{EX:cn}}, {{EX:sn}}, {{EX:ou}}, {{EX:telephoneNumber}},
|
|
{{EX:title}}, and {{EX:l}}. The schema checking is turned off, so
|
|
that the consumer {{slapd}} (8) will not enforce entry schema checking
|
|
when it process updates from the provider {{slapd}} (8).
|
|
|
|
For more detailed information on the syncrepl directive,
|
|
see the {{SECT:syncrepl}} section of {{SECT:The slapd Configuration File}}
|
|
chapter of this admin guide.
|
|
|
|
H3: Start the provider and the consumer slapd
|
|
|
|
The provider {{slapd}} (8) is not required to be restarted.
|
|
{{contextCSN}} is automatically generated as needed:
|
|
it might originally contained in the {{TERM:LDIF}} file,
|
|
generated by {{slapadd}} (8), generated upon changes in the context,
|
|
or generated when the first LDAP Sync search arrived at the provider.
|
|
|
|
When starting a consumer {{slapd}} (8), it is possible to provide a
|
|
synchronization cookie as the {{-c cookie}} command line option
|
|
in order to start the synchronization from a specific state.
|
|
The cookie is a comma separated list of name=value pairs. Currently
|
|
supported syncrepl cookie fields are {{csn=<csn>}}, {{sid=<sid>}}, and
|
|
{{rid=<rid>}}. {{<csn>}} represents the current synchronization state
|
|
of the consumer replica. {{<sid>}} is the identity of the per-scope
|
|
session log to which this consumer will be associated. {{<rid>}} identifies
|
|
a consumer replica locally within the consumer server. It is used to relate
|
|
the cookie to the syncrepl definition in {{slapd.conf}} (5) which has
|
|
the matching replica identifier.
|
|
Both {{<sid>}} and {{<rid>}} have no more than 3 decimal digits.
|
|
The command line cookie overrides the synchronization cookie
|
|
stored in the consumer replica database.
|