Join and Authentication Issues

Previous Topic Next Topic

Replication Issues

Active Directory knows only two kinds of replication: synchronous RPC replication, and asynchronous replication by using the Inter-site Messaging (ISM) service. When the directory uses the ISM service, it sends and receives messages in an asynchronous manner.

To summarize, the following types of transports are recognized:


note-icon

Note

One important requirement of the ISM transport module: that it work between disconnected networks. This is not an implementation requirement of the ISM transport module, but a feature requirement. The kinds of environments in which you deploy do not have a directly connected trusted path. Instead, the message must pass through gateways; part of the message travelling over an unsecured link. Whatever transport works in this environment must tolerate long latencies, must be routable through gateways, and must be able to be stored and forwarded in case one or more gateways are unavailable.

To troubleshoot mail-based replication

  1. Check the event log for relevant messages. Possible errors can be problems with the KCC in constructing the topology, problems from the SMTP service (SMTPSVC) in delivering the mail, problems from the ISM service in reading the messages, or problems from the NTDS in decoding and applying the mail.
  2. Verify that the KCC setup is on SMTP-based connections between the servers in the sites you want.

    This indicates that the site links are what you expect.

  3. Verify that the replication links are established by using the correct transport. Do this by looking for connections that have the SMTP transport associated with them. You might also use repadmin /showreps and to look for the "via SMTP" designation.

    Note that the KCC does not create a connection by using SMTP until the following criteria are met:

    Note that communication between sites is by definition between bridgehead servers. The KCC chooses the bridgehead for each site unless they are set explicitly. Verify that if you are using explicit bridgeheads, that they hold the domain you are trying to replicate.

  4. Before you move a server into a site connected only by mail-based replication, you want to verify that the Domain Controller Certificate is present. A certificate authority must be installed in your enterprise on one of the domain controllers. It must be an "Enterprise Certificate Authority." After some time (up to 8 hours), all the domain controllers in the domain are "auto enrolled" with a Domain Controller Certificate. You can verify whether a particular computer holds the domain controller Certificate by using the Certificates snap-in (look under Personal), or by using the command repadmin /showcert.
  5. For mail-based replication, you need to decide if mail routing is necessary. If the two servers have direct IP connectivity and can send mail directly to each other, no further configuration is required. However, if the two domain controllers must go through mail gateways to deliver mail to each other, you must configure the domain controller to use the mail gateway. Typically this is done by setting a "Smart Host" in the Default SMTP Virtual Server, under IIS in the computer configuration Snap-in.
  6. Determine whether mail-based replication is succeeding or not by checking the display from repadmin/showreps. This shows the current error code and the last success time. If the current error code is "request is pending" and the last success has been more than an hour, you can suspect that your mail is being delayed or not delivered. Note that it is normal for delivery of mail to not happen immediately, because the communication is store and forward, not a direct connection.

    The first thing you want to check is whether the SMTPSVC is picking up and delivering the mail. You can check the Inetpub\mailroot\Queue directory on the destination, any gateway computers, and the source. The Queue is the queue of outgoing mail. If the Queue directory contains a large number of files, this usually means the SMTPSVC is backed up or unable to process all the mail. A workaround is to run the following:

    This causes all pending mail to be sent.

  7. Another thing to check is whether there are delivery problems. Please verify that each leg of the mail route can contact its next hop directly by IP. In addition to IP connectivity, it is recommended that you verify that each server can resolve the name of the server it is trying to reach. The mail address that a domain controller uses to contact another domain controller is its "guid-based name," which looks like <guid>._msdcs.<forest-root-dns-name>. It is recommended that you verify that you can ping this name and receive a successful response.
  8. When mail cannot be delivered, it is returned as a "Delivery Status Notification" by the SMTPSVC. These DSNs are logged to the event log when logged is set high enough. In order to see DSNs, you must set the Active Directory Diagnostic level of Intersite Messaging Service to level 1 and restart the SMTP service. For more information about Active Directory diagnostic levels and how to set them, see "Active Directory Diagnostic Levels" earlier in this chapter.

Replication Event Viewer Entries

The types of entries that might appear in Directory Service log in Event Viewer that pertain to replication include errors such as the following:

Knowledge Consistency Check Replication Errors

Error: ID1311, from Event Source "NTDS KCC") in the Directory Service log

You might see the following entry (with ID 1311 from event source "NTDS KCC") in the Directory Service log.

The Directory Service consistency checker has determined that either (a) there is not enough physical connectivity published via the Active Directory Sites and Services Manager to create a spanning tree connecting all the sites containing the partition DC=mycorp,DC=com, or (b) replication cannot be performed with one or more critical servers in order for changes to propagate across all sites (most often due to the servers being unreachable).

For (a), please use the Active Directory Sites and Services Manager to do one of the following:

1. Publish sufficient site connectivity information such that the system can infer a route by which this Partition can reach this site.  This option is preferred.

2. Add an ntdsConnection object to a Domain Controller that contains the partition DC=mycorp,DC=com in this site from a Domain Controller that contains the same partition in another site.

For (b), please see previous events logged by the NTDS KCC source that identify the servers that could not be contacted.


This behavior can occur if the KCC has determined that a site has been orphaned from the replication topology.

One computer in a specific site owns the role of creating inbound replication connection objects between bridgehead servers from other sites. This domain controller is known as the Inter-Site Topology Generator. While analyzing the site link and site link bridge structure to determine the most cost-effective route to synchronize a naming context between two points, it might determine that a site does not have membership in any site link and therefore has no means to create a replication object to a bridgehead server in that site. The first site in Active Directory (named "Default-First-Site-Name"), is created automatically for the administrator. This site is a member of the default site link ("DEFAULTIPSITELINK"), which is also created automatically for the administrator, and is used for RPC communication over TCP/IP. If the administrator creates two additional sites ("site1" and "site2" for example), the administrator must define a Site Link that each site is going to be a member of before they can be written to Active Directory.

However, the administrator can open the properties of a site link and modify which sites reside in the site link. If the administrator were to remove a site from all site links, the KCC displays the error message listed earlier to indicate that a correction needs to be made to the configuration.


note-icon

Note

When the KCC is displaying this message, it is in a mode where it does not remove any connections. Normally, the KCC cleans up old connections from previous configurations or redundant connections. Thus, you might find extra unexpected connections during this time. The solution is to correct the topology problem so that the spanning tree can be formed.

This error might also occur when replication has failed from a particular bridgehead server in a site and no other bridgehead servers are available. For more information about bridgehead servers, see "Active Directory Replication" in this book.

Examples of Replication Event Viewer Messages

Examples of errors that involve long-running inbound replication and Certificate Services conflicts and other replication Event Viewer messages:


note-icon

Note

The error messages are not displayed by default , and the diagnostic level must be increased first. The frequency of these messages appearing in Event Viewer are rare. However, they are useful for troubleshooting and optimizing performance.

Event 1580

Severity=Informational

A long running inbound replication has finished.  The elapsed time was %1 minutes.

The operation was %2, and the options were %3. The status of the operation was:

The operation specific arguments are:

Parameter 1:

Parameter 2:

Parameter 3:

Parameter 4:

%n

This data is for information and may be useful in tuning the replication performance of the system. Since only one inbound replication may occur at a time, long running replications delay other replications from coming in in a timely manner.  This system has been delayed from receiving other directory updates because this replication went on

as long as it did. A long running replication may indicate a large number of updates, or a number of complex updates (DN-valued attributes) occurring at the source server. Performing these updates during non-critical times may prevent replication delays during important times.


A long running replication is normal in the case of adding a new replica to a system, either because of installation, Global Catalog promotion, or connection creation by the KCC.  A long running replication may also occur for a system that has been down, or a partition that has been out of touch for an extended period.


The record data is the status code.

.

logging_level: 1


Event 1579

Severity=Warning

Due to contention with the Certificate Services for resources,

replication was stalled for %1 minutes, %2 seconds. It took an unusually long time to prepare an asynchronous replication message for transmission. This condition should be transient. If this issue persists, please contact Microsoft Product Support Services for assistance.

.

logging_level: 0


Event 1574

Severity=Warning

Due to contention with the Security Descriptor Propagator for resources, inbound replication was stalled for %1 minutes, %2 seconds.  This condition should be transient. If this issue persists, please contact Microsoft Product Support Services for assistance.

.

logging_level: 0


Event 1575

Severity=Informational

One or more new attributes has been added to the partial attribute set for partition %1. A full synchronization will be performed from source %2 on the next periodic synchronization.

.

logging_level: 1


Event 1560

Severity=Informational

A new replica for partition %1 has been added to this server.  This server will now perform a full synchronization from source %2 with options %3.

.

logging_level: 1


Event 1561

Severity=Informational

The user has requested a full synchronization of partition %1 from source %2 with options %3.

.

logging_level: 1


Event 1562

Severity=Informational

The full synchronization of partition %1 from source %2 with options

%3 in being continued.

.

logging_level: 1


RPC Server Is Unavailable

The "RPC server is unavailable" errors typically indicate that the computer is down.


note-icon

Note

If the name can not be resolved, Active Directory generates an error that explicitly says "The name can not be resolved."

For more information on RPC server unavailable error messages, see "Name Resolution" earlier in this chapter.

Unknown User Name/Bad Password

Regarding the "Unknown user name/bad password" issue, run the repadmin command on the domain controller.

For example, run the following: repadmin /showmeta CN=<domain controller name>,OU=Domain Controllers,DC=<domain name>,DC=<domain name>.

If the version numbers on the unicodePwd attribute of either object is sufficiently different between the two domain controllers (that is greater than or equal to two), it might mean that the passwords are not synchronized, that replication hasn't occurred, and therefore that the domain controllers can no longer authenticate to each other. If that is the case, reset the computer account passwords.


note-icon

Note

The Dcdiag tool contains a computer object metadata comparison test that automates the steps described in the preceding section. For more information about the Dcdiag tool, see "Domain Controller Issues" earlier in this chapter.

Replication Failing with Access Denied

Other security errors might also fall under this diagnosis, such as "Wrong Target Name" or "Cannot locate domain controller." It is important to understand that a circular dependency exists between replication and distributed security using Kerberos v5. Replication is authenticated using Kerberos v5, and Kerberos security principals are replicated using replication.

Usually, if two domain controllers cannot replicate with each other because of a security problem, it is because one or both of these domain controllers haven't replicated the information needed to identify each other.

When Active Directory is installed on a computer, that computer becomes a domain controller (refer to this as the target), it joins the enterprise by communicating with an existing domain controller (refer to this as the source). Until the target restarts and configures itself, the source is the only domain controller in the enterprise with knowledge of the target.


note-icon

Note

The source becomes a single point of failure for a brief period until it replicates its knowledge off to other domain controllers. This is one reason that the removal of Active Directory from a domain controller requires successful replication outbound before attempting to run the Active Directory Installation Wizard. Under unusual circumstances, if a computer completely fails after being a source, knowledge of a new domain controller can be lost to the rest of the enterprise.

Thus, if a third domain controller attempts to replicate with the new system, replication can fail with a security error because knowledge of the new system has not replicated throughout the enterprise. Replication of knowledge of the new system replicates in the Configuration directory partition, but knowledge of the Kerberos v5 principal for the new system replicates in the Domain directory partition. Thus there can be a temporary disconnect between when a third computer knows of a new computer, and when that computer can authenticate with them.

Therefore, when replication receives the error "Access Denied," it should be verified whether knowledge of newly promoted domain controllers has replicated throughout the enterprise. In the event that the enterprise is having replication problems, it might be preventing the computer account object of a newly promoted domain controller from being replicated.

Automatic Topology Generator Was Unable to Complete the Topology

You also might see a message in the Directory Service log in Event Viewer that says "Automatic topology generator was unable to complete the topology for <distinguished name of the site>." This message indicates that there is an exception in the KCC.

To log more information, increase the value of the Internal Processing registry entry to 3 and wait 15 minutes:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NTDS\Diagnostics\Internal Processing

Alternatively, you can also run repadmin /kcc, and reset the value of the registry entry to 0.


note-icon

Note

This method must also be used whenever replication returns the status of "Internal error." The Internal Processing level should be set on both of the computers that are replicating data.

Monitoring the Replication Links

You can use the Replication Administration (Repadmin) command-line tool to monitor the current links for a specific domain controller, including the domain controllers that are replicating to and from the current domain controller. By viewing these links in Repadmin, you can see the replication topology as it exists for the current server. By seeing the replication topology, you can check replication consistency between replication partners, monitor replication status, and display replication metadata.

More importantly, you can also manipulate replication topology by forcing specific replication events and triggering KCC recalculation. However, you must force replication only when you know that a domain controller is offline, or when network connections are not working. During normal operation, the KCC automatically manages the replication topology for each directory partition on domain controllers.

For example, to track which domain controller received a particular replicated change, you can enter the following:

repadmin/showmeta "CN=JSmith,OU=PR,OU=Marketing,DC=Reskit,DC=com" <name of domain controller>


where <name of domain controller> is the host name of the target domain controller for which you are tracking replicated changes for "JSmith" in the "PR" OU in the "Marketing" OU, in the "Reskit.com" domain. The output resulting from this command shows the update sequence number (USN), the originating DSA, USN, date and time, version number, and the replicated attribute.

Domain Mode Changes

To determine whether directory updates are being replicated to all domain controllers, you need to use the Repadmin and Ldp tools. One primary example of operations that can affect replication integrity are domain mode changes.

For example, the domain mode change is propagated through normal replication. There is an attribute called ntMixedMode for objects of class domainDNS (for example, the domain). Nonzero indicates the mode is mixed, zero indicates native mode. You can determine if the domain mode change has propagated by checking this attribute on all the domain controllers in the domain.

You can use the Repadmin tool to view the replication topology as seen from the perspective of each domain controller and the ISM matrix information. In addition, the Repadmin tool can be used to manually create the replication topology (although in normal practice, this is not necessary), to force replication events between domain controllers and to view both the replication metadata and up-to-date vectors.


note-icon

Note

During the normal course of operations, there is no requirement for manual creation of the replication topology. Incorrect use of this tool might adversely impact the replication topology. The major use of this tool is to monitor replication so that problems such as offline servers or unavailable LAN/WAN connections, can be identified.

Repadmin

Repadmin.exe is a command-line tool that lets you view and change replication status on domain controllers when you need to diagnose and troubleshoot replication between Windows 2000–based domain controllers. You can use Repadmin to view the current replication topology, manually create the replication topology, and force replication events.


note-icon

Note

During normal operation, the KCC performs automatic replication topology generation, and manual management of the replication topology is not required.

For information about using Repadmin, see the /Support Tools Help on the Windows 2000 operating system CD.

Viewing the Connections for a Server

The Repadmin tool can be used to show the current links for a specific domain controller, including the domain controllers that are replicating to and from this domain controller. By viewing these links in Repadmin, you can see the replication topology as it currently exists for that server. The links might be unreachable, which prevents any new links from that server from being added.

When you use Repadmin to view the links for a domain controller, you are viewing the replication partners that the KCC is currently using for that server. If you can see a connection object in the Sites container, you might not see that connection represented in Repadmin; for example, to view the current replication partners for a particular server.

To view current replication partners for a server

Forcing Replication Between Replication Partners

There are four methods that can be used to initiate replication between direct replication partners. Three methods use administrative tools. The fourth method involves writing a Visual Basic script. For instructions about how to use the script, see the Microsoft Knowledge Base link on the Web Resources page at http://windows.microsoft.com/windows2000/reskit/webresources. Follow the links for article Q232072.

For each of the following methods that are described, the "source" server describes the domain controller that replicates changes to a replication partner. The "destination" domain controller receives the changes.

Initiating Replication by Using Active Directory Sites and Services

In the Active Directory Sites and Services MMC console, right-click a connection object, and then click Replicate Now. (For information about how to initiate replication by using Active Directory Sites and Services, see Windows 2000 Server Help.)

Initiating Replication by Using Repadmin

Repadmin is a command-line tool that is included in the Support directory on the Windows 2000 operating system CD. You can use Repadmin to first determine the directory replication partners of the destination server and then issue a command to synchronize the source server with the destination by using the object GUID of the source server.

To use Repadmin to force replication between two servers

  1. At a command prompt, type the following:

    repadmin /showreps <destination_server_name>


  2. note-icon

    Note

    Refer to the Repadmin example that follows as you walk through these steps.

  3. Under the Inbound Neighbors section of the output, find the directory partition that needs synchronization and locate the source server with which the destination is to be synchronized. Note the objectGuid value of the source server.
  4. Initiate replication by entering the following command:
  5. repadmin /sync <directory_partition_DN> <destination_server_name><source_server_objectGuid>

For example, to initiate replication on DC1 of the domain directory partition support.reskit.com so that changes are replicated from DC2, use the following command:

repadmin /sync dc=support,dc=microsoft,dc=com DC1 d2e3badd-e07a-11d2-b573-0000f87a546b


If the command is successful, Repadmin.exe displays the following message:

ReplicaSync() from source: d2e3badd-e07a-11d2-b573-0000f87a546b, to dest: DC1 is successful.


Optionally, you can use the following switches at the command prompt:

 - /force: Overrides the normal replication schedule.

 - /async: Starts the replication event. (Repadmin.exe does not wait for the replication event to finish.)


The following is an example of running repadmin:

C:\WINNT\idw>repadmin /showreps

repadmin /showreps

Washington\NTGROUP1

DSA Options : (none)

objectGuid  : 3a34efb9-f828-11d2-a68d-00c04fb9d14e

invocationID: 39216b7e-f828-11d2-8128-00105a68cf71


==== INBOUND NEIGHBORS ======================================


DC=dsysreskit,DC=reskit,DC=microsoft,DC=com

    Bldg\NTGROUP2 via RPC

        objectGuid: cc6d76a3-a71a-11d2-bbd0-00105a24d6db

        Last attempt @ 1999-05-10 22:47.33 failed, result 1722:

            The RPC server is unavailable.

        Last success @ 1999-05-10 22:02.32.

        6 consecutive failure(s).


CN=Schema,CN=Configuration,DC=reskit,DC=microsoft,DC=com

    Washington\WORKSTATION1 via RPC

        objectGuid: ed8a3ba0-d439-11d2-99e7-08002ba3ed3b

        Last attempt @ 1999-05-10 22:47.32 was successful.

    Washington\RESKIT-DC-07 via RPC

        objectGuid: 6a7ff635-baeb-11d2-8fda-0008c709d19e

        Last attempt @ 1999-05-10 22:47.33 was successful.

    Bldg\NTGROUP2 via RPC

        objectGuid: cc6d76a3-a71a-11d2-bbd0-00105a24d6db

        Last attempt @ 1999-05-10 22:47.33 failed, result 1722:

            The RPC server is unavailable.

        Last success @ 1999-05-10 22:02.32.

        6 consecutive failure(s).


CN=Configuration,DC=reskit,DC=microsoft,DC=com

    Washington\WORKSTATION1 via RPC

        objectGuid: ed8a3ba0-d439-11d2-99e7-08002ba3ed3b

        Last attempt @ 1999-05-10 22:47.32 was successful.

    Bldg\NTGROUP2 via RPC

        objectGuid: cc6d76a3-a71a-11d2-bbd0-00105a24d6db

        Last attempt @ 1999-05-10 22:47.33 failed, result 1722:

            The RPC server is unavailable.

        Last success @ 1999-05-10 22:02.32.

        6 consecutive failure(s).

    Washington\RESKIT-DC-07 via RPC

        objectGuid: 6a7ff635-baeb-11d2-8fda-0008c709d19e

        Last attempt @ 1999-05-10 22:48.26 was successful.


==== OUTBOUND NEIGHBORS FOR CHANGE NOTIFICATIONS ============


DC=dsysreskit,DC=reskit,DC=microsoft,DC=com

    Washington\DSYSRESKIT0 via RPC

        objectGuid: abbf2810-f51b-11d2-84a0-00105a68cf71

    Washington\RESKIT-DC-07 via RPC

        objectGuid: 6a7ff635-baeb-11d2-8fda-0008c709d19e


CN=Schema,CN=Configuration,DC=reskit,DC=microsoft,DC=com

    Washington\DSYSRESKIT0 via RPC

        objectGuid: abbf2810-f51b-11d2-84a0-00105a68cf71

    Washington\RESKIT-DC-07 via RPC

        objectGuid: 6a7ff635-baeb-11d2-8fda-0008c709d19e

    Washington\WORKSTATION1 via RPC

        objectGuid: ed8a3ba0-d439-11d2-99e7-08002ba3ed3b


CN=Configuration,DC=reskit,DC=microsoft,DC=com

    Washington\DSYSRESKIT0 via RPC

        objectGuid: abbf2810-f51b-11d2-84a0-00105a68cf71

    Washington\RESKIT-DC-07 via RPC

        objectGuid: 6a7ff635-baeb-11d2-8fda-0008c709d19e

    Washington\WORKSTATION1 via RPC

        objectGuid: ed8a3ba0-d439-11d2-99e7-08002ba3ed3b


Viewing Replication Status and Performance

Active Directory Replication Monitor (Replmon.exe) is a graphical tool that you can use to view low-level status and performance of replication between Active Directory domain controllers. Replication Monitor can be used to monitor replication features, as follows:

Replmon Requirements

Active Directory Replication Monitor must be installed on a computer that is running Microsoft® Windows® 2000 Professional or Windows 2000 Server. The computer can be a domain controller, member server, member workstation, or stand-alone computer. In addition, Replmon can be used to monitor domain controllers from different forests simultaneously.

Using Ldp.exe to Find the DSA Object GUID

The server GUID is a reference point that is used in Active Directory and DNS to locate a domain controller primarily for the purposes of replication. This GUID is automatically generated for each domain controller, is unique when created, and is not duplicated.


note-icon

Note

There are various GUIDs that are used for different purposes. The GUIDs that are used by the Repadmin tool are called "DSA objectGuid" because they are GUIDs of the Ntds settings object. This is displayed in the first four lines of repadmin /showreps output.

To identify the DSA object GUID for a particular domain controller that might be useful troubleshooting replication issues

  1. Using Ldp.exe, search the configuration directory partition with the following criteria:
  2. Base DN: CN=Sites,CN=Configuration,DC=RootDomainName,DC=Com

    Filter : (cn=NTDS Settings)

    Scope: Subtree

    Attributes: objectGUID


    Replace RootDomainName with the name of the first domain that was installed in the enterprise (the "forest root domain").

  3. In the search results, locate the entry that represents the appropriate server for which tyou are determining the GUID. The objectGUID attribute must also be present and look like the following:
  4. ***Searching...

    ldap_search_s(ld, "cn=sites,cn=configuration,dc=reskit,dc=com", 2, "(cn=NTDS

    Settings)", attrList, 0, &msg)

    Result <0>: (null)

    Matched DNs:

    Getting 1 entries:

    >> Dn: CN=NTDS

    Settings,CN='server-name',CN=Servers,CN='site-name',CN=Sites,CN=Configuration,DC=reskit,DC=com

    1> objectGUID: e99e82d5-deed-11d2-b15c-00c04f5cb503;


The DSA objectGUID is identified by the value that is associated with the objectGUID attribute.

For more information about Active Directory command-line tools, see Windows 2000 Resource Kit Tools Help, which is included on the Windows 2000 Resource Kit companion CD.

© 1985-2000 Microsoft Corporation. All rights reserved.