Active Directory Diagnostics, Troubleshooting, and Recovery |
The first step toward identifying and diagnosing Active Directory problems is to verify network connectivity. This section discusses diagnostic tools and gives examples of possible network connectivity problems, along with suggested solutions. Examine the following areas to determine whether the network is functioning properly.
Event Viewer is one of the most useful tools you can use to identify not only networking problems, but also name resolution, directory service, and other types of problems. It categorizes error codes so that you can easily identify a problem, and then analyze the cause of it. Always check the event log to make sure that the directory service is not reporting any events that are indicators of future problems.
To identify network connectivity problems, check the System Log folder and analyze the types of errors and warnings listed. For each error or warning, go to the Event Properties page to view the description and the data returned. In the Data box, click Words and translate the hexadecimal code to decimal. If you see a number in the Event column for the error code, use the net helpmsg command to obtain a brief description of the error code.
For example, if the first four digits of the error code are 8007, this indicates a Microsoft® Win32® API or network error. You can then use the net helpmsg command to decode these types of errors. To decode the error, first convert the last four digits of the hexadecimal error code to decimal. Then at the command prompt, type the following:
net helpmsg <message number in decimal>
where the <message number in decimal> is replaced with the return value you want to decode. The net helpmsg command returns a description of the error. For example, if Component Object Model (COM) returns the error 8007054B, convert the low order word, 054B, to decimal (1355). Then type the following:
net helpmsg 1355
For example, it is recommended that you look in the Event Properties page. Specifically, look at the description and the data that are returned. In the Data box, translate the hexadecimal code to decimal by clicking Words. Then, run net helpmsg <message number in decimal>, as in the following example:
net helpmsg 1355 equates to "The specified domain either does not exist or could not be contacted."
If you see error codes, such as "access denied" or "bad password," you probably have a security related problem, not a network connectivity problem. The error code "no logon servers" is usually indicative of the fact that you are not able to find a domain controller for that domain. The error code "No logon servers" have a source description of Net Logon. "Access Denied" might have a source description of SAM.
For more information about the Net Helpmsg command and error code explanations, see the Microsoft Platform SDK link on the Web Resources page at http://windows.microsoft.com/windows2000/reskit/webresources.
Check that your hardware, such as the network hub, cables, and so on, are functioning properly. For example, if the Local Area Connection icon in the Network and Dial-up Connections properties in Control Panel is marked with a red "X," this usually implies that your network cable is disconnected. For more information about checking hardware functionality, see the Server Operations Guide.
As a minimum, check that your network adapters and drivers are functioning properly. There are many ways to check the functionality of devices, such as network adapters and drivers, through Control Panel. You can select the Add/Remove Hardware icon, and click Add/Troubleshooting a device. Or, you can select the Hardware tab from the System icon.
Another way of using Control Panel is to click Hardware Wizard on the Hardware tab of System Properties in Control Panel. Select a device from the Devices box, and then check to see whether the device is working properly. If you click Finish, the Troubleshooter starts when you quit the Add/Remove Hardware wizard. Examine the properties of each device that is displayed by double-clicking the device icon. The status of each device displays on the General tab. Click Troubleshooter for help if the device is not working properly.
Another aspect of verifying network connectivity involves a check of the local area connection. Ensure that you are connected to the network and that the Internet Protocol (IP) addresses are correct. Do this by using the IPConfig command-line tool. The IPConfig tool is used to view and modify IP configuration details used by the computer. With DNS dynamic updates, you can also use IPConfig to register the computer's entries in the DNS service.
To view IP configuration details
To test a TCP/IP connectivity by using the ping command
127.0.0.1
If the ping command fails, verify that the computer was restarted after TCP/IP was installed and configured.
If the ping command fails, restart the computer and check the routing table using the route print command.
If the ping command fails, verify that the default gateway IP address is correct and that the gateway (router) is operational.
If the ping command fails, verify that the remote host IP address is correct, that the remote host is operational, and that all gateways (routers) between this computer and the remote host are operational.
If the ping command fails, verify that the DNS server IP address is correct, that the DNS server is operational, and that all gateways (routers) between this computer and the DNS server are operational.
Note
Use the ping command to test TCP/IP connectivity and to determine whether there are network problems between different computers. A failure to connect to the server causes Ping to return a "Request timed out" message.
The following example displays an example of an unsuccessful TCP/IP configuration for the local area network, with the disabled components indicated in bold text. Also, notice that IP addresses are not displayed. The absence of IP addresses indicates that the local area network is not properly connected.
i:> ipconfig /all
Windows 2000 IP Configuration
Host Name . . . . . . . . . . . . : SERVER1
Primary DNS Suffix . . . . . . . : reskit.com
Node Type . . . . . . . . . . . . : Hybrid IP Routing Enabled. . . . . . . . : No WINS Proxy Enabled. . . . . . . . : No DNS Suffix Search List. . . . . . : reskit.com
server1.reskit.com
Ethernet adapter Local Area Connection:
Media State . . . . . . . . . . . : Cable Disconnected
Description . . . . . . . . . . . : 3Com EtherLink XL 10/100 PCI TX NIC (3C905B-TX)
Physical Address. . . . . . . . . : 00-10-5A-99-F7-15
The following example shows a well-connected local area network. Notice that the IP addresses are displayed.
i:> ipconfig /all
Windows 2000 IP ConfigurationHost Name . . . . . . . . . . . . : Server1
Primary DNS Suffix . . . . . . . : reskit.com
Node Type . . . . . . . . . . . . : Hybrid IP Routing Enabled. . . . . . . . : No WINS Proxy Enabled. . . . . . . . : No DNS Suffix Search List. . . . . . : reskit.com
Server1.reskit.com
Ethernet adapter Local Area Connection: Connection-specific DNS Suffix . : Server1.reskit.com
Description . . . . . . . . . . . : 3Com EtherLink XL 10/100 PCI TX NIC (3C905B-TX)
Physical Address. . . . . . . . . : 00-10-5A-99-F7-15 DHCP Enabled. . . . . . . . . . . : No IP Address. . . . . . . . . . . . : 172.25.128.19 Subnet Mask . . . . . . . . . . . : 255.255.252.0
Default Gateway . . . . . . . . . : 172.25.128.1DNS Servers . . . . . . . . . . . : 172.26.128.19
Primary WINS Server . . . . . . . : 172.25.254.203
You might want to use the IP Configuration data of the local area connection that you obtained by using the IPConfig tool for further analysis. To make it easier to use, you can send the results to a text file. At the command line, type ipconfig /all > <local drive>:\<document title.txt> and then press ENTER. By default, the file is saved in the current directory. To view and modify the output, double-click the file that you created. For more information about TCP/IP troubleshooting, see the TCP/IP Core Networking Guide.
To determine whether your client is functional, you can use the Netdiag tool. The Netdiag tool helps to isolate networking and connectivity problems by performing a series of tests. These tests, and the key network status information they expose, give you a more direct means of identifying and isolating network problems. Moreover, because this tool does not require that parameters or switches be specified, you can focus on analyzing the output, rather than training users on tool usage.
Specifically, the Netdiag tool tests the following:
Run netdiag.exe at the command prompt and scan through the output, looking for words like "FATAL."
For more information about the Netdiag tool, see Windows 2000 Support Tools.
By using the Netdiag tool, the following example shows failures during DNS registrations and secure channel verifications. (The failures are noted in bold text.)
Computer Name: Server1
DNS Host Name: Server1.reskit.reskit.com
System info : NT Server 5.0 (Build 2091)
Processor : x86 Family 6 Model 5 Stepping 2, GenuineIntel
List of installed hotfixes :
Q147222
Netcard queries test . . . . . . . : Passed
Per interface results:
Adapter : Local Area Connection
Netcard queries test . . . : Passed
Host Name. . . . . . . . . : Server1.dns.reskit.com
IP Address . . . . . . . . : 172.16.85.33
Subnet Mask. . . . . . . . : 255.255.252.0
Default Gateway. . . . . . : 172.16.84.1
Primary WINS Server. . . . : 172.16.254.201
Secondary WINS Server. . . : 172.16.254.202
Dns Servers. . . . . . . . : 172.55.254.212
172.16.254.211
AutoConfiguration results. . . . . . : Passed
Default gateway test . . . : Passed
NetBT name test. . . . . . : Passed
WINS service test. . . . . : Passed
Global results:
Domain membership test . . . . . . : Passed
NetBT transports test. . . . . . . : Passed
List of NetBt transports currently configured.
NetBT_Tcpip_{69F6A885-C07C-49E4-ABFF-D15FB4B678E8}
1 NetBt transport currently configured.
Autonet address test . . . . . . . : Passed
IP loopback ping test. . . . . . . : Passed
Default gateway test . . . . . . . : Passed
NetBT name test. . . . . . . . . . : Passed
Winsock test . . . . . . . . . . . : Passed
DNS test . . . . . . . . . . . . . : Failed
[FATAL]: The DNS registration for Server1.reskit.reskit.com is incorrect on all DNS servers.
Redir and Browser test . . . . . . : Passed
List of NetBt transports currently bound to the Redir
NetBT_Tcpip_{69F6A885-C07C-49E4-ABFF-D15FB4B678E8}
The redir is bound to 1 NetBt transport.
List of NetBt transports currently bound to the browser
NetBT_Tcpip_{69F6A885-C07C-49E4-ABFF-D15FB4B678E8}
The browser is bound to 1 NetBt transport.
DC discovery test. . . . . . . . . : Passed
DC list test . . . . . . . . . . . : Failed
Trust relationship test. . . . . . : Failed
[FATAL] Secure channel to domain 'Reskit' is broken. [ERROR_NO_TRUST_SAM_ACCOUNT]
Kerberos test. . . . . . . . . . . : Skipped
LDAP test. . . . . . . . . . . . . : Passed
Bindings test. . . . . . . . . . . : Passed
WAN configuration test . . . . . . : Skipped
No active remote access connections.
Modem diagnostics test . . . . . . : Passed
The command completed successfully
For more information about diagnosing and troubleshooting DNS registration problems, see "Name Resolution" later in this chapter. For more information about diagnosing and troubleshooting secure channel problems, see "Authentication" later in this chapter.
The following example shows successful client-server connectivity by using the Netdiag tool.
Computer Name: Server1
DNS Host Name: Server1.Reskit.com
Processor : x86 Family 6 Model 5 Stepping 1, GenuineIntel
List of installed hotfixes :
Q147222
Netcard queries test . . . . . . . : Passed
Per interface results:
Adapter : Local Area Connection
Netcard queries test . . . : Passed
Host Name. . . . . . . . . : Server1.Reskit.Reskit.com
IP Address . . . . . . . . : 172.16.128.19
Subnet Mask. . . . . . . . : 255.255.252.0
Default Gateway. . . . . . : 172.16.128.1
Primary WINS Server. . . . : 172.16.254.203
Dns Servers. . . . . . . . : 172.16.128.19
AutoConfiguration results. . . . . . : Passed
Default gateway test . . . : Passed
NetBT name test. . . . . . : Passed
No remote names have been found.
WINS service test. . . . . : Passed
Global results:
Domain membership test . . . . . . : Passed
NetBT transports test. . . . . . . : Passed
List of NetBt transports currently configured.
NetBT_Tcpip_{F5A7E415-9D0B-444B-8028-11238D589BD0}
1 NetBt transport currently configured.
Autonet address test . . . . . . . : Passed
IP loopback ping test. . . . . . . : Passed
Default gateway test . . . . . . . : Passed
NetBT name test. . . . . . . . . . : Passed
Winsock test . . . . . . . . . . . : Passed
DNS test . . . . . . . . . . . . . : Passed
PASS - All the DNS entries for DC are registered on DNS server 172.16.128.19.
Redir and Browser test . . . . . . : Passed
List of NetBt transports currently bound to the Redir
NetBT_Tcpip_{F5A7E415-9D0B-444B-8028-11238D589BD0}
The redir is bound to 1 NetBt transport.
List of NetBt transports currently bound to the browser
NetBT_Tcpip_{F5A7E415-9D0B-444B-8028-11238D589BD0}
The browser is bound to 1 NetBt transport.
DC discovery test. . . . . . . . . : Passed
DC list test . . . . . . . . . . . : Passed
Trust relationship test. . . . . . : Skipped
Kerberos test. . . . . . . . . . . : Passed
LDAP test. . . . . . . . . . . . . : Passed
Bindings test. . . . . . . . . . . : Passed
WAN configuration test . . . . . . : Skipped
No active remote access connections.
Modem diagnostics test . . . . . . : Passed
The command completed successfully
You might want to use the network client and server connection data that you obtained by using the Netdiag tool for further analysis. To make it easier to use, you can send it to a text file. From the command line, type NetDiag.exe > <local drive>:\<document title.txt>, and then press ENTER. By default, the file is saved to the current directory. To view and modify the output, double-click the file.
Verify that the domain controller is functional. To verify network connectivity for domain controllers, use the Ping tool to check your domain controller, as well as other domain controllers in the network. If they are connected, the IP addresses are going to be properly resolved.
For example, carry out the following commands:
ping <your domain controller>
ping <additional domain controller>
ping <additional domain controller>
Does at least one of the previous procedures succeed? Also verify that it resolves to the correct IP address for the computer. If it does, go to the next section.
There are many reasons why the secure channel between a client and a domain might break. One example is if you don't have the appropriate access permissions, as shown in the following example:
[FATAL] Secure channel to domain 'RESKIT' is broken. [ERROR_ACCESS_DENIED]
> nltest /sc_query:reskit
nltest /sc_query:reskit
Flags: 0
Trusted DC Name
Trusted DC Connection Status Status = 5 0x5 ERROR_ACCESS_DENIED
The command completed successfully
To validate trust connections, you normally test the secure channel:
Note
The results of an Nltest /sc_query are unreliable — it returns the status of the channel when it was used last time and not the current status. The recommended sequence of verifying the trust is to run nltest /sc_query. If that returns success, run nltest /sc_reset:<domain>\<dcname returned by /sc_query>.
To determine the cause of trust relationship problems
nltest /dbflag:0x2000ffff.
The
08/30 10:15:19 [MAILSLOT] Returning paused to 'Reskit1' since: SysVol not ready
Common trust failures are the following:
Note
Installing computers that use the same computer name is often the reason for computer account problems, hence broken secure channels. The common way to get around this problem is to perform the join again.
Another example of client-domain controller trust relationship problems:
D:>nltest /sc_query:reskit
Flags: 0
Trusted DC Name
Trusted DC Connection Status Status = 1787 0x54b ERROR_NO_SAM_TRUST_ACCOUNT
The command completed successfully
The preceding example implies that the client assumes it has joined the domain. However, the client is not able to find a computer account registered for itself in the domain controller.
For more information about trust relationships, see "Active Directory Logical Structure" in this book.
The Nltest command-line tool enables you to check trust relationships, as well as the connectivity and traffic flow between a network client and a domain controller. Nltest checks the secure channel to make sure that both Windows 2000–based and Windows NT 4.0–based clients can connect to domain controllers. The tool also discovers domains and sites. Further, you can list the domain controllers and Global Catalog servers that are available. It supports user operations to identify which domain controllers are capable of logging on a specific user, as well as browsing specific user information.
To ensure that cached information is not being used when a Windows 2000–based client discovers a domain controller, carry out the /force command in the Nltest tool. At the command prompt, type nltest /dsgetdc:<your domain name> /force and then press ENTER.
Note
Nltest /dsgetdc: is used to exercise the locator. Thus /dsgetdc:<domain name> tries to find the domain controller for the domain. Using the force flag forces domain controller location rather than using the cache. You can also specify options such as /gc or /pdc to locate a Global Catalog or a primary domain controller emulator. For finding the Global Catalog, you must specify a "tree name," which is the DNS domain name of the root domain.
If you receive the following error, ERROR_NO_LOGON_SERVERS while using the Nltest tool to query the secure channel, this is usually indicative of the inability to find a domain controller for that domain. Run nltest /dsgetdc: <DomainName>: to verify whether you can locate a domain controller. If you are unable to find a domain controller examine DNS registrations and network connectivity.
For more information about verifying DNS registrations, see "Name Resolution" later in this chapter.
The following example shows an unsuccessful attempt to find a domain controller for the domain:
>nltest /SC_QUERY:reskit
Flags: 0
Trusted DC Name
Trusted DC Connection Status Status = 1311 0x51f ERROR_NO_LOGON_SERVERS
The command completed successfully
The following example shows an unsuccessful attempt to locate the domain controller for the domain using /dsgetdc switch:
:\>nltest /dsgetdc:reskit /force
DsGetDcName failed: Status = 1355 0x54b ERROR_NO_SUCH_DOMAIN
The following example shows a successful attempt to find a domain controller for the domain:
H:\>nltest /dsgetdc:reskit /force
DC: \\server1
Address: \\172.16.132.197
Dom Guid: ca21b03b-6dd3-11d1-8a7d-b8dfb156871f
Dom Name: reskit
Forest Name: reskit.com
Dc Site Name: Default-First-Site-Name
Our Site Name: Default-First-Site-Name
Flags: GC DS LDAP KDC TIMESERV WRITABLE DNS_FOREST CLOSE_SITE
The command completed successfully
To determine if the DHCP server is the problem, you can release your IP address, restart DHCP, and then renew your IP address by carrying out the following commands:
ipconfig /release
net stop dhcp
net start dhcp
ipconfig /renew
If you still cannot connect the client to the domain controller (even though you have a good IP address), a Network Monitor sniffer trace of the connection attempt might be useful.
For more information about DHCP, see "Dynamic Host Configuration Protocol" in the TCP/IP Core Networking Guide. For more information about DHCP Server, see Windows 2000 Server Help.
Network Monitor sniffer traces can help you trace all of the traffic to and from a computer; as well as to and from the DHCP server that issues IP addresses. A "light" version is delivered with Windows 2000 Server. However, to use Network Monitor's full capabilities, you need the full version included with Microsoft® Systems Management Server.
To install Network Monitor
As long as you have installed the full version available from Systems Management Server, you can capture and view every packet on the network. Network Monitor isolates the network layer where a problem occurred, or where an operation failed, and helps you determine the cause of the problem.
Note
Run Network Monitor on the computer that is having the problems, or on another computer that feeds into the same microhub. For more information about Network Monitor, see the Server Operations Guide.
Because the Network Monitor sniffer trace captures the entire exchange that occurs on the wire, you can scan this quickly and determine the source of a particular problem. For example, if you have a reproducible problem, a sniffer trace (or capture) helps determine the actual operation that failed. It displays the speed of operations, the source to network traffic, if packets are being dropped or if processes are experiencing
A typical example of monitoring network traffic by using Network Monitor is one where you install Network Monitor on your main working computer. Assuming that all of your computers are connected to the same hub, you can use your main computer to sniff each of the other computers on the network. For example, to monitor another computer, obtain its address and add it, as an Ethernet address, with the name of the monitored computer. Next, you can filter the sniffer trace so that you capture activity for the monitored computer.
Note
The Ethernet address (and not the IP address) is used for filtering when you want to see all traffic, be it IP or IPX. This is useful because delays can involve multiple transports.
When you are finished viewing the capture of the monitored computer, you can select another filter.
Windows 2000 includes the ability for clients to register DNS records automatically with DNS servers configured to accept these updates. The following example shows the captured network frame and indicates that the frames are client requests to dynamically update the DNS server.
DNS: 0x1B:Dyn Upd UPD records to MYSERVER.mycorp.com. of type Host Addr
DNS: Query Identifier = 27 (0x1B)
DNS: DNS Flags = Query, OpCode - Dyn Upd, RCode - No error
DNS: 0............... = Request
-----> DNS: .0101........... = Dynamic Update
DNS: .....0.......... = Server not authority for domain
DNS: ......0......... = Message complete
DNS: .......0........ = Iterative query desired
DNS: ........0....... = No recursive queries
DNS: .........000.... = Reserved
DNS: ............0000 = No error
DNS: Zone Count = 1 (0x1)
DNS: Prerequisite Section Entry Count = 0 (0x0)
DNS: Update Section Entry Count = 3 (0x3)
DNS: Additional Records Count = 0 (0x0)
DNS: Update Zone: mycorp.com. of type SOA on class INET addr.
DNS: Update Zone Name: mycorp.com.
DNS: Update Zone Type = Start of zone of authority
DNS: Update Zone Class = Internet address class
DNS: Update: MYSERVER.mycorp.com. of type Host Addr on class Req.
for any(2 records present)
DNS: Resource Record: MYSERVER.mycorp.com. of type Host Addr
on class Req. for any(2 records present)
DNS: Resource Name: MYSERVER.mycorp.com.
DNS: Resource Type = Host Address
DNS: Resource Class = Request for any class
DNS: Time To Live = 0 (0x0)
DNS: Resource Data Length = 0 (0x0)
This frame also includes the record to be written:
DNS: Resource Record: MYSERVER.mycorp.com. of type Host Addr
on class INET addr.
DNS: Resource Name: MYSERVER.mycorp.com.
DNS: Resource Type = Host Address
DNS: Resource Class = Internet address class
DNS: Time To Live = 1200 (0x4B0)
DNS: Resource Data Length = 4 (0x4)
DNS: IP address = 100.2.0.3 ---> example IP address
The version of Microsoft Network Monitor included with Windows 2000 Server parses these frames correctly and displays DNS dynamic update frames.
Note
If you are using a third-party version or an earlier version of Network Monitor, you can identify DNS dynamic update frames by the four bits in the "DNS Flags" section of the frame.
The following example displays the four bits in the DNS Flags section:
DNS: 0x17:Std Qry for mycorp.com. of type SOA on class INET addr.
DNS: Query Identifier = 23 (0x17)
DNS: DNS Flags = Query, OpCode - Std Qry, RD Bits Set, RCode - No
error
DNS: 0............... = Query
-----> DNS: .0101........... = Reserved (a value of 5 (0101) here =
Dynamic DNS Update Record)
DNS: .....0.......... = Server not authority for domain
DNS: ......0......... = Message complete
DNS: .......1........ = Recursive query desired
DNS: ........0....... = No recursive queries
DNS: .........000.... = Reserved
DNS: ............0000 = No error
This frame also includes the record to be written:
DNS: Authority Section: MYSERVER.mycorp.com. of type Host Addr on
class INET addr.
DNS: Resource Name: MYSERVER.mycorp.com.
DNS: Resource Type = Host Address
DNS: Resource Class = Internet address class
DNS: Time To Live = 3600 (0xE10)
DNS: Resource Data Length = 4 (0x4)
DNS: IP address = 100.2.0.3 ---> example IP address
When you use Network Monitor, be aware of the following:
3268 = LDAP (under the TCP_Handoffset section)
For more information about the Network Monitor tool, see the Server Operations Guide. For more information about the DNS service, see "Windows 2000 DNS" in the TCP/IP Core Networking Guide.
To determine whether there is a problem with the redirector, type net config rdr at the command prompt, and then press ENTER.
If the workstation is not active on at least one transport, you see something similar to the example that follows. The net config rdr command shows how the redirector or workstation is currently configured on your computer.
Computer name \\Reskit
User name User1
Workstation active on NetbiosSmb <000000000000>
Software version Windows 2000
Workstation domain NTWKSTA
Logon domain RESKIT
COM Open Timeout (sec) 0
COM Send Count (byte) 16
COM Send Timeout (msec) 250
The command completed successfully.
The workstation must be active on at least one transport. NetBT Tcpip, for example, as shown.
Computer name \\Reskit
User name User1
Workstation active on NetbiosSmb <000000000000>
NetBT_Tcpip_{24B6F8FC-0CE6-11D1-8F1A-A0BC38451EB2} (00C04FD8D37F)
Software version Windows 2000
Workstation domain NTWKSTA
Logon domain RESKIT
COM Open Timeout (sec) 0
COM Send Count (byte) 16
COM Send Timeout (msec) 250
The command completed successfully.
If not, you either have networking problems in the redirector, the transport, or in Plug and Play functionality. One main cause of not having at least one transport bound to the redirector or workstation is a duplicate name conflict.
Note
You might experience a delay when you attempt to connect to network resources from a system with multiple redirectors installed. This delay happens only the first time that you attempt the connection.
For more information about the redirector, see the Microsoft Platform SDK link on the Web Resources page at http://windows.microsoft.com/windows2000/reskit/webresources.