Chapter 2: Installation Problems

The installation process for MSCS is very simple compared to other network server applications. The MSCS installation completes within a short timeframe. Installation usually lasts just a few minutes. For a software package that does so much, the speed with which MSCS installs might surprise you. In reality, MSCS is more complex behind the scenes, and installation depends greatly on the compatibility and proper configuration of the system hardware and networks. If the hardware configuration is not acceptable, it is not unusual to expect installation problems. After installation, be sure to evaluate the proper operation of the entire cluster prior to installing additional software.

MSCS Installation Problems with the First Node

Is hardware compatible?

It is important to use certified systems for MSCS installations. Use systems and components from the MSCS Hardware Compatibility List (HCL). For many, the main reason for installing a cluster is to achieve high availability of their valuable resources. Why compromise availability by using unsupported hardware? Microsoft supports only MSCS installations that use certified complete systems from the MSCS Hardware Compatibility List. If the system fails and you need support, if the hardware isn't supported, high availability may be compromised.

Is the shared SCSI bus connected and configured properly?

MSCS relies heavily on the shared SCSI bus. You must have at least one device on the shared bus for the cluster to store the quorum logfile and act as the cluster's quorum disk. Access to this disk is vital to the cluster. In the event of a system failure or loss of network communication between nodes, cluster nodes will arbitrate for access to the quorum disk to determine which system will take control and make decisions. The quorum logfile holds information regarding configuration changes made within the cluster when another node may be offline or unreachable. The installation process requires at least one device on the shared bus for this purpose. A hardware RAID logical partition or separate physical disk drive will be sufficient to store the quorum logfile and function as the quorum disk.

To check proper operation of the shared SCSI bus, consult the section Troubleshooting Shared SCSI Bus in this document.

Install Windows NT Server, Enterprise Edition, and Service Pack 3

MSCS version 1.0 requires Microsoft Windows NT Server, Enterprise Edition, version 4.0 with Service Pack 3 or later. If you add network adapters or other hardware devices and drivers later, it's important to reapply the service pack to ensure that all drivers, DLLs, and system components are of the same version. Hotfixes may require reapplication if they are overwritten. Check with Microsoft Product Support Services or the Microsoft Knowledge Base regarding applied hotfixes, and to determine whether the hotfix needs to be reapplied.

Does the system disk have adequate free space to install the product?

MSCS requires only a few megabytes to store files on each system. The Setup program prompts for the path to store these files. The path should be to local storage on each server, not to a drive on the shared SCSI bus. Make sure that free space exists on the system disk, both for installation requirements and for normal system operation.

Does the server have a properly sized system paging file?

If you've experienced reduced system performance or near system lockup during the installation process, check the Performance tab using the System utility of the Control Panel. Make sure the system has acceptable paging file space (the minimum space required is the amount of physical RAM plus 11 MB.), and that the system drive has enough free space to hold a memory dump file, should a system crash occur. Also, make sure pagefiles are on local disks only, not on shared drives. Performance Monitor may be a valuable resource for troubleshooting virtual memory problems.

Do both servers belong to the same domain?

Both servers in the cluster must have membership in the same domain. Also, the service account that the cluster service uses must be the same on both servers. Cluster nodes may be domain controllers or domain member servers. However, if functioning as a domain member server, a domain controller must be accessible for cluster service account authentication. This is a requirement for any service that starts using a domain account.

Is the primary domain controller (PDC) accessible?

During the installation process, Setup must be able to communicate with the PDC. Otherwise, the setup process will fail. Additionally, after setup, the cluster service may not start if domain controllers are unavailable to authenticate the cluster service account. For best results, make sure each system has connectivity with the PDC, and install each node as a backup domain controller in the same domain.

Are you installing while logged on as an administrator?

To install MSCS, you must have administrative rights on each server. For best results, log on to the server with an administrative account before you start Setup.

Do the drives on the shared SCSI bus appear to be functioning properly?

Devices on the shared SCSI bus must be turned on, configured, and functioning properly. Consult the Microsoft Cluster Server Administrator's Guide for information on testing the drives before setup.

Are any errors listed in the event log?

Before you install new software of any kind, it is good practice to check the system and application event logs for errors. This resource can indicate the state of the system before you make configuration changes. Events may be posted to these logs in the event of installation errors or hardware malfunctions during the installation process. Attempt to correct any problems you find. Appendix A of this document contains information regarding some events that may be related to MSCS and possible resolutions.

Is the network configured and functioning properly?

MSCS relies heavily on configured networks for communications between cluster nodes, and for client access. With improper function or configuration, the cluster software cannot function properly. The installation process attempts to validate attached networks and needs to use them during the process. Make sure that the network adapters and TCP/IP protocol are configured properly with correct IP addresses. If necessary, consult with your network administrator for proper addressing.

For best results, use statically assigned addresses and do not rely on DHCP to supply addresses for these servers. Also, make sure you're using the correct network adapter driver. Some adapter drivers may appear to work because they are similar enough to the actual driver needed but are not an exact match. For example, an OEM or integrated network adapter may use the same chipset as a standard version of the adapter. Use of the same chipset may cause the standard version of the driver to load instead of an OEM supplied driver. Some of these adapters work more reliably with the driver supplied by the OEM, and may not attain acceptable performance if using the standard driver. In some cases, this combination may prevent the adapter from functioning at all, even though no errors appear in the system event log for the adapter.

Cannot Install MSCS on the Second Node

The previous section, "MSCS Installation Problems with the First Node," contains questions you need to ask if installation on the second node fails. Please consult this section first, before you continue with additional troubleshooting questions in this section.

During installation, are you specifying the same cluster name to join ?

When you install the second node, select the Join an Existing Cluster option. The first node you installed must be running at the time with the cluster service running.

Is the RPC service running on both systems?

MSCS uses remote procedure calls (RPC) and requires that the RPC service be running on both systems. Check to make sure that the RPC service is running on both systems and that the system event logs on each server do not have any RPC-related errors.

Can each node communicate with one another over configured networks?

Evaluate network connectivity between systems. If you used the procedures in the preinstallation section of this document, you've already covered the basics. During installation of the second node, the installation progam communicates through the server's primary network and through any other networks that were configured during installation of the first node. Therefore, you should test connectivity again with the IP addresses on these adapters. Additionally, the cluster name and associated IP address you configured earlier will be used. Make sure the cluster service is running on the first node and that the cluster name and cluster IP address resources are online and available. Also, make sure that the correct network was specified for the cluster IP address when the first node was installed. The cluster service may be registering the cluster name on the wrong network. The cluster name resource should be registered on the network that clients will use to connect to the cluster.

Are both nodes connected to the same network or subnet?

Both nodes need to use unique addresses on the same network or subnet. The cluster nodes need to be able to communicate directly, without routers or bridges between them. If the nodes are not directly connected to the same public network, it will not be possible to failover IP addresses.

Cannot reinstall MSCS after node evicted

If you evict a node from the cluster, it may no longer participate in cluster operations. If you restart the evicted node and have not removed MSCS from it, the node will still attempt to join, and cluster membership will be denied. You must remove MSCS with the Add/Remove Programs utility in Control Panel. This action requires that you restart the system. If you ignore the option to restart, and attempt to reinstall the software anyway, you may receive the following error message:

Figure 3. Microsoft Cluster Server Error Message

If you receive this message, restart the affected system and reinstall the MSCS software to join the existing cluster.