MSCS/Cluster Does Not Form with Error Messages 170 and 5086
ID: Q249194
|
The information in this article applies to:
-
Microsoft Windows 2000 Advanced Server
SYMPTOMS
If the Cluster service is started simultaneously on both nodes, it may
not start on one of the nodes and produce the following error message:
C:\>net start clussvc
The Cluster Service service is starting...............
The Cluster Service service could not be started.
A system error has occurred.
More help is available by typing NET HELPMSG 5086.
To determine what error message 5086 means, type net helpmsg 5086 at a command prompt. The result is:
The quorum disk could not be located by the cluster service.
The Cluster Diagnostic log indicates that arbitration did not succeed, with the following error message:
170
To determine what error message 170 means, type net helpmsg 170 at a command prompt. The result is:
The requested resource is in use.
CAUSE
When you start the service on both nodes at the same time, both try to arbitrate for the quorum device. Only one of the nodes survives the disk arbitration.
On the node that did not survive the arbitration, the cluster service mistakenly exits with a status of Stopped instead of Failed.
Service Control Manager's failure logic, as defined on the Recovery tab in a service's properties as viewed from the Services tool, is not activated by a Stopped status and the service is never restarted.
RESOLUTION
To work around this behavior, attempt to start the service again manually by typing net start clussvc at a command prompt.
To prevent this problem from occurring when the computers are restarted at the same time (for example, in the case of a power failure),
stagger the timing of the Cluster service start. A delay of 30 seconds should be sufficient and should not have a significant impact on server up time. To do so:
- Right-click My Computer, and then click Properties.
- On the Advanced tab, click Startup and Recovery.
- Click to select the Display list of operating systems for nn seconds check box. Modify the time value to be 30 seconds different from the other node.
STATUS
Microsoft has confirmed this to be a problem in Windows 2000 Advanced Server.
MORE INFORMATION
The following text is an excerpt from the Cluster log of one of the nodes reproducing this problem. The other node may not log these errors.
000005c8.000004c0::1999/12/12-03:53:20.843 Physical Disk <Disk G:>:
[DiskArb] Failed to read (sector 12), error 170.
000005c8.000004c0::1999/12/12-03:53:20.843 Physical Disk <Disk G:>:
[DiskArb]Arbitrate returned status 170.
0000039c.000003c8::1999/12/12-03:53:20.843 [MM] MmSetQuorumOwner(0,0),
old owner 1.
0000039c.000003c8::1999/12/12-03:53:20.843 [FM] FmGetQuorumResource
failed, error 170.
0000039c.000003c8::1999/12/12-03:53:20.843 [INIT] ClusterForm: Could not
get quorum resource. No fixup attempted. Status = 5086
0000039c.000003c8::1999/12/12-03:53:20.843 [INIT] Cleaning up failed form
attempt.
0000039c.000003c8::1999/12/12-03:53:20.843 [INIT] Failed to form cluster,
status 5086.
0000039c.000003c8::1999/12/12-03:53:20.843 [CS] ClusterInitialize failed
5086
0000039c.000003c8::1999/12/12-03:53:20.843 [INIT] The cluster service is
shutting down.
0000039c.000003c8::1999/12/12-03:53:20.843 [EVT] EvShutdown
0000039c.000003c8::1999/12/12-03:53:20.843 [FM] Shutdown: Failover Manager
requested to shutdown groups.
0000039c.000003c8::1999/12/12-03:53:20.843 [FM] FmpCleanupGroups: Entry
Additional query words:
Keywords : ntstart w2000mscs
Version : WINDOWS:2000
Platform : WINDOWS
Issue type : kbprb