Interpreting the Cluster Log |
Forming a cluster involves the following stages:
The Cluster service might have used stale object information at startup; now it needs to destroy and then recreate all the group and resource objects in order to refresh their information.
The cluster can be successfully formed before all the resources have been brought online.
The form operation starts at the following entry:
378.380::1999/06/09-18:00:46.615 [INIT] Attempting to form cluster
CLUSTER1
The form operation ends at the following entry:
378.380::1999/06/09-18:00:51.787 [INIT] Successfully formed a cluster.
The following sections discuss some of the significant entries in a form operation.
The following entry describes starting the Resource Monitor:
15c.0d4::1999/06/09-18:00:46.694 [RM] Main: Initializing.
The entries in this section describe the following:
The following entries describe bringing online the group that is going to contain the Quorum resource. Note the following:
378.380::1999/06/09-18:00:47.069 [FM] Creating group
254ef0e7-1937-11d3-b3fe-00a0c986aa14
378.380::1999/06/09-18:00:47.084 [FM] Initializing group
254ef0e7-1937-11d3-b3fe-00a0c986aa14 from the registry.
378.380::1999/06/09-18:00:47.100 [FM] Name for Group
254ef0e7-1937-11d3-b3fe-00a0c986aa14 is 'Disk Group 1'.
378.380::1999/06/09-18:00:47.100 [FM] Group
254ef0e7-1937-11d3-b3fe-00a0c986aa14 contains Resource
254ef0e8-1937-11d3-b3fe-00a0c986aa14.
In the following entries, resource 254ef0e8-1937-11d3-b3fe-00a0c986aa14 (the Physical Disk resource, which is the Quorum resource) is created and initialized using the values from the registry. (Note the entry that gives the display name of the resource.) The cluster key configuration for this resource specifies the following:
378.380::1999/06/09-18:00:47.100 [FM] Creating resource
254ef0e8-1937-11d3-b3fe-00a0c986aa14
378.380::1999/06/09-18:00:47.100 [FM] Initializing resource
254ef0e8-1937-11d3-b3fe-00a0c986aa14 from the registry.
378.380::1999/06/09-18:00:47.100 [FM] Name for Resource
254ef0e8-1937-11d3-b3fe-00a0c986aa14 is 'Disk D:'.
378.380::1999/06/09-18:00:47.100 [FM] FmpAddPossibleEntry: adding node 1
as possible host for resource 254ef0e8-1937-11d3-b3fe-00a0c986aa14.
Using the cluster configuration data in the registry, the Failover Manager creates the resources on which the Physical Disk resource (D:) depends.
378.380::1999/06/09-18:00:47.100 [FM] All dependencies for resource
254ef0e8-1937-11d3-b3fe-00a0c986aa14 created.
In the following entry, the node arbitrates for the ownership of the Quorum resource.
378.380::1999/06/09-18:00:47.115 [FM] arbitrate for quorum resource id
254ef0e8-1937-11d3-b3fe-00a0c986aa14.
Note that in adding possible owners for the Quorum resource, there are no other nodes presently in the cluster. This fact explains the following two entries.
378.380::1999/06/09-18:00:47.115 [FM] FmpAddPossibleNode: adding node 1
to resource type's possible node list
378.380::1999/06/09-18:00:47.115 [FM] FmpAddPossibleNode: Warning, node
2 not found
The following entry reports adding the Quorum resource to the Resource Monitor's list of resources to watch and help manage. This entry occurs for each resource as it is added to the list.
378.380::1999/06/09-18:00:47.115 [FM] FmpRmCreateResource: creating
resource 254ef0e8-1937-11d3-b3fe-00a0c986aa14 in shared resource
monitor
In the following entry, the PnP window is the Plug and Play window, which gets messages that indicate changes in the disk configuration.
15c.46c::1999/06/09-18:00:47.740 Physical Disk: PnP window created
successfully.
In the following entry, the Cluster service makes node 1 the Quorum resource owner.
378.380::1999/06/09-18:00:47.897 [MM] MmSetQuorumOwner(1,1), old owner
0.
Of the entries that report disk arbitration, the significant ones are the following. In the second entry, the node's ownership of the Quorum resource has been confirmed.
15c.458::1999/06/09-18:00:47.912 Physical Disk <Disk D:>:
[DISKARB]Assume ownership of a quorum device.
15c.458::1999/06/09-18:00:47.912 Physical Disk <Disk D:>:
[DISKARB]Arbitrate returned status 0.
As the Quorum resource is brought online, the following [GUM] entries report a successful GUM update. GUM relies on one node to be the master when issuing updates to the other nodes. This master is called the "locker."
Note that the sequence number is 1657. You can use this sequence number to track the update.
378.380::1999/06/09-18:00:47.928 [GUM] GumSendUpdate: Locker waiting
type 0 context 8
378.380::1999/06/09-18:00:47.928 [GUM] Thread 0x380 UpdateLock wait on
Type 0
378.380::1999/06/09-18:00:47.928 [GUM] DoLockingUpdate successful, lock
granted to 1
378.380::1999/06/09-18:00:47.928 [GUM] GumSendUpdate: Locker dispatching
seq 1657 type 0 context 8
378.380::1999/06/09-18:00:47.928 [GUM] GumpDoUnlockingUpdate releasing
lock ownership
378.380::1999/06/09-18:00:47.928 [GUM] GumSendUpdate: completed update
seq 1657 type 0 context 8
The description "type 0 context 8" occurs frequently in cluster logs. "Type 0 context 8" means that GUM is performing an update for the Failover Manager and that the update concerns a change in the state of a resource (in this case, the Quorum resource). Here's why:
Table 20.2 Update Types
Type Code | Type |
---|---|
0 | GumUpdateFailoverManager |
1 | GumUpdateRegistry |
2 | GumUpdateMembership |
In the following entries, the state of the Quorum resource is propagated to the cluster nodes; then the Quorum resource is brought online:
378.380::1999/06/09-18:00:47.928 [FM] FmpPropagateResourceState:
resource 254ef0e8-1937-11d3-b3fe-00a0c986aa14 pending event.
378.380::1999/06/09-18:00:47.928 [FM] FmpRmOnlineResource: Resource
254ef0e8-1937-11d3-b3fe-00a0c986aa14 pending
378.380::1999/06/09-18:00:47.944 [FM] FmpRmOnlineResource: Returning.
Resource 254ef0e8-1937-11d3-b3fe-00a0c986aa14, state 129, status
997.
Next, the Cluster service makes sure the drive-letter assignment data is consistent across the cluster.
In the following entry, the Cluster service lists the partition table data associated with the drive letter D:
15c.470::1999/06/09-18:00:47.944 Physical Disk <Disk D:>: Mountie[0]: 1,
let=?, start=104000, len=FF9FC000.
In the following entries, the Cluster service synchronizes drive-letter information on all the sources (HKEY_LOCAL_MACHINE\SYSTEM\Disk, \DiskInfo, \MountMgr, and \MountVolumeInfo). The information in the HKEY_LOCAL_MACHINE\SYSTEM\Disk subkey is out-of-date, and the fourth and fifth entries report the updating of the subkey.
15c.470::1999/06/09-18:00:48.037 Physical Disk <Disk D:>: MountieVerify:
Registry-System\DISK.GetInfo returned 0 [0:0].
15c.470::1999/06/09-18:00:48.037 Physical Disk <Disk D:>: MountieVerify:
ClusReg-DiskInfo selected.
15c.470::1999/06/09-18:00:48.037 Physical Disk <Disk D:>: MountieVerify:
DriveLetters mask is now 00000008.
15c.470::1999/06/09-18:00:48.037 Physical Disk <Disk D:>: MountieVerify:
Update needed for 08.
15c.470::1999/06/09-18:00:48.053 Physical Disk <Disk D:>: FtInfo_Set:
Update successful.
The following entry means that all the sources of drive letter information have been updated and are synchronized. In the description "Update needed for nn," where "nn" is a two-digit number; "00" means that nothing needs updating. Note that in the fourth entry of the preceding example, the description reads "Update needed for 08."
15c.470::1999/06/09-18:00:48.053 Physical Disk <Disk D:>: MountieUpdate:
Update needed for 00.
In the following entry, the Resource Monitor notifies the Cluster service that the Disk D: resource is online.
15c.470::1999/06/09-18:00:48.240 [RM] RmpSetResourceStatus, Posting
state 2 notification for resource <Disk D:>
The following entry indicates that the operation of bringing the Quorum resource online has succeeded.
378.3a4::1999/06/09-18:00:48.240 [DM] DmpQuoObjNotifyCb: Quorum resource
is online
The quorum log, also known as the change log, records changes to the cluster configuration.
At this point in the cluster formation operation, after the Quorum resource is online, the Cluster service does the following:
In the following entries, the Cluster service opens the quorum log:
378.3a4::1999/06/09-18:00:48.240 [DM] DmpQuoObjNotifyCb: Own quorum
resource, try open the quorum log
378.3a4::1999/06/09-18:00:48.303 [DM] DmpQuoObjNotifyCb: the name of the
quorum file is D:\MSCS\quolog.log
378.3a4::1999/06/09-18:00:48.303 [lm] LogCreate : Entry
FileName=D:\MSCS\quolog.log MaxFileSize=0x00010000
378.3a4::1999/06/09-18:00:48.397 [lm] LogpCreate : Entry
The service mounts the log and prepares it so that the log can be read and written:
378.3a4::1999/06/09-18:00:48.443 [lm] LogpMountLog : Entry
pLog=0x000a0650
378.3a4::1999/06/09-18:00:48.443 [lm]:LogpMountLog::Quorumlog File
size=0x00008000
The following entries are typical of those in which the Cluster service checks for or reads log records at a particular offset in the file.
378.3a4::1999/06/09-18:00:48.443 [lm]:LogpMountLog::checking LSN
0x00000408
.
.
.
378.3a4::1999/06/09-18:00:48.443 [lm]:LogpMountLog::checking LSN
0x00000a48
In the cluster form operation, applying quorum log changes to the cluster database involves the following:
A tombstone file is created when a cluster node's quorum-device information is outdated; the tombstone file alerts the node's Cluster service of the fact that the quorum directory is not located where the node's quorum-device information currently says it is.
For example, if node A was not running when node B changed the location of the quorum device, node A has outdated quorum-device information and node B has the current quorum-device information. If node B now goes down and node A comes up, node A's outdated quorum-device information points it to a tombstone file. The tombstone file doesn't point to the new Quorum resource. The result is that node A can only join the cluster; it cannot form a cluster.
In the following entries, the Database Manager checks whether a valid quorum tombstone exists. The return value of zero in the second entry means the DmpChkQuoTombstone function encountered no errors in trying to validate the tombstone information and the form operation can continue.
378.380::1999/06/09-18:00:48.459 [DM] DmpChkQuoTombStone - Entry
378.380::1999/06/09-18:00:48.459 [DM] DmpChkQuoTombStone:
Exit,dwError=0x00000000
If DmpChkQuoTombstone had found a valid quorum tombstone, the Database Manager would have logged an error in the event log, and the Cluster service would have exited immediately. In that case, you wouldn't see the Exit statement from DmpChkQuoTombstone in the cluster log.
In unrolling the quorum log, the Database Manager does the following:
Each configuration change is updated to any cluster nodes. Even though in the cluster form operation there is no other node, each change that occurs in the form operation prompts an update by the GUM.
The following entries are part of the Database Manager's finding the last cluster database checkpoint (sequence number 1656). The first entry notes the sequence number for the cluster key in the local registry.
378.380::1999/06/09-18:00:48.459 [DM] DmpApplyChanges: The current
registry sequence number 1656
378.380::1999/06/09-18:00:48.459 [lm] LogGetLastChkPoint:: Entry
378.328::1999/06/09-18:00:48.537 [FM] HandleResourceTransition: Resource
Name = 254ef0e8-1937-11d3-b3fe-00a0c986aa14 old state=129 new
state=2
After calling LogGetLastChkPoint, the Log Manager writes the sequence number. Be sure to track such activity by its thread. For example, the following entries are interspersed with entries from the 378.328 thread, which originated with the Failover Manager.
378.380::1999/06/09-18:00:48.553 [LM] LogGetLastChkPoint: ChkPt File
D:\MSCS\chk678.tmp ChkPtSeq=1656 ChkPtLsn=0x00000678 Checksum=93996
378.328::1999/06/09-18:00:48.568 FmpPropagateResourceState: signalling
the ghQuoOnlineEvent
378.380::1999/06/09-18:00:48.568 [LM] LogGetLastChkPoint exit, returning
0x00000000
378.328::1999/06/09-18:00:48.568 [GUM] GumSendUpdate: Locker waiting
type 0 context 8
378.380::1999/06/09-18:00:48.568 [DM] DmpLogFindStartLsn:
LogGetLastChkPt rets, Seq#=1656 ChkPtLsn=0x00000678
When a checkpoint is written to the quorum disk, the writing of the checkpoint is described in the quorum log by a begin-checkpoint record and an end-checkpoint record. In the following entry, the Database Manager finds the last begin-checkpoint and end-checkpoint records in the quorum log, which it then uses to find and identify the most recent checkpoint file. It then uploads the checkpoint file to the cluster registry key:
378.380::1999/06/09-18:00:48.568 [DM] DmpLogFindStartLsn: Uploading
chkpt from quorum log
In the following entries, the Database Manager loads the cluster database and applies the changes in the quorum log to the loaded cluster database:
378.380::1999/06/09-18:00:49.365 [DM]: Loading cluster database from
C:\WINNT\cluster\CLUSDB
15c.46c::1999/06/09-18:00:49.459 Physical Disk:
AddVolume : \\?\Volume{89fc8c21-18ef-11d3-b1b2-806d6172696f}\
'C', 7 (8419616)
378.380::1999/06/09-18:00:49.678 [DM] DmpLogFindStartLsn: chkpt uploaded
from quorum log
378.380::1999/06/09-18:00:49.693 [DM] DmpLogFindStartLsn:
LSN=0x00000678, dwError=0x00000000
378.380::1999/06/09-18:00:49.693 [DM] DmpApplyChanges: The LSN of the
record to apply changes from 0x00000678
Now that the quorum log entries have been read, validated, and propagated to the cluster's database, the Cluster service is ready to recreate its groups and resources.
In order to start, the Cluster service uses the cluster database from the local registry, which might be out-of-date (for instance, if the cluster database was updated while the node was down).
To guarantee that the Cluster service always uses the most up-to-date copy of the cluster database, the Cluster service updates the cluster configuration by first destroying the local in-memory copies of all the current objects (groups, resources, and so on) and then recreating cluster objects from the updated registry information in the cluster key. The local registry's copy of the cluster database is updated by copying the latest version from the quorum disk or, if the node is joining a cluster, from the sponsoring node.
The following entries report the destruction of the following:
378.380::1999/06/09-18:00:49.693 [FM] FmFormNewClusterPhase1, Entry.
Quorum quorum will be deleted
378.380::1999/06/09-18:00:49.693 [FM] DestroyGroup: destroying
254ef0e7-1937-11d3-b3fe-00a0c986aa14
378.380::1999/06/09-18:00:49.693 [FM] DestroyResource: destroying
254ef0e8-1937-11d3-b3fe-00a0c986aa14
15c.46c::1999/06/09-18:00:49.693 Physical Disk: AddVolume:
GetPartitionInfo(\??\Volume{89fc8c1b-18ef-11d3-b1b2-806d6172696f}),
error 1
378.380::1999/06/09-18:00:49.693 [FM] FmpDestroyResource Exit.
378.380::1999/06/09-18:00:49.693 [FM] FmpDestroyGroup: Group
254ef0e7-1937-11d3-b3fe-00a0c986aa14 destroyed.
378.380::1999/06/09-18:00:49.693 [OM] Deleting object Disk Group 1
(254ef0e7-1937-11d3-b3fe-00a0c986aa14)
The forming cluster builds up the memory database by recreating first the groups and then the resources on the basis of the new configuration.
Configuring the networks (which is covered in the next section) happens after the Cluster service has destroyed the groups and resources and before it recreates them.
Note that this set of operations actually follows the cluster's configuring of the network in time.
Recreating the groups and resources begins with the following:
378.380::1999/06/09-18:00:51.162 [FM] FmFormNewClusterPhase2, Entry.
In the subsequent entries, up to "378.380::1999/06/09-18:00:51.209 [FM] All groups created", the forming node does the following:
378.380::1999/06/09-18:00:51.193 [FM] Processing groups list.
378.380::1999/06/09-18:00:51.193 [FM] Creating group
254ef0e4-1937-11d3-b3fe-00a0c986aa14
378.380::1999/06/09-18:00:51.193 [FM] Initializing group
254ef0e4-1937-11d3-b3fe-00a0c986aa14 from the registry.
378.380::1999/06/09-18:00:51.193 [FM] Name for Group
254ef0e4-1937-11d3-b3fe-00a0c986aa14 is 'Cluster Group'.
378.380::1999/06/09-18:00:51.193 [FM] Group
254ef0e4-1937-11d3-b3fe-00a0c986aa14 contains Resource
254ef0e5-1937-11d3-b3fe-00a0c986aa14.
378.380::1999/06/09-18:00:51.193 [FM] Creating resource
254ef0e5-1937-11d3-b3fe-00a0c986aa14
378.380::1999/06/09-18:00:51.193 [FM] Initializing resource
254ef0e5-1937-11d3-b3fe-00a0c986aa14 from the registry.
378.380::1999/06/09-18:00:51.193 [FM] Name for Resource
254ef0e5-1937-11d3-b3fe-00a0c986aa14 is 'Cluster IP Address'.
.
.
.
378.380::1999/06/09-18:00:51.193 [FM] Group
254ef0e4-1937-11d3-b3fe-00a0c986aa14 contains Resource
254ef0e6-1937-11d3-b3fe-00a0c986aa14.
378.380::1999/06/09-18:00:51.193 [FM] Creating resource
254ef0e6-1937-11d3-b3fe-00a0c986aa14
378.380::1999/06/09-18:00:51.193 [FM] Initializing resource
254ef0e6-1937-11d3-b3fe-00a0c986aa14 from the registry.
378.380::1999/06/09-18:00:51.193 [FM] Name for Resource
254ef0e6-1937-11d3-b3fe-00a0c986aa14 is 'Cluster Name'.
378.380::1999/06/09-18:00:51.193 [FM] FmpAddPossibleEntry: adding
node 1 as possible host for resource
254ef0e5-1937-11d3-b3fe-00a0c986aa14.
378.380::1999/06/09-18:00:51.193 [FM] FmpAddPossibleEntry: adding
node 2 as possible host for resource
254ef0e5-1937-11d3-b3fe-00a0c986aa14.
378.380::1999/06/09-18:00:51.193 [FM] All dependencies for resource
254ef0e5-1937-11d3-b3fe-00a0c986aa14 created.
.
.
.
378.380::1999/06/09-18:00:51.193 [FM] FmpAddPossibleEntry: adding
node 1 as possible host for resource
254ef0e6-1937-11d3-b3fe-00a0c986aa14.
378.380::1999/06/09-18:00:51.193 [FM] FmpAddPossibleEntry: adding
node 2 as possible host for resource
254ef0e6-1937-11d3-b3fe-00a0c986aa14.
378.380::1999/06/09-18:00:51.193 [FM] Resource
254ef0e6-1937-11d3-b3fe-00a0c986aa14 depends on
254ef0e5-1937-11d3-b3fe-00a0c986aa14. Creating...
378.380::1999/06/09-18:00:51.193 [FM] CreateResource, Opened existing
resource 254ef0e5-1937-11d3-b3fe-00a0c986aa14
378.380::1999/06/09-18:00:51.193 [FM] All dependencies for resource
254ef0e6-1937-11d3-b3fe-00a0c986aa14 created.
Among the groups and resources created are the Quorum resource and the group that contains the Quorum resource:
378.380::1999/06/09-18:00:51.209 [FM] Name for Group
254ef0e7-1937-11d3-b3fe-00a0c986aa14 is 'Disk Group 1'.
378.380::1999/06/09-18:00:51.209 [FM] Group
254ef0e7-1937-11d3-b3fe-00a0c986aa14 contains Resource
254ef0e8-1937-11d3-b3fe-00a0c986aa14.
378.380::1999/06/09-18:00:51.209 [FM] CreateResource, Opened existing
resource 254ef0e8-1937-11d3-b3fe-00a0c986aa14
378.380::1999/06/09-18:00:51.209 [FM] ReCreating quorum resource
254ef0e8-1937-11d3-b3fe-00a0c986aa14
After the service has created all the groups and resources, this entry appears in the log:
378.380::1999/06/09-18:00:51.209 [FM] All groups created.
With the group and resource objects recreated and initialized and their possible owners added, the Cluster service is ready to configure the cluster's network information.
In this phase of cluster formation, the Cluster service creates network and network interface objects and makes sure the information about them is accurate and consistent across the cluster. To do so, the network configuration engine does the following:
The following entry begins network-configuration:
378.380::1999/06/09-18:00:49.756 [NM] Synchronizing node information.
The Cluster service creates node objects that represent the members of the cluster, according to the cluster database.
Note that this cluster is being formed with only one node.
378.380::1999/06/09-18:00:49.756 [NM] Creating node objects.
378.380::1999/06/09-18:00:49.756 [NM] Creating object for node 2 (NODE2)
The Cluster service collects information from the system about network adapters and network connection objects.
Interfaces are IP addresses. An adapter can have multiple IP addresses (interfaces) associated with it.
378.380::1999/06/09-18:00:49.772 [NM] Synchronizing network information.
378.380::1999/06/09-18:00:49.787 [NM] Synchronizing interface
information.
In the following entry, the Cluster service compares the current configuration with the one recorded in the cluster database. It notes which network adapters have been removed or added or whether there have been no changes in the network configuration.
378.380::1999/06/09-18:00:49.818 [NM] Running network configuration
engine.
In the following entries, the Node Manager reports any changes it detected. It updates the cluster database with this information. The second entry means the previously existing adapters (one, in this case) were matched to their previous networks.
378.380::1999/06/09-18:00:50.600 [NM] Processing network configuration
changes.
378.380::1999/06/09-18:00:50.600 [NM] Matched 1 networks, created 0 new
networks.
378.380::1999/06/09-18:00:50.600 [NM] Resynchronizing network
information.
378.380::1999/06/09-18:00:50.600 [NM] Resynchronizing interface
information.
The Cluster service creates the following network and interface objects on the basis of the current network configuration:
378.380::1999/06/09-18:00:50.600 [NM] Creating network objects.
378.380::1999/06/09-18:00:50.600 [NM] Creating object for network
5da3c9ea-7c25-4129-ab70-2c058de6d1fd (CPQ Netintellegent 10.10.0.2).
378.380::1999/06/09-18:00:50.600 [NM] Creating object for network
d561dbcb-1936-11d3-b3fe-00a0c986aa14 ( 100b 172.16.64.1).
378.380::1999/06/09-18:00:50.600 [NM] Creating interface objects.
378.380::1999/06/09-18:00:50.600 [NM] Creating object for interface
2e12db05-2947-425f-835a-fb9e4f13f0b3 ( 100b 172.16.64.1 - NODE2).
In the following entries, the Node Manager gathers information used to detect network interface failures.
378.380::1999/06/09-18:00:50.600 [NM] Assigned index 0 to interface
2e12db05-2947-425f-835a-fb9e4f13f0b3.
378.380::1999/06/09-18:00:50.600 [NM] Creating object for interface
a6fe4897-139a-4b84-815f-8892f674afb8 (CPQ Netintellegent 10.10.0.2 -
NODE2).
378.380::1999/06/09-18:00:50.600 [NM] Assigned index 0 to interface
a6fe4897-139a-4b84-815f-8892f674afb8.
378.380::1999/06/09-18:00:50.600 [NM] Creating object for interface
d561dbcc-1936-11d3-b3fe-00a0c986aa14 ( 100b 172.16.64.1 - NODE1).
378.380::1999/06/09-18:00:50.600 [NM] Assigned index 1 to interface
d561dbcc-1936-11d3-b3fe-00a0c986aa14.
In the entries in this section, the Node Manager does the following:
The cluster transport is Clusnet.sys, a transport driver that uses the User Datagram Protocol (UDP) and that the Cluster service uses to send messages to other cluster members.
A network is a link between network interfaces attached to the same subnet. If both network interfaces are online, the network state is up.
Note the event entry description, "Registering interface 2e12db05-2947-425f-835a-fb9e4f13f0b3 with cluster transport, addr 172.16.64.2, endpoint 3343," in which
Thus, "Registering interface 2e12db05-2947-425f-835a-fb9e4f13f0b3 with cluster transport, addr 172.16.64.2, endpoint 3343" means the following:
378.380::1999/06/09-18:00:50.600 [NM] Registering network
d561dbcb-1936-11d3-b3fe-00a0c986aa14 with cluster transport.
378.380::1999/06/09-18:00:50.600 [NM] Bringing network
d561dbcb-1936-11d3-b3fe-00a0c986aa14 online.
378.380::1999/06/09-18:00:50.600 [NM] Registering interfaces for network
d561dbcb-1936-11d3-b3fe-00a0c986aa14 with cluster transport.
378.380::1999/06/09-18:00:50.600 [NM] Registering interface
2e12db05-2947-425f-835a-fb9e4f13f0b3 with cluster transport, addr
172.16.64.2, endpoint 3343.
378.380::1999/06/09-18:00:50.600 [NM] Registering interface
d561dbcc-1936-11d3-b3fe-00a0c986aa14 with cluster transport, addr
172.16.64.1, endpoint 3343.
378.380::1999/06/09-18:00:50.615 [NM] Initializing membership...
The following entries signal that the Node Manager is starting a connectivity report to detect any network interface failures.
378.328::1999/06/09-18:00:50.631 [NM] Scheduling connectivity report for
network d561dbcb-1936-11d3-b3fe-00a0c986aa14 ( 100b 172.16.64.1)
378.328::1999/06/09-18:00:50.631 [NM] Updating local connectivity info
for network d561dbcb-1936-11d3-b3fe-00a0c986aa14.
378.328::1999/06/09-18:00:50.631 [NM] Node for interface 0 is down
378.328::1999/06/09-18:00:50.631 [NM] Interface 1 is either up or
unreachable
Note that in the following entry, the Node Manager discounts the connectivity report from interface 0 because the interface's node is down.
378.328::1999/06/09-18:00:50.631 [NM] The report from interface 0 is not
valid
378.328::1999/06/09-18:00:50.662 [NM] Interface 1 is Up
378.328::1999/06/09-18:00:50.662 [NM] Beginning phase 2 of state
computation for network d561dbcb-1936-11d3-b3fe-00a0c986aa14
In the following entry, the Node Manager summarizes the connectivity report. The unavailable interface is interface 0. (See the preceding entries.)
378.328::1999/06/09-18:00:50.662 [NM] Unavailable=1, Failed = 0,
Unreachable=0, Reachable=1, Up=1,
For the meaning of state 3, see "State Codes" later in this chapter.
378.328::1999/06/09-18:00:50.662 [NM] Network
d561dbcb-1936-11d3-b3fe-00a0c986aa14 is now in state 3
378.328::1999/06/09-18:00:50.662 [NM] Interface
d561dbcc-1936-11d3-b3fe-00a0c986aa14 is up (node: NODE1, network:
100b 172.16.64.1).
378.328::1999/06/09-18:00:50.662 [NM] Network
d561dbcb-1936-11d3-b3fe-00a0c986aa14 ( 100b 172.16.64.1) is up.
378.380::1999/06/09-18:00:50.631 [CLSMSG] Initializing.
378.380::1999/06/09-18:00:50.881 [NM] Membership initialization
complete.
If the cluster nodes are running different versions of Windows NT or Windows 2000 Advanced Server and therefore different versions of the Cluster service, there can be an issue of whether the versions are compatible with all the cluster nodes. The Cluster service computes the Cluster service version associated with each node. It then computes the cluster version for the cluster as a whole from the versions of all the active members of the cluster. As other members join the cluster, the existing members compute a new version of the cluster.
In the following two pairs of entries, Cluster service computes a cluster version that is based on the versions of all the active members of the cluster, even when there is only one active member. The version number falls into two
First the Cluster service retrieves the NodeHighestVersion and NodeLowestVersion properties for each active member.
378.380::1999/06/09-18:00:50.881 [NM] NmpValidateNodeVersion: Node=1,
HighestVersion=0x00030803, LowestVersion=0x000200e0
In the second pair of entries, the Cluster service calculates the highest version of the Cluster service that can reliably run on all the nodes in the cluster and the lowest version with which every node can communicate. The calculation is biased toward the middle:
378.380::1999/06/09-18:00:50.881 [NM] NmpCalcClusterVersion: status = 0
ClusHighestVer=0x00030803, ClusLowestVer=0x000200e0
378.380::1999/06/09-18:00:50.881 [NM] [NmpResetClusterVersion]
ClusterHighestVer=0x00030803 ClusterLowestVer=0x000200e0
378.380::1999/06/09-18:00:50.881 [NM] Disabling mixed NT4/NT5 operation.
Cluster version 1 corresponds to the Windows NT 4.0 Service Pack 3 release, and version 2 to the Windows NT 4.0 Service Pack 4 and Windows NT 4.0 Service Pack 5 releases.
In the following entry, the Node Manager performs fixups, which enables the cluster to run in mixed mode. (In this case, node 1 is running Windows 2000, and node 2 is running Windows NT 4.0.) Fixups are also run the first time a node that has been upgraded runs the Cluster service. After running fixups, the Cluster service performs a global update of the membership.
378.380::1999/06/09-18:00:50.881 [NM] NmPerformFixups Entry,
dwFixupType=1
The following entries are typical of those that report beginning and committing local updates. Note the timestamps. Local updates of the cluster database can, but do not necessarily have to, occur at this point in cluster formation.
378.380::1999/06/09-18:00:50.897 [GUM] GumSendUpdate: Locker waiting
type 2 context 17
378.380::1999/06/09-18:00:50.897 [GUM] Thread 0x380 UpdateLock wait on
Type 2
378.380::1999/06/09-18:00:50.897 [GUM] DoLockingUpdate successful, lock
granted to 1
378.380::1999/06/09-18:00:50.897 [GUM] GumSendUpdate: Locker dispatching
seq 1657 type 2 context 17
378.380::1999/06/09-18:00:50.897 [dm] DmBeginLocalUpdate Entry
378.380::1999/06/09-18:00:50.897 [dm] DmBeginLocalUpdate Exit,
pLocalXsaction=0x000a0dd8 dwError=0x00000000
In the last stage of forming a cluster, the Cluster service creates resources and tries to bring them online. There are two parts to bringing resources online: in the cluster database, adding possible owners for resource types in the cluster database; and then creating the resources and bringing them online.
In the following entries, the Failover Manager adds possible owners for resource types.
In the log, there are many more events of this type, but they are all essentially identical to the two reproduced here:
378.380::1999/06/09-18:00:50.897 [FM] processing resource types.
378.380::1999/06/09-18:00:50.897 [FM] FmpAddPossibleNode: adding node 1
to resource type's possible node list
378.380::1999/06/09-18:00:50.897 [FM] FmpAddPossibleNode: adding node 2
to resource type's possible node list
After adding possible owners for resource types, the Cluster service updates local copies of the cluster registry key:
378.380::1999/06/09-18:00:50.928 [NM] NmpUpdatePerformJoinFixups2:
called postfixup notifycb function with status 0
378.380::1999/06/09-18:00:50.928 [DM] DmCommitLocalUpdate Entry
378.380::1999/06/09-18:00:50.928 [DM] DmCommitLocalUpdate Exit,
dwError=0x00000000
378.380::1999/06/09-18:00:50.928 [GUM] GumpDoUnlockingUpdate releasing
lock ownership
378.380::1999/06/09-18:00:50.928 [GUM] GumSendUpdate: completed update
seq 1657 type 2 context 17
The following entries are representative of the process of creating and initializing resources. The pattern repeats, of course, for each resource. In this section, the Cluster service creates all groups from a group list in the cluster database, as indicated by the following line.
378.380::1999/06/09-18:00:51.193 [FM] Processing groups list.
In the following example, the Failover Manager creates and initializes the Cluster Group:
378.380::1999/06/09-18:00:51.193 [FM] Creating group
254ef0e4-1937-11d3-b3fe-00a0c986aa14
378.380::1999/06/09-18:00:51.193 [FM] Initializing group
254ef0e4-1937-11d3-b3fe-00a0c986aa14 from the registry.
378.380::1999/06/09-18:00:51.193 [FM] Name for Group
254ef0e4-1937-11d3-
In the following line, the Failover Manager creates and brings online the first of the resources that the Cluster Group contains. The Failover Manager repeats this pattern for all the resources the Cluster Group contains.
The pattern for creating groups and resources is that the Failover Manager does the following:
378.380::1999/06/09-18:00:51.193 [FM] Group
254ef0e4-1937-11d3-b3fe-00a0c986aa14 contains Resource
254ef0e5-1937-11d3-b3fe-00a0c986aa14.
378.380::1999/06/09-18:00:51.193 [FM] Creating resource
254ef0e5-1937-11d3-b3fe-00a0c986aa14
378.380::1999/06/09-18:00:51.193 [FM] Initializing resource
254ef0e5-1937-11d3-b3fe-00a0c986aa14 from the registry.
378.380::1999/06/09-18:00:51.193 [FM] Name for Resource
254ef0e5-1937-11d3-b3fe-00a0c986aa14 is 'Cluster IP Address'.
378.380::1999/06/09-18:00:51.193 [FM] FmpAddPossibleEntry: adding
node 1 as possible host for resource
254ef0e5-1937-11d3-b3fe-00a0c986aa14.
378.380::1999/06/09-18:00:51.193 [FM] FmpAddPossibleEntry: adding
node 2 as possible host for resource
254ef0e5-1937-11d3-b3fe-00a0c986aa14.
378.380::1999/06/09-18:00:51.193 [FM] All dependencies for resource
254ef0e5-1937-11d3-b3fe-00a0c986aa14 created.
The following entry starts the creation of the next resource, which is the Cluster Name resource:
378.380::1999/06/09-18:00:51.193 [FM] Group
254ef0e4-1937-11d3-b3fe-00a0c986aa14 contains Resource
254ef0e6-1937-11d3-b3fe-00a0c986aa14.
In creating and bringing online the Cluster Name resource (Resource 254ef0e6-1937-11d3-b3fe-00a0c986aa14), the Failover Manager discovers that the Cluster Name resource depends on the Cluster IP Address resource and so opens the Cluster IP Address resource. Note that, in order for the Cluster service to perform such actions as bringing a resource online and changing its properties, it has to first open the resource to obtain a handle to it.
378.380::1999/06/09-18:00:51.193 [FM] Resource
254ef0e6-1937-11d3-b3fe-00a0c986aa14 depends on
254ef0e5-1937-11d3-b3fe-00a0c986aa14. Creating...
378.380::1999/06/09-18:00:51.193 [FM] CreateResource, Opened existing
resource 254ef0e5-1937-11d3-b3fe-00a0c986aa14
378.380::1999/06/09-18:00:51.193 [FM] All dependencies for resource
254ef0e6-1937-11d3-b3fe-00a0c986aa14 created.
The remainder of the events involved in bringing resource 254ef0e6-1937-11d3-b3fe-00a0c986aa14 (the Cluster Name resource) online are interspersed with entries reporting a great many other events, most of which, including global updates, involve creating and attempting to bring online other resources and groups.
In the following entries, the Cluster service starts creating the Quorum resource and the group to which it belongs. Notice that the Cluster service does not bring the Quorum resource online before the Cluster Name resource and that the pattern of events is the same for all resources and groups:
378.380::1999/06/09-18:00:51.193 [FM] Creating group
254ef0e7-1937-11d3-b3fe-00a0c986aa14
378.380::1999/06/09-18:00:51.209 [FM] Initializing group
254ef0e7-1937-11d3-b3fe-00a0c986aa14 from the registry.
378.380::1999/06/09-18:00:51.209 [FM] Name for Group
254ef0e7-1937-11d3-b3fe-00a0c986aa14 is 'Disk Group 1'.
378.380::1999/06/09-18:00:51.209 [FM] Group
254ef0e7-1937-11d3-b3fe-00a0c986aa14 contains Resource
254ef0e8-1937-11d3-b3fe-00a0c986aa14.
378.380::1999/06/09-18:00:51.209 [FM] CreateResource, Opened existing
resource 254ef0e8-1937-11d3-b3fe-00a0c986aa14
378.380::1999/06/09-18:00:51.209 [FM] ReCreating quorum resource
254ef0e8-1937-11d3-b3fe-00a0c986aa14
In a mixed-mode cluster, different versions of resource type DLLs can require fixups just as the Cluster service does. After all groups have been created and if the cluster is running in mixed-mode, fixups can be run on the resource type DLLs. This is described in the following entries.
378.380::1999/06/09-18:00:51.209 [FM] FmpFixupResourceTypes Entry.
378.380::1999/06/09-18:00:51.209 [FM] FmpFixupResourceTypes Exit
Then the Cluster service obtains and reports the state of all the groups in the cluster.
378.380::1999/06/09-18:00:51.209 [FMX] GetGroupListState, Group <Disk
Group 1> state = 0
378.380::1999/06/09-18:00:51.209 [FMX] GetGroupListState, Group <Cluster
Group> state = 1
378.380::1999/06/09-18:00:51.209 [FMX] GetGroupListState, Group <test
group> state = 1
In the following entries, the Failover Manager initializes the Cluster IP Address resource by using the registry and then performs the subsequent global update and writes change records to the quorum log:
378.380::1999/06/09-18:00:51.209 [FM] Initializing resource
254ef0e5-1937-11d3-b3fe-00a0c986aa14 from the registry.
378.380::1999/06/09-18:00:51.225 [FM] Name for Resource
254ef0e5-1937-11d3-b3fe-00a0c986aa14 is 'Cluster IP Address'.
378.380::1999/06/09-18:00:51.225 [FM] FmpRmCreateResource: creating
resource 254ef0e5-1937-11d3-b3fe-00a0c986aa14 in shared resource
monitor
378.3c0::1999/06/09-18:00:51.240 [GUM] GumSendUpdate: Locker waiting
type 1 context 0
378.3c0::1999/06/09-18:00:51.240 [GUM] Thread 0x3c0 UpdateLock wait on
Type 1
378.3c0::1999/06/09-18:00:51.240 [GUM] DoLockingUpdate successful, lock
granted to 1
The following GUM update concerns creating a new registry key or entry:
378.3c0::1999/06/09-18:00:51.240 [GUM] GumSendUpdate: Locker dispatching
seq 1659 type 1 context 0
378.3c0::1999/06/09-18:00:51.240 [dm] DmWriteToQuorumLog Entry Seq#=1659
Type=0 Size=152
378.3c0::1999/06/09-18:00:51.240 [dm] DmWriteToQuorumLog Entry Seq#=1659
Type=0 Size=152
378.3c0::1999/06/09-18:00:51.240 [GUM] GumpDoUnlockingUpdate releasing
lock ownership
378.3c0::1999/06/09-18:00:51.240 [GUM] GumSendUpdate: completed update
seq 1659 type 1 context 0
The following entry reports that the network interface is no longer valid; the interface is no longer available.
15c.458::1999/06/09-18:00:51.256 IP Address <Cluster IP Address>: NBT
interface \Device\NetBt_If1 (instance 0x712F4D0A) is no longer valid.
In the following entry, the update concerns deleting an entry from a registry key, probably the one for the interface that is no longer valid:
378.3c0::1999/06/09-18:00:51.256 [GUM] GumSendUpdate: Locker waiting
type 1 context 4099
378.3c0::1999/06/09-18:00:51.256 [GUM] Thread 0x3c0 UpdateLock wait on
Type 1
378.3c0::1999/06/09-18:00:51.256 [GUM] DoLockingUpdate successful, lock
granted to 1
378.3c0::1999/06/09-18:00:51.256 [GUM] GumSendUpdate: Locker dispatching
seq 1660 type 1 context 4099
The Cluster service attempts again to create and bring the Cluster Name resource (254ef0e6-1937-11d3-b3fe-00a0c986aa14) online, initializing the resource from the cluster registry key and creating it in the Resource Monitor:
378.380::1999/06/09-18:00:51.256 [FM] Initializing resource
254ef0e6-1937-11d3-b3fe-00a0c986aa14 from the registry.
378.380::1999/06/09-18:00:51.256 [FM] Name for Resource
254ef0e6-1937-11d3-b3fe-00a0c986aa14 is 'Cluster Name'.
378.380::1999/06/09-18:00:51.271 [FM] FmpRmCreateResource: creating
resource 254ef0e6-1937-11d3-b3fe-00a0c986aa14 in shared resource
monitor
In the following entries, the Database Manager has created a registry key to replace the one that was deleted in preceding the entries for 1999/06/09-18:00:51.256. This key might be a registry entry or subkey, but its creation is reported to the cluster log as a "key." The GUM performs a global update with this change, and the Database Manager writes the change to the quorum log. Note, again, that the type (0) in the write operation to the quorum log matches the context number for the global update. If the Cluster service on this node were to crash right after this operation and the other node were started, the other node would have access to the changes recorded in the quorum log and would apply them in its cluster formation process. As a result, the current configuration of the cluster would be preserved.
378.3c0::1999/06/09-18:00:51.271 [GUM] GumSendUpdate: Locker waiting
type 1 context 0
378.3c0::1999/06/09-18:00:51.271 [GUM] Thread 0x3c0 UpdateLock wait on
Type 1
378.3c0::1999/06/09-18:00:51.271 [GUM] DoLockingUpdate successful, lock
granted to 1
378.3c0::1999/06/09-18:00:51.271 [GUM] GumSendUpdate: Locker dispatching
seq 1662 type 1 context 0
378.3c0::1999/06/09-18:00:51.271 [dm] DmWriteToQuorumLog Entry Seq#=1662
Type=0 Size=152
378.3c0::1999/06/09-18:00:51.271 [dm] DmWriteToQuorumLog Entry Seq#=1662
Type=0 Size=152
378.3c0::1999/06/09-18:00:51.271 [GUM] GumpDoUnlockingUpdate releasing
lock ownership
378.3c0::1999/06/09-18:00:51.271 [GUM] GumSendUpdate: completed update
seq 1662 type 1 context 0
15c.3cc::1999/06/09-18:00:51.287 Network Name <Cluster Name>: Successful
open of resid 507752
378.380::1999/06/09-18:00:51.287 [FM] FmpRmCreateResource: created
resource 254ef0e6-1937-11d3-b3fe-00a0c986aa14, resid 741136
After bringing other resources online, the Cluster service tries to bring the Cluster group, which contains the Cluster Name and Cluster IP Address resources, online:
378.380::1999/06/09-18:00:51.318 [FM] FmpOnlineGroupFromList: Previous
group state for 254ef0e4-1937-11d3-b3fe-00a0c986aa14 is 1
378.380::1999/06/09-18:00:51.318 [FM] FmpOnlineGroup: trying to bring
group 254ef0e4-1937-11d3-b3fe-00a0c986aa14 online
378.380::1999/06/09-18:00:51.318 [FM] OnlineGroup for
254ef0e4-1937-11d3-b3fe-00a0c986aa14 owner 651932
Next, the Cluster service brings the Cluster IP Address online. Note that the Failover Manager acquires a shared lock on the Quorum resource to do so. (For more information about shared locks, see "Shared Locks and gdwQuoBlockingResources" earlier in this chapter.)
378.380::1999/06/09-18:00:51.318 [FM] FmpRmOnlineResource: bringing
resource 254ef0e5-1937-11d3-b3fe-00a0c986aa14 (resid 735200) online.
378.380::1999/06/09-18:00:51.318 [CP] CppResourceNotify for resource
Cluster IP Address
378.380::1999/06/09-18:00:51.334 [FM] FmpRmOnlineResource: called
InterlockedIncrement on gdwQuoBlockingResources for resource
254ef0e5-1937-11d3-b3fe-00a0c986aa14
15c.458::1999/06/09-18:00:51.334 IP Address <Cluster IP Address>:
Bringing resource online...
The GUM initiates a global update for the Cluster IP Address resource's change in state from Pending to Online while the Online thread is still running:
378.380::1999/06/09-18:00:51.334 [GUM] GumSendUpdate: Locker waiting
type 0 context 8
15c.488::1999/06/09-18:00:51.334 IP Address <Cluster IP Address>: Online
thread running.
The following entry is part of a group of entries that describe activity while both the Cluster IP Address and Cluster Name resources are still in the Pending state. In this entry, the IP Address resource is notified that the network interface to which it is bound has changed or been deleted.
15c.488::1999/06/09-18:00:51.334 IP Address <Cluster IP Address>:
Created cluster notify port.
With the Cluster Group online, the Cluster service checks to make sure that there is still a network interface to which the IP Address can be bound on the same subnet. If there is not, the IP Address is orphaned.
15c.488::1999/06/09-18:00:51.350 IP Address <Cluster IP Address>:
Checking if resource is orphaned: network masks 00FCFFFF=00FFFFFF
and addresses FE4010AC^00000A0A.
After working on bringing another resource online, the Cluster service continues bringing the Cluster IP Address resource online. In these two entries, the Cluster service obtains a handle for the network interface and registers it so that the IP Address resource that is bound to it is notified if a change occurs to the network interface.
15c.488::1999/06/09-18:00:51.365 IP Address <Cluster IP Address>:
Online: Opened object handle for netinterface
d561dbcc-1936-11d3-b3fe-00a0c986aa14.
15c.488::1999/06/09-18:00:51.365 IP Address <Cluster IP Address>:
Online: Registered notification for netinterface
d561dbcc-1936-11d3-b3fe-00a0c986aa14.
In the following entries, the Database Manager takes a checkpoint of the cluster database (in this case, the cluster key of the registry for the local system), and the Log Manager writes it to the quorum disk:
378.380::1999/06/09-18:00:51.412 [dm]DmUpdateFormNewCluster - taking a
checkpoint
378.380::1999/06/09-18:00:51.412 [LM] LogCheckPoint entry
378.380::1999/06/09-18:00:51.412 [LM] DmpGetSnapshotCb::checkpoint file
name =D:\MSCS\chk67D.tmp Seq#=1661
378.380::1999/06/09-18:00:51.756 [dm] DmpGetSnapShotCb - DmpGetDatabase
returned 0x00000000
378.380::1999/06/09-18:00:51.756 [LM] LogCheckPoint:
ChkPtFile=D:\MSCS\chk67D.tmp Chkpt Trid=1661 CheckSum=86865
378.380::1999/06/09-18:00:51.756 [lm] LogpAppendPage : Writing 1024
bytes to disk at offset 0x00000800
378.380::1999/06/09-18:00:51.756 [lm] LogFlush : pLog=0x000a0650 writing
the 1024 bytes for active page at offset 0x00000c00
378.380::1999/06/09-18:00:51.771 [LM] LogCheckPoint: EndChkpt written.
EndChkPtLsn =0x00000c08 ChkPt Seq=1661 ChkPt
FileName=D:\MSCS\chk67D.tmp
378.380::1999/06/09-18:00:51.771 [lm] LogpCheckpoint : Writing 1024
bytes to disk at offset 0x00000000
378.380::1999/06/09-18:00:51.771 [LM] LogCheckPoint Exit
The following entry reports that a node is down. Because a node is down, updates are made to the database and written to the quorum log — therefore, the Database Manager performs the update. (If all nodes were up, the GUM would perform the update, replicating any change across the cluster.)
378.380::1999/06/09-18:00:51.771 [dm] DmUpdateFormNewCluster - some node
down
And now the cluster has successfully formed:
378.380::1999/06/09-18:00:51.787 [INIT] Successfully formed a cluster.
378.380::1999/06/09-18:00:51.803 [INIT] Cluster started.
With the cluster formed, the Cluster service continues bringing resources online, including the Cluster IP Address resource.
After making a number of changes to the cluster database and subsequently writing change records to the quorum log, the Database Manager changes one of the Parameters subkeys for the Cluster IP Address resource and writes the change to the quorum log. Then the GUM initiates a global update with the change to the registry (type 1 context 4098):
378.498::1999/06/09-18:00:53.162 DM: Setting value NbtDeviceInstance for
key Resources\254ef0e5-1937-11d3-b3fe-00a0c986aa14\Parameters\1
378.498::1999/06/09-18:00:53.178 [dm] DmWriteToQuorumLog Entry Seq#=1677
Type=4098 Size=180
378.498::1999/06/09-18:00:53.178 [lm] LogWrite : Entry TrId=1677 RmId=5
RmType = 4098 Size=180
378.498::1999/06/09-18:00:53.178 [lm] LogWrite : Exit
returning=0x000011f0
378.498::1999/06/09-18:00:53.178 [GUM] GumpDoUnlockingUpdate releasing
lock ownership
378.498::1999/06/09-18:00:53.178 [GUM] GumSendUpdate: completed update
seq 1677 type 1 context 4098
In the following entries, the Cluster service creates an NBT interface for the Cluster IP Address resource, and the Resource Monitor announces that the resource is online (state 2):
15c.488::1999/06/09-18:00:53.178 IP Address <Cluster IP Address>:
Created NBT interface \Device\NetBt_If1 (instance 0x7A85A0AA).
15c.488::1999/06/09-18:00:59.396 [RM] RmpSetResourceStatus, Posting
state 2 notification for resource <Cluster IP Address>
The following entry reports that the Cluster IP Address resource is online.
15c.488::1999/06/09-18:00:59.396 IP Address <Cluster IP Address>: IP
Address 172.16.64.254 on adapter (R) PRO/100B PCI Adapter (TX)
online
The following entry reports the Cluster IP Address resource as online. Note that the old state was 129 (ClusterResourceOnlinePending) and the new state is 2 (ClusterResourceOnline).
378.328::1999/06/09-18:00:59.396 [FM] HandleResourceTransition: Resource
Name = 254ef0e5-1937-11d3-b3fe-00a0c986aa14 old state=129 new
state=2
Bringing resource bf37b6a7-391d-11d2-a397-0000f803f3da (the Cluster Network Name resource) online takes longer. The entries that describe this process are the same as those that describe the process of bringing the Cluster IP Address online, except that there are some new entries.
In the first four entries, the Network Name resource is trying to come online. But it must wait for 254ef0e5-1937-11d3-b3fe-00a0c986aa14, the Cluster IP Address resource on which the Cluster Network Name resource depends, to come online first:
378.3a8::1999/06/09-18:00:59.396 [FM] OnlineWaitingTree,
254ef0e6-1937-11d3-b3fe-00a0c986aa14 (129) depends on
254ef0e5-1937-11d3-b3fe-00a0c986aa14 (2). Start now
378.3a8::1999/06/09-18:00:59.396 [FM] OnlineWaitingTree, Start resource
254ef0e6-1937-11d3-b3fe-00a0c986aa14
378.3a8::1999/06/09-18:00:59.490 [FM] OnlineResource:
254ef0e6-1937-11d3-b3fe-00a0c986aa14 depends on
254ef0e5-1937-11d3-b3fe-00a0c986aa14. Bring online first.
378.3a8::1999/06/09-18:00:59.537 [FM] FmpRmOnlineResource: bringing
resource 254ef0e6-1937-11d3-b3fe-00a0c986aa14 (resid 741136) online.
Bringing the Cluster Network Name resource online involves a number of activities that have been covered earlier in this chapter, including the following:
The unique activities are reported in the following entries, at the end of which the Cluster Network Name resource has successfully come online:
15c.4d4::1999/06/09-18:01:01.177 Network Name <Cluster Name>: Registered
server name CLUSTER1 on transport \Device\NetBt_If1.
15c.4d4::1999/06/09-18:01:01.177 Network Name <Cluster Name>: Registered
workstation name CLUSTER1 on transport \Device\NetBt_If1.
.
.
.
15c.4d4::1999/06/09-18:01:01.615 Network Name <Cluster Name>: The zone
for 254.64.16.172.in-addr.arpa. does not accept dynamic DNS
registration updates.
15c.4d4::1999/06/09-18:01:01.630 Network Name <Cluster Name>: Registered
DNS name cluster1.itreskit.com on IP Address 172.16.64.254.
15c.4d4::1999/06/09-18:01:01.630 Network Name <Cluster Name>: Network
Name CLUSTER1 is now online
15c.4d4::1999/06/09-18:01:01.630 [RM] RmpSetResourceStatus, Posting
state 2 notification for resource <Cluster Name>
The Cluster service does the following:
378.380::1999/06/09-18:00:51.381 [EVT] EvOnline
378.380::1999/06/09-18:00:51.381 [EVT] Set propagation state to 0001
378.380::1999/06/09-18:00:51.381 [EVT] EvOnline : calling
ElfRegisterClusterSvc
378.380::1999/06/09-18:00:51.412 [EVT] EvOnline:
pPackedEventInfo->ulSize=3284 pPackedEventInfo
->ulNulEventsForLogFile=2
378.380::1999/06/09-18:00:51.412 [dm]DmUpdateFormNewCluster - taking
a checkpoint
378.380::1999/06/09-18:00:51.412 [LM] LogCheckPoint entry
378.380::1999/06/09-18:00:51.412 [LM] DmpGetSnapshotCb::checkpoint
file name =D:\MSCS\chk67D.tmp Seq#=1661
378.380::1999/06/09-18:00:51.756 [dm] DmpGetSnapShotCb -
DmpGetDatabase returned 0x00000000
378.380::1999/06/09-18:00:51.756 [LM] LogCheckPoint:
ChkPtFile=D:\MSCS\chk67D.tmp Chkpt Trid=1661 CheckSum=86865
378.380::1999/06/09-18:00:51.756 [lm] LogpAppendPage : Writing 1024
bytes to disk at offset 0x00000800
378.380::1999/06/09-18:00:51.756 [lm] LogFlush : pLog=0x000a0650
writing the 1024 bytes for active page at offset 0x00000c00
378.380::1999/06/09-18:00:51.771 [LM] LogCheckPoint: EndChkpt
written. EndChkPtLsn =0x00000c08 ChkPt Seq=1661 ChkPt
FileName=D:\MSCS\chk67D.tmp
378.380::1999/06/09-18:00:51.771 [lm] LogpCheckpoint : Writing 1024
bytes to disk at offset 0x00000000
378.380::1999/06/09-18:00:51.771 [LM] LogCheckPoint Exit
The following entry reports that the Cluster service has successfully pinged its own IP address to make sure the IP address is unique:
15c.488::1999/06/09-18:00:59.396 IP Address <Cluster IP Address>: IP
Address 172.16.64.254 on adapter (R) PRO/100B PCI Adapter (TX)
online