Interpreting the Cluster Log

Previous Topic Next Topic

Forming a Cluster

Forming a cluster involves the following stages:

The form operation starts at the following entry:

378.380::1999/06/09-18:00:46.615 [INIT] Attempting to form cluster

  CLUSTER1


The form operation ends at the following entry:

378.380::1999/06/09-18:00:51.787 [INIT] Successfully formed a cluster.


The following sections discuss some of the significant entries in a form operation.

Starting Resrcmon.exe

The following entry describes starting the Resource Monitor:

15c.0d4::1999/06/09-18:00:46.694 [RM] Main: Initializing.


Bringing the Quorum Resource Online

The entries in this section describe the following:

The following entries describe bringing online the group that is going to contain the Quorum resource. Note the following:

378.380::1999/06/09-18:00:47.069 [FM] Creating group 

  254ef0e7-1937-11d3-b3fe-00a0c986aa14

378.380::1999/06/09-18:00:47.084 [FM] Initializing group 

  254ef0e7-1937-11d3-b3fe-00a0c986aa14 from the registry.

378.380::1999/06/09-18:00:47.100 [FM] Name for Group 

  254ef0e7-1937-11d3-b3fe-00a0c986aa14 is 'Disk Group 1'.

378.380::1999/06/09-18:00:47.100 [FM] Group 

  254ef0e7-1937-11d3-b3fe-00a0c986aa14 contains Resource 

  254ef0e8-1937-11d3-b3fe-00a0c986aa14.


In the following entries, resource 254ef0e8-1937-11d3-b3fe-00a0c986aa14 (the Physical Disk resource, which is the Quorum resource) is created and initialized using the values from the registry. (Note the entry that gives the display name of the resource.) The cluster key configuration for this resource specifies the following:

378.380::1999/06/09-18:00:47.100 [FM] Creating resource 

  254ef0e8-1937-11d3-b3fe-00a0c986aa14

378.380::1999/06/09-18:00:47.100 [FM] Initializing resource 

  254ef0e8-1937-11d3-b3fe-00a0c986aa14 from the registry.

378.380::1999/06/09-18:00:47.100 [FM] Name for Resource 

  254ef0e8-1937-11d3-b3fe-00a0c986aa14 is 'Disk D:'.

378.380::1999/06/09-18:00:47.100 [FM] FmpAddPossibleEntry: adding node 1

  as possible host for resource 254ef0e8-1937-11d3-b3fe-00a0c986aa14.


Using the cluster configuration data in the registry, the Failover Manager creates the resources on which the Physical Disk resource (D:) depends.

378.380::1999/06/09-18:00:47.100 [FM] All dependencies for resource

  254ef0e8-1937-11d3-b3fe-00a0c986aa14 created.


In the following entry, the node arbitrates for the ownership of the Quorum resource.

378.380::1999/06/09-18:00:47.115 [FM] arbitrate for quorum resource id

  254ef0e8-1937-11d3-b3fe-00a0c986aa14.


Note that in adding possible owners for the Quorum resource, there are no other nodes presently in the cluster. This fact explains the following two entries.

378.380::1999/06/09-18:00:47.115 [FM] FmpAddPossibleNode: adding node 1

  to resource type's possible node list

378.380::1999/06/09-18:00:47.115 [FM] FmpAddPossibleNode: Warning, node

  2 not found


The following entry reports adding the Quorum resource to the Resource Monitor's list of resources to watch and help manage. This entry occurs for each resource as it is added to the list.

378.380::1999/06/09-18:00:47.115 [FM] FmpRmCreateResource: creating

  resource 254ef0e8-1937-11d3-b3fe-00a0c986aa14 in shared resource

  monitor


In the following entry, the PnP window is the Plug and Play window, which gets messages that indicate changes in the disk configuration.

15c.46c::1999/06/09-18:00:47.740 Physical Disk: PnP window created

  successfully.


In the following entry, the Cluster service makes node 1 the Quorum resource owner.

378.380::1999/06/09-18:00:47.897 [MM] MmSetQuorumOwner(1,1), old owner

  0.


Of the entries that report disk arbitration, the significant ones are the following. In the second entry, the node's ownership of the Quorum resource has been confirmed.

15c.458::1999/06/09-18:00:47.912 Physical Disk <Disk D:>:

  [DISKARB]Assume ownership of a quorum device.

15c.458::1999/06/09-18:00:47.912 Physical Disk <Disk D:>:

  [DISKARB]Arbitrate returned status 0.


As the Quorum resource is brought online, the following [GUM] entries report a successful GUM update. GUM relies on one node to be the master when issuing updates to the other nodes. This master is called the "locker."

Note that the sequence number is 1657. You can use this sequence number to track the update.

378.380::1999/06/09-18:00:47.928 [GUM] GumSendUpdate:  Locker waiting

  type 0 context 8

378.380::1999/06/09-18:00:47.928 [GUM] Thread 0x380 UpdateLock wait on

  Type 0

378.380::1999/06/09-18:00:47.928 [GUM] DoLockingUpdate successful, lock

  granted to 1

378.380::1999/06/09-18:00:47.928 [GUM] GumSendUpdate: Locker dispatching

  seq 1657    type 0 context 8

378.380::1999/06/09-18:00:47.928 [GUM] GumpDoUnlockingUpdate releasing

  lock ownership

378.380::1999/06/09-18:00:47.928 [GUM] GumSendUpdate: completed update

  seq 1657    type 0 context 8


The description "type 0 context 8" occurs frequently in cluster logs. "Type 0 context 8" means that GUM is performing an update for the Failover Manager and that the update concerns a change in the state of a resource (in this case, the Quorum resource). Here's why:

Table 20.2 Update Types

Type Code Type
0 GumUpdateFailoverManager
1 GumUpdateRegistry
2 GumUpdateMembership

In the following entries, the state of the Quorum resource is propagated to the cluster nodes; then the Quorum resource is brought online:

378.380::1999/06/09-18:00:47.928 [FM] FmpPropagateResourceState:

  resource 254ef0e8-1937-11d3-b3fe-00a0c986aa14 pending event.

378.380::1999/06/09-18:00:47.928 [FM] FmpRmOnlineResource: Resource

  254ef0e8-1937-11d3-b3fe-00a0c986aa14 pending

378.380::1999/06/09-18:00:47.944 [FM] FmpRmOnlineResource: Returning.

  Resource 254ef0e8-1937-11d3-b3fe-00a0c986aa14, state 129, status

  997.


Next, the Cluster service makes sure the drive-letter assignment data is consistent across the cluster.

In the following entry, the Cluster service lists the partition table data associated with the drive letter D:

15c.470::1999/06/09-18:00:47.944 Physical Disk <Disk D:>: Mountie[0]: 1,

  let=?, start=104000, len=FF9FC000.


In the following entries, the Cluster service synchronizes drive-letter information on all the sources (HKEY_LOCAL_MACHINE\SYSTEM\Disk, \DiskInfo, \MountMgr, and \MountVolumeInfo). The information in the HKEY_LOCAL_MACHINE\SYSTEM\Disk subkey is out-of-date, and the fourth and fifth entries report the updating of the subkey.

15c.470::1999/06/09-18:00:48.037 Physical Disk <Disk D:>: MountieVerify:

  Registry-System\DISK.GetInfo returned 0 [0:0].

15c.470::1999/06/09-18:00:48.037 Physical Disk <Disk D:>: MountieVerify:

  ClusReg-DiskInfo selected.

15c.470::1999/06/09-18:00:48.037 Physical Disk <Disk D:>: MountieVerify:

  DriveLetters mask is now 00000008.

15c.470::1999/06/09-18:00:48.037 Physical Disk <Disk D:>: MountieVerify:

  Update needed for 08.

15c.470::1999/06/09-18:00:48.053 Physical Disk <Disk D:>: FtInfo_Set:

  Update successful.


The following entry means that all the sources of drive letter information have been updated and are synchronized. In the description "Update needed for nn," where "nn" is a two-digit number; "00" means that nothing needs updating. Note that in the fourth entry of the preceding example, the description reads "Update needed for 08."

15c.470::1999/06/09-18:00:48.053 Physical Disk <Disk D:>: MountieUpdate:

  Update needed for 00.


In the following entry, the Resource Monitor notifies the Cluster service that the Disk D: resource is online.

15c.470::1999/06/09-18:00:48.240 [RM] RmpSetResourceStatus, Posting

  state 2 notification for resource <Disk D:>


The following entry indicates that the operation of bringing the Quorum resource online has succeeded.

378.3a4::1999/06/09-18:00:48.240 [DM] DmpQuoObjNotifyCb: Quorum resource

  is online


Applying Quorum Log Changes to the Cluster Database

The quorum log, also known as the change log, records changes to the cluster configuration.

At this point in the cluster formation operation, after the Quorum resource is online, the Cluster service does the following:

In the following entries, the Cluster service opens the quorum log:

378.3a4::1999/06/09-18:00:48.240 [DM] DmpQuoObjNotifyCb: Own quorum

  resource, try open the quorum log

378.3a4::1999/06/09-18:00:48.303 [DM] DmpQuoObjNotifyCb: the name of the

  quorum file is D:\MSCS\quolog.log

378.3a4::1999/06/09-18:00:48.303 [lm] LogCreate : Entry

  FileName=D:\MSCS\quolog.log MaxFileSize=0x00010000

378.3a4::1999/06/09-18:00:48.397 [lm] LogpCreate : Entry


The service mounts the log and prepares it so that the log can be read and written:

378.3a4::1999/06/09-18:00:48.443 [lm] LogpMountLog : Entry

  pLog=0x000a0650

378.3a4::1999/06/09-18:00:48.443 [lm]:LogpMountLog::Quorumlog File

  size=0x00008000


The following entries are typical of those in which the Cluster service checks for or reads log records at a particular offset in the file.

378.3a4::1999/06/09-18:00:48.443 [lm]:LogpMountLog::checking LSN

  0x00000408

.

.

.

378.3a4::1999/06/09-18:00:48.443 [lm]:LogpMountLog::checking LSN

  0x00000a48


In the cluster form operation, applying quorum log changes to the cluster database involves the following:

  1. Checking for a tombstone file.
  2. Unrolling the quorum log and propagating state changes to the local registry copy of the cluster database.

Checking for a Tombstone File

A tombstone file is created when a cluster node's quorum-device information is outdated; the tombstone file alerts the node's Cluster service of the fact that the quorum directory is not located where the node's quorum-device information currently says it is.

For example, if node A was not running when node B changed the location of the quorum device, node A has outdated quorum-device information and node B has the current quorum-device information. If node B now goes down and node A comes up, node A's outdated quorum-device information points it to a tombstone file. The tombstone file doesn't point to the new Quorum resource. The result is that node A can only join the cluster; it cannot form a cluster.

In the following entries, the Database Manager checks whether a valid quorum tombstone exists. The return value of zero in the second entry means the DmpChkQuoTombstone function encountered no errors in trying to validate the tombstone information and the form operation can continue.

378.380::1999/06/09-18:00:48.459 [DM] DmpChkQuoTombStone - Entry

378.380::1999/06/09-18:00:48.459 [DM] DmpChkQuoTombStone:

  Exit,dwError=0x00000000


If DmpChkQuoTombstone had found a valid quorum tombstone, the Database Manager would have logged an error in the event log, and the Cluster service would have exited immediately. In that case, you wouldn't see the Exit statement from DmpChkQuoTombstone in the cluster log.

Unrolling the Quorum Log and Propagating State Changes

In unrolling the quorum log, the Database Manager does the following:

Each configuration change is updated to any cluster nodes. Even though in the cluster form operation there is no other node, each change that occurs in the form operation prompts an update by the GUM.

The following entries are part of the Database Manager's finding the last cluster database checkpoint (sequence number 1656). The first entry notes the sequence number for the cluster key in the local registry.

378.380::1999/06/09-18:00:48.459 [DM] DmpApplyChanges: The current

  registry sequence number 1656

378.380::1999/06/09-18:00:48.459 [lm] LogGetLastChkPoint:: Entry

378.328::1999/06/09-18:00:48.537 [FM] HandleResourceTransition: Resource

  Name = 254ef0e8-1937-11d3-b3fe-00a0c986aa14 old state=129 new

  state=2


After calling LogGetLastChkPoint, the Log Manager writes the sequence number. Be sure to track such activity by its thread. For example, the following entries are interspersed with entries from the 378.328 thread, which originated with the Failover Manager.

378.380::1999/06/09-18:00:48.553 [LM] LogGetLastChkPoint: ChkPt File

  D:\MSCS\chk678.tmp ChkPtSeq=1656 ChkPtLsn=0x00000678 Checksum=93996

378.328::1999/06/09-18:00:48.568 FmpPropagateResourceState: signalling

  the ghQuoOnlineEvent

378.380::1999/06/09-18:00:48.568 [LM] LogGetLastChkPoint exit, returning

  0x00000000

378.328::1999/06/09-18:00:48.568 [GUM] GumSendUpdate:  Locker waiting

  type 0 context 8

378.380::1999/06/09-18:00:48.568 [DM] DmpLogFindStartLsn:

  LogGetLastChkPt rets, Seq#=1656 ChkPtLsn=0x00000678


When a checkpoint is written to the quorum disk, the writing of the checkpoint is described in the quorum log by a begin-checkpoint record and an end-checkpoint record. In the following entry, the Database Manager finds the last begin-checkpoint and end-checkpoint records in the quorum log, which it then uses to find and identify the most recent checkpoint file. It then uploads the checkpoint file to the cluster registry key:

378.380::1999/06/09-18:00:48.568 [DM] DmpLogFindStartLsn: Uploading

  chkpt from quorum log


In the following entries, the Database Manager loads the cluster database and applies the changes in the quorum log to the loaded cluster database:

378.380::1999/06/09-18:00:49.365 [DM]: Loading cluster database from

  C:\WINNT\cluster\CLUSDB

15c.46c::1999/06/09-18:00:49.459 Physical Disk: 

  AddVolume : \\?\Volume{89fc8c21-18ef-11d3-b1b2-806d6172696f}\ 

  'C', 7 (8419616)

378.380::1999/06/09-18:00:49.678 [DM] DmpLogFindStartLsn: chkpt uploaded

  from quorum log

378.380::1999/06/09-18:00:49.693 [DM] DmpLogFindStartLsn:

  LSN=0x00000678, dwError=0x00000000

378.380::1999/06/09-18:00:49.693 [DM] DmpApplyChanges: The LSN of the

  record to apply changes from 0x00000678


Now that the quorum log entries have been read, validated, and propagated to the cluster's database, the Cluster service is ready to recreate its groups and resources.

Recreating Groups and Resources

In order to start, the Cluster service uses the cluster database from the local registry, which might be out-of-date (for instance, if the cluster database was updated while the node was down).

To guarantee that the Cluster service always uses the most up-to-date copy of the cluster database, the Cluster service updates the cluster configuration by first destroying the local in-memory copies of all the current objects (groups, resources, and so on) and then recreating cluster objects from the updated registry information in the cluster key. The local registry's copy of the cluster database is updated by copying the latest version from the quorum disk or, if the node is joining a cluster, from the sponsoring node.

Destroying Groups and Resources

The following entries report the destruction of the following:

378.380::1999/06/09-18:00:49.693 [FM] FmFormNewClusterPhase1, Entry.

  Quorum quorum will be deleted

378.380::1999/06/09-18:00:49.693 [FM] DestroyGroup: destroying 

  254ef0e7-1937-11d3-b3fe-00a0c986aa14

378.380::1999/06/09-18:00:49.693 [FM] DestroyResource: destroying 

  254ef0e8-1937-11d3-b3fe-00a0c986aa14

15c.46c::1999/06/09-18:00:49.693 Physical Disk: AddVolume:

  GetPartitionInfo(\??\Volume{89fc8c1b-18ef-11d3-b1b2-806d6172696f}),

  error 1

378.380::1999/06/09-18:00:49.693 [FM] FmpDestroyResource Exit.

378.380::1999/06/09-18:00:49.693 [FM] FmpDestroyGroup: Group 

  254ef0e7-1937-11d3-b3fe-00a0c986aa14 destroyed.

378.380::1999/06/09-18:00:49.693 [OM] Deleting object Disk Group 1

  (254ef0e7-1937-11d3-b3fe-00a0c986aa14)


Creating Groups and Resources

The forming cluster builds up the memory database by recreating first the groups and then the resources on the basis of the new configuration.

Configuring the networks (which is covered in the next section) happens after the Cluster service has destroyed the groups and resources and before it recreates them.

Note that this set of operations actually follows the cluster's configuring of the network in time.

Recreating the groups and resources begins with the following:

378.380::1999/06/09-18:00:51.162 [FM] FmFormNewClusterPhase2, Entry.


In the subsequent entries, up to "378.380::1999/06/09-18:00:51.209 [FM] All groups created", the forming node does the following:

378.380::1999/06/09-18:00:51.193 [FM] Processing groups list.


378.380::1999/06/09-18:00:51.193 [FM] Creating group

254ef0e4-1937-11d3-b3fe-00a0c986aa14


378.380::1999/06/09-18:00:51.193 [FM] Initializing group

254ef0e4-1937-11d3-b3fe-00a0c986aa14 from the registry.

378.380::1999/06/09-18:00:51.193 [FM] Name for Group

254ef0e4-1937-11d3-b3fe-00a0c986aa14 is 'Cluster Group'.


378.380::1999/06/09-18:00:51.193 [FM] Group

254ef0e4-1937-11d3-b3fe-00a0c986aa14 contains Resource

254ef0e5-1937-11d3-b3fe-00a0c986aa14.

378.380::1999/06/09-18:00:51.193 [FM] Creating resource

254ef0e5-1937-11d3-b3fe-00a0c986aa14

378.380::1999/06/09-18:00:51.193 [FM] Initializing resource

254ef0e5-1937-11d3-b3fe-00a0c986aa14 from the registry.

378.380::1999/06/09-18:00:51.193 [FM] Name for Resource

254ef0e5-1937-11d3-b3fe-00a0c986aa14 is 'Cluster IP Address'.

.

.

.

378.380::1999/06/09-18:00:51.193 [FM] Group

254ef0e4-1937-11d3-b3fe-00a0c986aa14 contains Resource

254ef0e6-1937-11d3-b3fe-00a0c986aa14.

378.380::1999/06/09-18:00:51.193 [FM] Creating resource

254ef0e6-1937-11d3-b3fe-00a0c986aa14

378.380::1999/06/09-18:00:51.193 [FM] Initializing resource

254ef0e6-1937-11d3-b3fe-00a0c986aa14 from the registry.

378.380::1999/06/09-18:00:51.193 [FM] Name for Resource

254ef0e6-1937-11d3-b3fe-00a0c986aa14 is 'Cluster Name'.


378.380::1999/06/09-18:00:51.193 [FM] FmpAddPossibleEntry: adding

node 1 as possible host for resource

254ef0e5-1937-11d3-b3fe-00a0c986aa14.

378.380::1999/06/09-18:00:51.193 [FM] FmpAddPossibleEntry: adding

node 2 as possible host for resource

254ef0e5-1937-11d3-b3fe-00a0c986aa14.

378.380::1999/06/09-18:00:51.193 [FM] All dependencies for resource

254ef0e5-1937-11d3-b3fe-00a0c986aa14 created.

.

.

.

378.380::1999/06/09-18:00:51.193 [FM] FmpAddPossibleEntry: adding

node 1 as possible host for resource

254ef0e6-1937-11d3-b3fe-00a0c986aa14.

378.380::1999/06/09-18:00:51.193 [FM] FmpAddPossibleEntry: adding

node 2 as possible host for resource

254ef0e6-1937-11d3-b3fe-00a0c986aa14.


378.380::1999/06/09-18:00:51.193 [FM] Resource

254ef0e6-1937-11d3-b3fe-00a0c986aa14 depends on

254ef0e5-1937-11d3-b3fe-00a0c986aa14. Creating...

378.380::1999/06/09-18:00:51.193 [FM] CreateResource, Opened existing

resource 254ef0e5-1937-11d3-b3fe-00a0c986aa14

378.380::1999/06/09-18:00:51.193 [FM] All dependencies for resource

254ef0e6-1937-11d3-b3fe-00a0c986aa14 created.


Among the groups and resources created are the Quorum resource and the group that contains the Quorum resource:

378.380::1999/06/09-18:00:51.209 [FM] Name for Group 

  254ef0e7-1937-11d3-b3fe-00a0c986aa14 is 'Disk Group 1'.

378.380::1999/06/09-18:00:51.209 [FM] Group 

  254ef0e7-1937-11d3-b3fe-00a0c986aa14 contains Resource 

  254ef0e8-1937-11d3-b3fe-00a0c986aa14.

378.380::1999/06/09-18:00:51.209 [FM] CreateResource, Opened existing

  resource 254ef0e8-1937-11d3-b3fe-00a0c986aa14

378.380::1999/06/09-18:00:51.209 [FM] ReCreating quorum resource

  254ef0e8-1937-11d3-b3fe-00a0c986aa14


After the service has created all the groups and resources, this entry appears in the log:

378.380::1999/06/09-18:00:51.209 [FM] All groups created.


With the group and resource objects recreated and initialized and their possible owners added, the Cluster service is ready to configure the cluster's network information.

Configuring the Networks

In this phase of cluster formation, the Cluster service creates network and network interface objects and makes sure the information about them is accurate and consistent across the cluster. To do so, the network configuration engine does the following:

The following entry begins network-configuration:

378.380::1999/06/09-18:00:49.756 [NM] Synchronizing node information.


Creating Objects

The Cluster service creates node objects that represent the members of the cluster, according to the cluster database.

Note that this cluster is being formed with only one node.

378.380::1999/06/09-18:00:49.756 [NM] Creating node objects.

378.380::1999/06/09-18:00:49.756 [NM] Creating object for node 2 (NODE2)


Collecting System Information

The Cluster service collects information from the system about network adapters and network connection objects.

Interfaces are IP addresses. An adapter can have multiple IP addresses (interfaces) associated with it.

378.380::1999/06/09-18:00:49.772 [NM] Synchronizing network information.

378.380::1999/06/09-18:00:49.787 [NM] Synchronizing interface

  information.


In the following entry, the Cluster service compares the current configuration with the one recorded in the cluster database. It notes which network adapters have been removed or added or whether there have been no changes in the network configuration.

378.380::1999/06/09-18:00:49.818 [NM] Running network configuration

  engine.


In the following entries, the Node Manager reports any changes it detected. It updates the cluster database with this information. The second entry means the previously existing adapters (one, in this case) were matched to their previous networks.

378.380::1999/06/09-18:00:50.600 [NM] Processing network configuration

  changes.

378.380::1999/06/09-18:00:50.600 [NM] Matched 1 networks, created 0 new

  networks.

378.380::1999/06/09-18:00:50.600 [NM] Resynchronizing network

  information.

378.380::1999/06/09-18:00:50.600 [NM] Resynchronizing interface

  information.


Recreating Network and Interface Objects

The Cluster service creates the following network and interface objects on the basis of the current network configuration:

378.380::1999/06/09-18:00:50.600 [NM] Creating network objects.

378.380::1999/06/09-18:00:50.600 [NM] Creating object for network

  5da3c9ea-7c25-4129-ab70-2c058de6d1fd (CPQ Netintellegent 10.10.0.2).

378.380::1999/06/09-18:00:50.600 [NM] Creating object for network

  d561dbcb-1936-11d3-b3fe-00a0c986aa14 ( 100b 172.16.64.1).

378.380::1999/06/09-18:00:50.600 [NM] Creating interface objects.

378.380::1999/06/09-18:00:50.600 [NM] Creating object for interface

  2e12db05-2947-425f-835a-fb9e4f13f0b3 ( 100b 172.16.64.1 -     NODE2).


In the following entries, the Node Manager gathers information used to detect network interface failures.

378.380::1999/06/09-18:00:50.600 [NM] Assigned index 0 to interface

  2e12db05-2947-425f-835a-fb9e4f13f0b3.

378.380::1999/06/09-18:00:50.600 [NM] Creating object for interface

  a6fe4897-139a-4b84-815f-8892f674afb8 (CPQ Netintellegent 10.10.0.2 -

  NODE2).

378.380::1999/06/09-18:00:50.600 [NM] Assigned index 0 to interface

  a6fe4897-139a-4b84-815f-8892f674afb8.

378.380::1999/06/09-18:00:50.600 [NM] Creating object for interface

  d561dbcc-1936-11d3-b3fe-00a0c986aa14 ( 100b 172.16.64.1 -     NODE1).

378.380::1999/06/09-18:00:50.600 [NM] Assigned index 1 to interface

  d561dbcc-1936-11d3-b3fe-00a0c986aa14.


Registering Networks and Interfaces and Bringing Them Online

In the entries in this section, the Node Manager does the following:

Note the event entry description, "Registering interface 2e12db05-2947-425f-835a-fb9e4f13f0b3 with cluster transport, addr 172.16.64.2, endpoint 3343," in which

Thus, "Registering interface 2e12db05-2947-425f-835a-fb9e4f13f0b3 with cluster transport, addr 172.16.64.2, endpoint 3343" means the following:

378.380::1999/06/09-18:00:50.600 [NM] Registering network 

  d561dbcb-1936-11d3-b3fe-00a0c986aa14 with cluster transport.

378.380::1999/06/09-18:00:50.600 [NM] Bringing network 

  d561dbcb-1936-11d3-b3fe-00a0c986aa14 online.

378.380::1999/06/09-18:00:50.600 [NM] Registering interfaces for network

  d561dbcb-1936-11d3-b3fe-00a0c986aa14 with cluster transport.

378.380::1999/06/09-18:00:50.600 [NM] Registering interface 

  2e12db05-2947-425f-835a-fb9e4f13f0b3 with cluster transport, addr

  172.16.64.2, endpoint 3343.

378.380::1999/06/09-18:00:50.600 [NM] Registering interface 

  d561dbcc-1936-11d3-b3fe-00a0c986aa14 with cluster transport, addr

  172.16.64.1, endpoint 3343.

378.380::1999/06/09-18:00:50.615 [NM] Initializing membership...


Performing the Connectivity Report

The following entries signal that the Node Manager is starting a connectivity report to detect any network interface failures.

378.328::1999/06/09-18:00:50.631 [NM] Scheduling connectivity report for

  network d561dbcb-1936-11d3-b3fe-00a0c986aa14 ( 100b     172.16.64.1)

378.328::1999/06/09-18:00:50.631 [NM] Updating local connectivity info

  for network d561dbcb-1936-11d3-b3fe-00a0c986aa14.

378.328::1999/06/09-18:00:50.631 [NM] Node for interface 0 is down

378.328::1999/06/09-18:00:50.631 [NM] Interface 1 is either up or

  unreachable


Note that in the following entry, the Node Manager discounts the connectivity report from interface 0 because the interface's node is down.

378.328::1999/06/09-18:00:50.631 [NM] The report from interface 0 is not

  valid

378.328::1999/06/09-18:00:50.662 [NM] Interface 1 is Up

378.328::1999/06/09-18:00:50.662 [NM] Beginning phase 2 of state

  computation for network d561dbcb-1936-11d3-b3fe-00a0c986aa14


In the following entry, the Node Manager summarizes the connectivity report. The unavailable interface is interface 0. (See the preceding entries.)

378.328::1999/06/09-18:00:50.662 [NM] Unavailable=1, Failed = 0,

  Unreachable=0, Reachable=1, Up=1,


For the meaning of state 3, see "State Codes" later in this chapter.

378.328::1999/06/09-18:00:50.662 [NM] Network 

  d561dbcb-1936-11d3-b3fe-00a0c986aa14 is now in state 3

378.328::1999/06/09-18:00:50.662 [NM] Interface 

  d561dbcc-1936-11d3-b3fe-00a0c986aa14 is up (node: NODE1, network:

  100b 172.16.64.1).

378.328::1999/06/09-18:00:50.662 [NM] Network 

  d561dbcb-1936-11d3-b3fe-00a0c986aa14 ( 100b 172.16.64.1) is up.

378.380::1999/06/09-18:00:50.631 [CLSMSG] Initializing.

378.380::1999/06/09-18:00:50.881 [NM] Membership initialization

  complete.


NodeHighestVersion and NodeLowestVersion

If the cluster nodes are running different versions of Windows NT or Windows 2000 Advanced Server and therefore different versions of the Cluster service, there can be an issue of whether the versions are compatible with all the cluster nodes. The Cluster service computes the Cluster service version associated with each node. It then computes the cluster version for the cluster as a whole from the versions of all the active members of the cluster. As other members join the cluster, the existing members compute a new version of the cluster.

In the following two pairs of entries, Cluster service computes a cluster version that is based on the versions of all the active members of the cluster, even when there is only one active member. The version number falls into two 16-bit numbers:

First the Cluster service retrieves the NodeHighestVersion and NodeLowestVersion properties for each active member.

378.380::1999/06/09-18:00:50.881 [NM] NmpValidateNodeVersion: Node=1,

  HighestVersion=0x00030803, LowestVersion=0x000200e0


In the second pair of entries, the Cluster service calculates the highest version of the Cluster service that can reliably run on all the nodes in the cluster and the lowest version with which every node can communicate. The calculation is biased toward the middle:

378.380::1999/06/09-18:00:50.881 [NM] NmpCalcClusterVersion: status = 0

  ClusHighestVer=0x00030803, ClusLowestVer=0x000200e0

378.380::1999/06/09-18:00:50.881 [NM] [NmpResetClusterVersion]

  ClusterHighestVer=0x00030803 ClusterLowestVer=0x000200e0

378.380::1999/06/09-18:00:50.881 [NM] Disabling mixed NT4/NT5 operation.


Cluster version 1 corresponds to the Windows NT 4.0 Service Pack 3 release, and version 2 to the Windows NT 4.0 Service Pack 4 and Windows NT 4.0 Service Pack 5 releases.

Performing Fixups

In the following entry, the Node Manager performs fixups, which enables the cluster to run in mixed mode. (In this case, node 1 is running Windows 2000, and node 2 is running Windows NT 4.0.) Fixups are also run the first time a node that has been upgraded runs the Cluster service. After running fixups, the Cluster service performs a global update of the membership.

378.380::1999/06/09-18:00:50.881 [NM] NmPerformFixups Entry,

  dwFixupType=1


Updating the Local Cluster Database and Setting Database Entries

The following entries are typical of those that report beginning and committing local updates. Note the timestamps. Local updates of the cluster database can, but do not necessarily have to, occur at this point in cluster formation.

378.380::1999/06/09-18:00:50.897 [GUM] GumSendUpdate:  Locker waiting

  type 2 context 17

378.380::1999/06/09-18:00:50.897 [GUM] Thread 0x380 UpdateLock wait on

  Type 2

378.380::1999/06/09-18:00:50.897 [GUM] DoLockingUpdate successful, lock

  granted to 1

378.380::1999/06/09-18:00:50.897 [GUM] GumSendUpdate: Locker dispatching

  seq 1657 type 2 context 17

378.380::1999/06/09-18:00:50.897 [dm] DmBeginLocalUpdate Entry

378.380::1999/06/09-18:00:50.897 [dm] DmBeginLocalUpdate Exit,

  pLocalXsaction=0x000a0dd8 dwError=0x00000000


Bringing Resources Online

In the last stage of forming a cluster, the Cluster service creates resources and tries to bring them online. There are two parts to bringing resources online: in the cluster database, adding possible owners for resource types in the cluster database; and then creating the resources and bringing them online.

Adding Possible Owners for Resource Types

In the following entries, the Failover Manager adds possible owners for resource types.

In the log, there are many more events of this type, but they are all essentially identical to the two reproduced here:

378.380::1999/06/09-18:00:50.897 [FM] processing resource types.

378.380::1999/06/09-18:00:50.897 [FM] FmpAddPossibleNode: adding node 1

  to resource type's possible node list

378.380::1999/06/09-18:00:50.897 [FM] FmpAddPossibleNode: adding node 2

  to resource type's possible node list


After adding possible owners for resource types, the Cluster service updates local copies of the cluster registry key:

378.380::1999/06/09-18:00:50.928 [NM] NmpUpdatePerformJoinFixups2:

  called postfixup notifycb function with  status 0

378.380::1999/06/09-18:00:50.928 [DM] DmCommitLocalUpdate Entry

378.380::1999/06/09-18:00:50.928 [DM] DmCommitLocalUpdate Exit,

  dwError=0x00000000

378.380::1999/06/09-18:00:50.928 [GUM] GumpDoUnlockingUpdate releasing

  lock ownership

378.380::1999/06/09-18:00:50.928 [GUM] GumSendUpdate: completed update

  seq 1657    type 2 context 17


Creating Resources and Bringing Them Online

The following entries are representative of the process of creating and initializing resources. The pattern repeats, of course, for each resource. In this section, the Cluster service creates all groups from a group list in the cluster database, as indicated by the following line.

378.380::1999/06/09-18:00:51.193 [FM] Processing groups list.


In the following example, the Failover Manager creates and initializes the Cluster Group:

378.380::1999/06/09-18:00:51.193 [FM] Creating group 

  254ef0e4-1937-11d3-b3fe-00a0c986aa14

378.380::1999/06/09-18:00:51.193 [FM] Initializing group 

  254ef0e4-1937-11d3-b3fe-00a0c986aa14 from the registry.

378.380::1999/06/09-18:00:51.193 [FM] Name for Group 

  254ef0e4-1937-11d3-


In the following line, the Failover Manager creates and brings online the first of the resources that the Cluster Group contains. The Failover Manager repeats this pattern for all the resources the Cluster Group contains.

The pattern for creating groups and resources is that the Failover Manager does the following:

378.380::1999/06/09-18:00:51.193 [FM] Group

254ef0e4-1937-11d3-b3fe-00a0c986aa14 contains Resource

254ef0e5-1937-11d3-b3fe-00a0c986aa14.


378.380::1999/06/09-18:00:51.193 [FM] Creating resource

254ef0e5-1937-11d3-b3fe-00a0c986aa14

378.380::1999/06/09-18:00:51.193 [FM] Initializing resource

254ef0e5-1937-11d3-b3fe-00a0c986aa14 from the registry.


378.380::1999/06/09-18:00:51.193 [FM] Name for Resource

254ef0e5-1937-11d3-b3fe-00a0c986aa14 is 'Cluster IP Address'.


378.380::1999/06/09-18:00:51.193 [FM] FmpAddPossibleEntry: adding

node 1 as possible host for resource

254ef0e5-1937-11d3-b3fe-00a0c986aa14.

378.380::1999/06/09-18:00:51.193 [FM] FmpAddPossibleEntry: adding

node 2 as possible host for resource

254ef0e5-1937-11d3-b3fe-00a0c986aa14.


378.380::1999/06/09-18:00:51.193 [FM] All dependencies for resource

254ef0e5-1937-11d3-b3fe-00a0c986aa14 created.


The following entry starts the creation of the next resource, which is the Cluster Name resource:

378.380::1999/06/09-18:00:51.193 [FM] Group 

  254ef0e4-1937-11d3-b3fe-00a0c986aa14 contains Resource 

  254ef0e6-1937-11d3-b3fe-00a0c986aa14.


In creating and bringing online the Cluster Name resource (Resource 254ef0e6-1937-11d3-b3fe-00a0c986aa14), the Failover Manager discovers that the Cluster Name resource depends on the Cluster IP Address resource and so opens the Cluster IP Address resource. Note that, in order for the Cluster service to perform such actions as bringing a resource online and changing its properties, it has to first open the resource to obtain a handle to it.

378.380::1999/06/09-18:00:51.193 [FM] Resource 

  254ef0e6-1937-11d3-b3fe-00a0c986aa14 depends on 

  254ef0e5-1937-11d3-b3fe-00a0c986aa14. Creating...

378.380::1999/06/09-18:00:51.193 [FM] CreateResource, Opened existing

  resource 254ef0e5-1937-11d3-b3fe-00a0c986aa14

378.380::1999/06/09-18:00:51.193 [FM] All dependencies for resource

  254ef0e6-1937-11d3-b3fe-00a0c986aa14 created.


The remainder of the events involved in bringing resource 254ef0e6-1937-11d3-b3fe-00a0c986aa14 (the Cluster Name resource) online are interspersed with entries reporting a great many other events, most of which, including global updates, involve creating and attempting to bring online other resources and groups.

In the following entries, the Cluster service starts creating the Quorum resource and the group to which it belongs. Notice that the Cluster service does not bring the Quorum resource online before the Cluster Name resource and that the pattern of events is the same for all resources and groups:

378.380::1999/06/09-18:00:51.193 [FM] Creating group 

  254ef0e7-1937-11d3-b3fe-00a0c986aa14

378.380::1999/06/09-18:00:51.209 [FM] Initializing group 

  254ef0e7-1937-11d3-b3fe-00a0c986aa14 from the registry.

378.380::1999/06/09-18:00:51.209 [FM] Name for Group 

  254ef0e7-1937-11d3-b3fe-00a0c986aa14 is 'Disk Group 1'.

378.380::1999/06/09-18:00:51.209 [FM] Group 

  254ef0e7-1937-11d3-b3fe-00a0c986aa14 contains Resource 

  254ef0e8-1937-11d3-b3fe-00a0c986aa14.

378.380::1999/06/09-18:00:51.209 [FM] CreateResource, Opened existing

  resource 254ef0e8-1937-11d3-b3fe-00a0c986aa14

378.380::1999/06/09-18:00:51.209 [FM] ReCreating quorum resource

  254ef0e8-1937-11d3-b3fe-00a0c986aa14


In a mixed-mode cluster, different versions of resource type DLLs can require fixups just as the Cluster service does. After all groups have been created and if the cluster is running in mixed-mode, fixups can be run on the resource type DLLs. This is described in the following entries.

378.380::1999/06/09-18:00:51.209 [FM] FmpFixupResourceTypes Entry.

378.380::1999/06/09-18:00:51.209 [FM] FmpFixupResourceTypes Exit


Then the Cluster service obtains and reports the state of all the groups in the cluster.

378.380::1999/06/09-18:00:51.209 [FMX] GetGroupListState, Group <Disk

  Group 1> state = 0

378.380::1999/06/09-18:00:51.209 [FMX] GetGroupListState, Group <Cluster

  Group> state = 1

378.380::1999/06/09-18:00:51.209 [FMX] GetGroupListState, Group <test

  group> state = 1


In the following entries, the Failover Manager initializes the Cluster IP Address resource by using the registry and then performs the subsequent global update and writes change records to the quorum log:

378.380::1999/06/09-18:00:51.209 [FM] Initializing resource 

  254ef0e5-1937-11d3-b3fe-00a0c986aa14 from the registry.

378.380::1999/06/09-18:00:51.225 [FM] Name for Resource 

  254ef0e5-1937-11d3-b3fe-00a0c986aa14 is 'Cluster IP Address'.

378.380::1999/06/09-18:00:51.225 [FM] FmpRmCreateResource: creating

  resource 254ef0e5-1937-11d3-b3fe-00a0c986aa14 in shared resource

  monitor

378.3c0::1999/06/09-18:00:51.240 [GUM] GumSendUpdate:  Locker waiting

  type 1 context 0

378.3c0::1999/06/09-18:00:51.240 [GUM] Thread 0x3c0 UpdateLock wait on

  Type 1

378.3c0::1999/06/09-18:00:51.240 [GUM] DoLockingUpdate successful, lock

  granted to 1


The following GUM update concerns creating a new registry key or entry:

378.3c0::1999/06/09-18:00:51.240 [GUM] GumSendUpdate: Locker dispatching

  seq 1659    type 1 context 0

378.3c0::1999/06/09-18:00:51.240 [dm] DmWriteToQuorumLog Entry Seq#=1659

  Type=0 Size=152

378.3c0::1999/06/09-18:00:51.240 [dm] DmWriteToQuorumLog Entry Seq#=1659

  Type=0 Size=152

378.3c0::1999/06/09-18:00:51.240 [GUM] GumpDoUnlockingUpdate releasing

  lock ownership

378.3c0::1999/06/09-18:00:51.240 [GUM] GumSendUpdate: completed update

  seq 1659    type 1 context 0


The following entry reports that the network interface is no longer valid; the interface is no longer available.

15c.458::1999/06/09-18:00:51.256 IP Address <Cluster IP Address>: NBT

  interface \Device\NetBt_If1 (instance 0x712F4D0A) is no longer valid.


In the following entry, the update concerns deleting an entry from a registry key, probably the one for the interface that is no longer valid:

378.3c0::1999/06/09-18:00:51.256 [GUM] GumSendUpdate:  Locker waiting

  type 1 context 4099

378.3c0::1999/06/09-18:00:51.256 [GUM] Thread 0x3c0 UpdateLock wait on

  Type 1

378.3c0::1999/06/09-18:00:51.256 [GUM] DoLockingUpdate successful, lock

  granted to 1

378.3c0::1999/06/09-18:00:51.256 [GUM] GumSendUpdate: Locker dispatching

  seq 1660    type 1 context 4099


The Cluster service attempts again to create and bring the Cluster Name resource (254ef0e6-1937-11d3-b3fe-00a0c986aa14) online, initializing the resource from the cluster registry key and creating it in the Resource Monitor:

378.380::1999/06/09-18:00:51.256 [FM] Initializing resource 

  254ef0e6-1937-11d3-b3fe-00a0c986aa14 from the registry.

378.380::1999/06/09-18:00:51.256 [FM] Name for Resource 

  254ef0e6-1937-11d3-b3fe-00a0c986aa14 is 'Cluster Name'.

378.380::1999/06/09-18:00:51.271 [FM] FmpRmCreateResource: creating

  resource 254ef0e6-1937-11d3-b3fe-00a0c986aa14 in shared resource

  monitor


In the following entries, the Database Manager has created a registry key to replace the one that was deleted in preceding the entries for 1999/06/09-18:00:51.256. This key might be a registry entry or subkey, but its creation is reported to the cluster log as a "key." The GUM performs a global update with this change, and the Database Manager writes the change to the quorum log. Note, again, that the type (0) in the write operation to the quorum log matches the context number for the global update. If the Cluster service on this node were to crash right after this operation and the other node were started, the other node would have access to the changes recorded in the quorum log and would apply them in its cluster formation process. As a result, the current configuration of the cluster would be preserved.

378.3c0::1999/06/09-18:00:51.271 [GUM] GumSendUpdate:  Locker waiting

  type 1 context 0

378.3c0::1999/06/09-18:00:51.271 [GUM] Thread 0x3c0 UpdateLock wait on

  Type 1

378.3c0::1999/06/09-18:00:51.271 [GUM] DoLockingUpdate successful, lock

  granted to 1

378.3c0::1999/06/09-18:00:51.271 [GUM] GumSendUpdate: Locker dispatching

  seq 1662    type 1 context 0

378.3c0::1999/06/09-18:00:51.271 [dm] DmWriteToQuorumLog Entry Seq#=1662

  Type=0 Size=152

378.3c0::1999/06/09-18:00:51.271 [dm] DmWriteToQuorumLog Entry Seq#=1662

  Type=0 Size=152

378.3c0::1999/06/09-18:00:51.271 [GUM] GumpDoUnlockingUpdate releasing

  lock ownership

378.3c0::1999/06/09-18:00:51.271 [GUM] GumSendUpdate: completed update

  seq 1662    type 1 context 0

15c.3cc::1999/06/09-18:00:51.287 Network Name <Cluster Name>: Successful

  open of resid 507752

378.380::1999/06/09-18:00:51.287 [FM] FmpRmCreateResource: created

  resource 254ef0e6-1937-11d3-b3fe-00a0c986aa14, resid 741136


After bringing other resources online, the Cluster service tries to bring the Cluster group, which contains the Cluster Name and Cluster IP Address resources, online:

378.380::1999/06/09-18:00:51.318 [FM] FmpOnlineGroupFromList: Previous

  group state for 254ef0e4-1937-11d3-b3fe-00a0c986aa14 is 1

378.380::1999/06/09-18:00:51.318 [FM] FmpOnlineGroup: trying to bring

  group 254ef0e4-1937-11d3-b3fe-00a0c986aa14 online

378.380::1999/06/09-18:00:51.318 [FM] OnlineGroup for 

  254ef0e4-1937-11d3-b3fe-00a0c986aa14 owner 651932


Next, the Cluster service brings the Cluster IP Address online. Note that the Failover Manager acquires a shared lock on the Quorum resource to do so. (For more information about shared locks, see "Shared Locks and gdwQuoBlockingResources" earlier in this chapter.)

378.380::1999/06/09-18:00:51.318 [FM] FmpRmOnlineResource: bringing

  resource 254ef0e5-1937-11d3-b3fe-00a0c986aa14 (resid 735200) online.

378.380::1999/06/09-18:00:51.318 [CP] CppResourceNotify for resource

  Cluster IP Address

378.380::1999/06/09-18:00:51.334 [FM] FmpRmOnlineResource: called

  InterlockedIncrement on gdwQuoBlockingResources for resource

  254ef0e5-1937-11d3-b3fe-00a0c986aa14

15c.458::1999/06/09-18:00:51.334 IP Address <Cluster IP Address>:

  Bringing resource online...


The GUM initiates a global update for the Cluster IP Address resource's change in state from Pending to Online while the Online thread is still running:

378.380::1999/06/09-18:00:51.334 [GUM] GumSendUpdate:  Locker waiting

  type 0 context 8

15c.488::1999/06/09-18:00:51.334 IP Address <Cluster IP Address>: Online 

  thread running.


The following entry is part of a group of entries that describe activity while both the Cluster IP Address and Cluster Name resources are still in the Pending state. In this entry, the IP Address resource is notified that the network interface to which it is bound has changed or been deleted.

15c.488::1999/06/09-18:00:51.334 IP Address <Cluster IP Address>:

  Created cluster notify port.


With the Cluster Group online, the Cluster service checks to make sure that there is still a network interface to which the IP Address can be bound on the same subnet. If there is not, the IP Address is orphaned.

15c.488::1999/06/09-18:00:51.350 IP Address <Cluster IP Address>:

  Checking if resource is orphaned: network masks 00FCFFFF=00FFFFFF

  and addresses FE4010AC^00000A0A.


After working on bringing another resource online, the Cluster service continues bringing the Cluster IP Address resource online. In these two entries, the Cluster service obtains a handle for the network interface and registers it so that the IP Address resource that is bound to it is notified if a change occurs to the network interface.

15c.488::1999/06/09-18:00:51.365 IP Address <Cluster IP Address>:

  Online: Opened object handle for netinterface 

  d561dbcc-1936-11d3-b3fe-00a0c986aa14.

15c.488::1999/06/09-18:00:51.365 IP Address <Cluster IP Address>:

  Online: Registered notification for netinterface 

  d561dbcc-1936-11d3-b3fe-00a0c986aa14.


In the following entries, the Database Manager takes a checkpoint of the cluster database (in this case, the cluster key of the registry for the local system), and the Log Manager writes it to the quorum disk:

378.380::1999/06/09-18:00:51.412 [dm]DmUpdateFormNewCluster - taking a

  checkpoint

378.380::1999/06/09-18:00:51.412 [LM] LogCheckPoint entry

378.380::1999/06/09-18:00:51.412 [LM] DmpGetSnapshotCb::checkpoint file

  name =D:\MSCS\chk67D.tmp Seq#=1661

378.380::1999/06/09-18:00:51.756 [dm] DmpGetSnapShotCb - DmpGetDatabase

  returned 0x00000000

378.380::1999/06/09-18:00:51.756 [LM] LogCheckPoint:

  ChkPtFile=D:\MSCS\chk67D.tmp Chkpt Trid=1661 CheckSum=86865

378.380::1999/06/09-18:00:51.756 [lm] LogpAppendPage : Writing 1024

  bytes to disk at offset 0x00000800

378.380::1999/06/09-18:00:51.756 [lm] LogFlush : pLog=0x000a0650 writing

  the 1024 bytes for active page at offset 0x00000c00

378.380::1999/06/09-18:00:51.771 [LM] LogCheckPoint: EndChkpt written.

  EndChkPtLsn =0x00000c08 ChkPt Seq=1661 ChkPt

  FileName=D:\MSCS\chk67D.tmp

378.380::1999/06/09-18:00:51.771 [lm] LogpCheckpoint : Writing 1024

  bytes to disk at offset 0x00000000

378.380::1999/06/09-18:00:51.771 [LM] LogCheckPoint Exit


The following entry reports that a node is down. Because a node is down, updates are made to the database and written to the quorum log — therefore, the Database Manager performs the update. (If all nodes were up, the GUM would perform the update, replicating any change across the cluster.)

378.380::1999/06/09-18:00:51.771 [dm] DmUpdateFormNewCluster - some node

  down


And now the cluster has successfully formed:

378.380::1999/06/09-18:00:51.787 [INIT] Successfully formed a cluster.

378.380::1999/06/09-18:00:51.803 [INIT] Cluster started.


With the cluster formed, the Cluster service continues bringing resources online, including the Cluster IP Address resource.

After making a number of changes to the cluster database and subsequently writing change records to the quorum log, the Database Manager changes one of the Parameters subkeys for the Cluster IP Address resource and writes the change to the quorum log. Then the GUM initiates a global update with the change to the registry (type 1 context 4098):

378.498::1999/06/09-18:00:53.162 DM: Setting value NbtDeviceInstance for

  key Resources\254ef0e5-1937-11d3-b3fe-00a0c986aa14\Parameters\1

378.498::1999/06/09-18:00:53.178 [dm] DmWriteToQuorumLog Entry Seq#=1677

  Type=4098 Size=180

378.498::1999/06/09-18:00:53.178 [lm] LogWrite : Entry TrId=1677 RmId=5

  RmType = 4098 Size=180

378.498::1999/06/09-18:00:53.178 [lm] LogWrite : Exit

  returning=0x000011f0

378.498::1999/06/09-18:00:53.178 [GUM] GumpDoUnlockingUpdate releasing

  lock ownership

378.498::1999/06/09-18:00:53.178 [GUM] GumSendUpdate: completed update

  seq 1677    type 1 context 4098


In the following entries, the Cluster service creates an NBT interface for the Cluster IP Address resource, and the Resource Monitor announces that the resource is online (state 2):

15c.488::1999/06/09-18:00:53.178 IP Address <Cluster IP Address>:

  Created NBT interface \Device\NetBt_If1 (instance 0x7A85A0AA).

15c.488::1999/06/09-18:00:59.396 [RM] RmpSetResourceStatus, Posting

  state 2 notification for resource <Cluster IP Address>


The following entry reports that the Cluster IP Address resource is online.

15c.488::1999/06/09-18:00:59.396 IP Address <Cluster IP Address>: IP

  Address 172.16.64.254 on adapter (R) PRO/100B PCI Adapter (TX)

  online


The following entry reports the Cluster IP Address resource as online. Note that the old state was 129 (ClusterResourceOnlinePending) and the new state is 2 (ClusterResourceOnline).

378.328::1999/06/09-18:00:59.396 [FM] HandleResourceTransition: Resource

  Name = 254ef0e5-1937-11d3-b3fe-00a0c986aa14 old state=129 new

  state=2


Bringing resource bf37b6a7-391d-11d2-a397-0000f803f3da (the Cluster Network Name resource) online takes longer. The entries that describe this process are the same as those that describe the process of bringing the Cluster IP Address online, except that there are some new entries.

In the first four entries, the Network Name resource is trying to come online. But it must wait for 254ef0e5-1937-11d3-b3fe-00a0c986aa14, the Cluster IP Address resource on which the Cluster Network Name resource depends, to come online first:

378.3a8::1999/06/09-18:00:59.396 [FM] OnlineWaitingTree, 

  254ef0e6-1937-11d3-b3fe-00a0c986aa14 (129) depends on 

  254ef0e5-1937-11d3-b3fe-00a0c986aa14 (2). Start now

378.3a8::1999/06/09-18:00:59.396 [FM] OnlineWaitingTree, Start resource

  254ef0e6-1937-11d3-b3fe-00a0c986aa14

378.3a8::1999/06/09-18:00:59.490 [FM] OnlineResource: 

  254ef0e6-1937-11d3-b3fe-00a0c986aa14 depends on 

  254ef0e5-1937-11d3-b3fe-00a0c986aa14. Bring online first.

378.3a8::1999/06/09-18:00:59.537 [FM] FmpRmOnlineResource: bringing

  resource 254ef0e6-1937-11d3-b3fe-00a0c986aa14 (resid 741136) online.


Bringing the Cluster Network Name resource online involves a number of activities that have been covered earlier in this chapter, including the following:

The unique activities are reported in the following entries, at the end of which the Cluster Network Name resource has successfully come online:

15c.4d4::1999/06/09-18:01:01.177 Network Name <Cluster Name>: Registered

  server name CLUSTER1 on transport \Device\NetBt_If1.

15c.4d4::1999/06/09-18:01:01.177 Network Name <Cluster Name>: Registered

  workstation name CLUSTER1 on transport \Device\NetBt_If1.

.

.

.

15c.4d4::1999/06/09-18:01:01.615 Network Name <Cluster Name>: The zone 

  for 254.64.16.172.in-addr.arpa. does not accept dynamic DNS 

  registration updates.

15c.4d4::1999/06/09-18:01:01.630 Network Name <Cluster Name>: Registered 

  DNS name cluster1.itreskit.com on IP Address 172.16.64.254.

15c.4d4::1999/06/09-18:01:01.630 Network Name <Cluster Name>: Network 

  Name CLUSTER1 is now online

15c.4d4::1999/06/09-18:01:01.630 [RM] RmpSetResourceStatus, Posting 

  state 2 notification for resource <Cluster Name>


Cluster Successfully Formed

The Cluster service does the following:

378.380::1999/06/09-18:00:51.381 [EVT] EvOnline

378.380::1999/06/09-18:00:51.381 [EVT] Set propagation state to 0001

378.380::1999/06/09-18:00:51.381 [EVT] EvOnline : calling

ElfRegisterClusterSvc

378.380::1999/06/09-18:00:51.412 [EVT] EvOnline:

pPackedEventInfo->ulSize=3284 pPackedEventInfo

->ulNulEventsForLogFile=2


378.380::1999/06/09-18:00:51.412 [dm]DmUpdateFormNewCluster - taking

a checkpoint

378.380::1999/06/09-18:00:51.412 [LM] LogCheckPoint entry

378.380::1999/06/09-18:00:51.412 [LM] DmpGetSnapshotCb::checkpoint

file name =D:\MSCS\chk67D.tmp Seq#=1661

378.380::1999/06/09-18:00:51.756 [dm] DmpGetSnapShotCb -

DmpGetDatabase returned 0x00000000

378.380::1999/06/09-18:00:51.756 [LM] LogCheckPoint:

ChkPtFile=D:\MSCS\chk67D.tmp Chkpt Trid=1661 CheckSum=86865

378.380::1999/06/09-18:00:51.756 [lm] LogpAppendPage : Writing 1024

bytes to disk at offset 0x00000800

378.380::1999/06/09-18:00:51.756 [lm] LogFlush : pLog=0x000a0650

writing the 1024 bytes for active page at offset 0x00000c00

378.380::1999/06/09-18:00:51.771 [LM] LogCheckPoint: EndChkpt

written. EndChkPtLsn =0x00000c08 ChkPt Seq=1661 ChkPt

FileName=D:\MSCS\chk67D.tmp

378.380::1999/06/09-18:00:51.771 [lm] LogpCheckpoint : Writing 1024

bytes to disk at offset 0x00000000

378.380::1999/06/09-18:00:51.771 [LM] LogCheckPoint Exit


The following entry reports that the Cluster service has successfully pinged its own IP address to make sure the IP address is unique:

15c.488::1999/06/09-18:00:59.396 IP Address <Cluster IP Address>: IP

  Address 172.16.64.254 on adapter (R) PRO/100B PCI Adapter (TX)

  online


© 1985-2000 Microsoft Corporation. All rights reserved.