Network Load Balancing

Components

The following are the principal Network Load Balancing components. They are installed to each Network Load Balancing cluster host ("Wlbs" remains from a previous version of the software):

Wlbs.sys
The Network Load Balancing networking device driver.
Wlbs.exe
The Network Load Balancing control program. Except for changing registry parameters, you can use Wlbs.exe from the command line to start, stop, and administer Network Load Balancing, as well as to enable and disable ports and to query cluster status.

For information about command-line syntax and arguments that Wlbs.exe carries out, see Windows 2000 Network Load Balancing Help.
Wlbs.chm
Network Load Balancing Help.

Network Load Balancing Design

Rather than routing incoming client requests through a central host for redistribution, every Network Load Balancing cluster host receives each client request. A statistical mapping algorithm determines which host processes each incoming client request. The distribution is affected by host priorities, whether the cluster is in multicast or unicast mode, port rules, and the affinity set.

This design has the following advantages:

Because filtering packets is faster than modifying them in one host and then retransmitting them to their destination hosts, which then must receive them, Network Load Balancing provides significantly higher throughput than do load-balancing solutions that route packets through a central host.
Network Load Balancing avoids a single point of failure and provides redundancy equal to the number of servers in a cluster.
Because Network Load Balancing is a software solution, it scales with the technology of the servers where it is installed.

The trade-off for these advantages is that sending all the client traffic to all the hosts means that the network adapter(s) in each host must handle all the incoming client traffic (which is usually a small percentage of overall traffic).

Requests That Require Synchronized Change in Data State

When a Network Load Balancing host processes a client request that requires changing state information that is visible to all application instances, the change in data must be synchronized across all the hosts in the cluster. To accomplish this synchronization, the application can maintain shared state information in a back-end database and generate an update to the back-end database server. If the target application is managed as a server-cluster resource, the back-end servers can be members of a server cluster. The application can also provide other methods of its own design, such as cookies, for managing shared state information.

Heartbeats and Convergence

Network Load Balancing hosts maintain membership in the cluster through heartbeats. By default, when a host fails to send out heartbeat messages within about five seconds, it is deemed to have failed, and the remaining hosts in the cluster perform convergence, in order to do the following:

Establish which hosts are still active members of the cluster.
Elect the host with the highest priority as the new default host.
Note that the lowest value for the Priority ID host parameter indicates the highest priority among hosts.
Redistribute the failed host's client requests to the surviving hosts.

In convergence, surviving hosts look for consistent heartbeats; if the host that failed to send heartbeats once again provides heartbeats consistently, it rejoins the cluster in the course of convergence. The other consistency that active hosts establish during convergence is that all the hosts have a consistent view of which hosts are active.

Convergence generally takes less than 10 seconds, so interruption in client service by the cluster is minimal.

By editing the registry, you can change both the number of missed messages required to start convergence and the period between heartbeats. However, making the period between heartbeats too short increases network overhead on the system.

During convergence, hosts that are still up continue handling client requests.

Statistical Mapping Algorithm

The assignment of a given client request to a server occurs on all the hosts; there is not a single host that centrally distributes the requests among the hosts. The hosts jointly use a statistical algorithm that maps incoming client requests to active hosts in the cluster.

Apart from the influence of cluster and host parameter settings, it is possible for two successive client requests to be assigned to the same host during normal operation. However, as more client requests come into the cluster, distribution of client requests by the algorithm statistically approaches the load division specified by the Load Weight parameter of the relevant port rule.

The distribution of client requests that the statistical mapping function effects is influenced by the following:

Host priorities
Multicast or unicast mode
Port rules
Affinity
Load percentage distribution
Client IP address
Client port number
Other internal load information

The statistical mapping function does not change the existing distribution of requests unless the membership of the cluster changes or you adjust the load percentage.

Affinity

Affinity defines a relationship between client requests from a single client address or from a Class C network of clients and one of the cluster hosts. Affinity ensures that requests from the specified clients are always handled by the same host. The relationship lasts until convergence occurs (namely, until the membership of the cluster changes) or until you change the affinity setting. There is no time-out — the relationship is based only on the client IP address.

There are three types of affinity, which you choose with the Affinity setting. The Affinity setting determines which bits of the source IP and IP port number affect the choice of a host to handle traffic for a particular client's request. The Affinity settings are as follows:

None
Setting Affinity to None distributes client requests more evenly; when maintaining session state is not an issue, you can use this setting to speed up response time to requests. For example, because multiple requests from a particular client can go to more than one cluster host, clients that access Web pages can get different parts of a page or different pages from different hosts.

With Affinity set to None, the Network Load Balancing statistical mapping algorithm uses both the port number and entire IP address of the client to influence the distribution of client requests.

In certain circumstances, setting Affinity to None is suitable when the Network Load Balancing cluster sits behind a reverse proxy server. All the client requests have the same source IP address, so the port number creates an even distribution of requests among the cluster hosts.
Single
When Affinity is set to Single, the entire source IP address (but not the port number) is used to determine the distribution of client requests.

You typically set Affinity to Single for intranet sites that need to maintain session state. Single Affinity always returns each client's traffic to the same server, thus assisting the application in maintaining client sessions and their associated session state.

Note that client sessions that span multiple TCP connections (such as ASP sessions) are maintained as long as the Network Load Balancing cluster membership does not change. If the membership changes by adding a new host, the distribution of client requests is recomputed, and you cannot depend on new TCP connections from existing client sessions ending up at the same server. If a host leaves the cluster, its clients are partitioned among the remaining cluster hosts when convergence completes, and other clients are unaffected.
Class C
When Affinity is set to Class C, only the upper 24 bits of the client's IP address are used by the statistical-mapping algorithm. This option is appropriate for server farms that serve the Internet. Client requests coming over the Internet might come from clients sitting behind proxy farms. In this case, during a single client session, client requests can come into the Network Load Balancing cluster from several source IP addresses during a session.

Class C Affinity addresses this issue by directing all the client requests from a particular Class C network to a single Network Load Balancing host.

There is no guarantee, however, that all of the servers in a proxy farm are on the same Class C network. If the client's proxy servers are on different Class C networks, then the affinity relationship between a client and the server ends when the client sends successive requests from different Class C network addresses.