Capacity Planning

Reliability

Some sites can afford to fail or go offline; others cannot. Many financial institutions, for example, require 99.999 percent or better reliability. Site availability takes two forms: The site might need to remain available if one of its servers crashes, and it might need to remain online while information is being updated or backed up.

Even if your requirements are less rigorous than those of a major financial institution, you will probably want to use Redundant Arrays of Independent Disks (RAID). You can also consider creating a “Web farm” with Network Load Balancing; in addition, you can create subsystem redundancy by clustering component servers.

Server Clustering

The term “failure” commonly brings to mind the idea of a system crash, but in fact many system failures are deliberate: the administrator brings a server down for routine maintenance or for hardware installation or an upgrade. Clustering makes it possible to take a server down for maintenance or service without causing the site itself to fail, and also provides reliability in the event of an unscheduled failure.

Microsoft supports two clustering solutions. The first of these is Windows 2000 Server clustering; the second is Network Load Balancing. The next two topics describe them.