Microsoft Site Server 3.0 Commerce Server Capacity and Performance Analysis

Update Incorporating SQL Server 7.0 and Xeon Architecture

August 1999

Microsoft Corporation

Introduction

This document is an update to the Microsoft Site Server 3.0 Commerce Edition Performance and Capacity Analysis white paper that is included with the Microsoft® Site Server 3.0 Commerce Edition Resource Kit. The purpose of this update is to address changes in both software and hardware configurations that could impact performance calculations. These changes reflect evolutionary improvements in technology that naturally occur over time in the computer industry, such as faster processors, new versions of software and the latest releases of service packs. A comparison between the two system configurations is shown below.

	Old Configuration	New Configuration
System Software:	Microsoft® Windows NT® Server 4.0, Service Pack 3	Windows NT Server 4.0, Service Pack 4
Database Software:	Microsoft® SQL Server™ version 6.5, Service Pack 4	SQL Server version 7.0, Service Pack 1
CPU (SSCE):	4 x 200-MHz Pentium Pro w/512 K L2 cache	4 x 400-MHz Pentium II (Xeon) w/512 K L2 cache
CPU (SQL Server):	4 x 200-MHz Pentium Pro w/512 K L2 cache	4 x 400-MHz Pentium II (Xeon) w/512 K L2 cache
Disk (SSCE):	2 x 4.3-GB SCSI-3 (10,000 RPM)	2 x 4.3-GB SCSI LVD (10,000 RPM)
Disk (SQL Server):	1 x 4.3-GB SCSI-3 (10,000 RPM)	2 x 4.3-GB SCSI LVD (10,000 RPM) 20 x 9.1-GB SCSI LVD (10,000 RPM)
Memory (SSCE):	384-MB ECC buffered EDO RAM	2-GB ECC buffered EDO RAM
Memory (SQL Server):	256-MB ECC buffered EDO RAM	2-GB ECC buffered EDO RAM
Network:	100-BaseT Switched Ethernet	100-BaseT Switched Ethernet

This document compares the capacity of these two configurations by analyzing CPU and disk costs, using a revised and simplified approach to Transaction Cost Analysis (TCA) methodology. You may be tempted to skip to the Summary of Capacity and Performance section later in this document and use the numbers found there as a guideline for designing a system configuration based on your unique capacity requirements. However, more ambitious site builders will appreciate the value in studying the TCA methodology to run their own tests and do their own performance and capacity planning analysis.

Note For information about using TCA with SSCE, refer to the Microsoft Site Server 3.0 Commerce Edition Performance and Capacity Analysis white paper included with the Site Server 3.0 Commerce Edition Resource Kit.

Capacity planning for a service such as SSCE has many dependencies. These include hardware, system software, database software, ASP scripts, and usage profiles. The importance of understanding how different system configurations, different SSCE sites, and different usage characteristics will produce different results cannot be overemphasized.

Active Server Pages (ASP) scripts used to build an SSCE site are unique. ASP performance can vary widely, depending on the complexity and efficiency of the code. These variations will ultimately impact resource cost and capacity. A well-written ASP can have a greater impact on capacity and performance than an evolutionary change in hardware.

User behavior will also vary from site to site, and this variation needs to be reflected in the Shopper Profile discussed later in this document. For example, one site may show that 20 percent of store visitors actually purchase a product while another site shows only 2 percent. Because the purchase operation is the most demanding operation that can be performed on an SSCE site (from the standpoint of computer resource usage), this change in the shopper profile will have a significant impact on shopper capacity.

In summary, the results of the analysis provided in this document will prove useful for those wanting to expedite the capacity planning process. However, in order to create a capacity plan with the highest level of confidence, it is recommended that this document be used as a road map for a "hands on" approach to the performance and capacity analysis of a uniquely-designed SSCE site.

Summary of Capacity and Performance

Performance and capacity for an SSCE site is by and large a factor of how efficiently ASP pages can be processed. Because ASP processing is highly CPU intensive, SSCE capacity is reached when ASP processing maximizes CPU resources.
In light of this, it should be noted that multi-processor SSCE servers do not make efficient use of CPU resources, due primarily to thread management on 2-processor and 4-processor computers.

Note By default, Microsoft Internet Information Server (IIS) allocates a maximum of 10 threads per processor. If thread contention is an issue, increasing the number of threads in the IIS thread pool can improve ASP performance. However, more threads in the IIS thread pool result in more context switching, which causes additional CPU overhead. So the optimum relationship between thread pool size and context switching must be calibrated carefully. Ideally, thread pool size should not push CPU utilization beyond 70 percent.

Having said this, the most straight-forward approach to increasing SSCE capacity is to increase CPU resources, either by using faster processors or by adding more SSCE servers. See Appendix B in the Microsoft Site Server 3.0 Commerce Edition Performance and Capacity Analysis for a detailed comparison of 1-processor, 2-processor and 4-processor SSCE server test results.

However, for purposes of comparison, test results for 4-processor SSCE system configurations are used.
Based on the data derived using the new hardware and software and presented in this document, the following assertions can be made about the performance of the Pentium Pro and Xeon configurations when hosting the Volcano Coffee sample site found in SSCE 3.0:
Note Note that these results also require the use of the proposed Shopper Profile found in the "Capacity and Performance Detail" section of this document. Different shopper profiles can produce considerably different results.
- Switching from Pentium Pro to Xeon configurations has increased shopper capacity by approximately 140 percent (from 400 to 950 shoppers).
- The increase in shopper capacity for the Xeon configuration is by and large a result of the increase in available CPU cycles (800 MHz for the Pentium Pro and 1600 MHz for the Xeon).
- The Xeon configuration shows some increased efficiency in terms of CPU cost per shopper. Cost declines from 0.91 Mcycles/sec on the Pentium Pro to 0.74 Mcycles/sec on the Xeon, a change of 19 percent.
Note The Mcycle is the unit of processor work used in this document. One Mcycle is equal to one million CPU cycles. As a unit of measure, the Mcycle is useful for comparing performance between processors because it is hardware independent.
- The Xeon configuration shows significant improvements in terms of disk cost. One shopper session will generate 0.016 disk seeks on a Pentium Pro configuration and 0.008 disk seeks on a Xeon configuration. However, the improvement in disk performance in the Xeon configuration is not a factor in increased shopper capacity.

Capacity Comparisons

The following chart shows how the increased CPU power of the Xeon configuration impacts shopper capacity. Because CPU is the bottleneck with SSCE, shopper capacity can be easily increased by adding more CPU power, in the form of faster, more powerful processors. When CPU is at maximum available usage, the Pentium Pro configuration supports 400 shoppers and the Xeon configuration supports 950 shoppers.

Note For further discussion of the CPU bottleneck in SSCE, refer to the Microsoft Site Server 3.0 Commerce Edition Performance and Capacity Analysis white paper.

Note Maximum available CPU usage for a 4-processor SSCE server has been determined to be 40 percent, which is equivalent to 320 MHz for the Pentium Pro (800 MHz x 40%) and 640 MHz for the Xeon (1600 MHz x 40%).

Chart 1: Comparing Shopper Capacity for Pentium Pro and Xeon Configurations

Chart 2 compares resource usage for CPU and disk on a Xeon configuration. It shows that when CPU usage is maximized, disk is operating at approximately 2.75 percent. The implication here is that CPU resources will be maximized far sooner than disk resources.

Note Disk capacity is defined in terms of the maximum number of disk seeks that can be performed on the SQL Server data partition. For both Xeon and Pentium Pro configurations, 100 percent disk utilization is equal to 280 seeks per second. When CPU utilization is running at maximum, the SQL Server data disk is performing, on average, 7.7 disk seeks per second, which is well below capacity.

Chart 2: Comparing Shopper Capacity for Xeon CPU and Xeon Disk

Capacity and Performance Detail

Shopper Profile

This Shopper Profile is also found in the Microsoft Site Server 3.0 Commerce Edition Performance and Capacity Analysis white paper, which is part of the Site Server 3.0 Commerce Edition Resource Kit. Shopper operations shown here are based on the shopper operations included in the Volcano Coffee (VC) sample site, which is part of SSCE 3.0.

This profile identifies the behavior of an average shopper, with 19 shopper operations performed during a 20-minute session. The Profile Performance Rate (PPR) is calculated from each of the values used for transactions/session, by converting transactions/session into transactions/second. For example, this profile shows that 19.0 operations will be performed by the average shopper during a 20-minute session (or 1200 seconds). Thus the PPR for all of the operations is 19.0 ÷ 1200, or 0.015827 transactions per second.

Table 1: Shopper Profile Used in this Report

VC Shopper Operation	Transactions/ session	Profile Performance Rate
Additem	1.5	0.001250 trans/sec
Basket	2.0	0.001667 trans/sec
Checkout	0.5	0.000417 trans/sec
Clearitems	0.5	0.000417 trans/sec
Default	1.0	0.000833 trans/sec
Delitem	0.5	0.000417 trans/sec
Listing	0.5	0.000417 trans/sec
Lookup	1.0	0.000833 trans/sec
Main	2.5	0.002084 trans/sec
Product	6.0	0.005000 trans/sec
Search	2.0	0.001667 trans/sec
Welcome	1.0	0.000833 trans/sec
Total for All Operations	19.0	0.015827 trans/sec

ASP Performance Comparisons

In the following table, the optimum performance rate for each shopper operation is compared for Pentium Pro and Xeon configurations. These performance rates are also compared in Chart 3 (taller bars indicate greater performance).

Note Optimum performance rate is approximately equal to maximum ASP throughput, although it is typically lower. For purposes of Transaction Cost Analysis, each shopper operation is tested to determine optimum performance rate (in terms of ASP requests per second) for a specific system configuration. As load is increased, throughput increases and resource cost remains fairly constant. However, when a certain threshold is reached, throughput continues to increase somewhat, but resource cost increases geometrically. The optimum performance rate is determined to be the point at which ASP throughput is greatest prior to the geometric increase in cost.

Table 2: Comparing ASP Performance (ASP Requests/sec) for Pentium Pro and Xeon Configurations

VC Shopper Operation	ASP requests/sec (Pentium Pro)	ASP requests/sec (Xeon)	% Improvement (from Pentium Pro to Xeon)
Additem	3.897	5.777	48.24%
Basket	7.886	8.252	4.64%
Checkout	3.754	8.521	126.97%
Clearitems	7.169	9.373	30.75%
Default	28.895	44.850	55.22%
Delitem	4.570	7.168	56.85%
Listing	3.272	7.074	116.20%
Lookup	10.988	28.848	162.54%
New	9.722	29.591	204.37%
Main	4.414	14.670	232.35%
Product	3.305	8.737	164.36%
Search	7.555	14.051	85.98%
Welcome	32.523	104.262	220.58%

Chart 3: Comparing ASP Performance (ASP Requests/sec) for Pentium Pro and Xeon Configurations

Processor and Disk Costs

Processor and disk costs are calculated from CPU utilization and ASP requests per second. Cost is a calculation that represents the total number of CPU cycles and/or disk seeks required to perform a single transaction. These costs are compared for Pentium Pro and Xeon configurations.

Note Lower cost per transaction translates to more transactions for a given number of CPU cycles, which (with all other things being equal) translates to greater capacity. However, because the Xeon configuration provides twice as many CPU cycles for processing transactions, capacity for the Pentium Pro and Xeon configurations will ultimately be determined not only by cost per transaction but by maximum cycles available for processing transactions.

Table 3: Comparing CPU and Disk Cost for Pentium Pro and Xeon Configurations

VC Shopper Operation	CPU Cost_Ppro¹	CPU Cost_Xeon	Disk Cost_Ppro²	Disk Cost_Xeon
Additem	53.350	73.541	1.971	0.821
Basket	32.091	52.696	0.407	1.146
Checkout	176.195	125.193	23.783	6.149
Clearitems	29.552	36.658	1.070	1.475
Default	5.622	5.665	0.008	0.000
Delitem	47.464	50.609	0.713	1.378
Listing	64.518	30.093	0.000	0.000
Lookup	17.182	10.295	0.015	0.000
New	20.935	12.269	0.796	0.526
Main	66.300	43.168	0.037	0.054
Product	81.048	55.050	0.047	0.126
Search	32.119	43.886	0.012	0.000
Welcome	6.270	4.544	0.003	0.000

1. CPU cost is measured in terms of Mcycles required to perform a single transaction, where one Mcycle is equal to one million CPU cycles.

2. Disk cost is measured in terms of disk seeks required to perform a single transaction.

The following chart uses the data from Table 3 to compare cost for each shopper operation for the Pentium Pro and Xeon configurations. Shorter bars indicate more efficient use of CPU resources (lower cost of operation).

Chart 4: CPU Cost by ASP for Xeon and Pentium Pro Configurations

Processor and Disk Calculations

CPU and disk costs found in Table 3 are multiplied by the Profile Performance Rate (PPR) found in the Shopper Profile (Table 1) to create weighted CPU and disk costs for each operation as shown in the table below (Table 4). For example, the CPU cost (Pentium Pro configuration) for the Additem operation is shown in Table 3 to be 53.350 Mcycles. The Shopper Profile indicates that the average shopper will generate 0.001250 Additem operations per second. This creates a weighted cost of 53.350 × 0.001250, or 0.0667 Mcycles per second.

The sum of the weighted costs provides a CPU and a disk cost per shopper per second, which can be used in simple formulas to predict capacity (maximum number of shoppers per second).

Table 4: Comparing Cost per Shopper (K) for Pentium Pro and Xeon Configurations

VC Shopper Operation	Weighted CPU Cost_Ppro³	Weighted CPU Cost_Xeon	Weighted Disk Cost_Ppro⁴	Weighted Disk Cost_Xeon
Additem	0.0667	0.0919	0.002464	0.001026
Basket	0.0535	0.0878	0.000678	0.001910
Checkout	0.0735	0.0522	0.009918	0.002564
Clearitems	0.0123	0.0153	0.000446	0.000615
Default	0.0047	0.0047	0.000007	0.000000
Delitem	0.0198	0.0211	0.000297	0.000575
Listing	0.0269	0.0125	0.000000	0.000000
Lookup	0.0143	0.0086	0.000012	0.000000
New	0.0436	0.0256	0.001659	0.001096
Main	0.3315	0.2158	0.000185	0.000270
Product	0.1351	0.0918	0.000078	0.000210
Search	0.0268	0.0366	0.000010	0.000000
Welcome	0.0992	0.0719	0.000047	0.000000
Cost per Shopper per Second (K)	0.9079	0.7358	0.015802	0.008267

3. Weighted CPU cost is measured in terms of Mcycles/sec.

4. Weighted disk cost is measured in terms of disk seeks/sec.

Processor and Disk Equations

In the previous table (Table 4), the cost per shopper per second (K) is calculated for Xeon and Pentium Pro configurations. These values can be plugged into equations to calculate shopper capacity using the following formula:

Capacity (C) = Number of Shoppers (N) × Cost per Shopper per Second (K)

Below are the capacity equations for processor and disk created from the calculations for cost per shopper per second (K) that are shown on the bottom line of Table 4. Each equation is bound by a maximum value, which is equivalent to the maximum number of Mcycles and disk seeks available for each system.

For Xeon:

C_CPU= Min [ (N × 0.7340), 640 ]

C_DSK= Min [ (N × 0.0089, 280 ]

For Pentium Pro:

C_CPU= Min [ (N × 0.8632, 320 ]

C_DSK= Min [ (N × 0.0130, 280 ]

The following chart (Chart 5) shows the CPU usage for Pentium Pro and Xeon configurations based on the previously constructed CPU equations. The Pentium Pro reaches CPU capacity (320 Mcycles) when shopper load is 400. The Xeon's CPU capacity (640 Mcycles) is reached when the shopper load is 950. Increased shopper capacity for the Xeon configuration is a factor of lower CPU cost per shopper as well as higher maximum number of Mcycles available.

Chart 5: Projected CPU Costs Based on Shopper Load for Pentium Pro and Xeon

In the following chart (Chart 6), disk cost is compared for Xeon and Pentium Pro configurations. The upper limit for each configuration is based on the values for shopper capacity calculated for CPU (400 shoppers for the Pentium Pro and 950 shoppers for the Xeon). In both configurations, the disk is operating well below the maximum disk capacity of 280 seeks/sec.

Chart 6: Projected Disk Costs Based on Shopper Load for Pentium Pro and Xeon

Information in this document, including URL and other Internet web site references, is subject to change without notice. The entire risk of the use or the results of the use of this resource kit remains with the user. This resource kit is not supported and is provided as is without warranty of any kind, either express or implied. The example companies, organizations, products, people and events depicted herein are fictitious. No association with any real company, organization, product, person or event is intended or should be inferred. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation.

Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property.

Microsoft, Windows and Windows NT are either registered trademarks or trademarks of Microsoft Corporation in the U.S.A. and/or other countries/regions.

The names of actual companies and products mentioned herein may be the trademarks of their respective owners.