Performance Testing a Scalable Application

Duwamish Books, Phase 4

Bernadette Bly
Microsoft Developer Network

August 1999

Summary: Discusses the methodology and tools used in testing Phase 4 of the Duwamish Books sample application. (9 printed pages) Test results are provided and compared with the performance goals set for the sample.

Introduction

Because business goals often change over time, an important design consideration for any business-critical system is scalability. What exactly do I mean by scalability? The dictionary defines scalability as “how well a solution to some problem will work when the size of the problem increases.” What it means for business systems is this: When some part of the system is overtaxed and all I have to do is add physical components to enhance performance, the system is scalable. If, on the other hand, I have to rewrite program code to handle an increase in users, the system is not scalable.

In this article, I will describe the tools and methodology used in testing the Duwamish Books, Phase 4 sample application. I will also provide the test results, compare them with the performance goals set for the Duwamish Books sample, and estimate the system hardware needed to meet those goals. (For a discussion on how our performance goals were determined, see the article “Setting Performance Goals for the Duwamish Books, Phase 4 Sample.”

This article is broken down into the following five sections:

Hardware. Lists the hardware configurations that were used in the test.
Tools. Describes the tools used to create and run the tests.
Test types. Explains the types of tests that were chosen and what types of system performance were tracked.
Results. Discusses what our testing revealed about the scalability of the Duwamish Books sample.
What do you do when the testing is done? Considers how you can use the testing results for future enhancements to your business system.

The Hardware

To run the tests, two types of systems were configured:

Site Configuration 1 is based on one machine running all components: The Data Access Layer (DAL), Business Logic Layer (BLL), Workflow Layer (WFL), Web components, and the database.

Site Configuration 2 is based on two machines. Machine 1 runs the Workflow Layer (WFL), and Web components. Machine 2 runs the Business Logic Layer (BLL), Data Access Layer (DAL), and the database.

Because Site Configuration 1 must run all of the components of our application, the strain on the hardware should be greater than on Site Configuration 2, which distributes the work over two machines.

Table 1. Hardware and Software Testing Configurations

Component/Software	Site Configuration 1	Site Configuration 2
		Machine 1	Machine 2
Workflow Layer	X	X
Business Logic Layer	X		X
Data Access Layer	X		X
Web components	X	X
SQL Server 7.0	X		X
Internet Information Server (IIS)	X	X
Computer model	Dell OptiPlex GX1p	Dell OptiPlex GX1p	Dell OptiPlex GX1p
Processor type	PIII-500 MHz	PIII-500 MHz	PIII-500 MHz
SCSI card	Adaptec 2940 UW PCI SCSI	Adaptec 2940 UW PCI SCSI	Adaptec 2940 UW PCI SCSI
System memory	320 MB SDRAM – 100MHz	320 MB SDRAM – 100MHz	320 MB SDRAM – 100MHz
Network card	3COM Fast Etherlink XL PCI 10/100	3COM Fast Etherlink XL PCI 10/100	3COM Fast Etherlink XL PCI 10/100

Many other configurations are possible for our Duwamish Books sample. The configurations we chose may not be the best for you. You will need to decide what configuration will meet your performance goal.

The Tools

A number of test scripts were created specifically to test Duwamish Books. These scripts consisted of a list of requests that would be sent to the server. For example:

GET /D4_1/det.asp?77 is a line of script that requests a specific detail page from the server. This detail page includes content such as the book title, author, price, and so on. See “The Duwamish Books, Phase 4 Workflow API Reference” for information about the parameters passed to det.asp.

To generate these scripts, two tools were used—the free, publicly available Microsoft® Web Stress tool (Web Stress) and a custom script-generator program (SGP), written in Microsoft Visual Basic®, that controls Web Stress through its Component Object Model (COM) interface. You can download and view more information on Web Stress from http://homer.rte.microsoft.com/. The SGP program is included with the source code in the Phase 4 download.

Also included in the source code in the folder \\Duwamish\Phase 4\Testing is the Microsoft Access database file (WAS.mdb) containing all of the Web Stress scripts that I used in the testing. You can build your own test scripts, or you can use my test scripts. To use my scripts you must replace the original WAS.mdb (created when Web Stress was downloaded) with the WAS.mdb from the Testing folder.

Script-Generator Program

For the functionality that could neither be entered manually nor efficiently recorded, the script-generator program was created. This sample program was written in Visual Basic version 6.0, and uses the Web Stress COM interface to create more complex test scripts into the system.

It would take too much time to compose 100 or more lines of script manually, so I generated it programmatically. In the SGP program the random number functions in Visual Basic were used to generate a sequence of random item numbers. The item number was then concatenated to the end of the URL to produce a random request. The following code from the script-generator sample program demonstrates this process:

‘CREATING GET URL REQUESTS
Randomize
For TargetCursor = 1 To TargetCount
    
    ‘CHOOSE RANDOM ITEM IN ID RANGE
    RandomItem = IDMin + _
    CLng(CDbl(IDMax - IDMin) * Rnd)
        
    ‘ADD URL TO SCRIPT
    Set m_oScriptItem = m_oScript.ScriptItems.Add
    m_oScriptItem.sVerb = “GET”
    m_oScriptItem.sURL = m_sTargetRoot & RandomItem
        
Next

Test Types

Two types of tests were used to test the scalability of the Duwamish Books, Phase 4 sample application—a test to stress the processor and test the functionality of our Cache object (for more information on the Cache object see “Creating a Page Cache Object.”) and a user-simulation to test how the Duwamish Books sample application functions in a real-world scenario.

Stress test

Simultaneous requests from multiple users measured the stress on the processor and told us how well the system performed during peak activity. If the hardware configuration was not powerful enough to meet the demand placed on it, a more powerful configuration would be needed. The efficiency of the Cache object was tested by running simulations with and without caching requested items. This was done by commenting out the references to the Cache object in det.asp for Approach 1 and xmldet.asp for Approach 3. I needed to find out two things: How well our processor would function when all requests were being handled by the SQL Server and the efficiency of the Cache object under stress conditions.

User simulation test

This test is based on a customer-use scenario of 50 customers to one sale. I wanted to know how our system would work when subjected to the type of use it would receive when it went online.

In order to create the user simulation script, I worked from the performance model created in our performance-goal calculations.

I had to calculate what type and how many requests 50 customers would generate. I started with an overview of how customers would use the site.

There are three different ways in which a customer can move through our site, reach a detail page, and ultimately purchase a book.

Figure 1. Each customer begins a sale by starting with one of three search types.

Assuming that 60 percent of our visitors will use the category search, 30 percent will use the keyword search, 10 percent will use the English Query search, and 2 percent will purchase a book, I put together the following script scenario.

Number of visitors by search type	Search type	GET/POST requests required to complete one search	Detail pages visited	Total GET/POST requests required
30	Category	4	1	150
15	Keyword	1	1	30
5	English Query	1	1	10

GET/POST requests to complete a sale				1
Total unique GET/POST requests included in the user simulation script				191

Note Our sale actually requires 10 GET/POST requests. To simplify the script, we combined the transactions necessary to complete the sale into one request. That asp file, “saletest.asp,” has been included in the source code.

Performance counters

Performance counters allow you to view information about the resources being used by the operating system. It is important to choose the right counters to view. Because there are many counters available to choose from, the counters you choose will be dictated by the information you need to track.

I wanted to see not only the number of requests our configurations could handle, but also the stress these requests were having on the server. I chose to track the percent of processor time used as well as how quickly the Web service handled requests from the browser. It was also important for me to track any Active Server Page (ASP) requests that were delayed or rejected.

While testing I tracked the percent of processor time used on both the server and the client. It is important to check the client as well as the Web server. According to the Web Stress documentation, if the client’s processor utilization is more than 80 percent, the test is invalid. This is explained in more detail of the Web Application Stress Tool site (http://homer.rte.microsoft.com/).

I used the following performance counters.

Processor\% Total processor time	All of the work done on our system is dependent on the processor. If the processor is overworked, the efficiency of the entire system is affected.
Active Server Pages—Requests per second	This tracks the number of page requests being sent to the system.
Active Server Pages—Request wait time	This tracks how long the request sits in the queue waiting to be processed.
Active Server Pages—Requests queued	This tracks how many requests you have sitting in the queue at any given time.
Active Server Pages—Requests rejected	If our system is overloaded, requests will be stored in the queue. Once the queue is filled, requests will be rejected.

The Results

The Phase 4 system has been designed to be scalable. If the performance goals cannot be met by the existing hardware, performance testing should help us predict what is needed to meet our goals.

We were pleased with the results of our stress test. It proves that our Cache object can greatly enhance the customer response time when all or a percentage of popular pages (best-selling book or item) are pre-cached. Our user simulation tests revealed that both configurations could handle more requests than our goal of 24.2 requests per second. Only one test, however—Approach 3 on Configuration 2—met our goal of less than 1 second for both the time to first byte and time to last byte. We intend to do further testing with additional hardware and will publish our findings as soon as they are available.

Reports

The results of our testing are as follows:

Stress test

The point of the stress test was to see how many requests per second our system could handle before performance peaked and the server began to reject requests. We ran tests utilizing the cache component as an in-process COM component. Although it can be utilized as a Microsoft Transaction Server (MTS) component, we found an increase in requests per second of more than 40 percent when it was run in process. There is a risk, however, with using the cache component in process. If the data contained within the cache becomes corrupted, it can bring down the server.

Our cache component was tested by running tests both with pre-cached records and disabling the Cache object. As you can see from the results, pre-caching popular selections could greatly enhance the response time of the system.

All Tests were run using 100 items and 300 threads on three clients (100 threads/3 sockets). Site Configuration 2 (utilizing two machines) was used for this test. Machine 1 runs the Workflow Layer (WFL), and Web components. Machine 2 runs the Business Logic Layer (BLL), Data Access Layer (DAL), and the database.

Approach 1 (HTML)		RPS		TTFB (MS)		TTLB (MS)
While caching records	342.20		648.50		649.80
All records cached	393.48		499.9		501.2
Cache disabled	18.47		14433.5		14434.8
Approach 3 (XML)
While caching records	400.52		731.80		732.70
All records cached	463.12		647.7		648.5
Cache disabled	20.17		13662.4		13662.8

RPS: Requests per second.

TTFB: Time to first byte in milliseconds.

TTLB: Time to last byte in milliseconds.

User simulation test

For our user simulation test we increased the number of threads (multiple users) on the Web Stress Tool until we got the requests per second, time-to-first-byte, and time-to-last-byte numbers that we required to meet our goal. We did not use any throttling (simulating 28.8 or 56 KB modem) or request delay available with Web Stress as we were calculating our total user number based on the system performance. This allowed us to test with a minimal number of client test machines.

TEST RESULTS FOR APPROACH 1: CONFIGURATION 1

Threads = 25
Test length = 900 sec.

OVERALL

Requests per second = 32.47
Average time to first byte = 2102.61
Average time to last byte = 3012.94
Hits = 29235

BY GROUP

PAGE	Total hits	Average TTFB	Average TTLB	Actual RPS	Needed RPS
Cat	17676	2403.52	2984.76	19.66	15.24
det	7974	2186.91	2714.90	8.86	6.35
sresults/keyword	2617	109.21	3694.9	2.91	1.91
sresults/query	848	119.93	4837.62	0.94	0.64
sale	100	1594.03	1942.74	0.11	0.13

TEST RESULTS FOR APPROACH 3: CONFIGURATION 1

Threads = 25
Test length = 900 sec.

OVERALL

Requests per second = 44.49
Average time to first byte = 2023.62
Average time to last byte = 2035.48
Hits = 40055

BY GROUP

Page	Total hits	Average TTFB	Average TTLB	Actual RPS	Needed RPS
cat	24098	1272.81	1290.82	26.78	15.24
det	11210	2818.40	2819.83	12.46	6.35
sresults/keyword	3465	4043.09	4044.50	3.85	1.91
sresults/query	1101	6007.53	6009.26	1.22	0.64
sale	181	2170.36	2173.31	0.20	0.13

TEST RESULTS FOR APPROACH 1: CONFIGURATION 2

Threads = 25
Test length = 900 sec.

OVERALL

Requests per second = 66.09
Average time to first byte = 711.20
Average time to last byte = 1376.54
Hits = 59435

BY GROUP

Page	Total hits	Average TTFB	Average TTLB	Actual RPS	Needed RPS
cat	37084	756.82	1294.03	41.20	15.24
det	15730	844.42	1378.91	17.48	6.35
sresults/keyword	4833	110.41	1770.16	5.37	1.91
sresults/query	1511	123.92	2191.28	1.69	0.64
sale	277	524.4	1180.58	.31	0.13

TEST RESULTS FOR APPROACH 3: CONFIGURATION 2

Threads = 25
Test length = 900 sec.

OVERALL

Requests per second = 92.10
Average time to first byte = 910.89
Average time to last byte = 921.60
Hits = 82915

BY GROUP

Page	Total hits	Average TTFB	Average TTLB	Actual RPS	Needed RPS
cat	50234	683.37	699.10	55.82	15.24
det	23011	1097.88	1100.33	25.57	6.35
sresults/keyword	6917	1614.30	1615.88	7.69	1.91
sresults/query	2358	2400.53	2402.46	2.62	0.64
sale	395	864.15	867.31	0.44	0.13

Summary

Performance testing should be an integral part of any development project. Testing not only the end product, but also each step in development, should be a standard practice. A good performance test will provide a standard against which future tests can be measured.

Once you know the relationship between your system configuration and current performance, you can certify that the application is scalable and can meet increasing performance goals.

Comments? We welcome your feedback at duwamish@microsoft.com.