Normal Distribution

You specify the amount of time to spend in each phase of the interaction cycle. Each parameter is an independent dimension of the workload. The values actually used to apply the workload in each dimension from one cycle to the next are normally distributed based on two simple parameters that you supply. The normal distribution (better known as the bell-shaped curve), has some good characteristics for applying known workloads.

When a workload has a normal distribution, about two-thirds of the samples are within plus or minus one standard deviation of the mean. About 95% of the samples are within plus or minus two standard deviations of the mean. And about 99% of the samples are within plus or minus three standard deviations of the mean. A standard deviation of zero causes the mean workload equal to be applied constantly. So (you guessed it), you supply the mean and the standard deviation of the each phase and, voilą, a purely defined workload.

Response Probe uses a folded normal distribution for some of its parameters. Actually there is no such thing in statistics, so we invented it. That is because some of the dimensions have upper and lower boundaries. None of the dimensions can realistically be negative, for example. (That's what we need; negative compute time so we'd be done before we start. Now that's fast!) And Response Probe will not access a file beyond its beginning or end. So if a computation of a workload parameter ends up beyond the boundary of the dimension, it is arithmetically folded back towards the mean. If it then happens to go past the boundary of the dimension on the other side, it is again folded back towards the mean, and this process continues until a value within the boundaries of the dimension is returned.

For example, suppose a dimension has boundaries of 1 and 100. If the computation of a parameter is 102, is folded back into the boundary and becomes 98. If a computation yields 205, it becomes 5 when folded back within the boundary.

Response Probe uses the following formula to calculate normally distributed values. If the result is beyond the boundaries, it is folded back into the specified range as described previously.

Normal = Mean + (-7 + Sum(14 Random Numbers [0..1])) * Standard Deviation

To get some idea of how the parameters to a normal distribution work, take a look at Figure C.2. This shows how access to a file with 1000 records is distributed when the mean is placed at record 500, and various standard deviations are supplied. When the standard deviation is one-third of the mean (166 = 500/3), we get the familiar bell-shaped curve. By the time the standard deviation is equal to the mean, at 500 records, we get a nearly uniformly random distribution of access across the file. If this were a pure normal distribution, this would look instead like the central two-thirds of the bell shaped curve. Since the access that would fall beyond the ends of the file are folded back into the file around the endpoints, the random distribution results instead.

Figure C.2 Normally distributed curves produced by various standard deviations

If we had selected a standard deviation of 0, all the access would have been to record 500 of the file.

Now let's take a look at the operation of the various phases of Response Probe.