Automatic Statistics Creation and Refresh

When you create an index, SQL Server 7.0 automatically stores statistical information regarding the distribution of values in the indexed columns. It also supports statistics on nonindexed columns. The query optimizer uses these statistics to estimate the size of intermediate query results as well as the cost of using the index for a query.

If the query optimizer determines that the statistics needed to optimize a query are missing, it automatically creates them and saves them in the database. Moreover, it automatically updates the statistics as the data in a table changes, and it eventually discards the statistics if they are not reused.

Statistics are created and refreshed very efficiently by sampling. The sampling is random across data pages and taken from a table or nonclustered index for the smallest index containing the columns needed by the statistics. The volume of data in the table and the amount of changing data determine the frequency with which the statistical information is updated. For example, the statistics for a table containing 10,000 rows may need updating when 1,000 rows have changed because 1,000 is a significant percentage of the table. However, for a table containing 10 million rows, 1,000 changes are less significant.