Designing the Data Warehouse

The development phase of the data warehousing process often begins with the creation of a dimensional model that describes the important metrics and dimensions of the selected subject area based on user requirements. Unlike online transaction processing (OLTP) systems that organize data in a highly normalized manner, the data in the data warehouse is organized in a highly denormalized manner to improve query performance when stored in a relational database management system.

Relational databases often use star or snowflake schemas to provide the fastest possible response times to complex queries. Star schemas contain a denormalized central fact table for the subject area and multiple dimension tables for descriptive information about the subject’s dimensions. The fact table can contain many millions of rows. Commonly accessed information is often preaggregated and summarized to further improve performance.

Although the star schema is primarily considered a tool for the database administrator to increase performance and simplify data warehouse design, it also represents data warehouse information in a way that makes better sense to end users.