Microsoft Repository:
The Glue that Binds
the Data Warehouse

One of the greatest implementation challenges is integrating all of the tools required to design, transform, store, and manage a data warehouse. The ability to share and reuse metadata reduces the cost and complexity of building, using, and managing data warehouses. Many data warehousing products include a proprietary metadata repository that cannot be used by any other components in the data warehouse. Each tool must be able to access, create, or enhance the metadata created by any other tool easily, while also extending the metadata model to meet the specific needs of the tool.

Consider the example of a data warehouse built with shared metadata. Metadata from operational systems is stored in the repository by design and data transformation tools. This physical and logical model is used by transformation products to extract, validate, and cleanse the data prior to loading it into the database. The database management system may be relational, multidimensional, or a combination of both. The data-access and data-analysis tools provide access to the information in the data warehouse. The information directory integrates the technical and business metadata, making it easy to find and launch existing queries, reports, and applications for the data warehouse.

The Microsoft Data Warehousing Framework is centered upon shared metadata in Microsoft Repository, which is a component of Microsoft SQL Server 7.0. Microsoft Repository is a database that stores descriptive information about software components and their relationships. It consists of an open information model (OIM) and a set of published COM interfaces.

OIMs are object models for specific types of information and are flexible enough to support new information types as well as extensible enough to fit the needs of specific users or vendors. Microsoft has developed OIMs in collaboration with the software industry for database schema, data transformations, and online analytical processing (OLAP). Future models may include replication, task scheduling, semantic models, and an information directory that combines business and technical metadata.

The Meta Data Coalition, an industry consortium of 53 vendors dedicated to fostering a standard means for vendors to exchange metadata, has announced support for Microsoft Repository, and Microsoft Repository OIMs have received broad third-party support.