Modeling a Distributed Inventory

Duwamish Books, Phase 2

Steve Kirk
Microsoft Developer Network

June 1998

Summary: Presents the revised data model for Phase 2 of the Duwamish Books sample. (4 printed pages) Discusses:

Introduction

The Phase 1 Duwamish Books database captured the business processes of a single retail store in a simple data model, as documented in "Designing a Simple Retail-Oriented Database." In Phase 2 I'll extend the model to handle expansion of the business to multiple stores with multiple checkout points at each store. The problems that Duwamish Books will face as it grows to more and larger stores include accommodating the increased load of multiple (and busier) users at each location and maintaining data consistency across the organization. Our solution to these problems is not only to improve the transactional efficiency of the database design but to more efficiently utilize the database by distributing the application. The data access tier, introduced in Phase 2 between the client applications and the database, is the first step toward a distributed system. A distributed application multiplies the effect of database improvements by reducing the number of simultaneous users that the database actually sees. By using the database more efficiently, more transactions can be executed against live data (rather than against a replica or a queue) with a resulting improvement in data consistency across the organization.

In this article I'll present a revised data model and then show how this model addresses increased load and data consistency across the organization. I'll also show how the model addresses some more specific use cases.

Phase 2 of the Duwamish Books sample includes a Microsoft® Access .mdb database, although this technology supports fewer concurrent users than does Microsoft SQL Server™. SQL Server will be required beginning in Phase 3, where the application architecture most effectively uses its power. I'll mention differences between SQL Server and Microsoft Access .mdb database performance in the article, although use of SQL Server is optional and conversion of the Microsoft Access database to SQL Server in this phase is left to the user.

Distributed Inventory Data Model

In the following section, I'll propose changes to the data model so that inventory can be tracked by store. See "Designing a Simple Retail-Oriented Database" for documentation of the database model as used in Phase 1.

All entities that represent the flow of inventory through the organization are tracked at the store level in Phase 2. This is accomplished by adding a required StoreID attribute to the Order, Sale, and Inventory Tracking entities. The StoreID field in these tables is a foreign key into the new Stores table.

Stores Table

Field name Data type Field attributes and relationships Description
PKId AutoNumber Primary key  
Code Text(50) Unique Identifies store
Name Text(5)   Store name
Address1 Text(50) Address line 1
Address2 Text(50) Address line 2
City Text(50) City
State Text(2) State
ZipCode Text(10) Zip code
Phone Text(50) Phone number
Fax Text(50) Fax number

In Phase 1 stock and reorder level quantities were attributes of the Item entity, which described an item of store merchandise. In Phase 2 inventory attributes are removed from the item entities so that the Items table becomes a pure catalog. Inventory tracking is moved to the new Inventory table. Each Inventory entity associates an Item with a Store and holds stock and reorder level attributes for the association.

Because each store may choose to carry only a subset of all items, the range of items that a particular store carries is modeled by the existence of a corresponding inventory record.

Inventory Table

Field name Data type Field attributes and relationships Description
PKId AutoNumber Primary key  
ItemID Long Foreign key to Items table (enforced reference) Identifies item
StoreID Long Foreign key to Stores table (enforced reference) Identifies store
InStock Integer   Quantity of item in stock
ReorderLevel Integer Stock level to trigger reorder

Stores may now transfer merchandise between themselves. An OtherStoreID attribute has been added to the Order entity and the IsSale attribute has been replaced with an OrderType attribute that designates TRANSFER_PURCHASE and TRANSFER_SALE order types in addition to PURCHASE and SALE.

In the multiple store model, transactions are executed in the context of both an employee and a store. The application maintains state about both the current store and the current employee. The current employee is determined by the logged-on user, while the current store is saved in the registry and may be changed only by system administrators. The new Admin attribute of the Employee entity designates employees who may change the current store or perform other administrative functions.

Design Issues

In this section I'll test the revised data model against the design criteria to demonstrate the workability of the design at this phase and to identify issues to be handled in the next phases.

Multiple Stores * Multiple Users per Store

Performance in two critical areas limits the ability of the system to accommodate increased load. All components of the system require responsive queries against the Items and Inventory tables and most system components require quick updates to the Inventory table. The Point of Sale (POS) application is the most intensive consumer of database transactions and may make multiple queries against the Items table and multiple updates of the Inventory table per sale.

Query Performance

The database model changes improve query performance against the Items table by removing the inventory fields from the Items table. This eliminates most update transactions against the Items table and increases query performance.

The Items table now changes rarely (relative to query frequency), so the cost of index updates is low when compared to the benefits that indexes provide to queries. In addition to the primary key (PKId), the Items table is indexed on all search-worthy fields to improve query performance.

Return time for queries against the Items table will be reduced dramatically in Phase 3 when the business logic tier executes queries against the Items table through stored procedures. Most queries will then be against the (infrequently rebuilt) cache with good performance.

The Inventory table has indexes on the (never changing) ItemId and StoreId fields, which will assist Inventory queries without slowing updates against the InStock field.

Inventory UPDATE Transactions

The most likely point for contention to occur is in the Inventory table as multiple users attempt to UPDATE the InStock field of an Inventory record. The design reduces potential for this type of contention because there is a separate record for each Item/Store association and the number of users attempting to update a record cannot exceed the number of users at the store.

Transactions that affect inventory do have a limiting factor in Phase 2 that will be reduced in Phase 3. In each of these transactions a record is INSERTED into the InventoryTransaction table and then the Inventory record InStock field is UPDATED to reflect the change. In Phase 2 the application makes two round trips to the database to accomplish the INSERT and UPDATE. In Phase 3 the business logic tier will call a stored procedure to accomplish the same work. This will result in one round trip invoking a compiled procedure rather than executing dynamic SQL. The transaction will not only execute faster, but it will also lock tables for a shorter period, providing greater concurrency.

Conclusion

The move to a multiple store model necessitates a separation of catalog and inventory data that also improves performance. This more fully normalized model reduces coupling between inventory updates and catalog queries and improves the performance of both.

The data access tier, separated from the applications in this phase, can multiply the effect of database server/engine upgrades by effectively reducing concurrent database connections. Although you can realize improved performance by using SQL Server in this phase, the most dramatic performance improvements will occur in Phase 3 where the business logic tier invokes stored procedures rather than dynamic SQL to reduce round trips, exploit query caching, and reduce lock contention.