This article is the first of several installments on Access database replication. It starts with a discussion of why database replication is important to Access developers, presents the basic technology, and explores some typical replication scenarios. This is followed by an overview of the new replication features of Access 2000. Since replication is in its third version with Access 2000, it's a mature feature with substantial subtlety and ease of use. The article concludes with several database replication examples worked through the user interface. This section reinforces the core concepts and creates a foundation for more advanced coverage of the topic in subsequent installments.
Remaining installments will explore advanced features and functionality for Access database replication. The next installment will demonstrate coding techniques for controlling database replication through Jet and Replication Objects (JRO). Other installments will focus on such topics as Replication Manager, using database replication over the Web, and interoperability with Microsoft SQL Server. Replication Manager supports many advanced features, such as indirect replication, and replication over the Internet via HTTP (HyperText Transfer Protocol).
The Benefits of Replication
Scalability. Historically, Access and Jet target small workgroup solutions. An important Access design goal is sufficient simplicity for end users to build their own applications. This explicitly eliminates the need for a database administrator as is necessary with most high-end database managers, such as Oracle, or - to a lesser degree - Microsoft SQL Server. One tradeoff for the simplicity is scalability. For example, Access/Jet applications can't take advantage of multiple processors. These applications are also "architecturally impaired" because they must pass the database file around a network, instead of just SQL statements and return sets. The requirement slows performance because adding more users lengthens the queue waiting for a turn with the database file.
Database replication helps to minimize the scalability issue by letting two or more database copies function as one. By placing each copy on a computer of its own, Access developers can architect solutions that scale by using multiple computers instead of multiple processors on a single computer. As the performance of a database on a physical database server deteriorates with each additional user, a developer/administrator can replicate the database and run it on multiple servers. By distributing the user load across multiple database servers, replication can multiply the number of users who can simultaneously work with a database.
Synchronization is the process that enables an organization to use multiple replicas of a database as one. When replicas synchronize, they exchange just the changes between database copies. This permits users to make independent changes to copies and merge them. The term "real-time-enough" characterizes Access replication. This is because the copies won't necessarily be exactly the same. Immediately after a successful synchronization, the replicas can be identical. As users modify their copies between synchronization cycles, however, they grow different.
As your replica set gains more database copies, bringing the copies into simultaneous alignment can be more challenging. This is because you need to fully exchange the changes from all replicas with one another. Because synchronization always occurs between pairs of replicas, you can require multiple rounds of synchronization to fully transfer all the changes in each replica to other replicas in a set. The topology, or geometry, of your replica set can impact the latency of transferring updates throughout the replica set members.
FIGURES 1 and 2 contrast linear and hub topologies for linking Replicas A through D for a database. The double-headed arrows indicate pair-wise synchronization. FIGURE 1 takes three rounds of synchronization for the changes in Replica D to migrate to Replica A (recall the synchronization constraint that exchanges occur between replicas on a pair-wise basis).
FIGURE 1: Linear topology requires three rounds of
synchronization to transfer data from Replica D to Replica A.
In FIGURE 2, the transfer of Replica D changes to Replica A occurs with a one round of replication in the hub topology. This demonstrates the dramatic impact that topology can have on the transfer of data.
FIGURE 2: The hub topology requires only one round of
synchronization to transfer data from Replica D to Replica A.
Partial replicas. Access database replication offers other benefits besides scalability. For example, it readily accommodates disconnected workers, such as traveling salespersons. This enables traveling employees to synchronize their database copies immediately before departure and after return. This "road warrior" scenario is enhanced by partial replicas. A partial replica doesn't contain all the records in a database, but just a subset, such as the customer and order records for a particular salesperson. The reduced database size helps to speed synchronization. As you consider applying database replication to branch offices, this partial replica feature becomes even more valuable because these offices are likely to connect to the headquarters' computer over a slower data communication line than a desktop computer at headquarters.
Data warehousing. Replicas and partial replicas are also invaluable for creating data warehousing solutions. For example, a department that needs to make many queries on a database (e.g. for reporting, statistical, decision-support, technical support, etc.) can use a replica or partial replica, instead of hitting the main database with their specialized demands.
Propagating design changes. There is at least one other significant core Access database replication feature: its ability to transfer design changes throughout a replica set. With this feature, a developer can retain one member of a replica set. As changes to the system become necessary over time, the developer can make those changes on a special replica, and then synchronize design changes for this database copy with all the other replicas. In this way, the developer can efficiently propagate design changes just as other workers synchronize data changes. This feature eliminates the need to distribute new application files because an organization can just propagate design changes across a replica set.
You can only propagate database design, as opposed to database content changes, from a Design Master replica. This replica is a special one. Consider restricting its administration to a developer or another special individual with database design responsibility. After verifying the performance of schema or application object changes, developers can synchronize the Design Master with other replica set members. Schema changes impact a database's design, such as the addition of a new table or relationship, and application object updates change the behavior of, or make, new forms, reports, and data access pages. A replica set can have only one Design Master at a time. However, the replica serving as the Design Master can change as circumstances require, e.g. if a key office member goes on vacation.
When to Use Replication
One common complaint about Access solutions is that they can only tolerate an upper limit of about a score of users. By making multiple replicas - each on a different server - an application architect can multiply the number of users served by an application. The rapidly declining price of desktop computers, such as those that might support Access database applications, escalates the attractiveness of this approach. If the need for reflecting changes made on different servers is low, you might be able to restrict synchronization to hours when the database is out of operation.
When there are more than a few changes per unit of time, it may be preferable to perform synchronization on a regular schedule. This is because the synchronization time increases with the number of changes. Furthermore, replicas diverge as the number of updates on replicas increases.
Access replication is particularly friendly to database changes that add new records, as opposed to ones that update records. When users can make conflicting changes to a database in different replicas, Access replication will need to resolve the conflict. You can do this through a built-in conflict resolver or with one of your design. Although custom conflict-resolvers can require many lines of code, the built-in Conflict Viewer can be confusing for at least some end users. In any event, it's not desirable to have end users making decisions about content that could have global database implications. Therefore, Access best serves those situations that restrict updates to a single replica. This eliminates the possibility of conflicts during synchronization. When it's imperative to permit changes at multiple replicas and Access replication is still desirable for other reasons (such as cost), consider special measures. These should restrict or eliminate the need for end users to work with the built-in conflict resolver.
Replication serves distributed disconnected workers well. This can include operations managers as they move their laptops around the plant. Any organization with a field staff, such as sales persons, service technicians, or traveling engineers, can benefit from replication. All these workers are likely to be distributed in space, and disconnected at times. In addition, field workers often collect data with which they must update a headquarters database. Database replication can improve the speed, timeliness, and accuracy of these updates.
Replication can work over the Internet, or a company intranet. This kind of replication is especially well suited for field staffs and branch offices. Applications that cannot justify the cost of a continuous high-speed data connection may still be able to benefit from Access replication.
When working with a replica set, there is no need to have a normal backup process. This is because each replica serves as a backup. One typical Access database application calls for designating a special copy as the backup replica. An administrator can launch synchronization for the backup on a regular schedule. Because a database can still serve clients while it is synchronizing, this type of application eliminates the need to take a database out of operation for backing it up.
Access database replication is unique among database managers in its ability to distribute design updates to tables, forms, reports, and other objects. Most other databases support replication exclusively for data, not for the database design.
When Not to Use Replication
Two types of application requirements indicate that Access database replication isn't appropriate. In both of these situations, your database development effort will be better served by replication with a different kind of database manager, such as Microsoft SQL Server.
In cases where all replicas need to be in synch all the time, Access database replication is not appropriate. Changes made in one replica don't appear in other replicas until one or more rounds of synchronization. Even for topologies that require just one round of synchronization, they will still be out of synch before scheduled or on-demand synchronization.
Whenever your requirements include perfect synchrony at all times, you should consider a two-phase replication commitment model. This approach requires a constant communication connection between all replicas in a replica set. The first phase announces a proposed update to the replicas. The second phase is the actual update after all replicas commit to the update. This model is especially appropriate for funds transfer scenarios and other contexts that demand knowledge about the exact current number available, such as airline reservation applications.
The Access conflict resolution model can generate substantial administration requirements when two or more replicas revise the contents of the same field in a record. This is because Access continues to request each database replica to resolve conflicts whenever there is a record in which one replica's update wins over another. Access has a built-in algorithm for determining which record wins, but you can programmatically override this with a more specific rule. This strategy can require many lines of code, so your project must justify the allocation of sufficient programming resources. A two-phase commit system removes the update problem because a revision to one replica automatically brings all or no replicas into compliance with the change.
What's New with Access 2000 Replication
This version of Access refines and improves many aspects of database replication from prior implementations. In this section, we'll briefly review a select set of these enhancements, including:
The first two versions of Access replication enabled row-level tracking of changes in replicas. If two replicas each changed the same row of a table, it counted as a conflict. This was so even if each replica changed a different field in the row. With column-level tracking, changing different columns in the same row doesn't result in a conflict. The new column-level tracking feature is the default, but can be changed. Developers can avoid a minor performance penalty by falling back to row-level tracking.
Earlier versions of Access resolved conflicts by declaring a winner based on whichever replica changed a record the most. If both replicas changed a field within a record an equal number of times, then Access chose the replica with the lower ReplicaID property.
Access 2000 declares conflict winners based on replica priority. Replica priority values range from 0 to 100. The Design Master has a default priority of 90. The default priority of any replica is 90 percent of its parent. You can override the default settings from the user interface and programmatically at creation-time for a replica. After that point, a replica's priority is read-only. Access replication relies on the historical priority of a change in resolving a conflict. Therefore, if the replica with the lowest priority synchronizes with the highest priority replica, the changes inherited by the lowest priority replica will win over those from replicas with an intermediate priority value.
Another feature introduced with Access 2000 is the replica Visibility property. There are three settings for this property that result in global, local, and anonymous replicas. A global replica behaves as in previous versions of Access. It can serve as a Design Master, and the parent of any other global replica. The internal plumbing of a replica supports the tracking of all changes to this type of replica. Global replicas can synchronize with all other global replicas.
A local replica can synchronize only with its parent, which must be a global replica. This feature makes them well suited for enforcing a hub topology. Make a global replica serve as the hub and attach a collection of local replicas to it. A local replica can synchronize with another replica only through their parent at the hub. Local replicas have a priority value of 0.
Microsoft created anonymous replicas for Internet applications. Unlike local replicas, Access doesn't track changes to anonymous replicas. In fact, it only tracks anonymous replicas to the degree that they belong to a particular global replica. Access uses this tracking information about anonymous replicas to remove them from a replica set after prolonged inactivity. Administrators of a replication system can schedule synchronization cycles with local replicas, but not with anonymous replicas. Anonymous replicas must synchronize from a command initiated at the anonymous replica. Like local replicas, anonymous replicas have priority values of 0.
Many organizations using previous versions of Access replication requested a replica that automatically prevented deletions. Such a replica prevents deletions by the user, but it permits the deletion of records through synchronization. Developers could always (and still can) apply Access user-level security to restrict the ability to delete records. However, a new Prevent deletes replica simplifies this task. The Prevent deletes replica gets its name from a check box in the Create Replica dialog box. If a user selects this box, it supports the attribute for the replica. There is the ONLY route for controlling the attribute. There is no programmatic interface for either setting or tracking this property. Therefore, you must manually track replicas with the attribute setting or develop conventions for identifying these replicas (for example, you can include a reference to the attribute setting in their name or description properties).
By offering programmatic control of replication with JRO, Microsoft supports an ADO approach to managing replication. Unlike its ADODB and ADOX (ActiveX Data Objects Extensions for Data Definition Language and Security) library siblings, the JRO library is exclusively for use with Microsoft database files, i.e. .mdb files. After referencing the library, developers can programmatically control Access replication. Although it's still possible to use DAO to program replication, JRO provides the only route to all the new features, such as priority and visibility. Microsoft ships code samples for programmatically manipulating replication with JRO with its Microsoft Office Developer edition. In addition, a white paper summarizing these code samples is available over the Internet at http://msdn.microsoft.com/library/techart/jrorep.htm.
Replicas from Access 2000 can bi-directionally synchronize with those from SQL Server 7.0. The prior version of Access supported only synchronization of changes from SQL Server databases to Access 97 databases. When a SQL Server database participates in a replication with an Access database, the SQL Server replica must always be the hub replica. Access replicas in this mixed kind of replica set will not be able to synchronize directly with one another.
Using the Replication Menu Commands
Access 2000 offers several techniques for managing database replication tasks. One of the easiest ways to get started is with the Tools | Replication menu. The commands on this menu configure themselves for the most typical replication tasks. Many developers will find it convenient to start with the menu commands before moving on to programmatic control of replication and other advanced replication topics. This approach also gives developers an opportunity to gauge the need for how much programmatic control you need for your business environment.
You start with an ordinary Access database file. Choosing Tools | Replication | Create Replica will convert the current Access database file into a Design Master. This command also returns a second replica for the replica set the first time you use it. Initially, all Replication menu items, except Create Replica, are disabled (see FIGURE 3). When using the command for the first time on a database, Access will give you a chance to generate a backup of the original file before it transforms it for replication. Because there is no automatic way of recovering the initial file after you make it replicable, it's wise to perform the backup.
FIGURE 3: Creating an Access 2000 replica.
Select Tools | Replication | Create Replica to convert a normal Access database file to a Design Master. Next, Access presents the location of New Replica dialog box. Use this dialog box to assign a path and file name for the second replica in the replica set. By default, the second replica's name is "Replica of X," where X represents the Design Master's file name. The Design Master's name will be the same as the original database file, and its path will remain unchanged. After clicking Create, Access transforms the original database file to a Design Master and makes a replica of it. You can use the new Design Master, or its replica, to create more replicas with the Tools | Replication | Create Replica command.
After creating at least two replicas, you can invoke basic replication tasks, such as synchronization and conflict resolution. You can demonstrate synchronization with the following three steps.
First, add a new record to either replica. FIGURE 4 depicts two replicas with a record added to the table in the second replica. Notice that converting a database changes the behavior of the AutoNumber field (used for EmployeeID). To reduce the possibility of conflicts between replicas, AutoNumber fields start assigning random values in a Long Integer format. If you find an application creating an unacceptable number of conflicts based on a former AutoNumber field, convert the Long Integer to a GUID by choosing Replication ID as the Field Size property. Microsoft claims that GUID field values are universally unique.
FIGURE 4A: Replica one with a record added to the table in the
second replica.
FIGURE 4B: Replica two with a record added to the table in the
second replica.
Select Tools | Replication | Synchronize Now to propagate the new record from the EmpExts table in the bottom record to the EmpExts table in the top replica. Then select Tools | Replication | Synchronize Now in the bottom replica to propagate the new record from the replica's EmpExts table to the corresponding table in the top replica. You may get a short prompt about closing any open objects. Click Yes to perform the synchronization.
Third, from the Synchronize Database Replica of MODdbRepl1 dialog box, select the replica with which to synchronize. In general, this dialog box title will reference the name of the current replica. If the target replica doesn't appear in the drop-down list box, click Browse and navigate to the target replica. In this scenario, the only other replica has the name MODdbRepl1.mdb. Its path will precede its name in the drop-down list box. Click OK to launch synchronization with a target replica. You'll get a dialog box prompting you to close the current database before performing synchronization. Click Yes to proceed. After completing synchronization, the record for Anthony Hill will appear in the EmpExts table of the MODdbRepl1.mdb file (available for download; see end of article for details).
Conflicts result when you modify the same record or field from different replicas in different ways. These conflicts arise only for different updates to the same field for a record when using the default column-level tracking. If record-level tracking is in force, then conflicts can result from any two changes to the same record (the changes don't strictly have to conflict with one another). Recall that a change to a Design Master will typically win over another replica. This is because Design Master replicas have a default priority of 90, and children replicas have a default priority of 90 percent of their parent's setting. Because Access replication picks a winner based on the highest priority setting, changes to the Design Master will typically win over changes to its children replicas.
FIGURE 5 shows the two replicas from the preceding example with two conflicting edits to the record for Anthony Hill. Although the change in the replica of the MODdbRepl1.mdb file updates Anthony's first name to Tone Man, Anthony's new name in the Design Master is Tony. Because the Design Master has the higher priority setting, its change wins. The field for Anthony's first name in the replica of MODdbRepl1.mdb file will change to Tony after either replica synchronizes with the other.
FIGURE 5A: The replica from FIGURE 3 with conflicting edits.
FIGURE 5B: The replica from FIGURE 3 with conflicting edits.
When you make conflicting changes in two different replicas, one wins and the other loses based on the replica's priority settings.
FIGURE 6 shows a pair of dialog boxes you may encounter as you resolve the conflicts resulting from the synchronization of the updates depicted in FIGURE 5. Select Tools | Replication | Resolve Conflicts to open the Microsoft Replication Conflict Viewer dialog box (or click Yes to the prompt reminding you to resolve any conflicts). The display in FIGURE 6 shows a single conflict table, but there can be more depending on the scope of changes because of the last time two replicas synchronized. Viewing the EmpExts(1) conflicts opens the dialog box in the bottom of FIGURE 6.
FIGURE 6A: One of two dialog boxes you may encounter resolving
synchronization conflicts.
FIGURE 6B: The second of two dialog boxes you may encounter
resolving synchronization conflicts.
This offers four options. First, you can keep the existing changes that Access selected. This scenario retains Tony as the first name in the replica of MODdbRepl1.mdb file. Second, you can overwrite with the conflicting data. In this option, Tone Man overwrites Tony in the replica of MODdbRepl1.mdb file. At the next synchronization, Tone Man will update Tony in the Design Master unless that field changes from the last synchronization. If you wish you can revise either the winner or the loser, and write the revised entry over the current entry. These are the third and fourth radio buttons in the Existing Data and Conflicting Data group on the bottom dialog box. Click either of these bottom two radio buttons to enable the corresponding text box for entry, and then edit the existing value. After selecting a radio button (or leaving the default option of Keep existing data), you can remove the conflict by clicking Resolve and then clicking Close in the top dialog box.
After a synchronization that has conflicting records, you must resolve the conflicts to remove the conflicts from the conflicts table. Select Tools | Replication | Resolve Conflicts to open these two dialog boxes.
The sample files for this article, MODdbRepl1.mdb and Replica of MODdbRepl1.mdb, help you reproduce and experiment with the replica set developed in this section. These files comprise a working replica set that you can copy to your hard drive and start making changes, synchronizing, and resolving conflicts. It also proves the concepts for the steps described in this section. You'll also find MODdbRepl1.bak. This is the backup file for MODdbRepl1.mdb before its conversion to a Design Master. You can change its extension from .BAK to .MDB and reproduce the steps in this section to verify the steps for creating replicas, synchronizing changes, and resolving conflicts. Remove or rename the replica set files before attempting to reconstruct it from MODdbRepl1.bak.
Conclusion
Access database replication can dramatically increase the value of Access solutions by making them available to wider audiences of users - in terms of both numbers and types of users. Access database replication accommodates more users because it can spread the processing load across multiple computers serving up different replicas. It can process more kinds of users by accommodating those connected via a LAN, those performing special tasks, and those who connect occasionally over the Internet.
Throughout this article, I have referred to Access database replication. In fact, other applications, such as Visual Basic and Excel, can use selected Access database replication features because their functionality derives from the Jet database engine. Nevertheless, Access offers a particularly friendly user interface for tapping Jet database engine functionality, and that remains true for database replication as well. This article illustrates behavior for selected Tools | Replication menu commands. The remaining articles in this series will build on the foundation provided by this article as they explore alternate interfaces for controlling Access database replication.
Notice that the article doesn't include a single line of code! Although it's possible to program replication, your applications don't require advanced code (or any code) to take advantage of this powerful feature.
The files referenced in this article are available for download.
Rick Dobson, Ph.D. and his wife operate a development and training consultancy. He is the author of Programming Microsoft Access 2000 for Microsoft Press. Rick is a regular contributor to this publication and numerous other computer periodicals. In addition, he presented training sessions and seminars on Access and Web development topics in Australia, Canada, the United Kingdom, and throughout the United States. Rick is a Microsoft Certified Professional and a Microsoft Certified Trainer. You can reach Rick at either of the two Web sites that his practice maintains (http://www.cabinc.net and http://www.programmingmsaccess.com/).
Copyright © 1999 Informant Communications Group. All Rights Reserved. • Site Use Agreement • Send feedback to the Webmaster • Important information about privacy