A Publisher is a server that makes data available for replication to other servers. In addition to identifying which data is to be replicated, the Publisher detects which data has changed and maintains information about all publications at that site. Any given data element that is replicated has a single Publisher, even if it may be updated by any number of Subscribers or published again by a Subscriber.
Note The publish and subscribe metaphor always implies an orderly administrative hierarchy. Although there can be multiple Subscribers to a publication, there is only one Publisher and only one “master” database. Referring to Microsoft® SQL Server™ version 7.0 replication as “multimaster” is misleading because there is no peer-to-peer topology. On the other hand, the hierarchical model should also not be interpreted as meaning that only a Publisher modifies data.
Subscribers are servers that store replicated data and receive updates. In earlier versions of SQL Server, updates could typically be performed only at the Publisher. SQL Server 7.0, however, allows Subscribers to make updates to data (but a Subscriber making updates is not the same as a Publisher). A Subscriber can, in turn, become a Publisher to other Subscribers.
The Distributor is the server that contains the distribution database and stores metadata, history data, and (for transactional publications) transactions.
A publication is simply a collection of articles, and an article is a grouping of data to be replicated. An article can be an entire table, only certain columns (using a vertical filter), only certain rows (using a horizontal filter), or even a stored procedure (in some types of replication). A publication often has multiple articles. This grouping of multiple articles makes it simpler to subscribe to a unit (the publication), which has all the relevant and required data. Subscribers subscribe only to a publication, not to individual articles within a publication.
Note In SQL Server 6.x, subscriptions could be to an article and not just a publication. This is a less precise data model that introduces more complexity in administration of replication. For compatibility with SQL Server 6.x, the subscription directly to an article is still allowed in SQL Server 7.0. However, the user interface does not support this, and it is provided only for backward compatibility. If you have applications built with SQL Server 6.x and those applications subscribe directly to articles, instead of to publications, they will continue to work in SQL Server 7.0. However, you should begin to migrate your subscriptions to the publication level where each publication is comprised of one or more articles.
With a push subscription, the Publisher propagates the changes to a Subscriber without a request from the Subscriber to do so. Typically, push subscriptions are used in applications that are required to send changes to Subscribers whenever and as soon as they occur. Push subscriptions are best for publications that require near real-time movement of data without polling and where the higher processor overhead at the Publisher does not affect performance. Changes can also be pushed to Subscribers on a scheduled basis.
With a pull subscription, the Subscriber asks for periodic updates of all changes at the Publisher. Pull subscriptions are best for publications having a large number of Subscribers (for example, Subscribers using the Internet). Pull subscriptions are also best for autonomous mobile users because they allow the user to determine when the data changes are synchronized. A single publication can support a mixture of push and pull subscriptions.
The above replication components are implemented using a modular design. You can install these components on separate computers to balance workloads and minimize SQL Server replication’s effect on server performance.
In addition to the basic components, your replication design may have two or more replication agents:
Prepares schema and initial data files of published tables and stored procedures, stores the snapshot on the Distributor, and records information about the synchronization status in the distribution database. Each publication has its own Snapshot Agent that runs on the Distributor and connects to the Publisher. The Snapshot Agent is run typically under SQL Server Agent and can be administered directly using SQL Server Enterprise Manager.
Moves transactions marked for replication from the transaction log on the Publisher to the distribution database. Each database published using transactional replication has its own Log Reader Agent that runs on the Distributor and connects to the Publisher.
Moves the transactions and snapshot jobs held in distribution database tables to Subscribers. Transactional and snapshot publications that are set up for immediate synchronization when a new push subscription is created each have their own Distribution Agent that runs on the Distributor and connects to the Subscriber. Transactional and snapshot publications not set up for immediate synchronization share a Distribution Agent, across the Publisher/Subscriber pair, that runs on the Distributor and connects to the Subscriber. Pull subscriptions to either snapshot or transactional publications have Distribution Agents that run on the Subscriber instead of the Distributor. Merge publications do not have a Distribution Agent. The Distribution Agent runs typically under SQL Server Agent and can be administered directly by using SQL Server Enterprise Manager.
For merge publications, moves and reconciles incremental data changes that occurred after the initial snapshot was created. Each merge publication has its own Merge Agent that connects to both the Publisher and the Subscriber and updates both. In a full merge, the agent first uploads all changes from the Subscriber where the generation is 0 or is greater than the last generation sent to the Publisher. The agent gathers the rows, and those rows without conflicts are applied to the publishing database. The rows with conflicts are handled by the conflict resolver associated with the article in the publication definition. All changes are applied using stored procedures derived from the Publisher tables at the time the snapshot is generated or first applied. Finally, the agent reverses the process by downloading any changes from the Publisher to the Subscriber and applying the changes to the subscribing database. Push subscriptions to merge publications have Merge Agents that run on the Publisher, while pull subscriptions to merge publications have Merge Agents that run on the Subscriber. Snapshot and transactional publications do not have Merge Agents.