Distribution Server Recovery

Distribution servers support automatic recovery. If the distribution server is taken off-line, when it again becomes available, all of the distribution processes will continue from their last successful operation. For each distribution task, the distribution server connects to the subscription server to find the last transaction that was successfully applied and continues distributing replicated transactions from that point forward.

In a similar way, the replication log reader task works to automatically recover the distribution server. Each distribution server stores a pointer into all of the associated publication server transaction logs. This provides the log reader task with access to the last distributed transaction which was successfully transferred.

During recovery, if this transaction is found to exist within the transaction log of the publication server, the log reader task will automatically recover the distribution server by moving transactions into the distribution database from that point forward.

Important Setting up a coordinated server backup scheme, as described in Replication Backup, maximizes the probability that the last distributed transaction will be available within the transaction log of all publication servers. This allows automatic recovery of the distribution server to occur even in the event of a major server failure requiring the recovery of the distribution database from a backup tape.

If a situation occurs where the distribution server needs to be rebuilt using an older backup tape, the distribution server may not contain a valid pointer to the last distributed transaction for the publication server. If this happens, the log reader task will log an error and reschedule itself in retry mode, which will keep the automatic recovery of the distribution server from succeeding. The following steps describe how the distribution server can be successfully recovered from this condition, without disturbing any of the existing publications.

    To recover a distribution server when the log reader task is in retry mode
  1. Unsubscribe and then resubscribe all subscribers to all publications of the publication server that has caused the log reader failure.

    Be careful during subscription to select the original destination database and initial synchronization option for each subscription server.

  2. In SQL Enterprise Manager, select the distribution server, and then from the Tools Menu, choose Task Scheduling.
  3. In the Task Scheduling window, select a cleanup task associated with a subscription server that has been resubscribed, and then choose the Run Task button. This immediately executes the cleanup task, removing all old transactions from the distribution database.

    Repeat for all subscribers that have been resubscribed.

Once all old transactions are removed from the distribution database, the invalid pointer into the publication server's transaction log will no longer exist and the log reader task will able to successfully execute. All subscription servers will have their replicated tables synchronized with the current state of the publication server, and replication will continue.