Integrating Data from Multiple Sites (Client)

Many companies have regional offices or entities that collect and process data that must be sent to a central location. For example:

  • Inventory data can be consolidated from a number of servers at local warehouses into a central server at corporate headquarters.

  • Information from autonomous business divisions within a company can be sent to a central server.

  • Order processing information from dispersed locations can be consolidated.

In some cases, data is also sent from the central site to remote sites. This data is typically intended to be read-only data at the remote site, such as a set of product inventory tables that are only updated at a central site.

The following diagram illustrates a typical scenario with data flowing in two directions between a central site and remote locations:

Replicating data to regional offices

In this diagram, data first flows to a hub before flowing to the regional offices served by that hub. It is also possible for the data to flow directly between the central site and regional offices if the organization does not have an intermediate layer.

Adventure Works Cycles Example

Adventure Works Cycles is a fictional manufacturing company used to demonstrate database concepts and scenarios. For more information, see AdventureWorks2008R2 Sample Databases.

Adventure Works Cycles has a large number of sales offices around the world. Each sales office collects data from its regional sales staff. This data is transmitted to regional hubs and then to the central site at the end of each business day. Data is also transmitted from the central site out through the hubs to each sales office, so that the sales office has the latest information on prices and promotions.

Common Requirements for This Scenario

Regional office applications typically have the following characteristics, which an appropriate replication solution must address:

  • Data is entered and updated at a central site and at remote sites.

  • Remote users must be able to make updates independently, without requiring a connection to the central site.

  • Because multiple users might update the same data independently, data conflicts can arise and must be handled.

  • Some data should only be updated at the central site, for example data in product pricing tables.

  • Users should be able to synchronize data on demand or at scheduled times.

  • The application must control how long a remote site can remain unsynchronized.

  • Some tables require filtering so that each user receives different data for one or more tables. For example, a regional office receives contact information only for the customers in the office's region.

  • Some data must be treated as a unit when it is transferred between sites. For example, if an order is sent from a remote user to the central site, the order header must be committed prior to the order details.

  • The application might require custom business logic to be executed when data is synchronized.

  • The application might require that data be synchronized over the Internet rather than through a dedicated connection.

  • The business might be organized such that data flows through one or more intermediate layers between the central site and remote sites (as in the diagram earlier in this topic).

The following diagram illustrates the filtering associated with this scenario:

Filtering for regional office applications

The Type of Replication to Use for This Scenario

Microsoft SQL Server uses a publishing industry metaphor to describe the components of the replication system. The components include the Publisher, Subscribers, publications and articles, and subscriptions. In the diagram above, the central site is the Publisher. The data at the central site is the publication, with each table of data being an article (articles can also be other database objects, such as stored procedures). Each hub is a Subscriber to the publication, receiving schema and data as a subscription. The hubs then republish the data, and the regional offices subscribe to this data. For more information on the components of the system, see Replication Publishing Model Overview.

SQL Server offers different types of replication for different application requirements: snapshot replication, transactional replication, and merge replication. This scenario is best implemented with merge replication, which is well suited to handle the requirements outlined in the previous section. For more information on merge replication, see Merge Replication Overview and How Merge Replication Works.

Important

It is possible to implement this scenario in two ways: the central office is a Publisher and the remote offices are Subscribers, or the central office is a Subscriber and the remote offices are Publishers. Merge replication does not support central Subscriber topologies. Even if all changes are occurring at the remote sites, the central office should be configured as the Publisher with the remote sites as Subscribers. A similar scenario can be implemented with transactional replication using a central Subscriber topology. If your application does not require conflict resolution or filters that provide each remote site with a unique set of data, consider using transactional replication. For more information, see Integrating Data from Multiple Sites (Server).

Merge Replication Options Relevant to This Scenario

Merge replication offers several options to address the requirements described earlier in this topic. The following list presents each requirement and the merge replication options that address it.

  • Data is entered and updated at a central site and remote sites.

    Merge replication provides this ability without specifying any separate options.

  • Remote users must be able to make updates independently, without requiring a connection to the central site.

    Merge replication provides this ability without specifying any separate options.

  • Because multiple users might update the same data independently, data conflicts can arise and must be handled.

    Merge replication provides conflict detection and resolution for cases in which data conflicts are expected. It is best to design applications to avoid conflicts, but where this is not possible you can select the default conflict resolution mechanism (first in wins) or use custom conflict resolution. For more information, see Detecting and Resolving Merge Replication Conflicts.

  • Some data should only be updated at the central site, for example data in product pricing tables.

    Merge replication provides download-only articles for those tables that should be updated only at the Publisher. For more information, see Optimizing Merge Replication Performance with Download-Only Articles.

  • Users should be able to synchronize data on demand and at scheduled times.

    Replication offers two subscription types: push subscriptions and pull subscriptions. Pull subscriptions are better suited to on demand synchronization. For more information on subscription types and scheduling synchronization, see Subscribing to Publications and Synchronizing Data.

  • The application must control how long a remote site can remain unsynchronized.

    Merge replication allows you to set a subscription expiration period to ensure that all Subscribers have synchronized within a certain amount of time. For more information, see Subscription Expiration and Deactivation.

  • Some tables require filtering so that each user receives different data for one or more tables. For example, each sales person might receive contact information only for her customers.

    Merge replication allows you to filter columns and rows. Row filters can be static or parameterized. A static filter is applied only when a publication is created; it results in one data set. A parameterized filter is applied every time a Subscriber synchronizes; it results in a different data set for each Subscriber. Regional office applications often use parameterized filters, but could also use static filters. For more information, see Filtering Published Data for Merge Replication.

  • Some data must be treated as a unit when it is transferred between sites. For example, if an order is sent from a remote user to the central site, the order header must be committed prior to the order details.

    Merge replication allows you to specify that a set of related tables must be processed as a unit. This unit is referred to as a logical record. For more information, see Grouping Changes to Related Rows with Logical Records.

  • The application might require custom business logic to be executed when data is synchronized.

    Merge replication allows you to specify code to be executed during synchronization. This code can respond to a wide range of events and has access to the data that is being synchronized. For more information, see Executing Business Logic During Merge Synchronization.

  • The application might require that data be synchronized over the Internet rather than through a dedicated connection.

    When using (SQL Server Compact 3.5 SP2), data is synchronized over an HTTP or HTTPS connection. For other editions of SQL Server you can use Web synchronization, which requires HTTPS. For more information, see Web Synchronization for Merge Replication.

  • The business might be organized such that data flows through one or more intermediate layers between the central site and remote sites.

    Merge replication can accommodate this requirement through republishing, an approach in which a central Publisher publishes data to one or more Subscribers, which then publish the data out to other Subscribers. For more information, see Republishing Data.

Steps for Implementing This Scenario

To implement this scenario, you must first create a publication and subscriptions, and then initialize each subscription. Click the links below for more information about each step:

After the subscription is initialized and data is flowing between the Publisher and Subscribers, you might need to consult the following topics for information on common administration and monitoring tasks: