Tuning and Monitoring Performance
We recommend the following approach to synchronization performance:
Configuring each server and database, and the application code to achieve best performance
Developing performance baselines
Monitoring and tuning to meet or exceed baselines
The first step in performance tuning is to ensure that hardware and software are configured appropriately.
Server and Network Considerations
Ensure that each computer has an adequate IO subsystem, and that database files are properly allocated.
The speed of reading and writing changes to disk is typically more important than network speed, so an adequate IO subsystem is essential. It is recommended to use multiple RAID disk arrays, and to have one dedicated array each on the server for tempdb, user databases, and transaction logs. tempdb, user databases, and transaction logs should be created in separate file groups. If one or more user tables prove to be a processing bottleneck, consider moving them to separate file groups.
Consider adding memory to computers in synchronization topologies.
This is particularly important for those servers that are involved in a large number of concurrent synchronization sessions. Synchronization sessions typically involve long-running transactions; therefore it is important to have a significant amount of memory that is directly addressable by the server database. In the case of SQL Server, consider using the /3GB switch for 32-bit systems or moving to a 64-bit system. For more information, see "Process Address Space" in SQL Server Books Online.
Set the minimum and maximum amount of memory allocated to the server database.
For example, by default, the Motor de base de datos changes its memory requirements dynamically based on available system resources. To avoid low memory availability during synchronization activities, use the min server memory option to set the minimum available memory. To avoid having the operating system page to disc for memory, you can also set a maximum amount of memory with the max server memory option. For more information, see SQL Server Books Online.
Database and Application Design
Follow best practices for database design.
A database involved in synchronization generally benefits from the same performance optimizations as any similar database. For more information about optimizing databases, see SQL Server Books Online. Be aware of the following considerations for indexes:
Indexes on base tables should be tested with synchronization activity in mind, because indexes can affect select, insert, update, and delete performance.
Metadata tables should be indexed appropriately. Sync Framework adds indexes to any tables that it creates. If you create metadata tables for DbSyncProvider, see How to: Provision a Server Database for Collaborative Synchronization (Non-SQL Server); and for DbServerSyncProvider, see How to: Use a Custom Change Tracking System.
Set appropriate values for database lock timeout and query timeout.
Minimize conflicts through publication design and application behavior.
Applications should be designed to avoid conflicts if they can, because conflict detection and resolution introduce additional complexity, processing, and network traffic. The most common ways to avoid conflicts are as follows:
Update a table at only one node, or
Filter data so that only one node updates a particular set of rows. For example, in a client-server scenario, you could horizontally partition data across clients so that each client changes only one "slice" of data, such as contact information for customers in Washington.
Use filtering judiciously.
Filtering data is a great way to reduce conflicts and to copy less data to each node. However, be aware that complex filtering clauses can affect performance. If a filter joins many tables, consider other ways to implement the same logic.
Be cautious with application logic in triggers and synchronization queries.
Executing additional logic in each query and trigger can significantly slow down performance.
Use stored procedures for database commands.
Sync Framework uses several queries to select and apply data and metadata changes during a synchronization session. If you create these queries manually, encapsulate them in stored procedures. This typically provides better performance and maintainability, and can also help to secure applications. If your application requires inline SQL instead of stored procedures, be sure to use the ADO.NET Prepare() method for all commands. This improves performance for inline SQL.
Use bulk application of changes
Sync Framework uses table-valued parameters to apply inserts, updates, and deletes to SQL Server 2008 and databases. This feature allows multiple changes to be applied to the database by using a single stored procedure call. This greatly increases performance and reduces the number of round trips between client and server during change application. If bulk application is not available, Sync Framework falls back to use single item application. If you do not want to apply changes in bulk, pass false to SqlSyncScopeProvisioningSetUseBulkProceduresDefault(Boolean) before you provision your database, and the bulk procedures will not be created.
Synchronize only the data that is required at each node.
It can be tempting to publish all or most of the tables in a database, "just in case". Avoid publishing tables that aren't really required by an application, and consider synchronizing each node less frequently.
Design synchronization groups and scopes appropriately, and use the appropriate synchronization direction for each scope.
A scope is a set of tables that are synchronized as a unit. One or more of the tables in a scope can be filtered, and tables can be included in more than one scope. Ensure that scopes reflect the structure and usage patterns of the tables that they contain. Consider the following four tables in a sales force application:
Orders and OrderDetails
Rows are inserted into these tables only at a client computer, but might be updated at the server. Changes must be committed in the same transaction. These tables should be in the same scope, with a synchronization direction of bidirectional.
Rows are frequently inserted and updated, but only at the server. This table and tables similar to it should be in one scope, with a synchronization direction of download-only, from the point of view of the client.
Rows are infrequently inserted and updated at the server. This table should probably be in a separate scope from Pricing because it is not updated at the same frequency and can probably be synchronized less frequently.
Synchronization direction can be changed for each session, whereas scope persists across sessions. There is no direct connection between scope and synchronization direction, but this example illustrates how to consider both when designing an application.
The query that is specified for this property selects the maximum timestamp from each base table or tracking table. This timestamp is used to determine whether for each table the destination already has all of the changes from the source. If the destination already has the changes, Sync Framework can often avoid running enumeration queries, which can improve performance.
Use batching to compensate for unreliable networks and low memory issues.
By default, Sync Framework delivers changes to each node in a single DataSet object. This object is held in memory as changes are applied to a node. The default behavior works well if there is sufficient memory on the computer where changes are applied, and the connection to the computer is reliable. Some applications, however, can benefit from having changes divided into batches. Batching does introduce additional overhead, but it can actually be a performance benefit for some applications. For more information, see How to: Deliver Changes in Batches (SQL Server).
Stagger synchronization schedules.
If a large number of nodes synchronize with a central node, stagger the schedules to reduce memory pressure and contention at the central node. Schedules can be based on clock time or on relative time. For example, an application could synchronize every hour or it could start a synchronization session one hour after the last session for that node had successfully completed.
Use appropriate metadata cleanup schedules.
Large amounts of metadata can affect the performance of synchronization queries.
After synchronization is configured, we recommend that you develop a performance baseline, which allows you to determine how synchronization behaves with a workload that is typical for your applications and topology. Use synchronization events and System Monitor to determine typical numbers for the following five dimensions of synchronization performance:
Latency: the amount of time it takes for a data change to be propagated between nodes in a topology.
Throughput: the amount of synchronization activity (measured in rows delivered over a period of time) that a system can sustain over time.
Concurrency: the number of nodes that can synchronize with a particular node simultaneously. This is often the number of clients that can synchronize with a particular server.
Duration of synchronization: how long it takes a given synchronization session to complete.
Resource consumption: hardware and network resources used as a result of synchronization processing.
Depending on your application, some of these performance measures might be more important than others. For example, it might be acceptable to have relatively low throughput as long as a high level of concurrency can be maintained. In establishing a baseline, be aware that Sync Framework is not designed as a low latency high-throughput server-to-server system like SQL Server transactional replication. It is designed for client-to-server and client-to-client synchronization that supports offline and collaborative applications.
After you have established baseline numbers, monitor the system looking for performance and scalability bottlenecks, and tune as necessary.
Monitoring and Maintenance
Monitoring can be used during the development of performance baselines, periodically in production environments, and more intensively if a performance issue arises. We recommend the following tools for monitoring synchronization activity and performance:
Use the following events to determine how much time particular phases are consuming:
Per-table events on each provider: SelectingChanges, ChangesSelected, ApplyingChanges, and ChangesApplied
Per-session event on each provider: SyncProgress
Hold the time for each phase in memory, and then write the results to a log file or performance database after the synchronization session ends.
SQL Server Profiler
Use the TSQL_SPs template, and identify any queries that are taking longer than a particular threshold, such as 10 seconds. If you write trace information to a log file or performance database, hold the data in memory and perform writes after the synchronization session ends.
SQL Server Management Studio
Management Studio can be used to check the query plan for each synchronization query, to ensure that the most optimal plan is used.
Use the following performance objects and counters to monitor areas that are important for synchronization:
Counters under SQL Server: Memory Manager. For example, you can verify whether SQL Server is using the memory available to it by comparing Target Server Memory and Total Server Memory.
Counters under PhysicalDisk. For example Avg. Disk Read Queue Length and Avg. Disk Write Queue Length help to identify if synchronization performance is IO bound or the computer is running out of memory. If queue lengths are unreasonable, consider adding memory and upgrading or adding drives.
The default is for the counter to report averages across all drives. Be sure that you add a counter for each drive.
Counters under SQL Server: Transactions. For example, Snapshot Transactions and Version Store Size can be used to determine whether change enumeration queries are causing a lot of tempdb growth. The change enumeration queries use snapshot transactions, and snapshot version information is stored in tempdb. A large version store means that tempdb might need to be resized.
Counters under SQL Server: Locks. For example, Lock Wait Time and Number of Deadlocks can be used to determine if contention is an issue during concurrent activity in the database.
Sync Framework includes tracing that is based on the TraceListener class. Tracing can be used to collect information about synchronization sessions, which can be helpful for monitoring and troubleshooting. However, be aware that tracing does introduce additional overhead, and it should be used primarily during development. For more information, see Tracing the Synchronization Process (This topic focuses on DbServerSyncProvider, but the information is applicable to other providers also).
In addition to monitoring, we recommend that you perform the following maintenance tasks on a regular basis:
Depending on the level of defragmentation, reorganize or rebuild indexes on base tables and metadata tables.
Update index statistics.
Clean up synchronization metadata.