Q&A on OCS & Sync Services for ADO.NET
Not surprisingly we've been get a lot of great questions about specific features and scenarios for our new Sync Services for ADO.NET (OCS). Rafik has been fielding most of these on the Sync Services forums. Since the Q&A for SQLce seemed popular, I thought I'd do the same here.
Q: How does Sync Services compare to Merge Replication and RDA?
A: Merge replication is our developer oriented, feature rich, SQL Server based database "replication" product. It's designed for a DBA to configure a publication on the database exposing a set of tables (articles) enabling filtering and rules. Clients then "subscribe" to the publication and receive a local database reflecting the publication. It's a very powerful and mature product that we will continue to invest in. From the server side (publications), Merge Replication is a SQL Server only feature that is available in our SQL Server Workgroup, Standard and Enterprise SKUs. Merge Replication, as a publisher, is not available in the free SQL Server Express SKU. Merge replication supports 2tier sync and sync over HTTP(s) to enable internet synchronization. However, I wouldn't say that Merge is SOA oriented in that you don't have much control of what's sent or received across the wire, nor do you have the ability to support additional transports. It's a fantastic end to end replication product that handles a lot of complicated scenarios.
Remote Data Access (RDA) is a developer oriented, simple but RAD sync technology. It works over HTTP, so it enables internet protocol sync, but again, it's not very SOA friendly. While the client doesn't directly open a connection to SQL Server, it does require the client provide the connection string and query to be executed on the server. So, it pretty much violates the notion of keeping the server information from the client. While it is limited, you can't deny it's popularity for it's simplicity. We continually hear many people implementing RDA rather than merge as theire requirements are simple and RDA just worked great. Which brings me to Sync Services for ADO.NET
Sync Services for ADO.NET is our answer to this dilemma of having to choose between the dba focused powerful features of Merge, the simplicity of RDA, and the developer motivation for them to write their own. When designing Sync Services we used RDA as our user model. Debra Dove and her team did an excellent job with RDA, and we wanted to take it to the next level. We used the provider and developer programming model of ADO.NET and wanted to leverage all the great transport, security and protocol work others were doing. Essentially, rather than build additional solutions to existing problems, we wanted to scope the solution to what we needed to solve, and leverage others who were experts in their field. The Sync Services for ADO.NET features are built by the same team that delivers Merge Replication. I've been truly amazed at the knowledge these guys have. It's because of their experience that I have a hard time calling Sync Services a 1.0 product. It's really a culmination of all the great work from Merge, RDA, file synchronization, and even the WinFS sync work.
The quick bullet for Sync Services is it's a componentized synchronization framework, built on ADO.NET, but factored to provide a common sync platform for Entities, Files, and other formats. Sync Services for ADO.NET is our first delivery in our Microsoft Synchronization Platform.
The key advantages of Sync Services is its developer focus, w/DBA empowerment. Rather than force your DBA to understand all the details of how you're going to synchronize data within your application, you engage the DBA for the portions that touch the server database. The transports are up to you, and we have a great WCF designer integrated story, but you can use other transports as well. The local database schema/structure is up to the developer to decide as it's their apps database. Of course you can engage your DBA as well, but you don't get what you get as the result of a synch operation. You can use just the client components and use Java and Oracle on the server, or you can use just the server components and other technologies on the client. If your data is locked up in Oracle, you can still use Sync Services to synchronize that data directly from Oracle to your client. You don't have to put SQL Server in the middle just to enable sync.
The point here is regardless of what cards you've been dealt, we want to help you get your app done to make your users happy. The more Microsoft products you use, the better the experience we can provide, but we're not locking you in, or out of our platform just because something is out of your control.
Q: What's the roadmap for Merge, RDA and Sync Services
Merge replication will continue to be our database replication product and will have Katmai investments in the next release. Merge will be the DBA tool for replicating a database, and those that are already using Merge shouldn't feel like we're abandoning them by any means. If Merge is working for you, don't worry, we're continuing to take feature work and making improvements. In fact, many of the Sync Services features will work into Merge as well. That's about all I can say for now. We'll announce more about Katmai later on.
RDA will eventually be phased out. We truly believe the Sync Services features address the simplicity of RDA, but don't have any of the limits imposed by RDA. Specifically, incremental changes, ability to synchronize several tables in one transaction handling all the interleaving of inserts, updates and deletes. Support for other ADO.NET Providers, etc. If you're already using RDA, we're not killing it yet, but we won't be doing any new work there either. We do expect to deprecate it within the next release or so as Sync Services are released. If you haven't yet deployed RDA, but are looking into it, definitely look at the Sync Services CTP. If you need to go into production now and can't wait ‘till Sync Services are released, than by all means use RDA. It's not like it has that much functionality that you won't be able to easily switch to Sync Services later on. <g>
Sync Services is where we're doing a lot of our investments. It's just the first delivery in our new Microsoft Synchronization Platform. I'll write a different blog post on our naming. (I love discussing naming, ...not). Sync Services are our developer oriented, SOA enabled data synchronization features. We won't have all the database replication features of Merge, as we really focus on synchronizing data. Sync Services provides and end to end story, but is a componentized model allowing you to get in the middle, or completely replace one end.
Q: Can I use Merge and Sync Services together?
A: As we've all seen with our SQLce and Express discussions, one size doesn't fit all. But we also don't believe you need 10 ways to solve the same problem. Just as with Express being our entry point to our Data Service platform and SQLce being our client/embedded platform, we expect Merge and Sync Services to address the needs of two types of scenarios. It's likely that some scenarios, such as a branch office may actually use both Merge and Sync Services. Merge to replicate data between the branch office and the corporate office, and Sync Services to enable branch workers to go out into the field.
Q: Does Sync Services support N Tier?
A: Absolutely. The beauty about the N, is it could imply 1 to many. With Sync Services we have Server and Client Providers. Actually, we like to think of them as local and remote providers as we currently focus on hub/spoke, but we'll be enabling p2p as well in the future. You can start with 2 tier and move to N tier. You can intermix 2 tier and N tier based on where the clients are connecting from. When you first start working with Sync Services you may ask why you have to specify the tables your synchronizing on the client and server providers. That's because we want to make sure you can easily split the client and server code to different tiers. All the "intimate" knowledge of the server can easily be moved to the mid tier, while the client provider configuration is maintained on the client. Even the Sync Designer allows you to easily split the client and server provider code. It's quite sweet when you see it. Screen cast coming soon...
Q: What transports do you support?
A: What do you want? As noted above, the sync team wants to get out of the transport business. We have many bright people in Microsoft thinking about those problems. In Orcas the Visual Studio designers will focus on enabling WCF, but you can use Web Services, SSE, SyncML or if you can figure out how to convert Jelly Beans to .NET objects, you can use those as well. Essentially, you simply plug in a matching service and a proxy, and you're good to go.
Q: Do you support column level tracking?
A: Not yet. For this first release, we only support row level tracking. We are working on column level tracking, but it does require additional functionality on the server, and we wanted to make it easy for developers to start synchronizing data from their existing databases. As with any good database design, you should think about how you partition your tables. For various reasons, it's good to separate images, or other large blobs from your primary table into other tables and maintain a 1:1 relationship. It's not meant to be an excuse, and we will be implementing column level tracking in the future. Using features like ADO.NET V3 entities you can roll up the 1:1 mappings into a single object allowing your developers to work in a normal fashion, while managing performance and isolation in your database.
Q: Do you support custom conflict resolvers?
A: Of course. How could we not? On both the server and client providers we expose a conflict event. In that event you'll get the client and server rows that represent the conflict. You can simply say force overwrite, or implement very custom logic. Such as determining if the person who made the change is a manager, or owner of that particular customer account. Based on that custom logic, you simply make the logical change. All this is in managed code, with your favorite .NET language.
Q: Do you support partitioning/filtering?
A: Of course. We don't really expect people to synchronize terabytes of their data to all their clients. The partitioning is the normal horizontal and vertical partitioning. You simply provide the query that represents the filter you wish to support. You can do joins, etc. It's just a query. The client sends up as many parameter values as you need. There's no limitation on the number of parameters. In fact, you can even intercept calls on the server and set sync parameter values based on other logic. In WCF you can determine who the client is, and based on that info, send them their customers without exposing the SalesPersonId to the client to substitute another value.
Q: Does the server track each client individually?
A: No. This is one of the major differences when compared to Merge. One of the powerful features of Merge is it knows all its clients, so when the client connects, it already has the data ready for it, and easily supports "data repartitioning". In Sync Services, the server has no idea who all the clients are. The benefit here is Sync Services don't have the same scalability constraints. You can synchronize as many clients as your server can handle queries. The server doesn't necessarily know it's a publisher, but rather it's just answering queries.
Q: Does Sync Services support multiple publications
A: Yes/no. Sync Services doesn't utilize the pub/sub model per se. You can configure the server provider to offer 20 tables you want to synchronize. The client simply say it cares about 3. Another client cares about a different 3. Another client cares about 4, which overlap the first two clients. In fact, we also support a client dynamically adding tables. A sales person may cover another sales person for a week and needs to bring in an additional product line. Within the app, the developer can change the filtering query, and off they go.
Q: How are schema changes handled?
A: Unlike merge which is geared around replicating a database, Sync Services is geared around synchronizing data. I'm not a big believer that generally speaking the DBA simply adds a column to the server and the UI automatically updates on the client and life is good. While it can be done, most of the time I'd bet you want some control over where and how the new element is displayed, add some interaction logic to the client, tab order etc. We really treat schema updates as an app update. It's a holistic update of the app overall. The model we've gone with the Sync Services are the following:
- A new requirement is defined, say AddressLine3. The DBA would add the column to the server. All the normal rules apply. If the column is non-nullable, than a default should be provided.
- The developer involved with the sync layer would most likely create a new version of the Sync Service, say v2. This means that apps that were using v1 can be slowly migrated, or at least be migrated within some level of control. If the user is in the middle of an important deal, the last thing they need is a forced software upgrade. Ever been in the middle of something important and IT forces an update that reboots your computer or app? Software is an enabler, it should help me achieve my goals, not fight me because IT thinks its important now.
- The app developer updates their service proxy to point to v2 of the sync service, exposing the extra column.
- In the version check code, the app author can either choose to reset the table, or they can execute the alter table script locally adding the additional column. They may even bring down a single data call to retrieve the values for the new column on all the existing rows.
- The developer than decides what they want to do with the new element, updating their ui, logic etc.
So, while we didn't implement something as simple as point click, we think it tends to fit the SOA model where apps may consume services from other apps, and they should have control over how and when they consume new schema.
Q: How are constraints, keys, and other db objects brought down to the client?
A: This again falls in the category of Sync Services is about synchronizing data, not replicating a database. Sync Services does do some schema and even database creation with SQL Server Compact Edition. If you're starting from scratch, and you first synchronize, the SQLce database will be created based on the connection string properties, name, encryption, password, etc. It will then create all the tables the client has said they're interested in. Remember, just because the server exposes 20 tables, doesn't mean the client must use all of them. The client determines which tables it wants to consume with the SyncTable collection. When the tables are created, primary keys are created, datatypes are mapped to the clients datatypes, and nullability is applied. No additional indexes, constraints, defaults, etc. are applied. There are SchemaCreating/ed events fired where you can either initial create the schema to be used, or alter the schema after the tables are created.
Q: Does sync services handle parent/child/grandchild relationships?
A: Yes. Unlike RDA where you can only sync one table at a time, Sync Services handles the hierarchical nesting of inserts, updates and deletes. In fact you can even control it seperatly on the server from the client. On the server, tables are placed in the SyncAdapter collection. The order of the SyncAdapters defines the order by which updates will be applied. Inserts and Updates are done from the top down, while deletes are done from the bottoms up. The same is done on the client, in the SyncTables collection. This allows the server to control its order, while allowing the client to control its order of updates.
Q: Can I update only a few tables at a time?
A: Yes. Say you want to only synchronize your lookup tables, states, codes, etc. once a day. You can create a specific service just for your lookups, and another for your product catalog.
Q: Can I update everything in a single operation, or can I control things more granularly?
A: Within the SyncAgent, you can utilize the SyncGroup to determine the grouping of updates. In the previous example, you may choose to put all the lookup tables in their own individual groups. If the connection drops while you're synchronizing your lookups, it can pickup where it left off next time it synchs. However, when synchronizing Orders, you probably don't want Orders to ever go up/down without OrderDetails. Simply put the Orders and OrderDetails table in the same SyncGroup, and you're all set.
Q: Does Sync Services support batching for large data sets?
A: Yes, but not quite yet. We initially scoped this out of the first release, but we believe we'll be able to get it in, so look for it sometime around March 07.
Q: How does Sync Services track changes?
A: Sync Services uses an Anchor based model. Each time a sync operation occurs it gets a reference mark from the server. It could be the servers DateTime, or a TimeStamp (RowVersion). The client saves that value for the next sync operation. Each time the client synchronizes a particular SyncGroup, it first requests the server anchor. It than executes the queries on the server using the last anchor as the low range, and the new anchor as the high range. This gets a consistent set of changes across several queries. In future releases of the Microsoft Synchronization Platform we'll be supporting a knowledge based sync model as well as the anchor based model discussed here. Rafik does a great job explaining it in his blog
Q: How are deletes purged?
A: On the server, deletes are either kept in a tombstone table, or simply tracked by some sort of active/status flag in the primary table. Since this version of Sync Services isn't tightly coupled to SQL Server, we actually don't do anything. In general, we'd expect the DBA to write a scheduled task to purge tombstone records on their determined interval. You can expect us to do more in the "future", tease, tease, tease...
On the client, SQLce purges deleted records once it confirms data has been sent to the server.
Q: Can I purge old data on the client without triggering a delete on the server?
A: Yes. While we don't have a simple API to do this today, you can delete a bunch of rows on the client based on what ever criteria you decide, than simply "AcceptChanges" on the client prior to these changes being sent to the server. Of course you could intercept these on the server an toss deletes as well to protect your server data.
Q: Does Sync Services support low bandwidth type sync scenarios?
A: By low bandwidth, I mean can I synchronize only the important things now, and catch up later. Yes. You can either upload only, download only, or synchronize just a particular SyncGroup based on your own logic at the time.
Q: When will Sync Services ship?
A: Sync Services for ADO.NET will ship at the same time Visual Studio Orcas ships. This is currently scheduled for Q4 2007. Note: This is not meant to be the official place to get the timeframe for Orcas, but rather just saying that our current plan is to ship Sync Services within SQL Server Compact Edition 3.5, which will ship with Orcas.
Q: Will Sync Services ship on both the desktop framework and the .NET Compact Framework?
A: Yes, but at different times. At current, March 16th '07, we are scheduled to ship for the full framework, but we are not planning on shipping Sync Services for the device platform in the Orcas product. We do plan to ship the client components for Sync Services soon after Orcas, but are still working out the schedule. The problem is the various .NET Compact Framework teams, including the Visual Studio for Devices teams and Sync teams have a lot of work to manage with many different device platforms and a very short schedule, and we haven't been able to get all the appropriate ship level test coverage complete. We have designed, and done preliminary testing with the client components working and synching over Web Services. We are still hopeful we can pull it in, but at this point, we're just not ready to commit to Orcas, but rather shortly thereafter.
Q: Will Sync Services or SQL Server Compact Edition be in the .NET Framework, or .NET Compact Framework?
A: No, we are not shipping within either framework, but rather shipping as an add-on component. Why? Because we wanted more flexibility with our ship schedule. SQLce will ship 2-3 times between .NET 2.0 and .NET 3.5. While it would be nice to ride the distribution of the frameworks, with the embedded/private deployment options of SQLce and Sync Services, we felt it was better to have more flexibility with our schedule.
Q: Are these the only questions?
A: I doubt it. So, keep them coming, and I'll update this FAQ as I receive them.
Thanks for all the great questions. Keep them coming as they help us make sure we're shipping the right features in the right order.