VSTO 2.0 and Cached Data: Goodbye Hidden Sheets, Hello Server!

Paul Cornell blogs about cached data in VSTO 2.0.


I gotta tell you—cached data is one of features I’m most excited about in VSTO 2.0. First some background: 


Office programming is different from traditional WinForms programming in that you are writing code that is associated with a document—a document that can move from person to person, from client to server, from server to client, etc. People don’t typically e-mail around WinForms programs, but they e-mail Office documents constantly. When they e-mail an Office document, they don’t expect to have to mail anything but the document—they don’t want to mail the document and say a related config or data file at the same time.


Also, it is assumed that an Office document can be used when you are offline—we often talk about the “on the plane” scenario. People expect Office solutions to work offline—they don’t always expect WinFoms or Web applications to work offline.


Finally, a lot of developers want to manipulate the contents of an Office document on the server without starting up Word or Excel. The new Office XML formats give you one way to do this (which is very cool and sexy), but generating Word XML or Excel XML on the server is complex. We wanted to provide developers with another way to populate an Office document on the server that doesn’t require them to learn the Word or Excel XML formats.


The cached data feature in VSTO 2.0 was designed around these aforementioned ideas: you always have a document associated with your application, documents move around, it is expected that a document will work offline, and that developers want an easier server story for Office to generate documents on the server.


So what is cached data? Quite simply, cached data lets you embed arbitrary data in a data island that VSTO 2.0 creates and manages for you inside the Office document. We see Excel developers do something similar to this all the time—they often use a “hidden sheet” to store additional data needed by their application.


We are now giving you the equivalent of this “hidden sheet” to store any data you want to associate with your Office document. Some of the data you store in the document will be data bound into the Office document and displayed all the time. Other data you store in the document might not be displayed all the time in the document or might be used for other purposes. For example, you can imagine that you might have a customers data set cached in a document, but you only data bind the customer’s name, address, and phone into the spreadsheet or document. Still, you keep the entire data set cached because you need to have the customer ID or other information.


It is super easy to start using cached data in VSTO 2.0. Here is the simplest code example in VB.NET:


Public Class ThisDocument

    <Cached()> Public myCachedData As DataSet

    Private Sub ThisDocument_Initialize(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Initialize

      If myCachedData Is Nothing Then

            myCachedData = New DataSet()


        End If

    End Sub

End Class

This is some simple code behind a Word document. In my code, you will notice that I declared a dataset (myCachedData). Three critical things to note:


1) myCachedData is declared as a field. Only fields can be used with the VSTO 2.0 cached data feature—a temporary variable cannot be cached.

2) I declared myCachedData but did not create an instance (that is, I didn’t write the code <Cached()> Public myCachedData as DataSet = New DataSet()). This is also important as we will see later.

3) Finally, I added an attribute to my declaration, the “Cached()” attribute. This is a way to mark this field so the VSTO 2.0 runtime knows to cache it in the document's data island.


Now, on to my code to handle the Initialize event (which fires each time the document is opened).


The first thing I do in my event handler is check if myCachedData Is Nothing. Why would I do this? Well, the first time this document runs, myCachedData will be nothing as I would expect. So I create a new instance of the data set and fill it in this example with an xml file containing customer information.


But what will then happen is when the user closes this document, Word will detect that the cached data has been changed and it will prompt the user to save the document. The user will save the document and along with the obvious contents of the document, the state of my data set (and any other fields I have marked as cached) will be saved into the document too.


On the second time the document is run (imagine it is now run on an airplane), the VSTO runtime will start up the customization again, but this time it notices that it has cached “myCachedData” in the document data island. So before any of my code gets run, the VSTO runtime will instantiate myCachedData from the data island. Now, when my Initialize code gets run the second time and I check if myCachedData Is Nothing, this will evaluate to false because the runtime has already created myCachedData for me—it will already be filled with the customers.xml that I loaded the first time I ran the document.


So you can imagine how you can use this feature to support offline. You can refresh the state of myCachedData if you have a connection available. If you don’t, you can just use the last cached version of myCachedData.


Finally, you can imagine how this now enables simple server scenarios. Imagine I’ve taken the myCachedData dataset and databound it to some of the contents of the document. Now what I want to do is fill the myCachedData dataset on the server when I serve this document up to a client on a web request. That way, the document is prepopulated with the data it needs before the user ever gets it. Also, this allows me to access the data bases on the server and not have to set up a way for the client to be able to hit my data sources. When the user opens the document on the client, the cached data will be loaded, data binding pulls the cached data into the document contents, and whammo--cached data populated on the server appears in my document.


On the server, we provide an object model that lets you read or write cached data from the data island without starting up Word or Excel. The code below opens mydoc.doc (without starting Word on the server). It then sets the cached data set “myCachedData” which is a member of the ThisDocument view class to a new value by setting new xml for the data set. The document is then saved and closed. You could run this code on the server to fill a document with data before sending it to the client.


Dim myServerDocument As New ServerDocument("c:\mydoc.doc")


 CachedData("myCachedData").Xml = "<Customers>...</Customers>"




Finally, you might ask, what can I cache? Just data sets?


Well we certainly support data sets and typed data sets. But you can also cache any type that is XML Serializable. This means you can create your own custom types—look up tables, property bags, k-d-B-trees, whatever you want, and cache them in your document. Also, you can populate the contents of these custom types on the server using the ServerDocument class as shown above. Pretty cool huh!