Document Serialization and Storage

Microsoft .NET Framework provides a powerful environment for creating and displaying high quality documents. Enhanced features that support both fixed-documents and flow-documents, advanced viewing controls, combined with powerful 2D and 3D graphic capabilities take .NET Framework applications to a new level of quality and user experience. Being able to flexibly manage an in-memory representation of a document is a key feature of .NET Framework, and being able to efficiently save and load documents from a data store is a need of almost every application. The process of converting a document from an internal in-memory representation to an external data store is termed serialization. The reverse process of reading a data store and recreating the original in-memory instance is termed deserialization.

About Document Serialization

Ideally the process of serializing and deserializing a document from and then back into memory is transparent to the application. The application calls a serializer "write" method to save the document, while a deserializer "read" method accesses the data store and recreates the original instance in memory. The specific format that the data is stored in is generally not a concern of the application as long as the serialize and deserialize process recreates the document back to its original form.

Applications often provide multiple serialization options which allow the user to save documents to different medium or to a different format. For example, an application might offer "Save As" options to store a document to a disk file, database, or web service. Similarly, different serializers could store the document in different formats such as in HTML, RTF, XML, XPS, or alternately to a third-party format. To the application, serialization defines an interface that isolates the details of the storage medium within the implementation of each specific serializer. In addition to the benefits of encapsulating storage details, the .NET Framework System.Windows.Documents.Serialization APIs provide several other important features.

Features of .NET Framework 3.0 Document Serializers

  • Direct access to the high-level document objects (logical tree and visuals) enable efficient storage of paginated content, 2D/3D elements, images, media, hyperlinks, annotations, and other support content.

  • Synchronous and asynchronous operation.

  • Support for plug-in serializers with enhanced capabilities:

    • System-wide access for use by all .NET Framework applications.

    • Simple application plug-in discoverability.

    • Simple deployment, installation, and update for custom third-party plug-ins.

    • User interface support for custom run-time settings and options.

XPS Print Path

The Microsoft .NET Framework XPS print path also provides an extensible mechanism for writing documents through print output. XPS serves as both a document file format and is the native print spool format for Windows Vista. XPS documents can be sent directly to XPS-compatible printers without the need for conversion to an intermediate format. See the Printing Overview for additional information on print path output options and capabilities.

Plug-in Serializers

The System.Windows.Documents.Serialization APIs provide support for both plug-in serializers and linked serializers that are installed separately from the application, bind at run time, and are accessed by using the SerializerProvider discovery mechanism. Plug-in serializers offer enhanced benefits for ease of deployment and system-wide use. Linked serializers can also be implemented for partial trust environments such as XAML browser applications (XBAPs) where plug-in serializers are not accessible. Linked serializers, which are based on a derived implementation of the SerializerWriter class, are compiled and linked directly into the application. Both plug-in serializers and linked serializers operate through identical public methods and events which make it easy to use either or both types of serializers in the same application.

Plug-in serializers aid application developers by providing extensibility to new storage designs and file formats without having to code directly for every potential format at build time. Plug-in serializers also benefit third-party developers by providing a standardized means to deploy, install, and update system accessible plug-ins for custom or proprietary file formats.

Using a Plug-in Serializer

Plug-in serializers are simple to use. The SerializerProvider class enumerates a SerializerDescriptor object for each plug-in installed on the system. The IsLoadable property filters the installed plug-ins based on the current configuration and verifies that the serializer can be loaded and used by the application. The SerializerDescriptor also provides other properties, such as DisplayName and DefaultFileExtension, which the application can use to prompt the user in selecting a serializer for an available output format. A default plug-in serializer for XPS is provided with .NET Framework and is always enumerated. After the user selects an output format, the CreateSerializerWriter method is used to create a SerializerWriter for the specific format. The SerializerWriter.Write method can then be called to output the document stream to the data store.

The following example illustrates an application that uses the SerializerProvider method in a "PlugInFileFilter" property. PlugInFileFilter enumerates the installed plug-ins and builds a filter string with the available file options for a SaveFileDialog.

// ------------------------ PlugInFileFilter --------------------------
/// <summary>
///   Gets a filter string for installed plug-in serializers.</summary>
/// <remark>
///   PlugInFileFilter is used to set the SaveFileDialog or
///   OpenFileDialog "Filter" property when saving or opening files
///   using plug-in serializers.</remark>
private string PlugInFileFilter
    {   // Create a SerializerProvider for accessing plug-in serializers.
        SerializerProvider serializerProvider = new SerializerProvider();
        string filter = "";

        // For each loadable serializer, add its display
        // name and extension to the filter string.
        foreach (SerializerDescriptor serializerDescriptor in
            if (serializerDescriptor.IsLoadable)
                // After the first, separate entries with a "|".
                if (filter.Length > 0)   filter += "|";

                // Add an entry with the plug-in name and extension.
                filter += serializerDescriptor.DisplayName + " (*" +
                    serializerDescriptor.DefaultFileExtension + ")|*" +

        // Return the filter string of installed plug-in serializers.
        return filter;

After an output file name has been selected by the user, the following example illustrates use of the CreateSerializerWriter method to store a given document in a specified format.

// Create a SerializerProvider for accessing plug-in serializers.
SerializerProvider serializerProvider = new SerializerProvider();

// Locate the serializer that matches the fileName extension.
SerializerDescriptor selectedPlugIn = null;
foreach ( SerializerDescriptor serializerDescriptor in
                serializerProvider.InstalledSerializers )
    if ( serializerDescriptor.IsLoadable &&
         fileName.EndsWith(serializerDescriptor.DefaultFileExtension) )
    {   // The plug-in serializer and fileName extensions match.
        selectedPlugIn = serializerDescriptor;
        break; // foreach

// If a match for a plug-in serializer was found,
// use it to output and store the document.
if (selectedPlugIn != null)
    Stream package = File.Create(fileName);
    SerializerWriter serializerWriter =
    IDocumentPaginatorSource idoc =
        flowDocument as IDocumentPaginatorSource;
    serializerWriter.Write(idoc.DocumentPaginator, null);
    return true;

Installing Plug-in Serializers

The SerializerProvider class supplies the upper-level application interface for plug-in serializer discovery and access. SerializerProvider locates and provides the application a list of the serializers that are installed and accessible on the system. The specifics of the installed serializers are defined through registry settings. Plug-in serializers can be added to the registry by using the RegisterSerializer method; or if .NET Framework is not yet installed, the plug-in installation script can directly set the registry values itself. The UnregisterSerializer method can be used to remove a previously installed plug-in, or the registry settings can be reset similarly by an uninstall script.

Creating a Plug-in Serializer

Both plug-in serializers and linked serializers use the same exposed public methods and events, and similarly can be designed to operate either synchronously or asynchronously. There are three basic steps normally followed to create a plug-in serializer:

  1. Implement and debug the serializer first as a linked serializer. Initially creating the serializer compiled and linked directly in a test application provides full access to breakpoints and other debug services helpful for testing.

  2. After the serializer is fully tested, an ISerializerFactory interface is added to create a plug-in. The ISerializerFactory interface permits full access to all .NET Framework objects which includes the logical tree, UIElement objects, IDocumentPaginatorSource, and Visual elements. Additionally ISerializerFactory provides the same synchronous and asynchronous methods and events used by linked serializers. Since large documents can take time to output, asynchronous operations are recommended to maintain responsive user interaction and offer a "Cancel" option if some problem occurs with the data store.

  3. After the plug-in serializer is created, an installation script is implemented for distributing and installing (and uninstalling) the plug-in (see above, "Installing Plug-in Serializers").

See Also

Documents in WPF
Printing Overview
XML Paper Specification: Overview