CreateXlsx sample program
This post covers the code for a CreateXlsx program that creates a simple Open XML spreadsheet from scratch using the .NET Framework 3.0 packaging API (System.IO.Packaging), as well as two of the Open XML code snippets that are available on MSDN. Full source code for this sample is provided in the attached ZIP file.
This is the second in a series on how to create Open XML documents natively in C# managed code without using the Office client applications. (Here's the DOCX version that I already posted.)
Let's get into the details ...
The sample is a simple console application that takes a filename to be created as the first command-line argument, followed by any number of values to be written into cells. The values to be written into cells must be integers or strings. (Other numeric types are supported by Open XML, of course, but I've kept this example simple.)
The specified document is created and the values are written to it. The resulting spreadsheet is a valid Open XML document, and can be opened by Excel 2007 as shown here. If you drill down into the package (by renaming it to a ZIP file or using a utility such as the Package Explorer), you'll see that it has a workbook with one worksheet, and the various relationships, content types and other parts that are necessary for an Open XML document.
There are five methods in CreateXlsx that provide the core functionality:
- Main - processes the comand-line arguments, calling CreateEmptyXlsx for the filename and WriteCell for each of the values to be written
- CreateEmptyXlsx - creates an empty Open XML spreadsheet
- WriteCell - calls the appropriate XLInset* method (below), based on whether the value is an integer or a string
- XLInsertNumberIntoCell - insert a number (Open XML snippet from MSDN)
- XLInsertStringIntoCell - insert a string (Open XML snippet from MSDN)
The last two methods are inserted from the published Open XML code snippets, and Main and WriteCell are simple wrappers. The real work is done in CreateEmptyXlsx, which creates the SpreadsheetML document.
First we set up some constants for the various namespaces and content types that we need:
Then we create the minimal parts necessaryto define the structure of a SpreadsheetML document. These are created in memory as XmlDocument objects:
Now it's time to create the package itself. We also create a temporary empty part that we'll delete when we're done, which has a content type of application/xml; his makes application/xml the default content type for XML parts. Here's the code:
Now we write the parts (which we created earlier as XmlDocuments) to the package. First we save the "start part," the workbook itself:
Then we do the same for the worksheet, stylesheet, and shared-strings table. Finally, we create the relationships that hold it all together, remove the temporary part that was created to set the default content type, and close the document:
Recommended Architecture. This is not a recommended architecture for generating anything other than simple proof-of-concept documents, because the package is opened and closed for every cell written. Performance would be very bad if you did it this way for a large complex spreadsheet. For that scenario, you'd want to create a package object once and then write to it until you're done and save it once. It's a trivial matter to change this example to work that way, but I wanted to keep this sample very simple, and I also wanted to demonstrate use of the published code snippets, which open and close the file every time.
XmlDocument vs. XmlTextWriter. Another thing you'd probably want to change for a production application is to use XmlTextWriter to create the parts. That will provide much better performance, but I thought XmlDocument provided a cleaner demo for these purposes. Feel free to have your own opinion about this. :-)
Shared Strings Table. This sample creates an empty shared-strings table and puts it in the package, even if there are no strings written by your test case. (I.e., you don't provide any strings as command-line arguments.) I did this so that you can use the XLInsertStringIntoCell snippet to write to the generated spreadsheet if you want to extend the sample and play around with it. (That snippet assumes the existence of a shared-strings table.) Note that if you create a spreadsheet with this utility that has no strings in it, then open and save it from Excel 2007, Excel will remove the empty shared strings table.
Themes. I didn't bother to create a default theme for the spreadsheet. It's not required, but I wanted to mention this because it's the primary difference between a spreadsheet created by this approach and one created by Excel, which always writes a default theme.
.NET Framework 3.0. Since doing the CreateDocx sample, I've upgraded to the latest publicly available version of .NET 3.0, and you should do the same if you're doing Open XML development. You can get it here. (I used the February CTP of WinFX for the CreateDocx sample.)
That's about it. I'll do the CreatePptx sample next, to complete the series, and then I'll post the set of these samples on the OpenXmlDeveloper.org web site. We'll also be putting together a framework for Open XML snippets soon that shows some best practices for optimizing performance in the generation of large complex documents.
Thanks to Mauricio Ordonez and Kevin Boske for their help in working through the details of this sample.
Mon 7/17 11:00AM: modified attached source code to define the relationship namespace as a namespace with prefix "r" rather than simply setting attributes. No change to the output, but this is the preferred approach for handling namespaces in an XmlDocument.