Office Open XML Formats: Inserting Values into Excel 2007 Cells

Summary:  Learn how you can add information to a specific cell in a Microsoft Office Excel 2007 workbook or set of workbooks without having to open Excel.

Office Visual How To

Applies to:  2007 Microsoft Office System, Microsoft Office Excel 2007, Microsoft Visual Studio 2005

Ken Getz, MCW Technologies

May 2007

Overview

Imagine that you have to add information to a specific cell in a Microsoft Office Excel 2007 workbook or set of workbooks. The ability to do this without having to load Excel 2007 and then load the workbooks, one after another, can be a great time saver.

The Office Open XML File Formats make this task possible. Working with the Office Open XML File Formats requires knowledge of the way Excel stores its content, the System.IO.Packaging API, and XML programming.

To help you get started, you can download the 2007 Office System Sample: Open XML File Format Code Snippets for Visual Studio 2005. This download contains a set of forty code examples for Microsoft Visual Studio 2005. The code examples demonstrate various techniques of working with the Office Open XML File Formats.

See It Office Open XML Formats: Inserting a Value splash

Watch the Video

Length: 12:58 | Size: 9.48 MB | Type: WMV file

Code It | Read It | Explore It

Code It

After you install the 2007 Office System Sample: Open XML File Format Code Snippets for Visual Studio 2005, create a sample Excel 2007 workbook to use for testing. (See the Read It section for reference). Create a new Windows Application project in Visual Studio 2005, open the code editor, right-click the project, and select Insert Snippet. Select the Excel: Insert a number into a cell snippet from the list of available code snippets.

If you are using Microsoft Visual Basic, inserting the snippet inserts a reference to WindowsBase.dll and adds the following Imports statements.

Imports System.IO.Packaging
Imports System.Xml
Imports System.IO

If you are using C#, you must add the reference to the WindowsBase.dll assembly and the corresponding using statements so that you can compile the code. (Code snippets in C# cannot set references and insert using statements for you.)

If the Windowsbase.dll reference does not appear on the .NET tab in the Add Reference dialog box, click the Browse tab, locate the C:\Program Files\Reference assemblies\Microsoft\Framework\v3.0 folder, and then click WindowsBase.dll.

The XLInsertNumberIntoCell snippet programmatically goes into the various parts and relations between the parts to set a single numeric cell value. To test it, store your sample workbook somewhere that is easy to find (for example, C:\Test.xlsx). In a Windows application, insert the XLInsertNumberIntoCell snippet, and then call it as shown in the following example, modifying the names to meet your needs. When you are finished, load the workbook to verify that you placed the numeric value into the specified cell.

XLInsertNumberIntoCell("C:\Test.xlsx", "Sheet3", "C3", 14)
XLInsertNumberIntoCell("C:\\Test.xlsx", "Sheet3", "C3", 14);

The code starts with the following block.

Public Function XLInsertNumberIntoCell( _
 ByVal fileName As String, ByVal sheetName As String, _
 ByVal addressName As String, ByVal value As Integer) As Boolean

  Const documentRelationshipType As String = _
   "http://schemas.openxmlformats.org/officeDocument/2006/" & _
   "relationships/officeDocument"
  Const worksheetSchema As String = _
   "http://schemas.openxmlformats.org/spreadsheetml/2006/main"

  ' Retrieve the stream containing the requested
  ' worksheet's info:
  Dim documentPart As PackagePart = Nothing
  Dim documentUri As Uri = Nothing
  Dim returnValue As Boolean = False

  Using xlPackage As Package = Package.Open(fileName, _
      FileMode.Open, FileAccess.ReadWrite)

    ' Get the main document part (workbook.xml).
    For Each relationship As PackageRelationship In _
     xlPackage.GetRelationshipsByType(documentRelationshipType)

      ' There should only be one document part in the package. 
      documentUri = PackUriHelper.ResolvePartUri( _
       New Uri("/", UriKind.Relative), relationship.TargetUri)
      documentPart = xlPackage.GetPart(documentUri)

      ' There should only be one instance, but get out no matter what.
      Exit For
    Next

  ' Code removed here.

  End Using
  Return returnValue
End Function
public bool XLInsertNumberIntoCell(string fileName, 
  string sheetName, string addressName, int value)
{
  //  Given a file, a sheet, and a cell, insert a specified value.
  //  For example: InsertNumberIntoCell("C:\Test.xlsx", "Sheet3", "C3", 14)

  const string documentRelationshipType = 
    "http://schemas.openxmlformats.org/officeDocument/2006/ + 
    "relationships/officeDocument";
  const string worksheetSchema = 
    "http://schemas.openxmlformats.org/spreadsheetml/2006/main";

  //  Retrieve the stream containing the requested
  //  worksheet's info:
  PackagePart documentPart = null;
  Uri documentUri = null;
  bool returnValue = false;

  using (Package xlPackage = 
    Package.Open(fileName, FileMode.Open, FileAccess.ReadWrite))
  {
    //  Get the main document part (workbook.xml).
    foreach (System.IO.Packaging.PackageRelationship relationship 
      in xlPackage.GetRelationshipsByType(documentRelationshipType))
    {
      //  There should only be one document part in the package. 
      documentUri = PackUriHelper.ResolvePartUri(new Uri("/", 
        UriKind.Relative), relationship.TargetUri);
      documentPart = xlPackage.GetPart(documentUri);

      //  There should only be one instance, 
      // but get out no matter what.
      break;
    }

    // Code removed here.

    return returnValue;
  }
}

This code starts by creating constants that it later uses to refer to the various schemas and namespaces that the procedure requires. The code then finds the document part by calling the Package.GetRelationshipsByType method, passing in the constant that contains the document relationship name (see Figure 2). The code then loops through all the returned relationships and retrieves the document URI, relative to the root of the package. You must loop through the PackageRelationship objects to retrieve the one you want. In every case, this loop only executes once.

The following code replaces the "Code removed here" comment in the previous example.

If documentPart IsNot Nothing Then

  ' Load the contents of the workbook.
  Dim doc As XmlDocument = New XmlDocument()
  doc.Load(documentPart.GetStream())

  ' Create a NamespaceManager to handle the default namespace, 
  ' and create a prefix for the default namespace:
  Dim nsManager As New XmlNamespaceManager(doc.NameTable)
  nsManager.AddNamespace("d", doc.DocumentElement.NamespaceURI)

  ' Code removed here.

End If
if (documentPart != null)
{
  // Load the contents of the workbook.
  XmlDocument doc = new XmlDocument();
  doc.Load(documentPart.GetStream());

  // Create a NamespaceManager to handle the default namespace, 
  // and create a prefix for the default namespace:
  XmlNamespaceManager nsManager = 
    new XmlNamespaceManager(doc.NameTable);
  nsManager.AddNamespace("d", doc.DocumentElement.NamespaceURI);

  // Code removed here

}
//  Save the XML back to its part.
xDoc.Save(sheetPart.GetStream(FileMode.Create, FileAccess.Write));

Determining that it found the document part, this code creates an XmlDocument instance to contain the contents of the workbook, loads the XML content, and creates an XmlNamespaceManager instance loaded with the namespace that later code must have to perform searches.

The following code replaces the "Code removed here" comment in the previous example.

Dim searchString As String = _
  String.Format("//d:sheet[@name='{0}']", sheetName)
Dim sheetNode As XmlNode = _
  doc.SelectSingleNode(searchString, nsManager)
If sheetNode IsNot Nothing Then

  ' Get the relId attribute:  
  Dim relationAttribute As XmlAttribute = _
   sheetNode.Attributes("r:id")
  If relationAttribute IsNot Nothing Then
    Dim relId As String = relationAttribute.Value

    Dim sheetRelation As PackageRelationship = _
      documentPart.GetRelationship(relId)
    Dim sheetUri As Uri = PackUriHelper. _
      ResolvePartUri(documentUri, sheetRelation.TargetUri)
    Dim sheetPart As PackagePart = xlPackage.GetPart(sheetUri)

    ' Load the contents of the sheet into an XML document.
    Dim xDoc As New XmlDocument()
    xDoc.Load(sheetPart.GetStream())

  ' Code removed here.

End If

' Save the XML back to its part.
xDoc.Save(sheetPart.GetStream(FileMode.Create, FileAccess.Write))
string searchString = 
  string.Format("//d:sheet[@name='{0}']", sheetName);
XmlNode sheetNode = doc.SelectSingleNode(searchString, nsManager);
if (sheetNode != null)
{
  //  Get the relId attribute:
  XmlAttribute relationAttribute = sheetNode.Attributes["r:id"];
  if (relationAttribute != null)
  {
    string relId = relationAttribute.Value;

    //  First, get the relation between the document and the sheet.
    PackageRelationship sheetRelation = 
      documentPart.GetRelationship(relId);
    Uri sheetUri = PackUriHelper.
      ResolvePartUri(documentUri, sheetRelation.TargetUri);
    PackagePart sheetPart = xlPackage.GetPart(sheetUri);

    //  Load the contents of the sheet into an XML document.
    XmlDocument xDoc = new XmlDocument();
    xDoc.Load(sheetPart.GetStream());

    // Code removed here.
}

This code creates a search string, based on the sheet name that you supplied. It searches the document's XML (see Figure 3) for the sheet name so that it can retrieve the r:id value. Given the r:id value, the code gets the corresponding relationship (see Figure 4), locates the specified sheet, retrieves it by using the GetPart method, and loads the contents of the sheet into a new XmlDocument instance.

The following code replaces the "Code removed here" comment in the previous example.

' Use regular expressions to get the row number.
' If the parameter wasn't well formed, this code
' will fail:
Dim r As New System.Text.RegularExpressions. _
  Regex("^(?<col>\D+)(?<row>\d+)")
Dim rowNumber As String = r.Match(addressName).Result("${row}")

' Search for the existing cell:
Dim cellnode As XmlNode = xDoc.SelectSingleNode( _
 String.Format("//d:sheetData/d:row/d:c[@r='{0}']", addressName), _
 nsManager)
If cellnode Is Nothing Then
  ' Code removed here.
Else

  ' Code removed here.
End If
//  Use regular expressions to get the row number.
//  If the parameter wasn't well formed, this code
//  will fail:
System.Text.RegularExpressions.Regex r = 
  new System.Text.RegularExpressions.Regex(@"^(?<col>\D+)(?<row>\d+)");
string rowNumber = r.Match(addressName).Result("${row}");

//  Search for the existing cell:
XmlNode cellnode = xDoc.SelectSingleNode(
  string.Format("//d:sheetData/d:row/d:c[@r='{0}']", addressName), 
  nsManager);
if (cellnode == null)
{
  // Code removed here.
}
Else
{
  // Code removed here.
}

This code uses a regular expression to parse the address that you specified when you called the procedure (like C3), retrieving the row and column parts of the name. Given the cell address, the code tries to find the existing cell node (which only exists if the cell already contains a value, as shown in Figure 5). If code does not find the cell, it must create it.

The following code replaces the if block's "Code removed here" comment in the previous example, which runs only if the cell did not already exist.

Dim cellElement As XmlElement = _
  xDoc.CreateElement("c", worksheetSchema)
cellElement.Attributes.Append(xDoc.CreateAttribute("r"))
cellElement.Attributes("r").Value = addressName

Dim valueElement As XmlElement = _
  xDoc.CreateElement("v", worksheetSchema)
valueElement.InnerText = value.ToString()
cellElement.AppendChild(valueElement)

' Default style is "0"
cellElement.Attributes.Append(xDoc.CreateAttribute("s"))
cellElement.Attributes("s").Value = "0"

Dim rowNode As XmlNode = xDoc.SelectSingleNode( _
 String.Format("//d:sheetData/d:row[@r='{0}']", rowNumber), nsManager)
If rowNode Is Nothing Then

  ' Code removed here…
Else

  ' Code removed here…

End If
XmlElement cellElement = xDoc.CreateElement("c", worksheetSchema);
cellElement.Attributes.Append(xDoc.CreateAttribute("r"));
cellElement.Attributes["r"].Value = addressName;

XmlElement valueElement = xDoc.CreateElement("v", worksheetSchema);
valueElement.InnerText = value.ToString();
cellElement.AppendChild(valueElement);

//  Default style is "0"
cellElement.Attributes.Append(xDoc.CreateAttribute("s"));
cellElement.Attributes["s"].Value = "0";

XmlNode rowNode = xDoc.SelectSingleNode(string.Format(
  "//d:sheetData/d:row[@r='{0}']", rowNumber), nsManager);
if (rowNode == null)
{
  // Code removed here…
}
Else
{
  // Code removed here…
}

This code creates the cell element and appends the "r" attribute, which contains the cell's address. The code also creates the "v" attribute, which contains the cell's value. Finally, the code creates the cell's "s" attribute, which contains the cell's style information. At this point, the code tries to find the element that corresponds to the row where the cell goes. If the row already exists, the code can insert it. If not, the code must also create the row.

The following example replaces the first "Code removed here…" comment in the previous example.

' Didn't find the row, either. Just add a new row element:
Dim sheetDataNode As XmlNode = _
 xDoc.SelectSingleNode("//d:sheetData", nsManager)
If sheetDataNode IsNot Nothing Then
  Dim rowElement As XmlElement = _
   xDoc.CreateElement("row", worksheetSchema)
  rowElement.Attributes.Append(xDoc.CreateAttribute("r"))
  rowElement.Attributes("r").Value = rowNumber
  rowElement.AppendChild(cellElement)
  sheetDataNode.AppendChild(rowElement)
  returnValue = True
End If
//  Didn't find the row, either. Just add a new row element:
XmlNode sheetDataNode = xDoc.SelectSingleNode("//d:sheetData", nsManager);
if (sheetDataNode != null)
{
  XmlElement rowElement = xDoc.CreateElement("row", worksheetSchema);
  rowElement.Attributes.Append(xDoc.CreateAttribute("r"));
  rowElement.Attributes["r"].Value = rowNumber;
  rowElement.AppendChild(cellElement);
  sheetDataNode.AppendChild(rowElement);
  returnValue = true;
}

If the code did not find the correct row element, it must create the row. This is easy: the code creates the row element, sets the "r" attribute correctly, appends the child cell element, appends the row element as a child element of the sheet, and sets the procedure's return value.

The following code replaces the else block's "Code removed here…" comment in the earlier example. This code is more complex, as you must insert new cell elements in the correct order within the existing row element.

Dim styleAttr As XmlAttribute = _
 CType(rowNode.Attributes.GetNamedItem("s"), XmlAttribute)
If styleAttr IsNot Nothing Then

  ' You know cellElement has an "s" attribute -- you
  ' added it yourself.
  cellElement.Attributes("s").Value = styleAttr.Value
End If

' You must insert the new cell at the correct location.
' Loop through the children, looking for the first cell that is
' beyond the cell you're trying to insert. Insert before that cell.
Dim biggerNode As XmlNode = Nothing
Dim cellNodes As XmlNodeList = _
 rowNode.SelectNodes("./d:c", nsManager)
If cellNodes IsNot Nothing Then
  For Each node As XmlNode In cellNodes
    Dim rowAttr As XmlAttribute = node.Attributes("r")
    If rowAttr IsNot Nothing Then
      If node.Attributes("r").Value > addressName Then
        biggerNode = node
        Exit For
      End If
    End If
  Next
End If
If biggerNode Is Nothing Then
  rowNode.AppendChild(cellElement)
Else
  rowNode.InsertBefore(cellElement, biggerNode)
End If
returnValue = True
XmlAttribute styleAttr = 
  ((XmlAttribute)(rowNode.Attributes.GetNamedItem("s")));
if (styleAttr != null)
{
  //  You know cellElement has an "s" attribute -- you
  //  added it yourself.
  cellElement.Attributes["s"].Value = styleAttr.Value;
}

// You must insert the new cell at the correct location.
// Loop through the children, looking for the first cell that is
// beyond the cell you're trying to insert. Insert before that cell.
XmlNode biggerNode = null;
XmlNodeList cellNodes = rowNode.SelectNodes("./d:c", nsManager);
if (cellNodes != null)
{
  foreach (XmlNode node in cellNodes)
  {
    if (String.Compare(node.Attributes["r"].Value, addressName) > 0)
    {
      biggerNode = node;
      break;
    }
  }
}
if (biggerNode == null)
{
  rowNode.AppendChild(cellElement);
}
else
{
  rowNode.InsertBefore(cellElement, biggerNode);
}
returnValue = true;

This code starts by retrieving the "s" element from the existing row and setting its value in the new cell. This is how cells inherit the row's style information. To insert the cell, the code must loop through existing cells in the row and look for a column value larger than the one in which you are inserting. When the code finds either the final cell in the row or a cell with a larger column, the code inserts the cell into the XML content at the appropriate location.

The following block of code replaces the "Code removed here…" comment in an example you saw earlier. In this case, the code runs if the cell already exists (that is, in the else block for the code that checked to see whether cellnode was a null reference).

cellnode.Attributes.RemoveNamedItem("t")
Dim valueNode As XmlNode = _
 cellnode.SelectSingleNode("d:v", nsManager)
If valueNode Is Nothing Then

  ' Cell with deleted value. Add a value element now.
  valueNode = xDoc.CreateElement("v", worksheetSchema)
  cellnode.AppendChild(valueNode)
End If
valueNode.InnerText = value.ToString()
returnValue = True
cellnode.Attributes.RemoveNamedItem("t");
XmlNode valueNode = cellnode.SelectSingleNode("d:v", nsManager);
if (valueNode == null)
{
  //  Cell with deleted value. Add a value element now.
  valueNode = xDoc.CreateElement("v", worksheetSchema);
  cellnode.AppendChild(valueNode);
}
valueNode.InnerText = value.ToString();
returnValue = true;

This code starts by removing the "t" attribute from the cell. The attribute exists only if the cell currently contains a text value. Because you know that the new cell value is a number, you no longer need this value. The cell might exist, but its value does not (perhaps its value was deleted in Excel). The code handles this scenario by adding a "v" element. Finally, after establishing that the cell exists, and that it contains a "v" element, the code sets the value of the "v" element, setting the value of the cell.

Read It

In this example, you inserted a numeric value into a cell. Inserting a string value requires an extra step: writing the shared string value into the SharedStrings part. The code to do this is just more of the same type of code. To see exactly how it works, investigate the contents of the XLInsertStringIntoCell snippet (in the list of code snippets, select Excel:Insert string into cell).

It is important that you understand the file structure of an Excel workbook so that you can find the data you need. In this case, you are trying to insert the value of a cell. To do that, create an Excel worksheet with a numeric value in it. This example assumes that you named your document Test.xlsx, and that it contains a single numeric value, as shown in Figure 1.

Figure 1. Sample document containing a single numeric value

The sample document contains a single numerical

To investigate the contents of the document, follow these steps:

  1. In Windows Explorer, rename the document "Test.xlsx.zip."

  2. Open the .zip file, using Window Explorer or a ZIP-management application.

  3. View the _rels\.rels file, as shown in Figure 2.

    This document contains information about the relationships between the parts in the document. The value for the workbook.xml part is highlighted in Figure 2. This information enables you to find specific parts.

    Figure 2. Using relations between parts to find specific parts

    In _rels\.rels, You use relations between parts

  4. Open xl\workbook.xml, as shown in Figure 3.

    The highlighted element contains a reference to Sheet3. You will use one of these references to retrieve information about a particular sheet.

    Figure 3. Using r:id attribute to find the worksheet

    In xl\workbook.xml, the document part

  5. Open xl\_rels\workbook.xml.rels, as shown in Figure 4.

    This document contains information about the relations between the document part and all the subsidiary parts. The code example uses this information to find the specified worksheet so that it can retrieve the value of a cell. For example, the sheet whose relationship ID is rId3 refers to worksheets/sheet3.xml.

    Figure 4. Sheet relationships in the workbook.xml.rels file

    Each sheet relationship appears workbook.xml.rels

  6. Open xl\worksheets\sheet3.xml, as shown in Figure 5.

    This part contains the sheet named Sheet3. The code example uses XML-searching techniques to insert cell values in the XML content.

    Figure 5. Finding cells in the sheet part

    In the sheet part, you can find cells

    The workbook displays XML content only for rows that contain data, and only for cells in those rows that contain data. If you are adding data to a cell that already contains content, you can overwrite the existing value. If you are adding content to cells that have never contained data, you must create both the row and cell elements.

  7. Close the tool that you are using to investigate the workbook, and rename the file with an .XLSX file name extension.

Explore It