Identifying Open XML Word-Processing Documents with Tracked Revisions

Summary:   Determining whether an Open XML WordprocessingML document contains tracked revisions is important. You can significantly simplify your code to process Open XML WordprocessingML if you know that the document does not contain tracked revisions. This article describes how to determine whether a document contains tracked revisions.

Applies to: Office 2010 | Open XML | Visual Studio Tools for Microsoft Office | Word 2007 | Word 2010

In this article
Introduction
Accepting Revisions by Using PowerTools for Open XML
Determining Existence of Tracked Changes
Determining Tracked Changes by Using the Open XML SDK 2.0
Determining Tracked Changes Using LINQ to XML
Conclusion
Additional Resources

Published: May 2010

Provided by:Eric White, Microsoft Corporation

  • Introduction

  • Accepting Revisions by Using PowerTools for Open XML

  • Determining Existence of Tracked Changes

  • Determining Tracked Changes by Using the Open XML SDK 2.0

  • Determining Tracked Changes Using LINQ to XML

  • Conclusion

Introduction

Processing tracked changes (also known as tracked revisions) is an important task that you should full understand when you write Open XML applications. If you accept all tracked revisions first, your job of processing or transforming the WordprocessingML is made significantly easier.

Accepting Revisions by Using PowerTools for Open XML

To review the semantics of the elements and attributes of WordprocessingML that hold tracked changes information in detail, see Accepting Revisions in Open XML Word-Processing Documents. In addition, you can download the code sample, RevisionAccepter.zip from the following project on CodePlex, CodePlex.com/PowerTools. To download, go to the Downloads tab, and then click RevisionAccepter.zip.

Determining Existence of Tracked Changes

There are other scenarios where you want to process documents that you know do not contain tracked changes, and because of certain business requirements, you do not want to automatically accept tracked changes. For example, perhaps you have a SharePoint document library that contains no documents that contain tracked changes. Before users add the document to that document library, you want them to consciously and intentionally address and accept all tracked revisions. Accepting revisions as part of checking the document into the document library circumvents this manual process, where you want each person to examine their documents and resolve any issues.

As an alternative, instead of accepting revisions with the RevisionAccepter class, you can validate that the document contains no tracked revisions, and refuse to let the document be checked into the document library without tracked changes being accepted.

The code is not complex. It defines an array of revision tracking element names, and if any of these elements occur in any of the parts that can contain tracked revisions, then the document contains tracked revisions. We can use a LINQ query to determine if any of the revision tracking elements exist in the markup. This article presents four versions of the code to determine whether a document contains tracked revisions.

  • Using C# and LINQ to XML.

  • Using C# and the Open XML SDK strongly-typed object model.

  • Using Visual Basic and LINQ to XML.

  • Using Visual Basic and the Open XML SDK strongly-typed object model

The process of determining whether a document contains tracked revisions is more complex because there are five varieties of parts in an Open XML WordprocessingML document that can contain tracked revisions.

  • Main document part

  • Header parts. There can be multiple header parts for each section. A document can contain multiple sections. Therefore, there may be a fair number of header parts.

  • Footer parts. Again, there can be multiple footer parts for each section.

  • EndNotes part. There is either zero or one End Note part.

  • FootNotes part. There is either zero or one Foot Note part.

By using the code that is presented in this article, you can then ignore those elements and attributes that contain tracked revisions. This simplifies processing WordprocessingML.

Determining Tracked Changes by Using the Open XML SDK 2.0

The following two examples use the strongly-typed object model of the Welcome to the Open XML SDK 2.0 for Microsoft Office to determine whether a document contains tracked revisions.

To build these examples, you must download and install the download version of the Open XML SDK 2.0. Next, add a reference to the Open XML SDK to your project and a reference to the WindowsBase assembly.

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;

class Program
{
    public static System.Type[] trackedRevisionsElements = new System.Type[] {
        typeof(CellDeletion),
        typeof(CellInsertion),
        typeof(CellMerge),
        typeof(CustomXmlDelRangeEnd),
        typeof(CustomXmlDelRangeStart),
        typeof(CustomXmlInsRangeEnd),
        typeof(CustomXmlInsRangeStart),
        typeof(Deleted),
        typeof(DeletedFieldCode),
        typeof(DeletedMathControl),
        typeof(DeletedRun),
        typeof(DeletedText),
        typeof(Inserted),
        typeof(InsertedMathControl),
        typeof(InsertedMathControl),
        typeof(InsertedRun),
        typeof(MoveFrom),
        typeof(MoveFromRangeEnd),
        typeof(MoveFromRangeStart),
        typeof(MoveTo),
        typeof(MoveToRangeEnd),
        typeof(MoveToRangeStart),
        typeof(MoveToRun),
        typeof(NumberingChange),
        typeof(ParagraphMarkRunPropertiesChange),
        typeof(ParagraphPropertiesChange),
        typeof(RunPropertiesChange),
        typeof(SectionPropertiesChange),
        typeof(TableCellPropertiesChange),
        typeof(TableGridChange),
        typeof(TablePropertiesChange),
        typeof(TablePropertyExceptionsChange),
        typeof(TableRowPropertiesChange),
    };

    public static bool PartHasTrackedRevisions(OpenXmlPart part)
    {
        return part.RootElement.Descendants()
            .Any(e => trackedRevisionsElements.Contains(e.GetType()));
    }

    public static bool HasTrackedRevisions(WordprocessingDocument doc)
    {
        if (PartHasTrackedRevisions(doc.MainDocumentPart))
            return true;
        foreach (var part in doc.MainDocumentPart.HeaderParts)
            if (PartHasTrackedRevisions(part))
                return true;
        foreach (var part in doc.MainDocumentPart.FooterParts)
            if (PartHasTrackedRevisions(part))
                return true;
        if (doc.MainDocumentPart.EndnotesPart != null)
            if (PartHasTrackedRevisions(doc.MainDocumentPart.EndnotesPart))
                return true;
        if (doc.MainDocumentPart.FootnotesPart != null)
            if (PartHasTrackedRevisions(doc.MainDocumentPart.FootnotesPart))
                return true;
        return false;
    }

    static void Main(string[] args)
    {
        foreach (var documentName in Directory.GetFiles(".", "*.docx"))
        {
            using (WordprocessingDocument wordDoc =
                WordprocessingDocument.Open(documentName, false))
            {
                if (HasTrackedRevisions(wordDoc))
                    Console.WriteLine("{0} contains tracked revisions", documentName);
                else
                    Console.WriteLine("{0} does not contain tracked revisions", documentName);
            }
        }
    }
}
Imports System.IO
Imports DocumentFormat.OpenXml.Packaging
Imports DocumentFormat.OpenXml.Wordprocessing

Module Module1
    Public trackedRevisionsElements As System.Type() = { _
        GetType(CellDeletion), _
        GetType(CellInsertion), _
        GetType(CellMerge), _
        GetType(CustomXmlDelRangeEnd), _
        GetType(CustomXmlDelRangeStart), _
        GetType(CustomXmlInsRangeEnd), _
        GetType(CustomXmlInsRangeStart), _
        GetType(Deleted), _
        GetType(DeletedFieldCode), _
        GetType(DeletedMathControl), _
        GetType(DeletedRun), _
        GetType(DeletedText), _
        GetType(Inserted), _
        GetType(InsertedMathControl), _
        GetType(InsertedMathControl), _
        GetType(InsertedRun), _
        GetType(MoveFrom), _
        GetType(MoveFromRangeEnd), _
        GetType(MoveFromRangeStart), _
        GetType(MoveTo), _
        GetType(MoveToRangeEnd), _
        GetType(MoveToRangeStart), _
        GetType(MoveToRun), _
        GetType(NumberingChange), _
        GetType(ParagraphMarkRunPropertiesChange), _
        GetType(ParagraphPropertiesChange), _
        GetType(RunPropertiesChange), _
        GetType(SectionPropertiesChange), _
        GetType(TableCellPropertiesChange), _
        GetType(TableGridChange), _
        GetType(TablePropertiesChange), _
        GetType(TablePropertyExceptionsChange), _
        GetType(TableRowPropertiesChange) }

    Public Function PartHasTrackedRevisions(ByRef part As OpenXmlPart) As Boolean
        Return part.RootElement.Descendants() _
            .Any(Function(e) trackedRevisionsElements.Contains(e.GetType()))
    End Function
     
    Public Function HasTrackedRevisions(ByRef doc As WordprocessingDocument)
        If PartHasTrackedRevisions(doc.MainDocumentPart) Then
            Return True
        End If
        For Each part In doc.MainDocumentPart.HeaderParts
            If PartHasTrackedRevisions(part) Then
                Return True
            End If
        Next
        For Each part In doc.MainDocumentPart.FooterParts
            If PartHasTrackedRevisions(part) Then
                Return True
            End If
        Next
        If doc.MainDocumentPart.EndnotesPart IsNot Nothing Then
            If PartHasTrackedRevisions(doc.MainDocumentPart.EndnotesPart) Then
                Return True
            End If
        End If
        If doc.MainDocumentPart.FootnotesPart IsNot Nothing Then
            If PartHasTrackedRevisions(doc.MainDocumentPart.FootnotesPart) Then
                Return True
            End If
        End If
        Return False
    End Function

    Sub Main()
        For Each documentName In Directory.GetFiles(".", "*.docx")
            Using wordDoc As WordprocessingDocument = _
                WordprocessingDocument.Open(documentName, False)
                If HasTrackedRevisions(wordDoc) Then
                    Console.WriteLine("{0} contains tracked revisions", documentName)
                Else
                    Console.WriteLine("{0} does not contain tracked revisions", documentName)
                End If
            End Using
        Next
    End Sub
End Module

Determining Tracked Changes Using LINQ to XML

The following two examples use LINQ to XML to determine whether a document contains tracked revisions.

To build these examples, you must install the download version of the Open XML SDK 2.0. Next, add a reference to the Open XML SDK to your project and a reference to the WindowsBase assembly.

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
using DocumentFormat.OpenXml.Packaging;

public static class LocalExtensions
{
    public static XDocument GetXDocument(this OpenXmlPart part)
    {
        XDocument partXDocument = part.Annotation<XDocument>();
        if (partXDocument != null)
            return partXDocument;
        using (Stream partStream = part.GetStream())
        using (XmlReader partXmlReader = XmlReader.Create(partStream))
            partXDocument = XDocument.Load(partXmlReader);
        part.AddAnnotation(partXDocument);
        return partXDocument;
    }
}

public static class W
{
    public static XNamespace w =
        "https://schemas.openxmlformats.org/wordprocessingml/2006/main";

    public static XName cellDel = w + "cellDel";
    public static XName cellIns = w + "cellIns";
    public static XName cellMerge = w + "cellMerge";
    public static XName customXmlDelRangeEnd = w + "customXmlDelRangeEnd";
    public static XName customXmlDelRangeStart = w + "customXmlDelRangeStart";
    public static XName customXmlInsRangeEnd = w + "customXmlInsRangeEnd";
    public static XName customXmlInsRangeStart = w + "customXmlInsRangeStart";
    public static XName del = w + "del";
    public static XName delInstrText = w + "delInstrText";
    public static XName delText = w + "delText";
    public static XName ins = w + "ins";
    public static XName moveFrom = w + "moveFrom";
    public static XName moveFromRangeEnd = w + "moveFromRangeEnd";
    public static XName moveFromRangeStart = w + "moveFromRangeStart";
    public static XName moveTo = w + "moveTo";
    public static XName moveToRangeEnd = w + "moveToRangeEnd";
    public static XName moveToRangeStart = w + "moveToRangeStart";
    public static XName numberingChange = w + "numberingChange";
    public static XName pPrChange = w + "pPrChange";
    public static XName rPrChange = w + "rPrChange";
    public static XName sectPrChange = w + "sectPrChange";
    public static XName tblGridChange = w + "tblGridChange";
    public static XName tblPrChange = w + "tblPrChange";
    public static XName tblPrExChange = w + "tblPrExChange";
    public static XName tcPrChange = w + "tcPrChange";
    public static XName trPrChange = w + "trPrChange";
}

class Program
{
    public static XName[] trackedRevisionsElements = new[]
    {
        W.cellDel,
        W.cellIns,
        W.cellMerge,
        W.customXmlDelRangeEnd,
        W.customXmlDelRangeStart,
        W.customXmlInsRangeEnd,
        W.customXmlInsRangeStart,
        W.del,
        W.delInstrText,
        W.delText,
        W.ins,
        W.moveFrom,
        W.moveFromRangeEnd,
        W.moveFromRangeStart,
        W.moveTo,
        W.moveToRangeEnd,
        W.moveToRangeStart,
        W.numberingChange,
        W.pPrChange,
        W.rPrChange,
        W.sectPrChange,
        W.tblGridChange,
        W.tblPrChange,
        W.tblPrExChange,
        W.tcPrChange,
        W.trPrChange,
    };

    public static bool PartHasTrackedRevisions(OpenXmlPart part)
    {
        return part.GetXDocument()
            .Descendants()
            .Any(e => trackedRevisionsElements.Contains(e.Name));
    }

    public static bool HasTrackedRevisions(WordprocessingDocument doc)
    {
        if (PartHasTrackedRevisions(doc.MainDocumentPart))
            return true;
        foreach (var part in doc.MainDocumentPart.HeaderParts)
            if (PartHasTrackedRevisions(part))
                return true;
        foreach (var part in doc.MainDocumentPart.FooterParts)
            if (PartHasTrackedRevisions(part))
                return true;
        if (doc.MainDocumentPart.EndnotesPart != null)
            if (PartHasTrackedRevisions(doc.MainDocumentPart.EndnotesPart))
                return true;
        if (doc.MainDocumentPart.FootnotesPart != null)
            if (PartHasTrackedRevisions(doc.MainDocumentPart.FootnotesPart))
                return true;
        return false;
    }

    static void Main(string[] args)
    {
        foreach (var documentName in Directory.GetFiles(".", "*.docx"))
        {
            using (WordprocessingDocument wordDoc =
                WordprocessingDocument.Open(documentName, false))
            {
                if (HasTrackedRevisions(wordDoc))
                    Console.WriteLine("{0} contains tracked revisions", documentName);
                else
                    Console.WriteLine("{0} does not contain tracked revisions", documentName);
            }
        }
    }
}
Imports System.IO
Imports System.Xml
Imports DocumentFormat.OpenXml.Packaging

Module Module1
    <System.Runtime.CompilerServices.Extension()> _
    Private Function GetXDocument(ByVal part As OpenXmlPart) As XDocument
        Dim partXDocument As XDocument = part.Annotation(Of XDocument)()
        If partXDocument IsNot Nothing Then
            Return partXDocument
        End If
        Using partStream As Stream = part.GetStream()
            Using partXmlReader As XmlReader = XmlReader.Create(partStream)
                partXDocument = XDocument.Load(partXmlReader)
                part.AddAnnotation(partXDocument)
                Return partXDocument
            End Using
        End Using
    End Function

    Public Class W
        Public Shared w As XNamespace = _
            "https://schemas.openxmlformats.org/wordprocessingml/2006/main"
        Public Shared cellDel As XName = w + "cellIns"
        Public Shared cellIns As XName = w + "cellDel"
        Public Shared cellMerge As XName = w + "cellMerge"
        Public Shared customXmlDelRangeEnd As XName = w + "customXmlDelRangeEnd"
        Public Shared customXmlDelRangeStart As XName = w + "customXmlDelRangeStart"
        Public Shared customXmlInsRangeEnd As XName = w + "customXmlInsRangeEnd"
        Public Shared customXmlInsRangeStart As XName = w + "customXmlInsRangeStart"
        Public Shared del As XName = w + "del"
        Public Shared delInstrText As XName = w + "delInstrText"
        Public Shared delText As XName = w + "delText"
        Public Shared ins As XName = w + "ins"
        Public Shared moveFrom As XName = w + "moveFrom"
        Public Shared moveFromRangeEnd As XName = w + "moveFromRangeEnd"
        Public Shared moveFromRangeStart As XName = w + "moveFromRangeStart"
        Public Shared moveTo As XName = w + "moveTo"
        Public Shared moveToRangeEnd As XName = w + "moveToRangeEnd"
        Public Shared moveToRangeStart As XName = w + "moveToRangeStart"
        Public Shared numberingChange As XName = w + "numberingChange"
        Public Shared pPrChange As XName = w + "pPrChange"
        Public Shared rPrChange As XName = w + "rPrChange"
        Public Shared sectPrChange As XName = w + "sectPrChange"
        Public Shared tblGridChange As XName = w + "tblGridChange"
        Public Shared tblPrChange As XName = w + "tblPrChange"
        Public Shared tblPrExChange As XName = w + "tblPrExChange"
        Public Shared tcPrChange As XName = w + "tcPrChange"
        Public Shared trPrChange As XName = w + "trPrChange"
    End Class

    Public trackedRevisionsElements As XName() = { _
        W.cellDel, _
        W.cellIns, _
        W.cellMerge, _
        W.customXmlDelRangeEnd, _
        W.customXmlDelRangeStart, _
        W.customXmlInsRangeEnd, _
        W.customXmlInsRangeStart, _
        W.del, _
        W.delInstrText, _
        W.delText, _
        W.ins, _
        W.moveFrom, _
        W.moveFromRangeEnd, _
        W.moveFromRangeStart, _
        W.moveTo, _
        W.moveToRangeEnd, _
        W.moveToRangeStart, _
        W.numberingChange, _
        W.pPrChange, _
        W.rPrChange, _
        W.sectPrChange, _
        W.tblGridChange, _
        W.tblPrChange, _
        W.tblPrExChange, _
        W.tcPrChange, _
        W.trPrChange}

    Public Function PartHasTrackedRevisions(ByRef part As OpenXmlPart) As Boolean
        Return part.GetXDocument() _
            .Descendants() _
            .Any(Function(e) trackedRevisionsElements.Contains(e.Name))
    End Function

    Public Function HasTrackedRevisions(ByRef doc As WordprocessingDocument)
        If PartHasTrackedRevisions(doc.MainDocumentPart) Then
            Return True
        End If
        For Each part In doc.MainDocumentPart.HeaderParts
            If PartHasTrackedRevisions(part) Then
                Return True
            End If
        Next
        For Each part In doc.MainDocumentPart.FooterParts
            If PartHasTrackedRevisions(part) Then
                Return True
            End If
        Next
        If doc.MainDocumentPart.EndnotesPart IsNot Nothing Then
            If PartHasTrackedRevisions(doc.MainDocumentPart.EndnotesPart) Then
                Return True
            End If
        End If
        If doc.MainDocumentPart.FootnotesPart IsNot Nothing Then
            If PartHasTrackedRevisions(doc.MainDocumentPart.FootnotesPart) Then
                Return True
            End If
        End If
        Return False
    End Function

    Sub Main()
        For Each documentName In Directory.GetFiles(".", "*.docx")
            Using wordDoc As WordprocessingDocument = _
                WordprocessingDocument.Open(documentName, False)
                If HasTrackedRevisions(wordDoc) Then
                    Console.WriteLine("{0} contains tracked revisions", documentName)
                Else
                    Console.WriteLine("{0} does not contain tracked revisions", documentName)
                End If
            End Using
        Next
    End Sub
End Module

Conclusion

Determining whether an Open XML WordprocessingML document contains tracked revisions enables certain advanced scenarios. You can prevent processing of a document if it contains tracked revisions, which may be important for your business processes. You can make sure that a document contains no tracked revisions before transforming to another form. This significantly simplifies the code that you must write to process word-processing documents.

Additional Resources

To get started with Open XML, see Open XML Developer Center on MSDN. This includes articles, how-to videos, and links to many blog posts. In particular, the following links provide important information to start to work with the Open XML SDK 2.0: