Working with Numbered Lists in Open XML WordprocessingML

Summary:   Learn about lists in Open XML. Word 2010 documents often contain numbered and bulleted lists. This area of WordprocessingML is justifiably complex. Numbered lists and bulleted lists have many features, each used by a different set of users.

Applies to: Office 2010 | Open XML | Visual Studio Tools for Microsoft Office | Word 2007 | Word 2010

In this article
Overview
Markup for Simple Numbered Lists
Markup for Simple Bulleted Lists
Markup for Multi-Level Lists
Markup for Numbering that is Linked to Styles
Overriding Number Formats in Multi-Level Lists with New Number Formats
Overriding Number Formats in Multi-Level Lists, Creating Multi-Level List Formats
Processing the w:startOverride Element
Defining List Styles
Alternatives for List Item Suffixes
Processing the w:lvlRestart Element
Processing the w:isLgl Element
Algorithm to Assemble List Item Text
Conclusion
Additional Resources

Published: March 2010

Provided by:Eric White, Microsoft Corporation

Contents

  • Overview

  • Markup for Simple Numbered Lists

  • Markup for Simple Bulleted Lists

  • Markup for Multi-Level Lists

  • Markup for Numbering that is Linked to Styles

  • Overriding Number Formats in Multi-Level Lists with New Number Formats

  • Overriding Number Formats in Multi-Level Lists, Creating Multi-Level List Formats

  • Processing the w:startOverride Element

  • Defining List Styles

  • Alternatives for List Item Suffixes

  • Processing the w:lvlRestart Element

  • Processing the w:isLgl Element

  • Algorithm to Assemble List Item Text

Overview

When implementing a conversion of Open XML word processing documents to HTML, one of the interesting issues is accurately converting numbered and bulleted lists. You must write specific code to process them, because they affect the text that the document contains, but that text is not directly in the markup. If you are accurately extracting the text of the document, you must process some elements and attributes to assemble the correct text.

Note

This article applies to both Microsoft Word 2010 and Microsoft Office Word 2007.

Numbered items and bulleted lists are complex, and justifiably so. There are many features of numbered and bulleted lists, each used by a different set of users. These features are represented by elements in the markup. However, you do not have to pay attention to all elements. Some aspects of the markup are there only to affect the user interface, and you do not have to pay attention to those elements when you determine the textual representation of a numbered or bulleted item. This article presents just the essentials that you must know to work with numbered and bulleted items.

The easiest way to talk about numbering markup is to relate the markup to the user interface of Word. You are concerned only with the document modifications that you can make with these three buttons:

Figure 1. Toolbar buttons for numbered lists

Numbering Buttons

There is a fair amount of indirection for WordprocessingML numbering markup. There are three patterns for this indirection in the markup:

  1. Direct numbering for simple numbered or bulleted lists

  2. Style-based numbering, that is Heading1 is at the first level of indentation, Heading2 is at the second level of indentation.

  3. Named numbering styles

This article examines each of these patterns.

Markup for Simple Numbered Lists

The following figure shows the markup that is generated if you create a simple numbered list.

Figure 2. Creating a simple numbered list

Create Simple Bulleted List

To discuss the markup accurately, we must differentiate between two aspects of numbered or bulleted lists. For each numbered or bulleted item, there are two components: the list item, and the paragraph text.

Figure 3. Bulleted list items

List Items - Bullets

We must differentiate between these items because the list item is what we need to assemble for each paragraph. In addition, there is separate formatting markup for the list item. The same difference applies to a numbered list.

Figure 4. Numbered list items

List Items - Text

The following diagram shows the indirection for a simple numbered or bulleted list.

Figure 5. Indirection for a simple numbered or bulleted list

Simple numbering indirection

Next, examine the markup for the following simple numbered list.

Figure 6. Simple numbered list

Simple Numbered List

The markup in the main document part looks as follows.

<w:p>
  <w:pPr>
    <w:pStyle w:val="ListParagraph"/>
    <w:numPr>
      <w:ilvl w:val="0"/>
      <w:numId w:val="1"/>
    </w:numPr>
  </w:pPr>
  <w:r>
    <w:t>Paragraph one.</w:t>
  </w:r>
</w:p>

The w:numPr element contains the numbering elements that we are concerned with. The w:ilvl element is a zero-based number that indicates the indentation level. The numbered items are at the least indented level, so that the value of w:ilvl is zero. The w:numId element is an index into the w:num elements, which is located in the numbering part.

Important

The w:numId can contain a value of 0, which is a special value that indicates that numbering was removed at this level of the style hierarchy. While processing this markup, if the w:val='0', the paragraph does not have a list item.

The numbering part markup looks as follows.

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<w:numbering xmlns:w="https://schemas.openxmlformats.org/wordprocessingml/2006/main">
  <w:abstractNum w:abstractNumId="0">
    <!-- These affect the user interface. We can ignore them. -->
    <w:nsid w:val="5FE17486"/>
    <w:multiLevelType w:val="hybridMultilevel"/>
    <w:tmpl w:val="1084E0BA"/>
    <w:lvl w:ilvl="0"
           w:tplc="0409000F">
      <w:start w:val="1"/>
      <w:numFmt w:val="decimal"/>
      <w:lvlText w:val="%1."/>
      <w:lvlJc w:val="left"/>
      <w:pPr>
        <w:ind w:left="720"
               w:hanging="360"/>
      </w:pPr>
    </w:lvl>
    <w:lvl w:ilvl="1"
           w:tplc="04090019"
           w:tentative="1">
      <w:start w:val="1"/>
      <w:numFmt w:val="lowerLetter"/>
      <w:lvlText w:val="%2."/>
      <w:lvlJc w:val="left"/>
      <w:pPr>
        <w:ind w:left="1440"
               w:hanging="360"/>
      </w:pPr>
    </w:lvl>
    <!-- a number of other w:lvl elements elided -->
  </w:abstractNum>
  <w:num w:numId="1">
    <w:abstractNumId w:val="0"/>
  </w:num>
</w:numbering>

The index in the w:numPr element in the main document part refers to the w:num element. In this case, that element contains a single element, w:abstractNumId, which refers to the w:abstractNum element that appears before it. The w:abstractNum element contains the required information to format the list item. Note that the w:abstractNum element does not contain formatting information for the paragraph (except for indentation information, which pertains to both the list item and the paragraph). Formatting for the paragraph itself is stored in the main document and styles part, as usual. There are three elements (the w:nsid element, the w:multiLevelType element, and the w:tmpl element) that only affect the user interface in the word processing application, and you can ignore them.

Note

The remainder of this article eliminates these elements from markup listings.

The w:lvl elements and the w:lvl child elements define the formatting for each indentation level of the list items. There are additional w:lvl elements (removed from the listing) that define the formatting for list items for each level at level two and higher. The w:tplc attribute and w:tentative attribute of the w:lvl element are there for the user interface only. I'll remove them from additional listings.

Tip

The indirection from the w:num element to the w:abstractNum element allows for markup that overrides formatting of list items. This markup follows later in the article.

The elements that you are really interested in are children of the w:lvl element: the w:start element, the w:numFmt element, the w:lvlText element, and the w:lvlJc element, and the complex element w:pPr.

The w:start element specifies the starting number for the indentation level. Do you start to count at 0, 1, or some other number? Negative start numbers are not allowed (and are not very useful).

In this example, the w:numFmt element indicates that the document uses a decimal in the list item for indentation level 0, and that the document uses a lowercase letter for indentation level 1. There are a fair number of options for this element:

  • Bullets

  • Decimal (1, 2, 3)

  • Decimal Zero (01, 02, 03)

  • Upper Roman (I, II, III)

  • Lower Roman (i, ii, iii)

  • Upper Letter (A, B, C)

  • Lower Letter (a, b, c)

  • Ordinal (1st, 2nd, 3rd)

  • Cardinal Text (One, Two Three)

  • Ordinal Text (First, Second, Third)

Note

There are other options for other languages that are not included. Various Asian languages have numbering systems that are not included in this list.

You use the w:lvlText element as a template to construct the list item. For level 0, this example uses "%1.", which indicates that whatever w:numFmt text was determined for the least indented item replaces the %1 in the format string. For level 1, this example uses "%2.", which indicates that whatever text was determined for the first indentation replaces the %2. Note that for the w:lvl elements, you specify indentation level with a zero-based index, whereas in the w:lvlText template, you specify replacement tokens using a one based index. This makes sense. The w:lvl elements are used only by developers, and zero-based indexes are easier for developers, whereas the template is used and specified by users, therefore a one-based index makes sense. These two elements are very powerful, and enable you to create hierarchical list formats to satisfy almost any need. More examples of these two elements appear later in the article.

Important

For bulleted lists, even if the w:lvltext element contains replacement tokens (such as %1 and %2), the replacement tokens are not replaced with the level number. This does not affect our code, because the w:lvlText element does not contain replacement tokens.

The w:lvlJc element controls whether the list items are right-justified or left-justified and other options. The following shows the difference between left justification and right justification for numbered items that use text. It is easy to see in this example, but is more difficult to notice when you use bulleted lists or numbered lists that use digits.

Figure 6. Justified list items

List Item Justification

Finally, there are paragraph properties that apply to the list item, such as specifying the indentation and hanging indentation.

The w:lvl element can also contain run properties that apply to the list item. If you change the font of the list item to Courier, it looks as follows.

Figure 7. List items formatted with the courier font

List Item with Font Set

Here is an example of the w:lvl element that contains run properties that define the font for the list item.

<w:lvl w:ilvl="0">
  <w:start w:val="1"/>
  <w:numFmt w:val="ordinalText"/>
  <w:lvlText w:val="%1)"/>
  <w:lvlJc w:val="left"/>
  <w:pPr>
    <w:ind w:left="720"
           w:hanging="360"/>
  </w:pPr>
  <w:rPr>
    <w:rFonts w:ascii="Courier New"
              w:hAnsi="Courier New"
              w:hint="default"/>
  </w:rPr>
</w:lvl>

As usual, the paragraph in the main document part also contains a reference to a style.

<w:p>
  <w:pPr>
    <w:pStyle w:val="ListParagraph"/>
    <w:numPr>
      <w:ilvl w:val="0"/>
      <w:numId w:val="1"/>
    </w:numPr>
  </w:pPr>
  <w:r>
    <w:t>One</w:t>
  </w:r>
</w:p>

We can find that style in the styles part, which specifies the formatting for the paragraph text.

<w:style w:type="paragraph"
         w:styleId="ListParagraph">
  <w:name w:val="List Paragraph"/>
  <w:basedOn w:val="Normal"/>
  <w:pPr>
    <w:ind w:left="720"/>
    <w:contextualSpacing/>
  </w:pPr>
</w:style>

This is the typical pattern for markup for styled paragraphs.

Markup for Simple Bulleted Lists

This section describes the markup that is generated if you create a simple bulleted list as shown in the following figure.

Figure 8. Creating a simple bulleted list

Bulleted List

The markup indirection pattern for a simple bulleted list is the same as the pattern for a simple numbering list.

For the simple bulleted list, the main document part (document.xml) contains paragraphs that are identical to the simple numbered list.

<w:p>
  <w:pPr>
    <w:pStyle w:val="ListParagraph"/>
    <w:numPr>
      <w:ilvl w:val="0"/>
      <w:numId w:val="1"/>
    </w:numPr>
  </w:pPr>
  <w:r>
    <w:t>One</w:t>
  </w:r>
</w:p>

Here is an example of the numbering part.

<w:abstractNum w:abstractNumId="0">
  <w:lvl w:ilvl="0">
    <w:start w:val="1"/>
    <w:numFmt w:val="bullet"/>
    <w:lvlText w:val="o"/>
    <w:lvlJc w:val="left"/>
    <w:pPr>
      <w:ind w:left="720"
             w:hanging="360"/>
    </w:pPr>
    <w:rPr>
      <w:rFonts w:ascii="Symbol"
                w:hAnsi="Symbol"
                w:hint="default"/>
    </w:rPr>
  </w:lvl>
  <!-- several w:lvl elements elided -->
  <w:num w:numId="1">
    <w:abstractNumId w:val="0"/>
  </w:num>
</w:numbering>

The w:val attribute of the w:lvlText element here is saved as the character 0xB7, which is the bullet symbol for this level (from the Symbol font). The main point about this example is that you can write a generalized method to assemble text for the list item, and this method works for both numbered lists and bulleted lists.

The w:start element does not affect the list item because the w:lvlText template element does not contain a replacement token (such as %1 and %2) But even if it did, we would not replace the token.

Markup for Multi-Level Lists

You can see that this framework easily allows for multi-level lists. You can change the source document to look like the following figure.

Figure 9. A simple multi-level list

List with Hierarchy

The paragraphs in the main document part looks as follows. The w:ilvl element for the second paragraph is set to "1".

<w:p>
  <w:pPr>
    <w:pStyle w:val="ListParagraph"/>
    <w:numPr>
      <w:ilvl w:val="0"/>
      <w:numId w:val="1"/>
    </w:numPr>
  </w:pPr>
  <w:r>
    <w:t>One</w:t>
  </w:r>
</w:p>
<w:p>
  <w:pPr>
    <w:pStyle w:val="ListParagraph"/>
    <w:numPr>
      <w:ilvl w:val="1"/>
      <w:numId w:val="1"/>
    </w:numPr>
  </w:pPr>
  <w:r>
    <w:t>Two</w:t>
  </w:r>
</w:p>

The numbering and style parts are not modified in this example.

Markup for Numbering that is Linked to Styles

There is a feature of numbering that makes it very convenient for users but makes the markup a little more involved. You can link a style to a numbering type and level, and then all paragraphs of that style are then represented with a list item. The canonical example is linking Heading1 style to the first level of indentation, Heading2 style to the second level. A document that uses this approach looks like the following figure.

Figure 10. Numbering that is linked to styles

Styles with Numbering

The markup indirection pattern for numbering that is linked to styles looks as follows.

Figure 11. Indirection for numbering that is linked to styles

Styled numbering indirection

In this case, the main document part does not contain any markup related to numbering.

<w:p>
  <w:pPr>
    <w:pStyle w:val="Heading1"/>
  </w:pPr>
  <w:r>
    <w:t>Overview of Numbering</w:t>
  </w:r>
</w:p>
<w:p>
  <w:pPr>
    <w:pStyle w:val="Heading2"/>
  </w:pPr>
  <w:r>
    <w:t>Markup of a Simple Numbered List</w:t>
  </w:r>
</w:p>

The markup in the style part for the Heading1 style contains a reference to the w:num element in the numbering part.

<w:style w:type="paragraph"
         w:styleId="Heading1">
  <w:name w:val="heading 1"/>
  <w:basedOn w:val="Normal"/>
  <w:pPr>
    <w:numPr>
      <w:numId w:val="1"/>
    </w:numPr>
    <w:spacing w:before="480"
               w:after="0"/>
    <w:outlineLvl w:val="0"/>
  </w:pPr>
  <w:rPr>
    <w:rFonts w:asciiTheme="majorHAnsi"
              w:eastAsiaTheme="majorEastAsia"
              w:hAnsiTheme="majorHAnsi"
              w:cstheme="majorBidi"/>
    <w:b/>
    <w:bCs/>
    <w:color w:val="365F91"
             w:themeColor="accent1"
             w:themeShade="BF"/>
    <w:sz w:val="28"/>
    <w:szCs w:val="28"/>
  </w:rPr>
</w:style>

The numbering part looks as follows:

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<w:numbering xmlns:w="https://schemas.openxmlformats.org/wordprocessingml/2006/main">
  <w:abstractNum w:abstractNumId="0">
    <w:lvl w:ilvl="0">
      <w:start w:val="1"/>
      <w:numFmt w:val="decimal"/>
      <w:pStyle w:val="Heading1"/>
      <w:lvlText w:val="%1"/>
      <w:lvlJc w:val="left"/>
      <w:pPr>
        <w:ind w:left="432"
               w:hanging="432"/>
      </w:pPr>
    </w:lvl>
    <w:lvl w:ilvl="1">
      <w:start w:val="1"/>
      <w:numFmt w:val="decimal"/>
      <w:pStyle w:val="Heading2"/>
      <w:lvlText w:val="%1.%2"/>
      <w:lvlJc w:val="left"/>
      <w:pPr>
        <w:ind w:left="576"
               w:hanging="576"/>
      </w:pPr>
    </w:lvl>
    <!-- Remaining w:lvl elements elided -->
  </w:abstractNum>
  <w:num w:numId="1">
    <w:abstractNumId w:val="0"/>
  </w:num>
</w:numbering>

The w:pStyle element links back to the style in the style part. You determine the indentation level by finding the w:pStyle element that matches the style of the paragraph. The parent w:lvl complex element defines the indentation level. As usual, you assemble the text of the list item from the other child elements of the w:lvl element.

Overriding Number Formats in Multi-Level Lists with New Number Formats

You can override the numbering format for any level in a multi-level list by using a new number format. This affects the markup. To create this markup, first create a multi-level list:

Figure 12. Create a multi-level list

Hierarchical List Style

Then select a paragraph, and define a new number format:

Figure 13. Override the number format

Define new number format

In the Define new Number Format dialog box, change some aspect. In this example, we changed Heading2 so that the list item is the cardinal number followed by a parenthesis: "One)".

Figure 14. Define New Number Format dialog box

Ordinal number format

The markup for the main document part now looks as follows:

<w:p>
  <w:pPr>
    <w:pStyle w:val="Heading1"/>
  </w:pPr>
  <w:r>
    <w:t>Paragraph One</w:t>
  </w:r>
</w:p>
<w:p>
  <w:pPr>
    <w:pStyle w:val="Heading2"/>
    <w:numPr>
      <w:ilvl w:val="1"/>
      <w:numId w:val="2"/>
    </w:numPr>
  </w:pPr>
  <w:r>
    <w:t>Paragraph Two</w:t>
  </w:r>
</w:p>
<w:p>
  <w:pPr>
    <w:pStyle w:val="Heading3"/>
  </w:pPr>
  <w:r>
    <w:t>Paragraph Three</w:t>
  </w:r>
</w:p>

The markup in the numbering part now has two w:num elements, each pointing to their own abstract numbering format:

<w:num w:numId="1">
  <w:abstractNumId w:val="1"/>
</w:num>
<w:num w:numId="2">
  <w:abstractNumId w:val="0"/>
</w:num>

The abstract numbering markup can be then processed as usual. The main point here is when you format numbering using styles, the markup follows the indirection detailed in this section. If a paragraph of that style contains a w:numPr element, this overrides the link from the abstract numbering format to the style.

Overriding Number Formats in Multi-Level Lists, Creating Multi-Level List Formats

The user can override the numbering format for any level in a multi-level list, creating a multi-level list format, and this affects the markup. To create this markup, first create a multi-level list.

Figure 15. Create a multi-level list

Current list

Then select a paragraph, and define a new multi-level list.

Figure 16. Define a new multi-level list

Define new multi-level list dialog box

In the Define new multilevel list dialog box, change some aspect of the list level. My example changes the format so that list items at the second level of indentation use uppercase alpha for the item marker.

Figure 17. Define new multi-level list dialog box

Define new multi-level list dialog box 2

The markup for the main document part contains a w:numPr element that overrides the link from the abstract numbering format to the style, exactly like when you create a number format.

<w:p>
  <w:pPr>
    <w:pStyle w:val="Heading1"/>
  </w:pPr>
  <w:r>
    <w:t>Paragraph One</w:t>
  </w:r>
</w:p>
<w:p>
  <w:pPr>
    <w:pStyle w:val="Heading2"/>
    <w:numPr>
      <w:ilvl w:val="1"/>
      <w:numId w:val="3"/>
    </w:numPr>
  </w:pPr>
  <w:r>
    <w:t>Paragraph Two</w:t>
  </w:r>
</w:p>
<w:p>
  <w:pPr>
    <w:pStyle w:val="Heading3"/>
  </w:pPr>
  <w:r>
    <w:t>Paragraph Three</w:t>
  </w:r>
</w:p>

The w:numPr element in the paragraph properties overrides the link between the abstract numbering format and the Heading2 style. The numbering part now contains a new w:num element that defines w:lvlOverride elements.

<w:num w:numId="1">
  <w:abstractNumId w:val="0"/>
</w:num>
<w:num w:numId="2">
  <w:abstractNumId w:val="1"/>
</w:num>
<w:num w:numId="3">
  <w:abstractNumId w:val="0"/>
  <w:lvlOverride w:ilvl="0">
    <w:lvl w:ilvl="0">
      <w:start w:val="1"/>
      <w:numFmt w:val="decimal"/>
      <w:pStyle w:val="Heading1"/>
      <w:lvlText w:val="%1"/>
      <w:lvlJc w:val="left"/>
      <w:pPr>
        <w:ind w:left="432"
               w:hanging="432"/>
      </w:pPr>
      <w:rPr>
        <w:rFonts w:hint="default"/>
      </w:rPr>
    </w:lvl>
  </w:lvlOverride>
  <w:lvlOverride w:ilvl="1">
    <w:lvl w:ilvl="1">
      <w:start w:val="1"/>
      <w:numFmt w:val="upperLetter"/>
      <w:pStyle w:val="Heading2"/>
      <w:lvlText w:val="%1.%2"/>
      <w:lvlJc w:val="left"/>
      <w:pPr>
        <w:ind w:left="576"
               w:hanging="576"/>
      </w:pPr>
      <w:rPr>
        <w:rFonts w:hint="default"/>
      </w:rPr>
    </w:lvl>
  </w:lvlOverride>
  . . .

Two of the w:num elements now refer to the same abstract numbering format. However, the second element now overrides the list item formatting. You can see that the level override for level 1 now specifies that the value for the w:numFmt element is "upperLetter".

Processing the w:startOverride Element

The w:lvlOverride element can contain a child w:startOverride element.

  <w:num w:numId="2">
    <w:abstractNumId w:val="2"/>
    <w:lvlOverride w:ilvl="0">
      <w:startOverride w:val="5"/>
    </w:lvlOverride>
    <w:lvlOverride w:ilvl="1">
      <w:startOverride w:val="1"/>
    </w:lvlOverride>
    . . .
  </w:num>

The w:startOverride element causes the starting number of the indent level to be the value specified. This is sometimes used when the user specifically sets this value in the user interface to force counting to start from a specific number. This is one way that markup forms a document that skips an number in a numbered list:

Figure 18. Numbered list that skips a number

Number 4 is skipped

There is another way that markup can specify a starting number. This markup is shown shortly.

Defining List Styles

One option in Word is that you can define a new named list style, and then use that style elsewhere in the document. This is a quick and easy way to implement your own customized numbering format. To do this, select the text to number, and select Define New List Style from the multi-level list.

Figure 19. Define a new list style

Define new list style

This brings up the Define New List Style dialog box.

Figure 20. Define New List Style dialog box

Define new list style 2

After completing this dialog, the selected paragraphs have the numbering format that you specify. You can then also apply this same style in other locations in the document. This feature affects the markup. Here is the indirection when you use this pattern.

Figure 21. Indirection for list styles

List style indirection

The document part looks the same as when you apply other forms of numbering.

<w:p>
  <w:pPr>
    <w:pStyle w:val="ListParagraph"/>
    <w:numPr>
      <w:ilvl w:val="0"/>
      <w:numId w:val="2"/>
    </w:numPr>
  </w:pPr>
  <w:r>
    <w:t>1</w:t>
  </w:r>
</w:p>

We can find the referenced w:num element in the numbering part.

<w:num w:numId="2">
  <w:abstractNumId w:val="0"/>
</w:num>

But if we look at the abstract numbering element, it looks very different. It does not contain child w:lvl elements, and uses the w:numStyleLink element to refer to the style defined previously.

<w:abstractNum w:abstractNumId="0">
  <w:nsid w:val="03916EF0"/>
  <w:multiLevelType w:val="multilevel"/>
  <w:tmpl w:val="0409001D"/>
  <w:numStyleLink w:val="EricsListStyle"/>
</w:abstractNum>

In the styles part, you can find the newly defined style, which then refers back to a w:num element in the numbering part.

<w:style w:type="numbering"
         w:customStyle="1"
         w:styleId="EricsListStyle">
  <w:name w:val="EricsListStyle"/>
  <w:uiPriority w:val="99"/>
  <w:rsid w:val="00B1427D"/>
  <w:pPr>
    <w:numPr>
      <w:numId w:val="1"/>
    </w:numPr>
  </w:pPr>
</w:style>

The w:num element in the numbering part looks as follows.

<w:num w:numId="1">
  <w:abstractNumId w:val="1"/>
</w:num>

This refers to a w:abstractNum element, which contains the level formats, as usual.

<w:abstractNum w:abstractNumId="1">
  <w:nsid w:val="2D235466"/>
  <w:multiLevelType w:val="multilevel"/>
  <w:tmpl w:val="0409001D"/>
  <w:styleLink w:val="EricsListStyle"/>
  <w:lvl w:ilvl="0">
    <w:start w:val="1"/>
    <w:numFmt w:val="decimal"/>
    <w:lvlText w:val="%1)"/>
    <w:lvlJc w:val="left"/>
    <w:pPr>
      <w:ind w:left="360"
             w:hanging="360"/>
    </w:pPr>
  </w:lvl>
  <w:lvl w:ilvl="1">
    <w:start w:val="1"/>
    <w:numFmt w:val="lowerLetter"/>
    <w:lvlText w:val="%2)"/>
    <w:lvlJc w:val="left"/>
    <w:pPr>
      <w:ind w:left="720"
             w:hanging="360"/>
    </w:pPr>
  </w:lvl>
  . . .

Alternatives for List Item Suffixes

The list suffix is the white space content between the list item and the paragraph. There are only two possible values: tab and space. We can alter a list definition to use a space for the list item suffix instead of a tab.

Figure 22. List with space as a suffix instead of tab

One two three

The markup uses the w:suff element to specify this.

<w:abstractNum w:abstractNumId="0">
  <w:nsid w:val="21E167DA"/>
  <w:multiLevelType w:val="multilevel"/>
  <w:tmpl w:val="5BE82C98"/>
  <w:lvl w:ilvl="0">
    <w:start w:val="1"/>
    <w:numFmt w:val="decimal"/>
    <w:pStyle w:val="Heading1"/>
    <w:suff w:val="space"/>
    <w:lvlText w:val="%1"/>
    <w:lvlJc w:val="left"/>
    <w:pPr>
      <w:ind w:left="432"
             w:hanging="432"/>
    </w:pPr>
    <w:rPr>
      <w:rFonts w:hint="default"/>
    </w:rPr>
  </w:lvl>

Processing the w:lvlRestart Element

The w:lvlRestart element allows the markup to control when a particular level restarts counting. In the following example, the level four list item starts recounting when there is an intervening level one item:

Figure 23. Controlling level restart

Renumbering restarting

The markup in the previous example for the w:lvl element where the w:ilvl attribute is equal to "3" looks as follows.

  <w:abstractNum w:abstractNumId="0">
    <w:nsid w:val="065A6A4E"/>
    <w:multiLevelType w:val="multilevel"/>
    <w:tmpl w:val="06A2CF44"/>
    <!-- several w:lvl elements elided -->
    <w:lvl w:ilvl="3">
      <w:start w:val="1"/>
      <w:numFmt w:val="decimal"/>
      <w:lvlRestart w:val="1"/>
      <w:lvlText w:val="%1.%2.%3.%4."/>
      <w:lvlJc w:val="left"/>
      <w:pPr>
        <w:ind w:left="1728"
               w:hanging="648"/>
      </w:pPr>
      <w:rPr>
        <w:rFonts w:hint="default"/>
      </w:rPr>
    </w:lvl>

Processing the w:isLgl Element

The w:isLgl element changes the textual representation of a list item so that it uses digits for all levels instead of the specified number format for that level, but only when formatting text for a given indentation level. Consider the following document, where items at the third indentation level use legal numbering.

Figure 24. A list that uses legal numbering for level three

Legal numbering

If this list did not use legal numbering for the third level of indentation, it appears as follows.

Figure 25. A list that does not use legal numbering

Not legal numbering

The markup for the w:isLgl element looks as follows.

<w:abstractNum w:abstractNumId="0">
  . . .
  <w:lvl w:ilvl="2">
    <w:start w:val="1"/>
    <w:numFmt w:val="decimal"/>
    <w:pStyle w:val="Heading3"/>
    <w:isLgl/>
    <w:lvlText w:val="%1.%2.%3)"/>
    <w:lvlJc w:val="left"/>
    <w:pPr>
      <w:ind w:left="720"
             w:hanging="720"/>
    </w:pPr>
    <w:rPr>
      <w:rFonts w:hint="default"/>
    </w:rPr>
  </w:lvl>

Algorithm to Assemble List Item Text

To assemble the list item text, you must determine:

  1. The w:lvl formatting for the paragraph.

  2. The indentation level of the paragraph.

  3. The number for each parent level of indentation for the paragraph. For example, if a paragraph is at the fourth indentation level, you must determine the level numbers of the first, second, third, and fourth levels associated with that paragraph.

After you have these three items, it is a simple text formatting exercise to assemble the list item text.

A paragraph may contain numbering properties that contain the w:ilvl and w:numId elements.

<w:p>
  <w:pPr>
    <w:pStyle w:val="ListParagraph"/>
    <w:numPr>
      <w:ilvl w:val="0"/>
      <w:numId w:val="1"/>
    </w:numPr>
  </w:pPr>
  <w:r>
    <w:t>Paragraph one.</w:t>
  </w:r>
</w:p>

In this case, you use the w:ilvl specified for the indentation level. You go to the numbering part, and look up the w:num element. If there is a w:lvlOverride element that contains a w:lvl element, the w:lvl element applies. If there is no w:lvlOverride element, or if the w:lvlOverride element does not contain a w:lvl element for the indentation level, find the associated w:abstractNum element, and use the w:lvl formatting specified in the w:abstractNum. If the w:num element contains a w:lvlOverride element, you use the w:lvl element specified for the indentation level in the override.

If the paragraph does not contain a w:numPr complex element, and the style of the paragraph contains a w:numPr element, there is an abstract numbering element that refers to the style. You use the w:lvl that refers to the style. This also gives you the indentation level.

After you determine the w:lvl formatting element, and the indentation level, you can determine the list number for each level. You must count the paragraphs at the same level before the current paragraph, paying attention to the w:lvlRestart element and the w:start element. This is where functional programming is optimal. You can build a stateless expression to calculate the item number for each indentation level. This may not be the most efficient way to do this. It may use more CPU processing than another approach, but it is easy to debug and probably fast enough for most scenarios. In my informal testing on my slowest computer, it is still basically instantaneous.

After you have the appropriate w:lvl element and the list of item numbers for each level of indentation, it is a simple text assembly process to create the list item text.

Conclusion

Numbering in Open XML WordprocessingML is complex, and justifiably so. Understanding numbering is important when accurately extracting the text of a document. In addition, generating documents that use numbering markup is a powerful approach in document assembly solutions.

Additional Resources

To get started with Open XML, see Open XML Developer Center on MSDN. There is lots of content there. This includes articles, how-to videos, and links to many blog posts. In particular, the following links provide important information to start to work with the Open XML SDK 2.0: