This document describes the elements in the WordprocessingML Schema that are important to document developers and to application developers whose programs will read and write WordprocessingML documents. The text assumes that you have a basic understanding of XML 1.0, XML namespaces, and the functionality of Microsoft® Office Word. Each major section of this document introduces new features of the language and describes those features in the context of concrete examples.
In this document, you'll see how to:
Following this introduction to WordprocessingML is a reference to the WordprocessingML elements that are most useful to developers.
After an initial overview of WordprocessingML and document-level properties and information, this white paper looks at WordprocessingML topics in the order that developers will presumably need them. This structure means that some elements are not discussed in detail in one location. For instance, the documentProperties element contains elements that affect how fields and headers are handled. As a result, the child elements of the documentProperties element are discussed in two different places in the document.
The top-level elements in a WordprocessingML document are:
docSuppData element (Microsoft Visual Basic® for Applications [VBA] code)
However, the simplest Word document consists of just five elements (and a single namespace). The five elements are:
The namespace for the root WordprocessingML Schema (also known as the XML Document 2003 Schema) is "http://schemas.microsoft.com/office/word/2003/wordml". This namespace is normally associated with the WordprocessingML elements by using a prefix of "w." The simplest possible WordprocessingML document looks like this:
<?xml version="1.0"?>
<w:wordDocument
xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml">
<w:body>
<w:p>
<w:r>
<w:t>Hello, World.</w:t>
</w:r>
</w:p>
</w:body>
</w:wordDocument>
In Figure 1, you can see the resulting document, displayed in Microsoft Office Word.
Figure 1. A WordprocessingML document in Microsoft Office Word
If you save a Word document with the .xml extension, Windows will treat the file like any other XML file. Double-clicking the file, for instance, will open it in the standard XML processor (usually Microsoft Internet Explorer). However, adding the mso-application
processing instruction specifies Word as the preferred application for processing the file. As a result, Word will open the XML document when the user double-clicks the document's icon. This example shows the sample document with the mso-application
element added:
<?xml version="1.0"?>
<?mso-application progid="Word.Document"?>
<w:wordDocument
xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml">
<w:body>
<w:p>
<w:r>
<w:t>Hello, World.</w:t>
</w:r>
</w:p>
</w:body>
</w:wordDocument>
A side effect of this automatic behavior, however, is that it prevents the display in Internet Explorer of the XML markup of XML files saved by Word. You can temporarily disable this behavior by deleting the following registry entry and value
Word.Document = "application/msword"
from the following subkey:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\11.0\Common\
Filter\text/xml
The document's content is held in the body element. Text within the body element is kept in a nested set of three elements: t (a piece of text), r (a run of text within a paragraph), and p (a paragraph).
The lowest level of this hierarchy is the t element, which is the container for the text that makes up the document's content. You can put as much text as you want in a t element
A t element must be enclosed by an r element
In a WordprocessingML document, the layout of the page that your text appears in is controlled by the properties for that section of the document. However, there is no container element for sections in WordprocessingML. Instead, the information about a section is kept inside a sectPr (section properties) element that appears at the end of each section. Though a sectPr element isn't necessary in a WordprocessingML document, Word always inserts a sectPr element at the end of any new document that it creates. Here is a typical sectPr element generated by Word when a document is created:
<w:sectPr>
<w:pgSz w:w="12240" w:h="15840"/>
<w:pgMar w:top="1440" w:right="1800" w:bottom="1440" w:left="1800"
w:header="720" w:footer="720" w:gutter="0"/>
<w:cols w:space="720"/>
<w:docGrid w:line-pitch="360"/>
</w:sectPr>
When new sections are added to a WordprocessingML document, the new sectPr elements must appear inside pPr elements (which are discussed later) inside p elements. This example shows a sectPr element added to a document to mark the end of a section:
<w:p>
<w:pPr>
<w:sectPr>
<w:pgSz w:w="6120" w:h="7420" />
<w:pgMar w:top="720" w:right="720" w:bottom="720"
w:left="720" w:header="0" w:footer="0" w:gutter="0" />
</w:sectPr>
</w:pPr>
</w:p>
Each sectPr element marks the end of a section and the start of a new section. The child elements of the sectPr element provide the definition of the section just ended. All the child elements for the sectPr element are listed in Table 4.
While WordprocessingML does not have a container for sections, Word does generate sect elements that act as containers for sections. These are not part of WordprocessingML but belong to the Auxiliary XML Document 2003 namespace ("http://schemas.microsoft.com/office/word/2003/auxHint"). The sect elements (and other auxiliary elements) are discussed later in this document.
The following example has multiple t elements inside an r element (for the following examples, only the body element and its children are shown):
<w:body>
<w:p>
<w:r>
<w:t>Hello, World.</w:t>
<w:t> How are you, today?</w:t>
</w:r>
</w:p>
</w:body>
Although this document is valid, duplicating the t element isn't necessary. Therefore, the following example would give the same result as the previous one:
<w:body>
<w:p>
<w:r>
<w:t>Hello, World. How are you, today?</w:t>
</w:r>
</w:p>
</w:body>
Typically, if you have multiple t elements in an r element, it's because you need to insert some other element in between the pieces of text. In the following example, a br element appears between the two t elements. The br element will force the second t element to a new line when the text is displayed in Word:
<w:body>
<w:p>
<w:r>
<w:t>Hello, World. </w:t>
<w:br w:type="text-wrapping"/>
<w:t>How are you, today?</w:t>
</w:r>
</w:p>
</w:body>
The br element's type attribute allows you to specify the kind of break ("page", "column", "text-wrapping"). Because the default is "text-wrapping" (a new line), the type attribute in the previous example could have been omitted. Figure 2 shows the results of using a br element between r elements.
Figure 2. A Word document with a br element between t elements
You use p elements to define new paragraphs (a br element with text-wrapping is equivalent to the "soft break" in Word that's created by pressing SHIFT+ENTER and doesn't start a new paragraph). A WordprocessingML document with text in two separate paragraphs would look like this:
<w:body>
<w:p>
<w:r>
<w:t>Hello, World.</w:t>
</w:r>
</w:p>
<w:p>
<w:r>
<w:t>How are you, today?</w:t>
</w:r>
</w:p>
</w:body>
The resulting document can be seen in Figure 3. As comparing Figures 2 and 3 shows, depending on your formatting options, the difference between using br elements and p elements may not be visible. The display of a WordprocessingML document in Word may not reveal the underlying structure of the document.
Figure 3. A Word document with multiple p elements
The tab element allows you to position text horizontally on a line. Tab elements move the following text to the next tab stop. Exactly where on the line that will be depends on how tab stops are defined in the document.
In this example, the text will appear on a single line but with each t element's text positioned at a separate tab stop:
<w:p>
<w:r>
<w:tab/>
<w:t>Hello, World.</w:t>
</w:r>
<w:r>
<w:tab/>
<w:t>How are you, today?</w:t>
</w:r>
</w:p>
Tab stops are defined in the pPr element, which is also a child of the p element. Within the pPr element, you can set the tab stops for the paragraph by using tab elements with the tabs element. Three attributes on the tab element define the tab stop:
For example, this paragraph has three tab stops at 1 inch (1,440 twips), 3 inches (4,320 twips), and 5 inches (7,200 twips), with each tab stop being a different type. In the example, the tab elements before the r element move the text to the second tab stop:
<w:p>
<w:pPr>
<w:tabs>
<w:tab w:val ="center" w:pos="1440"/>
<w:tab w:val="left" w:pos="4320"/>
<w:tab w:val="decimal" w:pos="7200"/>
</w:tabs>
</w:pPr>
<w:r>
<w:tab/>
<w:tab/>
<w:t>Hello, World.</w:t>
</w:r>
</w:p>
Table 7 lists the attributes for the tab element and the options that you can use.
The most powerful formatting tool discussed in this section is WordprocessingML styles. Although you can format your document by setting individual properties at the paragraph and run level, this approach may not be your best choice. If you're doing more than setting bold, underline, or italics for a single run, using styles to format your document makes it easier to manage the appearance of your document.
The rPr (run property) element is a container that holds the property elements that define how a run is to be treated by Word. Only one rPr element is allowed within an r element. Table 2 lists all of the elements that can be included inside the rPr element with their description (taken from the WordprocessingML schema).
Most the children of the rPr element have a single val attribute that is limited to a specific set of values. For instance, the b (bold) element causes the text that follows it to be bold when the b element has its val attribute set to "on". In this example, both "Hello, World." and "How are you, today?" will be bold because both sets of text are in the same run and follow the rPr element with the b element.
Note The prefix "w:" (which denotes the namespace) on the val attribute is not optional.
<w:r>
<w:rPr>
<w:b w:val="on"/>
</w:rPr>
<w:t>Hello, World.</w:t>
<w:br/>
<w:t>How are you, today?</w:t>
</w:r>
Figure 4 shows the result of this change.
Figure 4. Text in an r element with the b element used in the rPr element
If the val attribute isn't present for the b element, it defaults to "on". Therefore, the element <w:b/> is equivalent to the element <w:b w:val=""on"/>.
If the style applied to a run (or a paragraph) has the bold property turned on, you can suppress the bold formatting by setting the val attribute to "off" like this:
<w:r>
<w:rPr>
<w:b w:val="off"/>
</w:rPr>
<w:t>Hello, World.</w:t>
<w:t>How are you, today?</w:t>
</w:r>
While most rPr elements use just the val attribute, there are exceptions (the asianLayout property, for instance, takes several attributes). Table 2 provides the values for the val attribute for each of the rPr properties, provided that the list of values is short. Where the element has multiple attributes, doesn't use the val attribute, or has a large number of values, the table gives the name of the type definition in the WordprocessingML schema that describes the element.
For example, the underline element uses the val attribute but offers more choices than "on" and "off". This example gives the text a single, continuous underline (other options include "words", "double", and "thick"):
<w:r>
<w:rPr>
<w:u w:val="single"/>
</w:rPr>
<w:t>How are you today?</w:t>
</w:r>
The result appears in Figure 5.
Figure 5. Applying the u element to text
The pPr element defines the properties for a paragraph. Table 3 lists the permitted child elements. For example, within the pPr element, the jc element is used to control the paragraph's alignment. In this document, the text in the paragraph will be centered (see Figure 6):
<w:p>
<w:pPr>
<w:jc w:val="center"/>
</w:pPr>
<w:r>
<w:t>Hello, World.</w:t>
<w:br/>
<w:t>How are you, today?</w:t>
</w:r>
</w:p>
Figure 6. Centered text
Styles allow you to create a group of style properties that can be applied as a unit either to individual paragraphs (within the pPr element) or runs (within the rPr element). Styles reduce the amount of WordprocessingML text that you have to produce and the amount of work required to make changes to your document's appearance. With styles, changing the appearance of all the pieces of text that share a common style has to be done in only one place: the style definition.
The pStyle element inside the pPr element specifies which style is to be used for all runs in the paragraph. In the rPr elements, the rStyle element specifies the style for individual runs. The text inside the t elements will reflect a merging of the styles set at the pPr and set at the rPr level. There are no child elements in common between the pPr and rPr elements, so merging the two property sets is straightforward.
In this example:
<w:body>
<w:p>
<w:pPr>
<w:pStyle w:val="MyStyle"/>
</w:pPr>
<w:r>
<w:rPr>
<w:rStyle w:val="MyFirstRunStyle"/>
</w:rPr>
<w:t>Hello, World.</w:t>
</w:r>
<w:r>
<w:rPr>
<w:rStyle w:val="MySecondRunStyle"/>
</w:rPr>
<w:t>How are you, today?</w:t>
</w:r>
</w:p>
</w:body>
Styles are defined in the WordprocessingML styles element, which is a top-level element under the wordDocument element. Within the styles element, each style element defines a single style. A style element is a container element for elements that define the style (all the children for the style element are listed in Table 6).
The style element itself takes three attributes: type, styleId, and default:
The type attribute allows you to indicate what kind of style you're defining: paragraph, character, table, or list. Styles used in the pPr element must be paragraph styles; styles in the rPr element must be character styles.
The styleId attribute gives your style the name that you use to invoke the style in your WordprocessingML document.
When the default attribute is set to "on," it indicates that this style is the default style for a particular type of style: paragraph, character, table, and list.
In the following example, three styles are defined:
<w:styles>
<w:style w:type="paragraph" w:styleId="MyParagraphStyle"
w:default="on"/>
<w:style w:type="paragraph" w:styleId="AnotherParagraph"
w:default="off"/>
<w:style w:type="character" w:styleId="EmphasisStyle"
w:default="off"/>
</w:styles>
The following sample applies those styles. "AnotherStyle" is used for the first paragraph in the document. In the second paragraph, no paragraph style is specified, so the second paragraph will be formatted using the default style ("MyParagraphStyle"). However, within the r element in the second paragraph, a character style is used to control the appearance of the text:
<w:body>
<w:p>
<w:pPr>
<w:pStyle w:val="AnotherParagraph"/>
</w:pPr>
<w:r>
<w:t>Hello, World.</w:t>
</w:r>
</w:p>
<w:p>
<w:r>
<w:rPr>
<w:rStyle w:val="Emphasis"/>
</w:rPr>
<w:t>How are you, today?</w:t>
</w:r>
</w:p>
</w:body>
To create a style by extending another style, you use the basedOn element. The basedOn element allows you to create variations on a style by adding or overriding the properties of the base style. This example defines an "Italic" style and then uses it as the base for a "ItalicBold" style:
<w:styles>
<w:style w:type="paragraph" w:styleId="Italic" >
<w:rPr>
<w:i w:val="on"/>
</w:rPr>
</w:style>
<w:style w:type="paragraph" w:styleId="ItalicBold" >
<w:basedOn w:val="Italic"/>
<w:rPr>
<w:b w:val="on"/>
</w:rPr>
</w:style>
</w:styles>
<w:body>
<w:p>
<w:pPr>
<w:pStyle w:val="ItalicBold" />
</w:pPr>
<w:r>
<w:t>Hello, World.</w:t>
</w:r>
</w:p>
</w:body>
The order of the style elements with the styles element doesn't matter: a basedOn style can extend style elements that precede or follow it.
Other useful child elements of the style element include:
As a more comprehensive example, the following style element establishes:
<w:style w:type="paragraph" w:styleId="ReferenceName" >
<w:name w:val="DisplayName" />
<w:locked w:val="on" />
<w:hidden w:val="off"/>
<w:next w:val="ItalicBold"/>
<w:rPr>
<w:i w:val="on"/>
</w:rPr>
</w:style>
This paragraph uses the style just defined, by setting the val attribute to the name specified in the style's styleId attribute:
<w:p>
<w:pPr>
<w:pStyle w:val="ReferenceName"/>
</w:pPr>
<w:r>
<w:t>Hello, World</w:t>
</w:r>
</w:p>
Figure 7 shows the style applied to the first paragraph in the document. On the Formatting toolbar in Word, the Style drop-down list shows the name established for the style through the name element in the style element. The second paragraph in Figure 7 was created by pressing the ENTER key at the end of the first paragraph and is in the "ItalicBold" style, as specified by the next element.
Figure 7. The "DisplayName" style in use
You define a style by adding child elements to the style elements (all of the children are listed in Table 6). Within the style element, rPr and pPr elements allow you to define the formatting to be used at the r and p levels. The only limitation is that pPr elements used in a character style are ignored (and, as mentioned before, you can only refer to paragraph styles within a pPr element and only to character styles within an rPr element).
Putting it all together, this document defines a style that sets the justification for the paragraph (in the pPr element of the style) and combines bold and italic (in the rPr element of the style). The style is then used to format a paragraph:
<w:styles>
<w:style w:type="paragraph" w:styleId="ItalicBold">
<w:pPr>
<w:jc w:val="center"/>
</w:pPr>
<w:rPr>
<w:i w:val="on"/>
<w:b w:val="on"/>
</w:rPr>
</w:style>
</w:styles>
<w:body>
<w:p>
<w:pPr>
<w:pStyle w:val="ItalicBold" />
</w:pPr>
<w:r>
<w:t>Hello, World.</w:t>
</w:r>
</w:p>
</w:body>
The result of applying this paragraph style using the pPr element inside the body element is that the text will be italic, bold, and centered.
If you violate the restrictions that Word puts on using styles, Word won't raise an error but Word also won't apply your styles. Consider this example, which is similar to the previous example but has some key changes that prevent the style from being applied:
<w:styles>
<w:style w:type="character" w:styleId="ItalicBold" >
<w:pPr>
<w:jc w:val="center"/>
</w:pPr>
<w:rPr>
<w:i w:val="on"/>
<w:b w:val="on"/>
</w:rPr>
</w:style>
</w:styles>
<w:body>
<w:p>
<w:pPr>
<w:pStyle w:val="ItalicBold" />
</w:pPr>
<w:r>
<w:t>Hello, World.</w:t>
</w:r>
</w:p>
</w:body>
In this example, the "ItalicBold" style has its type attribute set to character. The result is that Word will ignore the use of the style in the pPr element inside the body element.
In this example, the character version of the style is used correctly inside the rPr element but the result will still not reflect all of the settings made in the "ItalicBold" style:
<w:body>
<w:p>
<w:r>
<w:rPr>
<w:rStyle w:val="ItalicBold" />
</w:rPr>
<w:t>Hello, World.</w:t>
</w:r>
</w:p>
</w:body>
Because the style is specified as being a character style, the pPr element in the style definition (where the center justification is specified) will be ignored. The rPr element inside the style is applied, though. As a result, the text will be bold and italic but not centered.
You could also make text centered, bold, and italic by making the "ItalicBold" style the default paragraph style and by not specifying a style at the paragraph level:
<w:styles>
<w:style w:type="paragraph" w:styleId="ItalicBold"
w:default="on">
<w:pPr>
<w:jc w:val="center"/>
</w:pPr>
<w:rPr>
<w:i w:val="on"/>
<w:b w:val="on"/>
</w:rPr>
</w:style>
</w:styles>
<w:body>
<w:p>
<w:r>
<w:t>Hello, World.</w:t>
</w:r>
</w:p>
</w:body>
Because properties can be set at the style, p, and r element levels, Word must deal with conflicts between the three levels. In this example, for instance, Word must reconcile:
<w:body>
<w:p>
<w:pPr>
<w:pStyle w:val="MyStyle"/>
</w:pPr>
<w:r>
<w:rPr>
<w:rStyle w:val="MyFirstRunStyle"/>
<w:b w:val="on"/>
</w:rPr>
<w:t>Hello, World.</w:t>
</w:r>
</w:p>
</w:body>
Three rules are used to reconcile settings for properties that are either "on" or "off":
Applying the rules to the previous example, the text "Hello, World" will be bold because of the b element in the rPr element of the run. If the b element hadn't been used in the rPr element, then if either "MyStyle" or "MyFirstRunStyle" turned on bold formatting, the text would be bold
WordprocessingML provides two separate kinds of support for fonts:
You can specify which fonts are used in your document by using the fonts element. For each font that you use, you can specify a variety of properties that allow Word to manage the document's fonts and make appropriate substitutions when the requested font isn't available. Setting these properties requires an understanding of font information that's beyond the scope of this document. The fonts element has no direct effect on your document's appearance but is simply a place to supply Word with information about the fonts used by the document.
Within the fonts element, the defaultFont element specifies the default fonts for the document. This element directly controls which font is to be used to display the text in the document (unless overridden by a style or an rPr child element).The defaultFont element has a set of attributes that let you specify the default fonts for four character sets: ascii, fareast, h-ansi, and cs (complex scripts, for example, those scripts that allow bidirectional rendering). The defaultFont element is one of the elements that control what font is to be used in displaying the document.
You can override the default font by using the rFonts element in the rPr element. This can be done either in the rPr element preceding the t element with the text, in an rPr element inside a pPr element, or in a style. The rFonts element takes the same attributes as the defaultFonts element. For example, the following element sets the font for a run to Tahoma:
<w:r>
<w:rPr>
<w:rFonts w:ascii="Tahoma" w:h-ansi="Tahoma" w:cs="Tahoma"/>
</w:rPr>
<w:t>Hello, World</w:t>
</w:r>
The font that you use to display your text doesn't have to be listed in the fonts element at the start of the document. However, without the information in the fonts element, if the font that you specify in the rFonts element isn't available on the computer where Word is displaying the document, Word may not make the best choice in selecting a substitute font.
At the section level, formatting information is held with the sectPr element at the end of the section. Within the sectPr element, child elements allow you to control the page's size and margins and to define columns for the page.
In the sectPr elements, there are two elements that control your page's layout:
The following pgSz element uses the w attribute to set a page width of 12,240 twips (8.5 inches) and the h attribute to set the height at 15,840 twips (11 inches):
<w:sectPr>
<w:pgSz w:w="12240" w:h="15840" w:code="1"/>
</w:sectPr>
The following pgMar element sets the top and bottom margins at 1,440 twips (1 inch) and the left and right margins at 1,800 twips (1.25 inches). In addition, the header and footer are 720 twips (0.5 inches) each.
<w:sectPr>
<w:pgMar w:top="1440" w:right="1800" w:bottom="1440" w:left="1800"
w:header="720" w:footer="720" />
</w:sectPr>
The pgMar element also lets you specify how much space is to be set aside for the gutter, which is the part of the page that is lost to the binding process when pages are bound together. In the previous example, no space has been left for the gutter. This next example sets aside 360 twips (0.25 inches) for the gutter:
<w:sectPr>
<w:pgMar w:top="1440" w:right="1800" w:bottom="1440" w:left="1800"
w:header="720" w:footer="720" w:gutter="360"/>
</w:sectPr>
Typically, documents are bound down their inside edges. If your documents are bound along the top, you'll need to specify that in the docPr element:
<w:docPr>
<w:gutterAtTop w:val="on"/>
</w:docPr>
To force the gutter to the right side of the page, set the rtlGutter element to "on" in the sectPr element:
<w:sectPr>
<w:rtlGutter w:val="on"/>
</w:sectPr>
You can define columns in the sectPr element by using the cols element. If your columns are all the same width, you need only to specify the number of columns (in the num attribute) and the space between columns (in the space attribute):
<w:sectPr>
<w:cols w:num="4" w:space="720"/>
</w:sectPr>
If the columns have different widths, you must insert col elements inside the cols element. However, you must still specify the number of columns on the cols element. You must also turn off the equalWidth attribute.
<w:sectPr>
<w:cols w:num="4" w:sep="on" w:space="1440" w:equalWidth="off">
<w:sectPr>
For each col element except the last one, you specify the width of the column and the space following it.
<w:cols w:num="4" w:sep="on" w:space="1440" w:equalWidth="off">
<w:col w:w="1440" w:space="500"/>
<w:col w:w="2880" w:space="500"/>
<w:col w:w="1080" w:space="750"/>
<w:col w:w="1080"/>
</w:cols>
You do not have to do anything further. Word will make the content of the t elements in the document's body flow, or "snake," through the columns.
This section shows how to add lists, tables, headers, footers, and title page elements to a WordprocessingML document. You'll also see how to add both document properties and document information to your document.
In WordprocessingML, lists are a series of paragraphs that have a list style applied to them, with each item in the list in a separate paragraph. What distinguishes a "list paragraph" from an ordinary paragraph is the presence of a listPr element in the pPr element in the paragraph. The listPr element specifies the list style to be used with the paragraph's content and the level of the list. Here is a sample of a list with two items, which are represented by the two paragraphs with listPr elements:
<w:p>
<w:pPr>
<w:listPr>
<w:ilvl w:val="0" />
<w:ilfo w:val="1" />
</w:listPr>
</w:pPr>
<w:r>
<w:t>Item 1</w:t>
</w:r>
</w:p>
<w:p>
<w:pPr>
<w:listPr>
<w:ilvl w:val="0" />
<w:ilfo w:val="1" />
</w:listPr>
</w:pPr>
<w:r>
<w:t>Item 2</w:t>
</w:r>
</w:p>
Only two elements can appear inside the listPr element:
As an example of the ilvl element in action, consider the list shown in Figure 8:
Figure 8. Example of the ilvl element in use
There are actually three lists in the example. First, there is an outer list with two items ("Types of Web sites" and "WayFinding", numbered 1 and 2). Within those items are two nested lists. The first is the list consisting of "Applications", "Content", and "Hybrid"; the second list consists of "Planning strategies" and "Executing plans with feedback". In WordprocessingML, this example consists of seven paragraphs, one for each list item. The paragraphs in different lists are at different paragraph levels and have different list styles assigned to them.
For the first three paragraphs, the WordprocessingML would look like this:
<w:p>
<w:pPr>
<w:listPr>
<w:ilvl w:val="0" />
<w:ilfo w:val="2" />
</w:listPr>
</w:pPr>
<w:r>
<w:t>Types of Web sites</w:t>
</w:r>
</w:p>
<w:p>
<w:pPr>
<w:listPr>
<w:ilvl w:val="1" />
<w:ilfo w:val="2" />
</w:listPr>
</w:pPr>
<w:r>
<w:t>Applications</w:t>
</w:r>
</w:p>
<w:p>
<w:pPr>
<w:listPr>
<w:ilvl w:val="1" />
<w:ilfo w:val="2" />
</w:listPr>
</w:pPr>
<w:r>
<w:t>Content</w:t>
</w:r>
</w:p>
The entry in the outermost list ("Types of Web sites") has an ilvl element with a val attribute of "0". The next paragraph, which is the first item of the nested list ("Applications") has an ilvl element with a val attribute of "1", indicating that the paragraph is nested one level deep. All of the paragraphs use the same list style, specified in the ilfo element as "2".
The value in the ilfo element refers to a list element that appears inside the lists element before the body element. The list element, in turn, associates the ilfo element id with a particular list definition by using the ilst element. The following list element, for instance, defines list "1" as using list definition "2":
<w:lists>
<w:list w:ilfo="1">
<w:ilst w:val="2" />
</w:list>
</w:lists>
The list element can contain one other element, lvlOverride. The lvlOverride element contains elements that override settings in the list definition. These override settings can include a new starting number for the list and different formatting. By using the lvlOverride element, you can specify settings for a particular list (or part of a list) without having to create a whole new list definition.
The actual list definitions are defined inside the listDef element, which also appears inside the lists element. Within the listDef element, the listDefId attribute (which must be numeric) specifies the list name that is used in the ilfo element of the list element. All of the children of the listDef element are given in Table 16.
Within the listDef element, lvl elements define how the list items at each level are to be formatted. The format information inside an lvl element can include a pPr element (containing formatting for p elements) and an rPr element (containing formatting for r elements), among other elements. The pPr and rPr settings will automatically be applied to the p and r elements that make up the list item's paragraph.
Also within the lvl element, the start element specifies the starting number for the list.
The following sample listDef style definition defines two levels of a list (Word typically generates eight levels of definition for a listDef element). The list definition is tied to the actual list in the body of the document through a list element. The example demonstrates that linkage. Starting from the listDef element:
<w:lists>
<w:listDef w:listDefId="2">
<w:lvl w:ilvl="0">
<w:start w:val="1" />
<w:lvlText w:val="%1." />
<w:lvlJc w:val="left" />
<w:pPr>
<w:tabs>
<w:tab w:val="list" w:pos="1080" />
</w:tabs>
<w:ind w:left="1080" w:hanging="720" />
</w:pPr>
<w:rPr>
<w:rFonts w:hint="default" />
</w:rPr>
</w:lvl>
<w:lvl w:ilvl="1" w:tplc="56325532">
<w:start w:val="1" />
<w:nfc w:val="4" />
<w:lvlText w:val="%2." />
<w:lvlJc w:val="left" />
<w:pPr>
<w:tabs>
<w:tab w:val="list" w:pos="1800" />
</w:tabs>
<w:ind w:left="1800" w:hanging="720" />
</w:pPr>
<w:rPr>
<w:rFonts w:hint="default" />
</w:rPr>
</w:lvl>
</w:listDef>
<w:list w:ilfo="1">
<w:ilst w:val="2" />
</w:list>
</w:lists>
<w:body>
<w:p>
<w:pPr>
<w:listPr>
<w:ilvl w:val="0" />
<w:ilfo w:val="1" />
</w:listPr>
</w:pPr>
<w:r>
<w:t>Types of Web sites</w:t>
</w:r>
</w:p>
</w:body>
Using the list element as an intermediary between the list definition in the listDef element and the listPr tag in the body makes it easy to change a style for a group of lists. Rather than rewrite the list definition or set all the lists in the document to use a different list definition, all that's necessary is to update the val attribute of the ilst element in the list element so that it points to a different listDef element. All the listPr tags that use that list element will now be displayed according to the new listDef element.
WordprocessingML lets you add headers, footers, and a title page to your document. In WordprocessingML, headers and footers are just another kind of paragraph.
Headers and footers are defined in the sectPr element that marks the end of the section. In the sectPr element, the hdr elements contain the definitions of the headers for the section, and the ftr elements contain the definitions for the footers. Within the hdr and ftr elements, the content of the element is treated like the content of the body element: p, r, and t elements are used to hold the text that makes up the header or footer.
Here's an example of the definition of a header and a footer:
<w:sectPr>
<w:hdr w:type="odd" >
<w:p>
<w:pPr>
<w:pStyle w:val="Header"/>
</w:pPr>
<w:r>
<w:t>My Header</w:t>
</w:r>
</w:p>
</w:hdr>
<w:ftr w:type="odd">
<w:p>
<w:pPr>
<w:pStyle w:val="Footer"/>
</w:pPr>
<w:r>
<w:t>My Footer</w:t>
</w:r>
</w:p>
</w:ftr>
<w:sectPr>
You can use any style that you want to control the formatting of a header or footer. A typical header style might look like this:
<w:style w:type="paragraph" w:styleId="Header" >
<w:name w:val="header"/>
<w:basedOn w:val="Normal"/>
<w:pPr>
<w:pStyle w:val="Header"/>
<w:tabs>
<w:tab w:val="center" w:pos="4320"/>
<w:tab w:val="right" w:pos="8640"/>
</w:tabs>
</w:pPr>
</w:style>
The hdr and ftr elements have a type attribute that takes one of three values: "even", "odd", and "first". If you're only using one hdr or ftr element, the type attribute must be set to "odd".
To have a different header (or footer) on even and odd pages, you will need two hdr elements, one with its type attribute set to "even" and one with its type attribute set to "odd". For example:
<w:sectPr>
<w:hdr w:type="odd">
<w:p>
<w:pPr>
<w:pStyle w:val="Header"/>
</w:pPr>
<w:r>
<w:t>My Odd Header</w:t>
</w:r>
</w:p>
</w:hdr>
<w:hdr w:type="even">
<w:p>
<w:pPr>
<w:pStyle w:val="Header"/>
</w:pPr>
<w:r>
<w:t>My Even Header</w:t>
</w:r>
</w:p>
</w:hdr>
</w:sectPr>
You must also add the evenAndOddHeaders element to the docPr element at the top of the document:
<w:docPr>
<w:evenAndOddHeaders/>
</w:docPr>
If you set the type attribute of a hdr or ftr element to "first", the hdr or ftr will be used only on the first page (even if it's the only hdr or ftr element in the document). You don't have to add any elements to the document properties to use this option, but you do need to add the titlePg element to the end of the sectPr element, following the definition of your headers and footers:
<w:sectPr>
<w:hdr w:type="first">
<w:p>
<w:pPr>
<w:pStyle w:val="Header"/>
</w:pPr>
<w:r>
<w:t>My Title Page Header</w:t>
</w:r>
</w:p>
</w:hdr>
<w:titlePg/>
</w:sectPr>
To ensure that your headers and footers display correctly, you should allocate the space on the page to display them. For this, you'll need to control your page's layout, as described earlier in "Formatting a Section."
In WordprocessingML, tables are defined with the tbl element (Table 10 lists the high-level table elements). The following elements are used within the tbl element:
The following example shows a table with two columns and a single row. The tbl element is followed by a tblPr element, which contains a set of table properties. As is typical in WordprocessingML, each property is an empty element with a single val attribute that contains the value for the property. In this example:
<w:tbl>
<w:tblPr>
<w:tblStyle w:val="TableGrid"/>
<w:tblW w:w="0" w:type="auto"/>
<w:tblLook w:val="000001E0"/>
</w:tblPr>
The next element inside the tbl element is the tblGrid element, which contains one gridCol element for each column in the table. The w attribute of the gridCol element gives the width of the column in twips. In this example, there are two columns, one 1770 twips and one 1400 twips wide:
<w:tblGrid>
<w:gridCol w:w="1770"/>
<w:gridCol w:w="1400"/>
</w:tblGrid>
With the table now defined, tr elements are added to contain the cells with the table's content. The tr element can contain a trPr element, which holds the properties for the row (for example, the row's height and whether it can be split across a page). The following example omits the trPr element.
Within the tr element, the row's cells, which are defined by tc elements, contain the table's content. Within a tc element, the tcPr element contains the properties for the cell. In the following example:
Also within the tc element is the cell's content. In this example, the content is a p element with a single run with a single piece of text:
<w:tbl>
<w:tr>
<w:tc>
<w:tcPr>
<w:tcW w:w="1770" w:type="dxa"/>
</w:tcPr>
<w:p>
<w:r>
<w:t>Hello, World</w:t>
</w:r>
</w:p>
</w:tc>
</w:tr>
</w:tbl>
You can merge cells by using the vmerge (merge cells vertically) and hmerge (merge cells horizontally) elements in the tcPr element. An empty vmerge or hmerge element with its val attribute set to "restart" marks the start of a merged range. A Vmerge or hmerge element with no attributes (or with the val attribute set to "continue") marks a cell that is part of the merged region. In this example, the last cell in the first row starts a merge that is completed in the cell below it:
<w:tr>
<w:tc>
<w:p>
<w:r>
<w:t>First cell, first row</w:t>
</w:r>
</w:p>
</w:tc>
<w:tc>
<w:tcPr>
<w:vmerge w:val="restart"/>
</w:tcPr>
<w:p>
<w:r>
<w:t>Last cell, first row </w:t>
</w:r>
</w:p>
</w:tc>
</w:tr>
<w:tr>
<w:tc>
<w:p>
<w:r>
<w:t>First cell, second row</w:t>
</w:r>
</w:p>
</w:tc>
<w:tc>
<w:tcPr>
<w:vmerge />
</w:tcPr>
<w:p>
<w:r>
<w:t>Last cell, second row </w:t>
</w:r>
</w:p>
</w:tc>
</w:tr>
Figure 9 shows the results of merging the two cells. The space that the second cell of the last row would occupy is now just a continuation of the cell above it and can have no separate content. The content specified in the WordprocessingML file for the second cell of the last row disappears when the WordprocessingML is displayed in Word. The content is still present, though, and can be retrieved through Word's object model.
Figure 9. The results of merging two cells
For more information, Table 11 lists the table property elements, Table 12 lists the table positioning elements, Table 13 lists the child elements for table row properties, and Table 14 lists the child elements for table cell properties.
The document has a set of properties, held in the docPr element. Table 9 lists the properties that can be set at the document level. Some useful settings for developers who are creating documents include:
The following example shows a set of document properties that set the user's view to normal, zoom the view to full page, and prevent the user from changing the formatting in the document:
<w:docPr>
<w:view w:val="normal"/>
<w:zoom w:val="full-page" w:percent="100"/>
<w:documentProtection w:formatting="on" w:enforcement="on"/>
</w:docPr>
The DocumentProperties element performs a different function from the docPr element. Like docPr, DocumentProperties is a container for other elements. The DocumentProperties element, however, is not part of the WordprocessingML namespace but is part of the Microsoft Office Common Properties namespace ("urn:schemas-microsoft-com:office:office"), a set of elements common to all Office applications.
The DocumentProperties element contains meta-information about the document, including the document's title, version, and author. Some statistics about the document are also kept in the DocumentProperties element, including the number of characters, pages, lines, and paragraphs. Here's a sample DocumentProperties element:
<o:DocumentProperties>
<o:Title>Sample Document</o:Title>
<o:Author>Jane Doe</o:Author>
<o:Pages>1</o:Pages>
<o:Words>2</o:Words>
<o:Characters>15</o:Characters>
<o:Lines>1</o:Lines>
<o:Paragraphs>1</o:Paragraphs>
<o:Version>11.5606</o:Version>
</o:DocumentProperties>
WordprocessingML stores graphics as a combination of Vector Markup Language (VML) and a binary representation of the image. A discussion of VML is outside the scope of this document, but this section shows how picture data fits into the structure of a WordprocessingML document.
Some shapes are very easy to add. For instance, to add a rectangle to your document, you only need the VML rect (rectangle) element. The element's style attribute holds the information to draw a rectangle in the right place at the right size:
<v:rect id="_x0000_s1032" style="position:absolute;margin-
left:63pt;margin-top:4.2pt;width:54pt;height:45pt;z-index:1" />
To use the VML rect element, you must add the Vector Markup Language (VML) namespace ("urn:schemas-microsoft-com:vml") to the namespaces declared in your document. The Common Properties namespace ("urn:microsoft-schemas:office:office") may also be required if you intend to include anything more than the simplest AutoShapes. Typically, you'll establish these namespaces in the document's root element:
<w:wordDocument
xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml"
xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xml:space="preserve">
Your graphic must appear inside a pict element inside an r element inside a p element. This example uses the VML rect element to add a simple rectangle to the document:
<w:p>
<w:r>
<w:pict>
<v:rect id="_x0000_s1032"
style="position:absolute;margin-left:63pt;margin-
top:4.2pt;width:54pt;height:45pt;z-index:1" />
</w:pict>
</w:r>
</w:p>
If a graphic consists of more than just a simple shape, you will also need to include a base64-encoded version of the graphic:
<w:p>
<w:r>
<w:pict>
<v:shapetype id="_x0000_t75" coordsize="21600,21600"
o:spt="75" o:preferrelative="t">
<w:binData w:name="http://01000001.gif">R0lGODlhQQAzAKI
...more base64-encoded data...
q18Ldi1baGzZt1/nZr07dW/Tv0cHDz3cc3HOxzMnt7x8
</w:binData>
<v:shape id="_x0000_i1025" type="_x0000_t75"
style="width:48.75pt;height:38.25pt">
<v:imagedata src="http://01000001.gif"
o:title="FolderN" />
</v:shape>
</w:pict>
</w:r>
</w:p>
Bookmarks are not part of the WordprocessingML namespace but are part of the Annotation Markup Language namespace ("http://schemas.microsoft.com/aml/2001/core"), which is conventionally prefixed with "aml". In a WordprocessingML document, annotation elements, which are empty elements, bracket the area that is bookmarked. An annotation element with its type attribute set to "Word.Bookmark.Start" marks the start of a bookmark area; an annotation element with its type attribute set to "Word.Bookmark.End" marks the end of the bookmark.
In this example, a complete paragraph (containing the text "Inside bookmark") has been bookmarked with a bookmark called "MyBookmark":
<w:p>
<w:r>
<w:t>Before bookmark</w:t>
</w:r>
</w:p>
<aml:annotation aml:id="0" w:type="Word.Bookmark.Start"
w:name="MyBookmark" />
<w:p>
<w:r>
<w:t>Inside bookmark</w:t>
</w:r>
</w:p>
<aml:annotation aml:id="0" w:type="Word.Bookmark.End" />
<w:p>
<w:r>
<w:t>After bookmark</w:t>
</w:r>
</w:p>
If a bookmark is inserted without enclosing any text, the "Word.Bookmark.Start" annotation element will be immediately followed by the corresponding "Word.Bookmark.End" annotation element:
<w:p>
<w:r>
<w:t>text sur</w:t>
</w:r>
<aml:annotation aml:id="1" w:type="Word.Bookmark.Start"
w:name="MyOtherBookmark" />
<aml:annotation aml:id="1" w:type="Word.Bookmark.End" />
<w:r>
<w:t>rounding bookmark</w:t>
</w:r>
</w:p>
In addition to the type attribute, which identifies an annotation element as being used as a bookmark, two attributes of the annotation element are used with bookmarks:
A field is, effectively, a kind of declarative programming. A field is a set of instructions on how part of the document is to be processed. Also included in the field definition are any input parameters and the results of the processing.
A typical Word document with several form fields can be seen in Figure 10.
Figure 10. A Word document with several fields to enter
WordprocessingML supports two kinds of fields:
Simple fields are defined with the fldSimple element. The fldSimple element has an instr (instruction) attribute whose contents define the field's behavior. Within the fldSimple element, an r element holds the results of processing the instructions. For instance, this example creates a simple field that will insert the name of the author from the document properties into the text:
<w:p>
<w:fldSimple w:instr="AUTHOR \* Upper \* MERGEFORMAT">
<w:r>
<w:t>Jane Doe</w:t>
</w:r>
</w:fldSimple>
</w:p>
Complex fields appear in WordprocessingML as a series of r elements inside a paragraph. Each r element contains one part of the field's definition. Three r elements contain fldChar elements, which mark the three parts of a complex field definition:
The fldChar element is used to mark each of these three parts. The fldCharType attribute of the fldChar element is set to "begin", "separate", or "end" to mark the parts of the field definition. The field instructions are placed in the instrText elements. The instrText elements appear between the r element that marks the beginning of the field definition and the r element that marks the end of the instructions. The results of the field's processing are placed between the r element that marks the end of the instructions and the r element that marks the end of the field definition. A small set of fields require additional information. For example, form fields require that the definition include a fldData element, which holds binary data required by the field.
To make it easier to find the form field when processing the document, you can add a bookmark to identify the field.
One kind of complex field is a form text field. A set of WordprocessingML elements that creates a single form field inside a p element would look like this:
Note To make it easier to find the form field when processing the document, a bookmark has been used to identify the field in this example.
<w:p>
<w:r>
<w:fldChar w:fldCharType="begin">
<w:fldData>////</w:fldData></w:fldChar>
</w:fldChar>
</w:r>
<w:r>
<w:instrText>FORMTEXT</w:instrText>
</w:r>
<w:r>
<w:fldChar w:fldCharType="separate" />
</w:r>
<w:r>
<aml:annotation aml:id="0" w:type="Word.Bookmark.Start"
w:name="MyField" />
<w:t> </w:t>
<aml:annotation aml:id="0" w:type="Word.Bookmark.End" />
</w:r>
<w:r>
<w:fldChar w:fldCharType="end" />
</w:r>
</w:p>
Looking at the previous example in detail:
For Word to process form fields correctly, the document must be protected for form editing only. You can turn on this level of protection by adding a documentProtection element to the docPr element at the start of the WordprocessingML document. The edit attribute of the documentProtection element must be set to "forms" and the enforcement attribute must be set to "on". Here's an example:
<w:docPr>
<w:documentProtection w:edit="forms" w:enforcement="on" />
</w:docPr>
After the field has been filled in by the user, the r element that contained the original value will hold the value entered by the user. The result would look like this if the user entered "My Data Entered" into the form field:
<w:p>
<w:r>
<w:fldChar w:fldCharType="begin">
<w:fldData>////</w:fldData></w:fldChar>
</w:fldChar>
</w:r>
<w:r>
<w:instrText>FORMTEXT</w:instrText>
</w:r>
<w:r>
<w:fldChar w:fldCharType="separate" />
</w:r>
<w:r>
<aml:annotation aml:id="0" w:type="Word.Bookmark.Start"
w:name="MyField" />
<w:t>My Data Entered</w:t>
<aml:annotation aml:id="0" w:type="Word.Bookmark.End" />
</w:r>
<w:r>
<w:fldChar w:fldCharType="end" />
</w:r>
</w:p>
A hyperlink has two components: the hyperlink itself (the text the user will click) and the target for the link. Potential targets include external files, e-mail addresses, and bookmarks. If you are creating a hyperlink in Microsoft Office Word, other targets are supported (for example, the top of the document and headings). However, all of those targets are implemented by adding a bookmark at the appropriate location in the document. In this section, you'll see how to create a bookmark for a target within the document.
For a bookmark to be the target of a hyperlink, it must be a complete bookmark pair and have a name assigned to it. For instance, in Word if the user creates a hyperlink to the top of the document, a bookmark called "_top" is inserted at the top of the document. The resulting WordprocessingML looks like this:
<aml:annotation aml:id="0" w:type="Word.Bookmark.Start" w:name="_top" />
<aml:annotation aml:id="0" w:type="Word.Bookmark.End" />
The hyperlink that points to this bookmark is represented in WordprocessingML by an hlink element that has "_top" in its bookmark attribute. The text that is displayed by Word as the hyperlink must be inside a r element between the hlink element's opening and closing element (see Figure 11 for how the link appears in Word):
<w:p>
<w:hlink w:bookmark="_top">
<w:r>
<w:rPr>
<w:rStyle w:val="Hyperlink" />
</w:rPr>
<w:t>Go To Top</w:t>
</w:r>
</w:hlink>
</w:p>
Figure 11. A hyperlink in Microsoft Office Word
You can use any style that you want with your hyperlink. However, the "Hyperlink" style that is generated by Microsoft Office Word is what most users will recognize as the visual clue for a hyperlink (underlined blue text). For consistency's sake, you should consider adding this style to your document and using it with your hyperlinks:
<w:style w:type="character" w:styleId="Hyperlink">
<w:name w:val="Hyperlink" />
<w:basedOn w:val="DefaultParagraphFont" />
<w:rsid w:val="365462" />
<w:rPr>
<w:color w:val="0000FF" />
<w:u w:val="single" />
</w:rPr>
</w:style>
Two other attributes of the hlink element can be useful in generating a WordprocessingML hyperlink that will be read Word:
<w:hlink w:bookmark="_top" w:screenTip="a ScreenTip">
Figure 12. A Word hyperlink with a ScreenTip
A document can also contain Visual Basic for Applications (VBA) code, toolbar modifications, OLE custom controls (OCX) and other "active" components. All of these items can be represented in WordprocessingML. In this section, you'll be introduced to how WordprocessingML stores VBA code and OCX controls. You'll also see how Word ensures that software can detect whether these components are present in the document so that the component can, for instance, be scanned for viruses. Word also ensures that if components are not made visible in WordprocessingML, they will not be executed.
For VBA code, a base64-encoded version of the binary file generated by the VBA editor is held in the binData element inside the docSuppData element. The binData element has a name attribute whose value must be set to "editdata.mso". The docSuppData element is a top-level element under the wordDocument root element, and follows the styles element in a document created by Word.
A typical VBA module in a WordprocessingML document looks like this:
<w:docSuppData>
<w:binData w:name="editdata.mso">
QWN0aXZlTWltZQAAAfAEAAAA/////wAAB/AbDwAABA
...more base64-encoded data...
LgBNAFkATQBPAEQAVQBMAEUAAABAAAAL8AQAAAASNFZ4
</w:binData>
</w:docSuppData>
Representing an OCX control in WordprocessingML is more complicated than storing VBA code because an OCX control also has a graphical representation in the document. For OCX controls, a binData element within a docOleData element is used to hold the OLE data. For OCX controls, the name attribute of the binData element must be set to "oledata.mso".
<w:docOleData>
<w:binData w:name="oledata.mso">
0M8R4KGxGuEAAAAAAAAAAAAAAAAAAAAAPgADAP7/CQAGAAAAAAAAA
...more base64-encoded data...
C4zcL+WTKDhJozVltEGRkTOwQAROjpejLDyT5d+/F5BeLt5n3wv4P/Cl4BK=
</w:binData>
</w:docOleData>
Later in the document, a set of VML-related elements will handle the display of the component.
Two attributes of the wordDocument element are used to indicate the presence of the VBA code and OCX controls: macrosPresent for VBA code and embeddedObjectPresent for OCX controls.
The macrosPresent attribute is used to indicate that macros are present in the document. If the attribute is missing or if it's set to "no", Word won't load a document that has a docSuppData element. This attribute is strictly enforced. If, for instance, the attribute is present and is set to "yes" (indicating that macros are supposed to be present), and Word doesn't find a docSuppData element before it finds the body element, Word will not load the document.
Note Once the document is loaded, Word's security settings will control whether the VBA code will be allowed to execute.
The second attribute is the embeddedObjectPresent attribute, which indicates that an OCX control may have been used in the document. If the attribute is missing or if it's set to "no", Word won't load a document that has a docOleData element. This attribute is not, however, strictly enforced. If the attribute is present and is set to "yes", but Word doesn't find a docOleData element before the body element, Word will still load the document.
When a WordprocessingML document is created in Word, a number of elements are included that provide information to any applications used to read the document. These auxHint elements, from the Auxiliary XML Document 2003 namespace ("http://schemas.microsoft.com/office/word/2003/auxHint"), provide information about how Word handled various elements. Setting the auxHint attributes and elements has no effect on how Word behaves. These elements are provided for the use of other WordprocessingML processing tools and provide a convenient way to access information that would otherwise be difficult to determine.
When you are creating a document, there is no problem with using the WordprocessingML sectPr elements and omitting the auxHint section elements in your document. However, when a WordprocessingML document is read, the sect elements provide containers for the sections of your document. These containers can be very useful to the application that is processing the document, especially when XSL transformations (XSLTs) are used, because XSLTs are oriented towards processing child elements inside container elements.
WordprocessingML does not use a container element for a section but, instead, marks the end of a section with a sectPr element. However, Word does generate sect elements to enclose the p elements that make up a section whenever possible, creating a true XML container for sections. Nonetheless, if the inclusion of a sect element would generate invalid XML (for example, if a section break occurs within a list or table), Word does not write out the sect element.
Within a sect element, each table of contents heading generates sub-section elements that enclose content at a particular heading level or lower.
A WordprocessingML document may consist of any number of sect elements. If the document contains multiple sectPr elements, which define multiple sections in the document, the document will consist of a series of sect elements. Including the sect elements in the definition of a WordprocessingML body element, this means that there are three possible structures for the body element:
<w:body>
<wx:sect>
<p>
</p>
...etc. ...
</wx:sect>
<w:body>
<w:body>
<wx:sect>
<p>
</p>
...etc. ...
</wx:sect>
<wx:sect>
<p>
</p>
...etc. ...
</wx:sect>
...etc. ...
<w:body>
<w:body>
<p>
</p>
...etc. ...
<w:body>
A sub-section element is generated by Word whenever a paragraph is found that has an outlineLvl element assigned in the p element's pPr element. In this example, for instance, the paragraph is assigned to the third level of the outline (the lowest level is 0):
<w:p>
<w:pPr>
<w:outlineLvl w:val="2" />
</w:pPr>
<w:r>
<w:t>x</w:t>
</w:r>
</w:p>
Outline levels are frequently assigned through styles. In Word, the "Heading 1" style has an outline level of 0 set in its rPr element. Any text formatted with the "Heading 1" style picks up that outline level and generates a sub-section element.
Word nests sub-section elements within each other, depending on the outline level. When Word finds a paragraph with an outlineLvl element assigned to it, Word generates an opening sub-section element. If the outlineLvl element just found is higher than the previous outlineLvl element, the new sub-section element will be nested within the sub-section created for the earlier outlineLvl; if the previous outlineLvl was equal to or higher than the outlineLvl just found, closing elements for all the higher level sub-section elements are generated before the new sub-section element is opened.
In this example, for instance, there are five headings at various heading levels:
Heading Level 1
Paragraph1
Paragraph2
Heading Level 2
Paragraph3
Paragraph4
Heading Level 2
Paragraph5
Paragraph6
Heading Level 1
Paragraph7
Omitting all other WordprocessingML elements, the auxiliary sect and sub-section elements that Word would generate would look like this:
<wx:sect>
<wx:sub-section>
Heading Level 1
Paragraph1
Paragraph2
<wx:sub-section>
Heading Level 2
Paragraph3
Paragraph4
</wx:sub-section>
<wx:sub-section>
Heading Level 2
Paragraph5
Paragraph6
</wx:sub-section>
<wx:sub-section>
Heading Level 1
Paragraph7
</wx:sub-section>
</wx:sub-section>
</wx:sect>
Inserting a section break will create a new sect element in the document and close all open sub-section elements. In this sample, a section break has been added after paragraph4:
Heading Level 1
Paragraph1
Paragraph2
Heading Level 2
Paragraph3
Paragraph4
Section Break
Heading Level 1
Paragraph5
The resulting sect and sub-section elements would look like this:
<wx:sect>
<wx:sub-section>
Heading Level 1
Paragraph1
Paragraph2
<wx:sub-section>
Heading Level 2
Paragraph3
Paragraph4
</wx:sub-section>
</wx:sub-section>
</wx:sect>
<wx:sect>
<wx:sub-section>
Heading Level 1
Paragraph5
</wx:sub-section>
</wx:sect>
The tab element takes three attributes from the auxiliary namespace: wTab, tlc, and cTlc. If you're reading a document and need to determine where some text that is positioned on a tab stop will fall horizontally on the page, these properties provide useful information.
In WordprocessingML, the tab element moves the text following it to the next tab stop in the document. The wTab attribute that Word adds to the tab element provides the distance (in twips) between the previous character in the document and the first character of the text at the tab stop. In this example, the word "Hello" is 2,880 twips from the end of the previous text:
<w:tab wx:wTab="2880" wx:tlc="none" wx:cTlc="14"/><w:t>Hello</w:t>
To get the absolute distance between tab stops, you should reference the settings in the tab elements of the pPr (paragraph properties) element.
The tlc attribute reports on how the space before the tab is filled. Values for this attribute are:
The cTlc attribute states how many dots were used in the leader before the tab stop. Going back to the previous example, the word "Hello" would have 14 dots between it and the previous text. However, in the example no leader was shown, as indicated by the tlc setting of "none".
The font element describes the font used by Word for part of the document. In this example, the run is displayed in the MS Mincho font:
<w:rPr>
<wx:font wx:val="MS Mincho"/>
</w:rPr>
<w:t>Hello, World</w:t>
You can't use the font element to set which font is used
The estimate attribute can appear as an attribute on a number of elements that hold numerical information. Where the estimate attribute appears, it will be set to either "true" or "false" and indicates whether Word has estimated the value in the element ("true" indicates that the value has been estimated).
Notes
Table 1. WordprocessingML Elements
Name | Description |
---|---|
Fonts | A container element containing information about the fonts used in the document. Also contains the defaultFont element, which specifies the default fonts for the document. |
Lists | A container element for list definitions and the assignments of a list id to a list definition. |
List | Associates a list id with a particular list definition. |
listDef | A container element for the definition of a list. See Table 15. |
documentProperties | A container element for Office-related elements containing information and statistics about the document. |
docPr | A container element for elements that set properties for the document as a whole. See Table 9. |
Styles | A container element for the styles defined within the document. |
Style | A container element that defines a specific style. Styles are referred to by other elements in the document using the styleId attribute. Table 5 lists the attributes of the style element; Table 6 lists children of the style element. |
Body | Contains the portion of the document that holds the text that will be displayed to the user. |
P | A paragraph containing one or more runs. |
R | A run of one or more t elements to be displayed with a consistent set of properties. |
T | Contains the text to be displayed. |
pPr | Container for paragraph properties. For the child elements, see Table 3. |
Tabs | Container element holding tab elements that define tab stops for a paragraph or style. |
Tab | Defines a single tab stop. Attributes are listed in Table 7. |
br | Inserts a break between t elements inside an r element. The type attribute controls what kind of break is inserted: "page", "column", or "text-wrapping" (the default). |
tbl | A container element for a table. |
tblPr | A container element for properties of a table. See Table 11. |
tblpPr | A container element for the elements that control the position of a floating table. See Table 12. |
tr | A container element for the cells in a table that make up a table row. |
trPr | A container element for the properties of a row in a table. See Table 13. |
tc | Contains the content for one cell in a table. |
tcPr | A container element for the properties for a cell in a table. See Table 14. |
sectPr | A container element for the definition of the section of the document preceding the sectPr element. |
ftr | Container element in a sectPr element for the text to be displayed in the page footer. |
hdr | Container element in a sectPr element for the text to be displayed in the page header. |
titlePg | Used in the sectPr element to indicate that a separate header and footer for the first page of this section is allowed. |
docSuppData | Container for VBA code. |
binData | Child element of the docSuppData element. The binData element holds the binary representation of the VBA project. |
Note Some, but not all, of these elements are empty elements with only a val attribute. Where the list of acceptable values for the val attribute is short, those values are given in the Definition column in the following table. However, some elements may take additional attributes besides the val attribute, so you should always consult the WordprocessingML Schema for a full understanding of each element. In some cases, additional notes regarding the element, including its type definition, are also listed in the Definition column.
Table 2. rPr Child Elements (Run Properties)
Element | Description | Definition |
---|---|---|
rStyle | Character style for this run. | String. The character style is set in the styles section. |
rFonts | Fonts for this run. | String. A font named in the fonts section, or "default", "fareast", or "cs". |
b | Sets Latin and Asian characters to bold. | "on", "off" |
b-cs | Sets complex scripts characters to bold. | "on", "off" |
i | Sets Latin and Asian characters to italic. | "on", "off" |
i-cs | Sets complex scripts characters to italic. | "on", "off" |
caps | Formats lowercase text as capital letters (does not affect numbers, punctuation, non-alphabetic characters, or uppercase letters). | "on", "off" |
smallcaps | Formats lowercase text as capital letters and reduces their size (does not affect numbers, punctuation, non-alphabetic characters, or uppercase letters). | "on", "off" |
strike | Draws a line through the text. | "on", "off" |
dstrike | Draws a double line through the text. | "on", "off" |
outline | Displays the inner and outer borders of each character. | "on", "off" |
shadow | Adds a shadow behind the text, beneath and to the right of the text. | "on", "off" |
emboss | Makes text appear as if it is raised off the page in relief. | "on", "off" |
imprint | Makes selected text appear to be imprinted or pressed into page (also referred to as "engrave"). | "on", "off" |
noproof | Formats the text so that spelling and grammar errors are ignored in this run. | "on", "off" |
snaptogrid | Sets the number of characters per line to match the number of characters specified in the docGrid element of the current section's properties. | "on", "off" |
vanish | Prevents the text in this run from being displayed or printed. | "on", "off" |
webHidden | Prevents the text in this run from being displayed when this document is saved as a Web page. | "on", "off" |
color | Specifies either an automatic color or a hexadecimal color code for this run. | 3-digit hexBinary
or "auto" |
spacing | The amount by which the spacing between characters is expanded or condensed. | Integer |
w | Stretches or compresses text horizontally as a percentage of its current size. | Integer |
kern | The smallest font size for which kerning should be automatically adjusted. | unsignedInt |
position | The amount by which text should be raised or lowered in relation to the baseline. | Integer |
sz | Font size for this Asian and Latin fonts in this run. | Integer |
sz-cs | Font size for complex script fonts in this run. | Integer |
highlight | Highlights text so it stands out from the surrounding text. | "on", "off" |
u | Underline formatting for this run. | underlineValue |
effect | Animated text effect for this run. | textEffectValues |
bdr | Border for characters in this run. | borderValues |
shd | Shading for characters in this run. | shdValues |
fitText | Width of the space that this run fits into. | unsignedInt |
vertAlign | Adjusts the vertical position of the text relative to the baseline and changes the font size if possible (to raise or lower the text without reducing the font size, use the 'Position' element). | "baseline", "superscript", or "subscript" |
rtl | Sets the alignment and reading order for this run to right-to-left. | "on", "off" |
cs | True if text in this run is complex scripts text. | "on", "off" |
em | Sets the type of emphasis mark for this run. | "none", "dot", "comma", "circle", or "under-dot" |
hyphen | Hyphenation style for this run. | String |
lang | Languages for this run. | 2-digit hexBinary or a string |
asianLayout | Special Asian layout formatting properties. | See the schema. |
specVanish | Property that makes text in this run always hidden. | "on", "off" |
Note Some, but not all, of these elements are empty elements with only a val attribute. Where the list of acceptable values for the val attribute is short, those values are given in the Definition column in the following table. However, some elements may take additional attributes besides the val attribute, so you should always consult the WordprocessingML Schema for a full understanding of each element. In some cases, additional notes regarding the element, including its type definition, are also listed in the Definition column.
Table 3. pPr Child Elements (Paragraph Properties)
Element | Description | Definition |
---|---|---|
pStyle | Paragraph style. | String (a style defined in the styles element of the document). |
keepNext | Keep with next paragraph: Prevents a page break between this paragraph and the next. | "on", "off" |
keepLines | Keep lines together: Prevents a page break in this paragraph. | "on", "off" |
pageBreakBefore | Forces a page break before this paragraph. | "on", "off" |
framePr | Text frame and drop cap properties. | FramePrProperty |
widowControl | Prevents word from printing the last line of a paragraph by itself at the top of the page (widow) or the first line of a paragraph at the bottom of a page (orphan). | "on", "off" |
listPr | List properties. | listPrElt |
supressLineNumbers | Prevents line numbers from appearing next to the paragraph. This setting has no effect in documents or sections with no line numbers. | "on", "off" |
pBdr | Borders for the paragraph. | pBdrElt |
shd | Paragraph shading. | ShdValues |
tabs | A container holding a list of tab elements. | tabsElt |
suppressAutoHyphens | Prevents automatic hyphenation. | "on", "off" |
bidi | Sets the alignment and reading order for a paragraph to right-to-left. | "on", "off" |
adjustRightInd | Automatically adjusts the right indent when you are using the document grid. | "on", "off" |
snapToGrid | Aligns text to document grid (when defined). | "on", "off" |
spacing | Spacing between lines and paragraphs. | Two attributes: before, after. Each contains spacing distance in twips. |
ind | Paragraph indentation. | Integer (twips) |
contextualSpacing | Don't add space between paragraphs of the same style. | "on", "off" |
suppressOverlap | Don't allow this frame to overlap. | "on", "off" |
jc | Paragraph alignment. | "left", "right", "center", "both", "medium-kashida", "distribute", "list-tab", "high-kashida", "low-kashida", "thai-distribute" |
textDirection | Orientation for the paragraph in the current cell, text box, or text frame. | "lr-tb", "tb-rl", "bt-lr", "lr-tb-v", "tb-rl-v" |
outlineLvl | Outline level. | Integer |
divId | ID of HTML DIV element this paragraph is currently in. | Integer |
rPr | Run properties for the paragraph mark. | Properties for all r elements within this p element |
sectPr | Section properties for the section that terminates at this paragraph mark. | Contains the properties for the section. Appears in the last paragraph in the section. |
kinsoku | Asian typography: Use East Asian typography and line-breaking rules to determine which characters begin and end a line on a page. | "on", "off" |
wordWrap | Asian typography: Allows a line to break in the middle of a Latin word. | "on", "off" |
overflowPunct | Asian typography: Allows punctuation to continue one character beyond the alignment of other lines in the paragraph. If you do not use this option, all lines and punctuation must be perfectly aligned. | "on", "off" |
topLinePunct | Asian typography: Allows punctuation to compress at the start of a line, which lets subsequent characters move in closer. | "on", "off" |
autoSpaceDE | Asian typography: Automatically adjusts character spacing between East Asian and Latin text. | "on", "off" |
autoSpaceDN | Asian typography: Automatically adjusts character spacing between East Asian text and numbers. | "on", "off" |
textAlignment | Asian typography: Determines the vertical alignment of all text in a line. | "top", "center", "baseline", "bottom", "auto" |
Note Some, but not all, of these elements are empty elements with only a val attribute. Where the list of acceptable values for the val attribute is short, those values are given in the Definition column in the following table. However, some elements may take additional attributes besides the val attribute, so you should always consult the WordprocessingML Schema for a full understanding of each element. In some cases, additional notes regarding the element, including its type definition, are also listed in the Definition column.
Table 4. sectPr Child Elements (Section Properties)
Element | Description | Definition |
---|---|---|
hdr | Headers that appear at the top page in this section. | hdrElt |
ftr | Footers that appear at the bottom of the page in this section. | ftrElt |
footnotePr | Footnote properties for this section. | ftnEdnPropsElt |
endnotePr | Endnote properties for this section. | ftnEdnPropsElt |
type | Section type. | sectTypeElt |
pgSz | Specifies the size and orientation of this page. | pageSzType |
pgMar | Specifies the page margins. | pageMarType |
paperSrc | Specifies where the paper is located in the printer. | paperSourceType |
pgBorders | Specifies the page borders. | pageBordersType |
lnNumType | Specifies the line numbering. | lineNumberType |
pgNumType | Specifies the page numbering options. | pageNumberType |
cols | Specifies the column properties for this section. | columnsType |
formProt | Turns form protection on for this section alone. | "on", "off" |
vAlign | Sets alignment for text vertically between the top and bottom margins. | "top", "center", "both", "bottom" |
noEndnote | Suppresses endnotes that would ordinarily appear at the end of this section. | "on", "off" |
titlePg | Specifies that the first page of this section is different and will have different headers and footers. | "on", "off" |
textFlow | Specifies text flow. | "lr-tb", "tb-rl", "bt-lr", "lr-tb-v", "tb-rl-v" |
bidi | Specifies that this section contains bidirectional (complex scripts) text. | "on", "off" |
rtlGutter | Positions the gutter at the right of the document. | "on", "off" |
docGrid | Specifies the type of document grid. | "default", "lines", "lines-and-chars", "snap-to-chars" |
Table 5. style Element Attributes
Attribute | Description | Values |
---|---|---|
type | Type of style. | "paragraph", "character", "table", "list" |
styleId | Name used to refer to this style within XML. Unique within the file. | String |
default | Specifies whether this style is the default for this type of style. | "on", "off" |
Note Some, but not all, of these elements are empty elements with only a val attribute. Where the list of acceptable values for the val attribute is short, those values are given in the Definition column in the following table. However, some elements may take additional attributes besides the val attribute, so you should always consult the WordprocessingML Schema for a full understanding of each element. In some cases, additional notes regarding the element, including its type definition, are also listed in the Definition column.
Table 6. style Child Elements (Style Definitions)
Element | Description | Definition |
---|---|---|
name | Primary name of style; built-in style names are converted to a language-independent form. | String |
aliases | Secondary names of style, separated by commas. | String |
sti | Built-in style unique numerical identifier. | DecimalNumberProperty |
basedOn | styleId (name of style) this style is based on. | String |
next | styleId of the next-paragraph style; used only for paragraph styles. | String |
link | styleId of the linked style; used only for linked paragraph and character styles. | StringProperty |
hidden | Don't show this style to the user. | "on", "off" |
semiHidden | Don't show this style to the user unless they request to see it. | "on", "off" |
locked | This style is restricted for use by end user. | "on", "off" |
rsidHex | Revision Save Id for this style: a unique identifier used to track when the style was last changed. | NumberProperty |
pPrp | Paragraph properties for the style, if any. | See Table 3. |
rPr | Character properties for the style, if any. | See Table 2. |
tblPr | Table properties. | See Table 11. |
trPr | Table row properties. | See Table 13. |
tcPr | Table cell properties. | See Table 14. |
tblStylePr | Conditional override properties for table styles. | tblStylePrElt |
Table 7. tab Element Attributes
Attribute | Description | Values |
---|---|---|
Val | The type of tab stop. | "clear", "left", "right", "center", "decimal", "bar", "list" |
Leader | How empty space between tab stops is to be filled. | "none", "dot", "hyphen", "underscore", "heavy", "middle-dot" |
Pos | Position of the tab stop from the left edge, in twips. | Integer |
Table 8. WordprocessingML Auxiliary Elements and Attributes
Name | Description |
---|---|
Elements | |
Sect | An arbitrary section of a WordprocessingML document. The sect element encloses all text between inserted section breaks. |
sub-section | A container for all elements at the same heading level. Recursive. |
font | The font being used in the paragraph. |
allowEmptyCollapse | Hint to transforms to allow this paragraph to autocollapse if empty (for use with HTML). |
font | The font that was actively applied by Word for display |
sym | Hint to transforms that this run resolves to a single symbol, described herein. |
bgcolor | The background color applied at this point. |
bdrwidth | The HTML equivalent of the border width, in points. This element takes into account different internal border styles and represents the appropriate final presentation width. |
hintShdProperty | The HTML equivalent of the background color. This element takes into account various shading settings and represents the appropriate final presentation color. |
t | The text Word displayed for this object. |
uiName | The style name as shown to the user at save time, only exported if different than the name element's value. |
Attributes | |
wTab | Space between start of text at the tab stop and end of previous text. |
Tlc | Type of leader to use before text at a tab stop. |
cTlc | Number of dots in the leader used. |
Note Some, but not all, of these elements are empty elements with only a val attribute. Where the list of acceptable values for the val attribute is short, those values are given in the Definition column in the following table. However, some elements may take additional attributes besides the val attribute, so you should always consult the WordprocessingML Schema for a full understanding of each element. In some cases, additional notes regarding the element, including its type definition, are also listed in the Definition column.
Table 9. docPr Child Elements (Document Properties)
Element | Description | Definition |
---|---|---|
validateAgainstSchema | Templates and Add-Ins XML Schema option: Validate document against attached schemas. | "on", "off" |
saveInvalidXML | Templates and Add-Ins XML Schema option: Allow saving as XML even if the XML is not valid. | "on", "off" |
ignoreMixedContent | Templates and Add-Ins XML Schema option: Save and validate ignores all text not in leaf nodes. | "on", "off" |
alwaysShowPlaceholderText | Turns on display of placeholder text for all empty leaf elements. | "on", "off" |
doNotUnderlineInvalidXML | Templates and Add-Ins XML Schema option: Turns off wavy underline of schema violations in document. | "on", "off" |
removeWordSchemaOnSave | XML Save option: Save data only, removing all elements in the WordprocessingML Schema when saving as XML. | "on", "off" |
useXSLTWhenSaving | XML Save option: Apply a custom transform when saving the document as XML. | "on", "off" |
saveThroughXSLT | XML Save option: The custom transform to apply when saving document as XML. | saveThroughXsltElt |
showXMLElements | Turns on display of XML elements in document. | "on", "off" |
alwaysMergeEmptyNamespace | Controls how empty namespace elements that do not belong to a schema are handled. If set to "on", these elements will not be removed. If set to "off", they will be removed. | "on", "off" |
hdrShapeDefaults | Wrapper for the shape defaults of the headers and footers. | shapeDefaultsElt |
footnotePr | Document-wide footnote properties, including footnote separators. | ftnDocPropsElt |
endnotePr | Document-wide endnote properties, including endnote separators. | ednDocPropsElt |
compat | Container for compatibility options (that is, the user preferences entered on the Compatibility tab of the Options dialog in Word). | compatElt |
docVars | Container for document variables from documents created in Word version 6.0/95 or earlier. | docVarsElt |
drawingGridHorizontalSpacing | Drawing Grid option: The amount of horizontal space between vertical gridlines. | twipsMeasureProperty |
drawingGridVerticalSpacing | Drawing Grid option: The amount of vertical space between horizontal gridlines. | twipsMeasureProperty |
displayHorizontalDrawingGridEvery | Drawing Grid option: The amount of space between horizontal gridlines drawn on the screen. | decimalNumberProperty |
displayVerticalDrawingGridEvery | Drawing Grid option: The amount of space between vertical gridlines drawn on the screen. | decimalNumberProperty |
useMarginsForDrawingGridOrigin | Drawing Grid option: If set to "on" overrides the settings for drawingGridHorizontalOrigin and drawingGridVerticalOrigin and sets the upper-left corner of the document area within the margins as the grid origin. | "on", "off" |
drawingGridHorizontalOrigin | Drawing Grid option: The point at the left edge of the page where you want the invisible grid to begin. This setting is ignored when useMarginsForDrawingGridOrigin is set to "on". | twipsMeasureProperty |
drawingGridVerticalOrigin | Drawing Grid option: The point at the top edge of the page where you want the invisible grid to begin. This setting is ignored when useMarginsForDrawingGridOrigin is set to "on". | twipsMeasureProperty |
doNotShadeFormData | Specifies whether to turn off the gray shading on form fields. | "on", "off" |
printTwoOnOne | Page Setup Margins option: For multiple page documents, prints two pages per sheet. | "on", "off" |
punctuationKerning | Asian Typography option: When kerning for Latin text is turned on, also kern punctuation text. | "on", "off" |
characterSpacingControl | Asian Typography option: Sets the blank-space compression option you want for Asian characters. The equivalent in HTML is setting text-justify-trim on the BODY element. | characterSpacingProperty |
strictFirstAndLastChars | Asian Typography option: Use standard characters to start and end lines of text. | "on", "off" |
noLineBreaksAfter | Asian Typography option: Specifies which characters are restricted from ending a line. | kinsokuProperty |
noLineBreaksBefore | Asian Typography option: Specifies which characters are restricted from starting a line. | kinsokuProperty |
webPageEncoding | Web option: The encoding you want to use when you save as a Web page. | stringProperty |
optimizeForBrowser | Web option: Specifies whether to disable features not supported by Web browsers. | "on", "off" |
relyOnVML | Web option: Rely on Vector Markup Language (VML) for displaying graphics in browsers. | "on", "off" |
allowPNG | Web option: Allow Portable Network Graphics (PNG) as a graphic format. | "on", "off" |
doNotRelyOnCSS | Web option: Turns off cascading style sheets (CSS) for font formatting of Web pages. | "on", "off" |
doNotSaveWebPagesAsSingleFile | Web option: When saving this file as a Web page, does not save as a single-file Web page (MHTML). | "on", "off" |
doNotOrganizeInFolder | When saving as a Web page, causes all supporting files such as bullets, background textures, and graphics to be stored in the same folder as the Web page. | "on", "off" |
doNotUseLongFileNames | Web option: Disables long file names of Web pages, forcing a filename of no more than eight characters. | "on", "off" |
pixelsPerInch | Web option: The number of pixels per inch that you want for the display of pictures in Web pages. The size that you select affects the size of graphics relative to the size of text on the screen. | decimalNumberProperty |
targetScreenSz | Web option: The monitor resolution (screen size) that you are optimizing your Web pages for. The screen size that you specify can affect the size and layout of images on Web pages. | targetScreenSzElt |
savePreviewPicture | Document Properties Summary option: Saves a picture of the first page of the file for previewing. (This option has no effect the document is saved as XML.) | "on", "off" |
alignBordersAndEdges | Page Border option: Aligns paragraph borders and tables with the page border throughout the document. Setting this element to "on" eliminates any gaps between adjoining borders. However, Word aligns, or "snaps," text to the edge of a table only if the text is one character width (10.5 points) or less from the page border. | "on", "off" |
bordersDontSurroundHeader | Page Border option: Causes the page border to exclude the header. | "on", "off" |
bordersDontSurroundFooter | Page Border option: Causes the page border to exclude the footer. | "on", "off" |
gutterAtTop | Page Setup Margins option: Positions the gutter at the top of a document. If you have set up your document with facing pages or two pages per sheet (by selecting the Mirror margins, Book fold, or 2 pages per sheet setting for the Multiple Pages list in the Page Setup dialog box), gutterAtTop is ignored. | "on", "off" |
hideSpellingErrors | Spelling and Grammar option: Hides the wavy red line under possible spelling errors in your document. | "on", "off" |
hideGrammaticalErrors | Spelling and Grammar option: Hides the wavy green line under possible grammatical errors in your document. | "on", "off" |
activeWritingStyle | Spelling and Grammar option: The writing style you want Word to use to when checking grammar in this document. | writingStyleProperty |
proofState | The state of the proofing tools in this document: "clean" (no errors found) or "dirty" (errors present in the document). | proofProperty |
formsDesign | Specifies whether the document is in forms design mode. In this mode, you can edit or create a form by using the ActiveX® controls in the Control Toolbox toolbar. | "on", "off" |
attachedTemplate | Templates and Add-Ins option: The template that's attached to this document. | stringProperty |
linkStyles | Templates and Add-Ins option: Updates the styles in this document to match the styles in the attached template each time you open the document. This ensures that your document contains up-to-date style formatting. | "on", "off" |
stylePaneFormatFilter | Bitmask of controlling the display of styles in the Styles and Formatting task pane. | shortHexNumberProperty |
documentType | Document type used by the AutoFormat feature. | docTypeProperty |
mailMerge | Container for elements holding mail merge information for this document. | mailMergeElt |
revisionView | Determines how document revisions are viewed. | trackChangesViewElt |
trackRevisions | Marks changes in the current document and keeps track of each change by reviewer name. | "on", "off" |
documentProtection | Protect Document option: Helps prevent unintentional changes to all or part of an online form or document, as specified. | docProtectProperty |
autoFormatOverride | Protect Document option: Allows the AutoFormat feature to override formatting restrictions. | "on", "off" |
defaultTabStop | Format Tabs option: The default spacing between tab stops. | twipsMeasureProperty |
autoHyphenation | Language Hyphenation option: Automatically hyphenates the document as you type. | "on", "off" |
consecutiveHyphenLimit | Language Hyphenation option: The maximum number of consecutive lines of text that can end with a hyphen. | decimalNumberProperty |
hyphenationZone | Language Hyphenation option: The distance from the right margin within which you want to hyphenate your document. Word hyphenates words that fall into the hyphenation zone. A smaller zone reduces the raggedness of the right margin, but more words may require hyphens. A larger zone increases the raggedness of the right margin, but fewer words may require hyphens. | decimalNumberProperty |
doNotHyphenateCaps | Language Hyphenation option: Causes Word to not hyphenate words written in all capital letters. | "on", "off" |
showEnvelope | Displays the Microsoft Office Outlook® e-mail header in a document. | "on", "off" |
summaryLength | AutoSummary option: Size for automatic document summary. | decimalNumberProperty |
clickAndTypeStyle | Edit option: Style to be used when automatically formatting paragraphs as a result of double-clicking any open area in the document. | docPrStyleProperty |
defaultTableStyle | Table AutoFormat option: Default table style for new documents. | docPrStyleProperty |
evenAndOddHeaders | Page Setup Layout option: Creates one header or footer for even-numbered pages and a different header or footer for odd-numbered pages. | "on", "off" |
bookFoldRevPrinting | Page Setup Margin option: For multiple-page documents, specifies whether to print the document as a reverse book fold. | "on", "off" |
bookFoldPrinting | Page Setup Margin option: For multiple-page documents, specifies whether to print the document as a book fold. | "on", "off" |
bookFoldPrintingSheets | Page Setup Margin option: For multiple-page documents with book fold and reverse book fold printing, sets the number of sheets per booklet. | decimalNumberProperty |
view | Controls the view mode in Word. | "none", "print", "outline", "master-pages", "normal", "web" |
zoom | Controls how large or small the document appears on the screen in Word. | "none", "full-page", "best-fit", "text-fit" |
removePersonalInformation | If set to "on", helps avoid unintentionally distributing hidden information, such as the document's author or the names associated with comments or tracked changes. | "on", "off" |
dontDisplayPageBoundaries | View option: Turns off display of the space between the top of the text and the top edge of the page. | "on", "off" |
displayBackgroundShape | View option: Controls display of the background shape in print layout view. | "on", "off" |
printPostScriptOverText | Print option: Allows PostScript code in PRINT fields in a document to print on top of the document text instead of underneath it. This element's setting has no effect if a document does not contain PRINT fields. | "on", "off" |
printFractionalCharacterWidth | Print option: Word for the Macintosh setting that has no effect in other versions of Word. | "on", "off" |
printFormsData | Print option: Prints the data entered into an online form without printing the online form. | "on", "off" |
embedTrueTypeFonts | Save option: Stores the TrueType fonts used to create this document along with the document. Others who open the document will be able to view and print it with the fonts used to create it, even if those fonts aren't installed on their computer. (NOTE: TrueType fonts are not embedded in XML files.) | "on", "off" |
doNotEmbedSystemFonts | Save option: For the TrueType fonts in your document, does not embed fonts that are likely to already be installed on a computer. This option takes effect only when the Embed TrueType fonts option is on. | "on", "off" |
saveSubsetFonts | Save option: For the TrueType fonts in your document, embeds only the font styles you actually used in the document, which may decrease the file size of your document. If you used 32 or fewer characters of a font, Word embeds only those characters. This option takes effect only when the Embed TrueType fonts option is on. | "on", "off" |
saveFormsData | Saves the data entered in an online form as a single, tab-delimited record so you can use it in a database. Word saves the file in Text Only file format. | "on", "off" |
mirrorMargins | Page Setup Margins option: For multiple page documents, swaps left and right margins on facing pages. | "on", "off" |
Table 10. Table-Related Elements
Element | Description |
---|---|
tbl | Identifies a table. |
tblPr | Container for table properties (see Table 11). |
tblGrid | Container for column definitions (gridCol element). |
gridCol | Defines a column's width in twips; the table will have one column for each gridCol element. |
tr | A row in the table. |
trPr | Container for properties for a row in the table (see Table 13). |
tc | A cell within a row |
tcPr | Container for properties for a cell (see Table 14). |
Note Some, but not all, of these elements are empty elements with only a val attribute. Where the list of acceptable values for the val attribute is short, those values are given in the Definition column in the following table. However, some elements may take additional attributes besides the val attribute, so you should always consult the WordprocessingML Schema for a full understanding of each element. In some cases, additional notes regarding the element, including its type definition, are also listed in the Definition column.
Table 11. tblPr Child Elements (Table Properties)
Element | Description | Definition |
---|---|---|
tblStyle | The style applied to this table. | Name of a table style. |
tblW | Preferred width of the table. | Integer |
jc | Table alignment. | "left", "center", "right", "both", "medium-kashida", "distribute", "list-tab", "high-kashida", "low-kashida", "thai-distribute" |
tblCellSpacing | HTML cellspacing attribute for the table (the spacing between individual cells). | Integer |
tblInd | Width that the table should be indented by. | Integer |
tblBorders | The border definitions for the table. | tblBordersElt |
shd | Table shading; applies to the "cellspacing" gaps. | shdValues |
tblLayout | Specifies whether the table is of fixed width. If not specified, the contents of the table will be taken in to account during layout. | "Fixed" |
tblOverlap | Should this table avoid overlapping another table during layout? If this element is not specified, floating tables will be allowed to overlap. | "Never" |
tblLook | What aspects of the table styles should be included? | Bitmask. 0x0020 (Apply header row formatting) 0x0040 (Apply last row formatting) 0x0080 (Apply header column formatting) 0x0100 (Apply last column formatting) |
tblpPr | Table-positioning properties (for floating tables). | tblpPrElt (see Table 12) |
tblCellMar | Cell margin defaults for this table's cells. | tblCellMarElt |
tblRtl | Is this a right-to-left table? (Logical right-to-left, not visual right-to-left.) This element is used only to persist settings from Word 9.0/2000 and is not recommended for use. Use bidiVisual instead. | "on", "off" |
bidiVisual | Is this not a logical right-to-left table? (Visual right-to-left, not logical right-to-left.) | "on", "off" |
tblStyleRowBandSize | When a style specifies the format for a band (a contiguous set) of rows in a table, this element specifies the number of rows in a band. | Integer |
tblStyleColBandSize | When a style specifies the format for a band (a contiguous set) of columns in a table, this element specifies the number of columns in a band. | Integer |
Note Some, but not all, of these elements are empty elements with only a val attribute. Where the list of acceptable values for the val attribute is short, those values are given in the Definition column in the following table. However, some elements may take additional attributes besides the val attribute, so you should always consult the WordprocessingML Schema for a full understanding of each element. In some cases, additional notes regarding the element, including its type definition, are also listed in the Definition column.
Table 12. tblpPr Child Elements (Table Positioning Properties)
Element | Description | Definition |
---|---|---|
leftFromText | Distance between the left table border and the surrounding text (for wrapping tables). | Integer |
rightFromText | Distance between the right table border and the surrounding text (for wrapping tables). | Integer |
topFromText | Distance between the top table border and the surrounding text (for wrapping tables). | Integer |
bottomFromText | Distance between bottom table border and the surrounding text (for wrapping tables). | Integer |
vertAnchor | Defines how this table is vertically anchored. | "text", "margin", "page" |
horzAnchor | Defines how this table is horizontally anchored. | "text", "margin", "page" |
tblpXSpec | Horizontal alignment (for example, center, left, or right); overrides position set by other formatting options (for example, page layout settings). | "left", "center", "right", "inside", "outside" |
tblpX | Horizontal distance from anchor. | Integer |
tblpYSpec | Horizontal alignment (for example, top or bottom); overrides position set by other formatting options (for example, page layout settings). | "inline", "top", "center", "bottom", "inside", "outside" |
tblpY | Vertical distance from anchor. | Integer |
Note Some, but not all, of these elements are empty elements with only a val attribute. Where the list of acceptable values for the val attribute is short, those values are given in the Definition column in the following table. However, some elements may take additional attributes besides the val attribute, so you should always consult the WordprocessingML Schema for a full understanding of each element. In some cases, additional notes regarding the element, including its type definition, are also listed in the Definition column.
Table 13. trPr Child Elements (Table Row Properties)
Element | Description | Definition |
---|---|---|
divId | Defines what HTML DIV element this row belongs within. | Integer |
gridBefore | Number of grid units consumed before the first cell; assumed to be zero. | Integer |
gridAfter | Number of grid units consumed after the last cell; assumed to be zero. | Integer |
wBefore | Preferred width before the table row. | Integer |
wAfter | Preferred width after the table row. | Integer |
cantSplit | If specified, a page cannot split this row. | "on", "off" |
trHeight | The height of this row. | val attribute: Height it twips h-rule attribute: "exact", "at-least" |
tblHeader | If specified, this row belongs to the collection of "header" rows (which will repeat at the top of every page and will get any special header row formatting from the table style). If this row is not contiguously connected with the first row of the table (that is, if either it isn't the first row itself, or all of the rows between this row and the first row are marked as header rows), this property will be ignored. | "on", "off" |
Note Some, but not all, of these elements are empty elements with only a val attribute. Where the list of acceptable values for the val attribute is short, those values are given in the Definition column in the following table. However, some elements may take additional attributes besides the val attribute, so you should always consult the WordprocessingML Schema for a full understanding of each element. In some cases, additional notes regarding the element, including its type definition, are also listed in the Definition column.
Table 14. tcPr Child Elements (Table Cell Properties)
Element | Description | Definition |
---|---|---|
tcW | Preferred width for this cell. | Integer |
gridSpan | Number of grid units this cell consumes -- assumed to be one. | Integer |
Vmerge | Is this cell part of (or the beginning of) a vertically merged region? | "continue", "restart" |
Hmerge | Is this cell part of (or the beginning of) a horizontally merged region? | "continue", "restart" |
tcBorders | Defines the borders for this cell. Overrides the definitions given by the table borders. | tcBordersElt |
shd | Underlying shading for this cell. | shdValues |
noWrap | If present, specifies that the contents of this cell should never wrap. | "on", "off" |
tcMar | Margins for this cell (maps to CSS padding property). Overrides any definitions given in table properties. | tcMarElt |
textFlow | Defines the text flow for this cell. | "lr-tb": Left To Right; Top to Bottom;
"tb-rl": Top to Bottom; Right to Left; "bt-lr": Bottom to Top; Left to Right; "lr-tb-v": Left to Right, Top to Bottom Rotated; "tb-rl-v": Top to Bottom; Right to Left Rotated |
tcFitText | Causes text to be sized to fit in cell. | "on", "off" |
vAlign | Vertical alignment. | "top", "center", "both", "bottom" |
Note Some, but not all, of these elements are empty elements with only a val attribute. Where the list of acceptable values for the val attribute is short, those values are given in the Definition column in the following table. However, some elements may take additional attributes besides the val attribute, so you should always consult the WordprocessingML Schema for a full understanding of each element. In some cases, additional notes regarding the element, including its type definition, are also listed in the Definition column.
Table 15. listDef Child Elements (List Definitions)
Element | Description | Definition |
---|---|---|
lsid | List id. | hexNumberProperty |
plt | Description of the type of list. | "SingleLevel", "MultiLevel", "HybridMultiLevel" |
tmpl | List template for formatting the list. | hexNumberProperty |
name | Name of the list definition. | String |
styleLink | Name of the list style defined in the styles element. | String |
listStyleLink | Name of the list style that the list is referencing. | String |
lvl | Container for level properties. An lvl element is required for each level in the list. Includes pPr, tabs, and rPr elements. | See Table 16. |
Note Some, but not all, of these elements are empty elements with only a val attribute. Where the list of acceptable values for the val attribute is short, those values are given in the Definition column in the following table. However, some elements may take additional attributes besides the val attribute, so you should always consult the WordprocessingML Schema for a full understanding of each element. In some cases, additional notes regarding the element, including its type definition, are also listed in the Definition column.
Table 16. lvl Child Elements (List-Level Definitions)
Element | Description | Definition |
---|---|---|
start | First number in numbering series for the list. | Integer |
nfc | Specifies the number style used for a list. | Integer (see Table 17) |
lvlRestart | When present causes numbering to be restarted at 1. | Integer |
pStyle | Name of a style as defined in the styles element. | String |
isLgl | Is this level following "legal numbering" rules? | "on", "off" |
lvlText | Text to use as a basis for the number of an item in the list. | String |
suff | Text to follow the number of the item in the list. | "Tab", "Space", "Nothing" |
lvlPicBulletId | The number of the built-in graphic to be used as the bullet for an item in the list. | Integer |
legacy | List level is from Word 6.0/95 or earlier. | lvlLegacyElt |
lvlJc | Justification of the actual number. | "left", "center", "right", "both" |
pPr | The p element properties. | pPrElt (see Table 3) |
rPr | The r element properties. | rPrElt (see Table 2) |
Table 17. nfc Element Integer Values
Internal name for nfc code | Integer value | Description |
---|---|---|
nfcArabic | 0 | Arabic: 1, 2, 3, 4, ... |
nfcUCRoman | 1 | Uppercase roman: I, II, III, IV, ... |
nfcLCRoman | 2 | Lowercase roman: i, ii, iii, iv, ... |
nfcUCLetter | 3 | Uppercase alpha: A, B, C, D, ... |
nfcLCLetter | 4 | Lowercase alpha: a, b, c, d, ... |
nfcOrdinal | 5, | Ordinal: 1st, 2nd, 3rd |
nfcCardtext | 6, | Cardinal: One, Two, Three |
nfcOrdtext | 7, | Ordinal Text: First, Second, Third |
nfcHex | 8, | Hexadecimal: 8, 9, A, B, C, D, E, F, 10, 11, 12 |
nfcChiManSty | 9, | Chicago Manual of Style: *, †, † |
nfcDbNum1 | 10 | Ideograph-digital |
nfcDbNum2 | 11 | Japanese counting |
nfcAiueo | 12 | Aiueo |
nfcIroha | 13 | Iroha |
nfcDbChar | 14, | Full-width Arabic: 1, 2, 3, 4 |
nfcSbChar | 15, | Half-width Arabic: 1, 2, 3, 4 |
nfcDbNum3 | 16 | Japanese legal |
nfcDbNum4 | 17 | Japanese digital ten thousand |
nfcCirclenum | 18 | Enclosed circles |
nfcDArabic | 19 | Decimal full width2: 1, 2, 3, 4 |
nfcDAiueo | 20 | Aiueo full width |
nfcDIroha | 21 | Iroha full width |
nfcArabicLZ | 22 | Leading zero: 01, 02, ..., 09, 10, 11, ... |
nfcBullet | 23 | Bullet character |
nfcGanada | 24 | Korean Ganada |
nfcChosung | 25 | Korea Chosung |
nfcGB1 | 26 | Enclosed full stop |
nfcGB2 | 27 | Enclosed parenthesis |
nfcGB3 | 28 | Enclosed circle Chinese |
nfcGB4 | 29 | Ideograph enclosed circle |
nfcZodiac1 | 30 | Ideograph traditional |
nfcZodiac2 | 31 | Ideograph Zodiac |
nfcZodiac3 | 32 | Ideograph Zodiac traditional |
nfcTpeDbNum1 | 33 | Taiwanese counting |
nfcTpeDbNum2 | 34 | Ideograph legal traditional |
nfcTpeDbNum3 | 35 | Taiwanese counting thousand |
nfcTpeDbNum4 | 36 | Taiwanese digital |
nfcChnDbNum1 | 37 | Chinese counting |
nfcChnDbNum2 | 38 | Chinese legal simplified |
nfcChnDbNum3 | 39 | Chinese counting thousand |
nfcChnDbNum4 | 40 | Chinese (not implemented) |
nfcKorDbNum1 | 41 | Korean digital |
nfcKorDbNum2 | 42 | Korean counting |
nfcKorDbNum3 | 43 | Korea legal |
nfcKorDbNum4 | 44 | Korea digital2 |
nfcHebrew1 | 45 | Hebrew-1 |
nfcArabic1 | 46 | Arabic alpha |
nfcHebrew2 | 47 | Hebrew-2 |
nfcArabic2 | 48 | Arabic abjad |
nfcHindi1 | 49 | Hindi vowels |
nfcHindi2 | 50 | Hindi consonants |
nfcHindi3 | 51 | Hindi numbers |
nfcHindi4 | 52 | Hindi descriptive (cardinals) |
nfcThai1 | 53 | Thai letters |
nfcThai2 | 54 | Thai numbers |
nfcThai3 | 55 | Thai descriptive (cardinals) |
nfcViet1 | 56 | Vietnamese descriptive (cardinals) |
nfcNumInDash | 57 | Page number format: - 1 -, - 2 -, - 3 -, - 4 - |
nfcLCRus | 58 | Lowercase Russian alphabet |
nfcUCRus | 59 | Uppercase Russian alphabet |