Several groups of elements are involved in working with the text of a shape
The following table gives a brief description of these elements and their relationships.
Element | Description |
---|---|
<Shape>
|
Opening tag for the Shape element. |
<Text>
|
Opening tag for the shape's text. |
<cp/>
|
Marks character runs. |
<pp/>
|
Marks paragraph runs. |
<tp/>
|
Marks tab runs. |
<fld/>
|
Marks Field position. |
</Text>
|
Closing tag for the shape's text. |
<Char>
|
Contains character properties. |
<Para>
|
Contains paragraph properties. |
<Tabs>
|
Contains tab properties. |
<Field>
|
Contains the shape's text field properties. |
</Shape>
|
Closing tag for the Shape element. |
For example, consider the following shape. It has text that consists of two character runs, one in bold and the other in italic.
The portion of the shape's sheet that describes character text runs is called the Character section. The Character rows for this shape as viewed in the ShapeSheet window are as follows:
Note that the column on the left indicates the length of each run, 4 characters and 7 characters, respectively. The only cells that vary between the two rows are the Style cell values. The Style cell determines if the run is bold, italic, or another style. The XML code for these rows follows:
<Char IX='0'>
<Font>4</Font>
<Color>0</Color>
<Style>17</Style>
<Case>0</Case>
<Pos>0</Pos>
<FontScale>1</FontScale>
<Size Unit='PT'>0.1666666666666667</Size>
<DblUnderline>0</DblUnderline>
<Overline>0</Overline>
<Strikethru>0</Strikethru>
<Highlight>0</Highlight>
<DoubleStrikethrough>0</DoubleStrikethrough>
<RTLText>0</RTLText>
<UseVertical>0</UseVertical>
<Letterspace>0</Letterspace>
<ColorTrans>0</ColorTrans>
<AsianFont>0</AsianFont>
<ComplexScriptFont>0</ComplexScriptFont>
<LocalizeFont>0</LocalizeFont>
<ComplexScriptSize>-1</ComplexScriptSize>
<LangID>1033</LangID>
</Char>
<Char IX='1'>
<Font>4</Font>
<Color>0</Color>
<Style>34</Style>
<Case>0</Case>
<Pos>0</Pos>
<FontScale>1</FontScale>
<Size Unit='PT'>0.1666666666666667</Size>
<DblUnderline>0</DblUnderline>
<Overline>0</Overline>
<Strikethru>0</Strikethru>
<Highlight>0</Highlight>
<DoubleStrikethrough>0</DoubleStrikethrough>
<RTLText>0</RTLText>
<UseVertical>0</UseVertical>
<Letterspace>0</Letterspace>
<ColorTrans>0</ColorTrans>
<AsianFont>0</AsianFont>
<ComplexScriptFont>0</ComplexScriptFont>
<LocalizeFont>0</LocalizeFont>
<ComplexScriptSize>-1</ComplexScriptSize>
<LangID>1033</LangID>
</Char>
Note Although the size of the text is 12 points, the internal unit for points is inches. So the point size of "12" is represented in inches as "0.16666666666667."
The two Char elements represent the two Character rows. Each has an IX attribute describing the relative order of the rows. Each Char element contains child elements that correspond to the cells viewed in the ShapeSheet window.
The Shape element also contains an element called Text, which contains the characters of the text and special elements (cp, pp, tp, and fld) that mark the end of one run and the beginning of the next. The Text element is somewhat unusual because it contains both data (the text characters) and child elements (cp, pp, and so on).
The Text element that describes the text on the preceding shape follows:
<Text><cp IX='0'/>Bold <cp IX='1'/>Italic</Text>
The <cp IX='0'>
tag indicates that the character properties from the first Char element (<Char IX='0'>
) are to be applied to the text that follows. The <cp IX='1'/>
tag indicates that the preceding character run (<Char IX='0'>
) has ended and the run attributed to <Char IX='1'>
has begun.
Inheritance for text elements follows the standard inheritance rules. Text row elements (Char, Para, Tabs and Field) need to contain child elements only for elements whose values are different from their inherited value. For example, the statement <Char IX='0'><Color>4</Color</Char>
is sufficient to specify that a Char element should have blue text; the remaining element values will automatically be inherited.
Note Microsoft® Office Visio® writes out all values
The following special cases apply:
When Visio loads text from an XML file that is created or edited outside of Visio (an untrusted file), the first thing it does is normalize the text data, that is, it fills in, adds, or reorders missing elements and removes unused elements.
Following are some of the ways in which Visio normalizes untrusted text data in a VDX file at load time.
<cp IX ='0'/>
tag always precedes a <cp IX ='1'/>
tag, and so on.<Char IX='0'> <Style> 0 </Style> </Char>
<Char IX='1'> <Style> 1 </Style> </Char>
<Para IX='0'> <HorzAlign> 0 </HorzAlign> </Char>
<Para IX='1'> <HorzAlign> 1 </HorzAlign> </Char>
<Text><cp IX='0'/><pp IX='0'/>The quick <cp IX='1'/>brown fox
<pp IX='1'/>jumped <cp IX='0'/>over the lazy dog.</Text>
The code describes a text block that contains three character runs and two paragraph property runs. There is a single paragraph break after the word "fox." The block would be rendered like this:
In the preceding example, the Char row 0 is referenced twice (<cp IX ='0'/>
). Visio normalizes this by creating a third Char row (<Char IX='2'>
), which is identical to Char row 0.
When the file is resaved, it contains the third Char row and an additional cp element marking the third character run, which is demonstrated as follows. In addition, the XML file will contain all the child elements (Font, Size, FontScale, and so on) for all the Char and Para elements.
<Char IX='0'>...<Style> 0 </Style>...</Char>
<Char IX='1'>...<Style> 1 </Style>...</Char>
<Char IX='2'>...<Style> 0 </Style>...</Char>
<Para IX='0'>...<HorzAlign> 0 </HorzAlign>...</Char>
<Para IX='1'>...<HorzAlign> 1 </HorzAlign>...</Char>
<Text><cp IX ='0'/><pp IX='0'>The quick <cp IX ='1'/>brown fox
<pp IX ='1'/>jumped <cp IX ='2'/>over the lazy dog.</Text>
Although it is not an error to omit the marker for the first run, it is recommended that you include it. Visio always emits the initial marker element when the DatadiagramML data is round-tripped.
Paragraph run markers (<pp IX='1'>
) and tab run markers (<tp IX='1'>
) are valid only at the beginning of a paragraph. If such a marker is encountered in the middle of a paragraph, it is ignored. A new line is required before each pp and tp element.
Visio normalizes untrusted text data when it is loaded into the application. This means that Visio performs further processing