GSUB — Glyph Substitution Table

The Glyph Substitution (GSUB) table provides data for substition of glyphs for appropriate rendering of scripts, such as cursively-connecting forms in Arabic script, or for advanced typographic effects, such as ligatures.

Overview

The Glyph Substitution table (GSUB) contains information for substituting glyphs to render the scripts and language systems supported in a font. Many language systems require glyph substitutes. For example, in the Arabic script, the glyph shape that depicts a particular character varies according to its position in a word or text string (see figure 1). In other language systems, glyph substitutes are aesthetic options for the user, such as the use of ligature glyphs in the English language (see Figure 2).

gsub_fig3a.gif
Figure 1. Isolated, initial, medial, and final forms of the Arabic character HAH
gsub_fig3b.gif
Figure 2. Two Latin glyphs and their associated ligature

OpenType fonts use character encoding standards, such as the Unicode Standard, that assumes a distinction between characters and glyphs: text is encoded as sequences of characters, and the 'cmap' table provides a mapping from that character to a single default glyph. Multiple characters are not directly mapped to a single glyph, as needed for ligatures; and a single character is not mapped directly to multiple glyphs, as may be needed for some complex-script scenarios. The GSUB table provides a way to describe such substititions, enabling applications to apply such substitions during text layout and rendering to achieve desired results.

To access substitute glyphs, GSUB maps from the glyph index or indices defined in a 'cmap' subtable to the glyph index or indices of the substitute glyphs. For example, if a font has three alternative forms of an ampersand glyph, the 'cmap' table associates the ampersand’s character code with only one of these glyphs. In GSUB, the indices of the other ampersand glyphs are then referenced from this one default index.

The text-processing client uses the GSUB data to manage glyph substitution actions. GSUB identifies the glyphs that are input to and output from each glyph substitution action, specifies how and where the client uses glyph substitutes, and regulates the order of glyph substitution operations. Any number of substitutions can be defined for each script or language system represented in a font.

The GSUB table supports seven types of glyph substitutions that are widely used in international typography:

  • A single substitution replaces a single glyph with another single glyph. This is used to render positional glyph variants in Arabic and vertical text in the Far East (see Figure 3).

    gsub_fig3c.gif
    Figure 3. Alternative forms of parentheses used when positioning Kanji vertically
  • A multiple substitution replaces a single glyph with more than one glyph. This is used to specify actions such as ligature decomposition (see Figure 4).

    gsub_fig3d.gif
    Figure 4. Decomposing a Latin ligature glyph into its individual glyph components
  • An alternate substitution identifies functionally equivalent but different looking forms of a glyph. These glyphs are often referred to as aesthetic alternatives. For example, a font might have five different glyphs for the ampersand symbol, but one would have a default glyph index in the cmap table. The client could use the default glyph or substitute any of the four alternatives (see Figure 5).

    gsub_fig3e.gif
    Figure 5. Alternative ampersand glyphs in a font
  • A ligature substitution replaces several glyph indices with a single glyph index, as when an Arabic ligature glyph replaces a string of separate glyphs (see Figure 6). When a string of glyphs can be replaced with a single ligature glyph, the first glyph is substituted with the ligature. The remaining glyphs in the string are deleted, this does not include those glyphs that are skipped as a result of lookup flags.

    gsub_fig3f.gif
    Figure 6. Three Arabic glyphs and their associated ligature glyph
  • Contextual substitution, the most powerful type, describes glyph substitutions in context-that is, a substitution of one or more glyphs within a certain pattern of glyphs. Each substitution describes one or more input glyph sequences and one or more substitutions to be performed on that sequence. Contextual substitutions can be applied to specific glyph sequences, glyph classes, or sets of glyphs.

  • Chaining contextual substitution, extends the capabilities of contextual substitution. With this, one or more substitutions can be performed on one or more glyphs within a pattern of glyphs (input sequence), by chaining the input sequence to a ‘backtrack’ and/or ‘lookahead’ sequence. Each such substitution can be applied in three formats to handle glyphs, glyph classes or glyph sets in the input sequence. Each of these formats can describe one or more of the backtrack, input and lookahead sequences.

  • Reverse Chaining contextual single substitution, allows one glyph to be substituted with another by chaining input glyph to a ‘backtrack’ and/or ‘lookahead’ sequence. The difference between this and other lookup types is that processing of input glyph sequence goes from end to start.