Tablet PC

Add Support for Digital Ink to Your Windows Applications

Paul Yao

This article discusses:

  • Windows XP Tablet PC Edition
  • Setting up your Tablet PC dev environment
  • Writing code to process ink
This article uses the following technologies:
C#, .NET Framework

Code download available at:TabletPC.exe(129 KB)

Contents

The Ink Input Function
Tablet PC Hardware
Speech Input
Programming with Ink
Ink Classes
Gestures
Ink Data
The Stroke Class and the Strokes Collection
Appearance and Display of Ink
Analyzing Ink
Converting Ink to Text
TicTacToe: An Ink-Enabled Sample
InkEdit Controls
A Full-Screen InkCollector
Nine InkCollectors
Assembly and Namespace References
TicTacToe Initialization and Cleanup
InkCollector Events
What's Next

Two of the most innovative features of the Tablet PC are the pen and microphone. While these devices are fairly novel for most PC users, they are intended to provide new choices for users, not to eliminate traditional input devices. In fact, a user may opt to use voice or pen input with some programs, mouse and keyboard with other programs, or voice when in transit.

By speaking commands a user can, for example, run a program or open or save a file. Speech can be used in other ways as well, most of which are not new, including conversion of spoken word to text and the recording of speech for later playback.

The pen can be used as a pointing device, substituting for a mouse. But the widest use of pen input has been for handwriting recognition. The handwriting recognition engine on the Tablet PC uses a neural network to analyze pen movement and a word list to cross-reference its results. Unlike earlier recognition engines, the Tablet PC recognizers work better with cursive than with block letters. The current generation of handwriting recognizers does have some limitations, however; it does not have the ability to learn from its mistakes.

So, how good is the handwriting recognition? Figure 1 shows examples of my handwriting as I wrote the phrase "Can you read this now?" I am sometimes surprised how well it can convert garbled script into the correct text and at other times I am amazed that simple words are not interpreted the way I expect them to be. The recognizer relies on context, so with its letters all connected, cursive produces the best results. Single characters and block letters produce less consistent results. Without context, the recognition engine is unsure about whether to interpret a single vertical bar as the number 1, a lowercase L, or an uppercase I.

Figure 1 Can You Read This Now?

Figure 1** Can You Read This Now? **

The recognizer works in the background and its results are displayed as they are produced within the Tablet PC Input Panel. As you can see in Figure 1, some of the words I wrote were not recognized correctly. When that occurs, a user can click on a word and a correction window appears.

The Ink Input Function

Pen input does not have to be converted into text. It can be stored as a set of points in a new data type called Ink. The Ink type can contain either text or non-text information such as a flow chart, an org chart, a street map, or even a combination of text and pictures.

In fact, Windows® Journal, which is an accessory that comes with Windows XP Tablet PC Edition, could be described as an ink-enabled cross between Notepad and Paint. It allows text input as well as freehand drawing, plus a whole lot more.

Windows Journal demonstrates the advantages of electronic ink. Physical ink cannot move to allow more room to add notes in between lines of existing notes; electronic ink can. Inside Windows Journal, handwriting recognition does not occur unless a user requests it; the default is for ink to remain as ink. And this is a core improvement in Tablet PC technology.

Tablet PC Hardware

I worked with four Tablet PCs while writing this article: the Acer TravelMate C100 Convertible Tablet PC, the Compaq Tablet PC TC1000, a ViewSonic TPCV1250, and a Toshiba Portege M200. I found all of them easy to set up and use.

The hardware in a Tablet PC represents a rethinking of the elements needed in a portable computer. To save weight, serial ports, parallel ports, and floppy drives are excluded. Most have two or more USB ports and many have 1394 Firewire ports. Most of them have built-in Wi-Fi (802.11b) support; most also have built-in wired network cards.

A USB mouse and keyboard convert even the lightest slate tablet into a familiar PC. While Tablet PC display hardware is not at all unique, Tablet PC display drivers are unique because of the support for multiple display orientations. For example, on the systems I tested, two landscape and two portrait orientations were supported. Behind the display screen is another unique feature, a high-resolution digitizer.

The digitizer picks up pen movements, but not other physical pressure, so a user can rest a hand on the screen without accidentally creating unwanted input. Even when there is no physical contact between the pen and the screen, a pen waved over a Tablet PC screen causes the mouse cursor to scurry along like a puppy at meal time. The digitizer's ability to detect the pen tip is the magic that makes the pen work.

A Tablet PC pen has at least one user-activated switch, which must include a pen tip and can also include an eraser and buttons on the pen barrel. The pen tip contains the one required switch, which triggers the equivalent of a left-button click when it makes contact with the screen. For some Tablet PC pens, a second switch is provided on the "eraser side" of the pen. Pen-enabled drawing applications often use that second button to erase on-screen objects. The right-click button can sometimes be found on the barrel of the pen.

Context menus summoned with right-clicks are commonplace, but barrel buttons are not standard on all Tablet PC pens. To address this, the Tablet PC shell incorporates a user action similar to that found on the Pocket PC: tap-and-hold. When using a Tablet PC pen without a barrel button, you can tap and hold to summon context menus.

There are six distinct ways for a pen to generate character input on a Tablet PC:

  • Input Panel Keyboard
  • Input Panel Writing Pad
  • Input Panel Character Pad
  • Pen Input Panel
  • InkEdit control
  • Ink-Enabled Window

Of these, the first three can be used in any application. The last three are only supported for ink-enabled programs.

Using a pen, a user can enter character data using the onscreen keyboard (see Figure 2) The Input Panel appears when a user clicks an icon in the system taskbar and can be hidden when not needed.

Figure 2 Tablet PC Input Panel Keyboard

The Input Panel can be docked or undocked. The three mode keys—Control, Shift, and Alt—are "sticky" so that multiple-key input is possible using the pen by itself. Incidentally, the Input Panel keyboard is not the same as the on-screen accessibility keyboard that comes with both Windows XP and a Tablet PC.

The Input Panel also supplies a handwriting area, which collects ink input. After a slight pause, or when the user clicks the Send button, ink is recognized (converted to text). The resulting text is then sent to the window with focus given to the foreground application (see Figure 3).

Figure 3 Input Panel Writing Pad

The Input Panel Character Pad is identical in appearance to the Input Panel Writing Pad except for the presence of small vertical ticks (see Figure 4). Using the guides, text can be entered one letter at a time. For users who insist on printing individual characters, the Character Pad uses the guides to help improve the results of handwriting recognition. It is so oriented towards individual characters that it is unable to interpret cursive characters. Instead, each stroke is assumed to belong to only one character.

Figure 4 Input Panel Character Pad

When undocked, the Tablet Input Panel is referred to as the Pen Input Panel. Ink-enabled programs are able to control the location and appearance of the undocked Input Panel through the PenInputPanel class. Through an instance of this class, an ink-enabled program can incorporate a pen input panel to capture character input. Identical to the system's handwriting input panel, which stays docked to the top or bottom of the screen, a Pen Input Panel is positioned next to the window which is to hold handwriting input the user will enter.

Figure 5 Ink in Notepad

Figure 5** Ink in Notepad **

The Pen Input Panel can be used even when a program is not explicitly ink-enabled. The top of Figure 5 shows how a button appears inside Notepad (which is not ink-enabled) when a pen is pointed at its input window. When the button is clicked, the undocked Tablet Input Panel appears. The Pen Input Panel creates a unified pen-based text entry for all programs on a Tablet PC, not just those that are ink-enabled.

Figure 6 Text Drawn

Figure 6** Text Drawn **

InkEdit is the built-in control that supports handwriting recognition. Your program can treat InkEdit like a regular edit control, making this the simplest way to ink-enable an application. The user sees a window that directly accepts and recognizes ink as text. Figure 6 and Figure 7 shows ink drawn in an edit control and the resulting text that has been recognized and placed into that control.

Figure 7 Text Recognized

Figure 7** Text Recognized **

InkEdit replaces a standard edit control, but is built on the rich text control, not on the standard edit control. This means that an ink edit control can display text drawn using different font sizes and styles. (A standard edit control supports one font size and style.)

The greatest level of application integration with pen input involves ink-enabled windows. An ink-enabled window is an application-specific window that calls the ink support libraries to create and manipulate Ink objects. An ink-enabled window can convert ink to text, which satisfies its inclusion here as a character data collector. An ink-enabled window can also support pen-based command input called gestures. In addition to text, ink-enabled windows also support picture drawing. When text and pictures are placed in the same window, the Microsoft® Ink libraries support a Divider object that helps separate ink-based text data from ink-based pictures.

Speech Input

The Tablet PC incorporates two important uses of a microphone: text recognition and command input. To activate speech-enabled programs, use the Microsoft Speech SDK, version 5.1. Even without adding this support, the Tablet PC supports speech as input for any Windows XP-compatible program. Speech input is controlled from the Input Panel (see Figure 8).

Figure 8 Speech-Related Buttons

Figure 8** Speech-Related Buttons **

As of this writing, four speech recognizers are available: Chinese (Simplified), Chinese (Traditional), English, and Japanese. For best results, a user should spend some time training the recognizer in the nuances of her speech. I also suggest you consider purchasing a high-quality microphone, which will also help.

Figure 9 Available Voice Commands

Figure 9** Available Voice Commands **

When speech is enabled on a Tablet PC, a set of speech-related buttons appear on the Input Panel. The buttons marked Dictation and Command allow a user to decide whether speech should be interpreted as text to be treated as input data, as a command, or (when both buttons are off) to ignore speech input. When a user clicks on the Command Input button in the Input Panel, text is interpreted as commands. You can, for example, tell your computer to "launch Notepad," "select paragraph," or summon the Start menu by simply saying "Start menu." If you are unsure of the available commands, ask the system "What can I say?" to summon the window showing available voice commands (see Figure 9).

Programming with Ink

The ability to store ink as ink is probably the most important innovation in the Tablet PC. In addition, ink can also be used to annotate documents, spreadsheets, Web pages, and, of course, other ink drawings and images.

You can build ink-enabled applications in either managed or unmanaged code. A set of ActiveX® controls/COM components support unmanaged ink-enabled programs. A pair of managed code assemblies supports the creation of ink-enabled programs in a managed language, such as C# or Visual Basic® .NET. The .NET Framework 1.0 is included with Windows XP Tablet PC Edition, which has the distinction of being the first Microsoft operating system with built-in support for managed applications. The sidebar "Tablet PC Test and Development Systems" describes three options for setting up a development system to build Tablet PC software.

Ink Classes

The managed code interfaces are organized by a pair of managed assemblies: Microsoft.Ink.dll and Microsoft.Ink.resources.dll. These assemblies are added to the global assembly cache so they can be shared by different managed code programs. Figure 10 summarizes the core classes for building ink-enabled programs. Not every class listed is needed to create ink-enabled programs.

Figure 10 Core Ink Programming Classes

Class Name Description
Divider Analyzes ink to distinguish text from pictures.
DrawingAttributes Controls appearance of ink, such as color, line width, and so on.
Gesture Ink interpreted as a command.
Ink The main container to hold ink. Holds the collection of strokes that make up ink input. Supports moving ink between memory and disk. Supports clipboard actions on ink.
InkCollector Snap-on support for creating a basic ink-enabled window.
InkOverlay Snap-on support for adding ink-enabled support to existing application windows.
PenInputPanel An in-place input window for adding in-place ink input for existing controls.
Recognizer Provides language-specific conversion of ink strokes to text and the system dictionary with common words in a given language.
RecognizerContext Organizes the input elements needed for ink-to-text conversion: a recognizer, hints about data type (factoid), and an application-specific dictionary (word list).
Renderer Draws ink onto a drawing surface.
Stroke A set of (x,y) point values generated by a pen. A stroke starts when a pen makes contact with the drawing surface, and includes the locations traversed by the pen until the pen is lifted from the drawing surface.
Strokes collection Holds a set of strokes.

The ink classes operate by connecting to a window. InkCollector and InkOverlay attach to an application window and use that window's message stream to ink-enable it. When building an ink-enabled application, you have to pick one of these two classes; you then connect the class to one of your program windows.

InkCollector provides support for a pure-ink window. It is the simpler of the two and it was created to support ink-enabled programs that are written from scratch. In a pure-ink window, the InkCollector provides the primary support for managing the contents of the window.

InkOverlay is more powerful than InkCollector. With it, you can let users perform such actions as editing an onscreen graphic layout by drawing circles around items and scribbling comments in the margin. The ink overlay class allows the ink drawing model to coexist with an application-specific drawing model.

Through a CollectionMode property both InkCollector and InkOverlay accept a setting for how pen input is to be interpreted. The possible values are available in the CollectionMode enumeration. An application can choose to receive ink input, gesture input, or both, as shown in the following code.

enum CollectionMode { InkOnly = 0, GestureOnly = 1, InkAndGesture = 2 };

So what is the difference between ink input and gesture input? Ink is drawn and collected; it produces data. Gestures, on the other hand, are not drawn (or if they are drawn they are immediately erased) and represent commands.

Gestures

One way to think about gestures is as pen-driven commands. The ink libraries support two types of gestures: system and application. Each contains a fixed number of elements; available system gestures are defined in the SystemGesture enumeration.

enum SystemGesture { Tap = 0x10, DoubleTap = 0x11, RightTap = 0x12, Drag = 0x13, RightDrag = 0x14, HoldEnter = 0x15, HoldLeave = 0x16 HoverEnter = 0x17, HoverLeave = 0x18 };

Application gestures are defined in the ApplicationGesture enumeration. A complete list of available application gestures can be found at ApplicationGesture Enumeration.

Whereas application gestures are gestures you can choose to have your application support, system gestures are surrogates for mouse events. When not processed, system gestures become mouse events. They help support non-ink-enabled programs by bridging the gap between pen input and mouse input. Application gestures generate input intended for ink-enabled programs. Traditional programs can still benefit from some of the application gestures through the Input Panel.

What if a program wants a gesture not supported in either the system or application gesture tables? There are several options. When a desired gesture corresponds to a supported letterform (say an X to mark an object for deletion) a program could collect ink and use a RecognizerContext to watch for this command. As a command, the gesture would need to be erased to avoid confusion with ink as data.

But what if a desired gesture is not built in and has no corresponding letterform? Support for such a feature could be provided by creating a custom gesture recognizer, which could either be used by itself for total control of recognized gestures, or used to supplement the gestures recognized by the Microsoft-provided gesture recognizer.

Ink Data

As I mentioned earlier, the Tablet PC development team decided to treat ink as its very own data type. Ink created by an ink-enabled program is its own portable graphic format called Ink Serialized Format (ISF). This format can be serialized and saved to disk in either binary format or XML, and it can be placed on the clipboard. Ink can also be Web-enabled (and also made accessible to non-ink-enabled programs) when converted to a GIF with embedded ISF data.

The key managed code container class for holding ink data is the class Ink. Both the InkCollector and the InkOverlay classes hold an Ink object, which manages all of the collected ink data. The Ink class provides the primary support for serializing ink and for supporting the clipboard.

The Ink class serves up ink persistence through a Save method. Despite its name, this method does not write data to disk, but rather converts ink data into a byte array. Your program must take this byte array and copy it to disk. The class also supports a Load method, which copies a byte array—one created from a previous call to Save—into an empty Ink object.

ISF data is available in four different encodings; in a sense, ISF is four formats masquerading as one. One encoding is a simple ISF stream. A second encoding is a GIF image with a simple ISF stream embedded as metadata. This second encoding makes an ink image available to applications that are not visibly ink-enabled on a Web page, for example. Ink-enabled programs can also access the simple ISF stream inside the GIF. The third encoding is a simple ISF stream stored as base-64 data in an XML-compatible manner. The fourth encoding is a GIF similarly stored as base-64 data.

Ink data can be transferred between programs via the clipboard. Data sharing has always been an important feature of Windows; the creation of a new data format raises the issue of how to allow that format to be shared. At the most basic level, one or more clipboard formats need to be supported (see Figure 11).

Figure 11 Supported Clipboard Formats for Ink

Name CF_ Flag Description
HTML Format   Supports more than a single Ink object in a format that is compatible with both ink-enabled and non-ink-enabled programs
Bitmap CF_BITMAP A device-dependent bitmap, for non-ink-enabled programs
Device Independent CF_DIB A device-independent bitmap, for non-ink-enabled programs
Enhanced Metafile CF_ENHMETAFILE Draw ink into an enhanced metafile, which can be drawn by non-ink-enabled programs
Ink Serialized Format (ISF)   Native ink format
Metafile CF_METAFILEPICT Draw ink into a metafile, for use by non-ink-enabled programs
Sketch Ink   An embeddable OLE object containing ink that is expected to hold a drawing
Text Ink   An embeddable OLE object containing ink that is expected to hold handwritten text that may sub-sequently be converted to recognized text

When ink is copied to the clipboard, the possible recipients for that data are of two types: programs that are ink-enabled and programs that are not. An ink-enabled program puts a range of different data types on the clipboard. When a user pastes an Ink objects to an ink-enabled program, the recipient can request ISF from the clipboard. And when the paste operation is performed to a non-ink-enabled program, a variety of alternative graphic formats, such as bitmaps and metafiles, allow the recipient to display the ink.

The Ink object supports the creation of embeddable OLE objects. This is a clever approach because it allows ink to be incorporated into applications that are otherwise not ink-enabled. In effect, these embeddable objects allow any OLE-compatible program to host ink-enabled objects. For the present, there are two basic types of embeddable OLE Ink objects: sketch ink and text ink. Sketch ink is expected to contain a drawing; text ink is expected to contain data that may be converted to text via an ink-compatible recognizer.

The Stroke Class and the Strokes Collection

Inside the Ink class is the Strokes property, which holds a collection of Stroke objects. A stroke is a set of points generated by the movement of a pen and can contain thousands of points. But a stroke is more thank that; it can represent other data such as Pressure and Drawing attributes. When converting ink to text, the US-English recognizer appears to prefer fewer long strokes to many short strokes. That's why the recognizer appears to get more accurate results when recognizing cursive text than block-letter text.

As a collection class, the Strokes class supports the basic operations expected of any collection class. For example, the Count property returns the number of elements, and the Add and Remove methods work as expected and individual strokes can be accessed by index (0 to Count-1). The Strokes class supports ink-specific operations, as well. Some methods apply a coordinate transform to the stroke data—scaling, shearing, rotating, and scrolling (or translating). Other methods change the appearance of ink when drawn or perform tests on stroke data.

Contained within a Strokes collection is a set of references to individual Stroke objects. As the pen moves to create a stroke, intermediate data known as Packets are delivered to an Ink object. The resulting stroke data is extracted from the packets.

The Stroke class makes it possible to perform several types of operations on stroke data. You can modify individual points, test whether a mouse-click (or pen-tap) touches a stroke, and apply a coordinate transform to an individual stroke. You can compare one stroke to another stroke for the intersections or find the places that a stroke intersects with itself through the SelfIntersections property.

The Stroke class supports copying the contents of a stroke object to an external Point array. This data is available in several forms, including as a polyline array and a Bezier array. The former can be drawn with a call to a DrawLines (or Win32 Polyline) method; the latter can be drawn with a call to a DrawBeziers (or Win32 PolyBezier) method. The benefit of a Bezier curve is that rough edges in a stroke are gently smoothed out.

Appearance and Display of Ink

Built into the InkCollector and InkOverlay classes is support for the collection and drawing of ink. The appearance of ink can be controlled by modifying various properties. The DefaultDrawingAttributes property holds a reference to a DrawingAttributes object. Changes to this object are reflected in changes to all the strokes in the current Ink object. Ink can be set to be drawn with a round or rectangular pen tip and with a given color and width. Ink can even be drawn with a Boolean raster operation to change the appearance of ink based on the color of a window (XOR, AND, and so on).

A Renderer object is also built into the InkCollector and InkOverlay classes. This object draws ink as a pen is moved, and it also provides automatic redraw of ink in response to a Paint event (or Win32 WM_PAINT message). In addition, an ink-enabled program can call on this object to redraw any combination of individual strokes. A Renderer object can draw onto a bitmap or into a metafile, and it can also be used to create hardcopy output of ink on a printer.

Figure 12 shows the containment relationship between some of the key ink-related objects for a window that uses an InkCollector. The InkCollector itself is connected to a portion of a window, as defined by the window input rectangle. The InkCollector contains an Ink object, which itself contains a Stroke collection. The Stroke collection, in turn, contains a set of Stroke objects, one for each user stroke. Outside of the Ink object is a Renderer object, which provides the drawing support for ink.Tablet PC Development and Test Systems

Tablet PC development requires a Windows XP-compatible compiler. In order to build native code executables, Visual Basic 6.0 or Visual C++® 6.0 work just fine. For building managed code executables, you need Microsoft Visual Studio® 7.0 or later (or any .NET-compatible compiler).

In addition to the compiler, you may need to install an SDK. Two SDKs support Tablet PC development; the one you need depends on the type of application you are building. For ink-enabled applications, install the Tablet PC SDK; for speech-enabled applications, install the Microsoft Speech SDK.

Test ink-enabled programs using one of three system configurations: Tablet PC operating system on a Tablet PC system, Tablet PC operating system on a standard PC, and Windows XP Professional with Tablet PC SDK. The ideal test system is obviously a Tablet PC, since it comes equipped with the hardware and software that you need. The other two do provide an easier and possibly lower cost point of entry.

Testing using a standard PC with the Tablet PC operating system is available to MSDN® subscribers. This configuration serves up most software-based features, including handwriting recognizers and Tablet PC applications. Obviously, the limitations are hardware-dependent. For example, screen display rotation is not supported. The mouse can simulate pen input, but the user experience is not going to be the same as on a Tablet PC.

If you look at the installation disks for the Tablet PC operating system, you will see that Disk 1 of the Tablet PC Edition is the same as Disk 1 of Windows XP Professional. The Tablet PC-specific files are on Disk 2. You must make sure to enter the product key for Windows XP Tablet PC Edition. If you enter a Windows XP product key, then Windows XP installs without the Tablet PC enhancements.

The test environment that is the easiest to set up, but delivers the fewest features, is available by installing the Tablet PC SDK on Windows XP Professional. This configuration supports basic ink collection, which means that most of the sample programs in the SDK can run. With the recently released recognizer pack, you can now have recognition engines on machines running other OSs for dev purposes. But none of the handwriting recognizers are installed. And none of the accessories, such as Windows Journal, are available in this configuration.

The ink-enabling components themselves are written as a set of automation-enabled ActiveX controls; they are, in short, written in unmanaged code. This means you can build ink-enabled applications using your favorite COM-compatible language: Visual Basic 6.0; Visual C++ 6.0; and the various C/C++-compatible APIs including Win32®, MFC, ATL, and WTL. My focus is on managed code development, but for each of the managed code classes, an unmanaged code equivalent is available.

Figure 12 Containment Relationship of Ink

Figure 12** Containment Relationship of Ink  **

An important feature provided by a Renderer object is support for coordinate conversion. Ink is collected in a set of coordinates known as ink space, which is fixed as HIMETRIC units. In this mapping mode, each logical unit is set to .01 millimeters. By contrast, mouse input always arrives as pixel coordinates. The default drawing coordinates system is also pixels (GraphicsUnit set to Pixel in managed code, and in Win32 the mapping mode is set to MM_TEXT). Figure 13 summarizes the available coordinate conversion functions provided by the Renderer class.

Figure 13 Coordinate Conversion Methods in Renderer Class

Method Description
InkSpaceToPixel Converts a single point from ink space to pixel coordinates
InkSpaceToPixelFromPoints Converts an array of points from ink space to pixel coordinates
PixelToInkSpace Converts a single point from pixel coordinates to ink space
PixelToInkSpaceFromPoints Converts an array of points from pixel coordinates to ink space

The Tablet PC supports three distinct coordinate mapping schemes, two of which are provided in the Renderer (see Figure 14). The two coordinate mapping schemes supported by the Renderer change the way ink is drawn, without changing the underlying ink data: the object and view mappings. The third mapping scheme, stroke mapping, modifies the actual coordinates stored by individual strokes. Stroke mapping is implemented by the Stroke class and also by the Strokes collection class to modify all the strokes in a given collection.

Figure 14 Coordinate Mapping Supported by Tablet PC

Name Class Methods Description
Object Renderer GetObjectTransform
SetObjectTransform
Modifies coordinate mapping, leaves ink and pen width unchanged
View Renderer GetViewTransform
Move
Rotate
Scale
SetViewTransform
Modifies coordinate mapping, changes ink and pen (cursor) width
Stroke Stroke and Strokes collection Move
Rotate
Scale
ScaleToRectangle
ScaleTransform
Shear
Transform
Changes data in underlying ink strokes
       

When drawing pictures or creating annotation, the storing and retrieving of ink might be all that is needed. When program data must be converted to text, an additional step is required: the analysis of ink and the conversion (or recognition) of ink. This is the final item that a programmer needs to understand to build an ink-enabled program.

Analyzing Ink

The freedom to mix text and pictures presents a challenge for developers. If a user wants to edit the hand-drawn text as a set of characters (inside an edit control, for example), the program must convert the ink to text. But first a program must distinguish pictures from text.

The Tablet PC ink libraries provide the Divider class to do this. This object uses data in a Stroke collection to help it decipher what is text and what is picture. The analysis is actually performed by the Divide method, which returns an object of type DivisionResult. That class, in turn, returns a collection of type DivisionUnits, which contains one or more DivisionUnit objects, each holding data with one of four types, as defined in the InkDivisionType enumeration:

enum InkDivisionType { Segment = 0, Line = 1, Paragraph = 2, Drawing = 3 };

Three of the ink division types are text: Segment, Line, and Paragraph. For each of these types, the ToString method returns the associated text string. The fourth division type, Drawing, is a set of strokes that contain non-text ink. The associated strokes are stored in a Strokes collection.

Converting Ink to Text

Converting ink strokes to text requires a recognizer, which runs only on Windows XP Tablet PC Edition or on a system with the Tablet SDK installed. The SDK for the Tablet PC allows ink to be drawn and saved on systems running Windows XP Professional (or Windows 2000 with SP2), but the SDK does not provide a recognizer for anything other than a development machine.

Handwriting recognizers are language specific. To be more precise, recognizers are identified both by primary language and locale. The Recognizer class contains a Languages property, which has an array of LCID (locale-identifiers) values that identify the languages a given recognizer supports. The recognizer uses a locale-specific (not language-specific) dictionary to improve its results. Thus there are US English (locale = 0x409) and UK English (locale = 0x809) recognizers.

Multiple recognizers can be installed on a single Tablet PC. To help organize available recognizers, the ink library provides a collection class named Recognizers. A program can let the system provide a default recognizer, or it can select a specific one available from the Recognizer collection.

Microsoft provides a number of handwriting recognizers, some of which come with the Tablet PC; others are available for download and installation as part of the Microsoft Tablet PC Recognizer Pack. Each recognizer generally supports a single language, although support for several related languages is possible. As of this writing, Microsoft provides nine handwriting recognizers in the Recognizer Pack: Chinese (Simplified), Chinese (Traditional), English (UK version), English (US version), French, German, Korean, Japanese, and Spanish.

Third parties can create custom recognizers. The handwriting recognizer interfaces are public and documented in the Tablet PC SDK. Custom recognizers could be used to support other languages or sets of specialized symbols such as musical notes or architectural or engineering symbols.

Once a recognizer has been selected, text recognition services can be made available by creating an instance of the Recognizer object—or, more precisely, of a RecognizerContext. Call the CreateRecognizerContext method on the Recognizer class; and it will return an object of type RecognizerContext.

A RecognizerContext provides an opportunity for an ink-enabled program to give hints to the recognizer. The selection of a recognizer by available language/locale leads to the most important hint—a dictionary of common words. The set of words in the dictionary helps improve the recognizer's accuracy.

A RecognizerContext contains three properties to improve text recognition: Factoid, Guide, and WordList. A Factoid indicates the expected type of data (see Figure 15). Some Factoids indicate that numeric data is expected; others suggest that a date, time, or telephone number is available. This level of precision assumes that a program has gathered a very precise set of pen strokes from a well-defined input area.

Figure 15 Supported Tablet PC Factoids

NONE DIGIT FILENAME HIRAGANA
DEFAULT NUMSIMPLE UPPERCHAR KATAKANA
SYSDICT CURRENCY LOWERCHAR KANJI_COMMON
WORDLIST POSTALCODE PUNCCHAR KANJI_RARE
EMAIL PERCENT JPN_COMMON BOPOMOFO
WEB DATE CHS_COMMON JAMO
ONECHAR TIME CHT_COMMON HANGUL_COMMON
NUMBER TELEPHONE KOR_COMMON HANGUL_RARE

A second property in the RecognizerContext is the Guide property, a structure of type RecognizerGuide. This property provides a set of coordinates (in ink space) that describe the size of the drawing area, the size of any drawn guide box, along with row and column information a program supplies to assist a user. The recognizer can use these coordinates to determine the relationship of strokes to available drawing area, thereby improving certain types of recognition.

The third property in the RecognizerContext is the WordList property. This identifies an application-specific set of words to supplement the system dictionary. The Factoid property controls whether the system dictionary is used (when set to SYSDICT), whether the word list is used (when set to WORDLIST), or when neither is used (NONE).

TicTacToe: An Ink-Enabled Sample

The Tablet PC Platform SDK offers a nice set of samples that represent a starting point for your Tablet PC programming. To supplement them, I wrote TicTacToe (see Figure 16).

Figure 16 Ink-Enabled TicTacToe

Figure 16** Ink-Enabled TicTacToe **

While it's a simple game, seasoned developers can tell you that outward simplicity often hides inner complexity. And sometimes, as with my game, the problem involves finding the best match between the elements of a programming interface and the real world. I arrived at the final design after exploring other designs and writing some prototype code. A review of my discarded alternative designs reveals as much about ink programming as the one that I eventually adopted.

InkEdit Controls

My first approach involved using an InkEdit control provided by the Tablet PC libraries. Built-in controls are a good place to get rich features with a minimum amount of coding. UI elements often appear simple, but the myriad of details required often make this portion of code inherently complex. Offloading the work to a built-in control helps to simplify your project.

The InkEdit control is a rich text control that has been enabled to support ink input, which then gets converted to text. By contrast, the InkPicture control is for non-text ink. Since any TicTacToe application must allow players to enter two letters, I decided to try using the InkEdit control.

The first version of my game used nine InkEdit controls. InkEdit controls are primarily built to accept and display multiple characters, and even multiple lines, of text. An edit control is less usable when you only want to accept a single character, which is what my program required. It also does not work well when you want the character drawn in a particular way, which in my case meant having an X or an O occupy the whole client area of a single InkEdit window. I experimented with increasing the font size and playing other tricks. As I increased the font size, I also increased the chances that entered text would get scrolled outside the viewing area.

Also, I wanted the X or O to be centered both horizontally and vertically in the game square. But the InkEdit class tries to align text top-left. I could probably have made accommodations for this, but it occurred to me that the appearance of Arial or Times New Roman characters had the wrong feeling for a game that I used to play with crayons on scratch paper.

The text recognition worked well, but for my purposes it worked too well. A user could write anything in a game square, which meant that I would have to validate the input, requiring more code.

At the start, I thought I could adjust for these shortcomings, but in the end I decided to try another approach.

A Full-Screen InkCollector

My next attempt was to ink-enable my program's main window. Because this was a managed code program, the main window was an instance of the Form class. The most important thing I needed was a blank canvas to draw my game grid, and the ability to connect an InkCollector object to the window to retrieve pen events, manage the ink, and convert the input to the required game input.

The results of this experiment demonstrate a key design element of ink input. All the ink-as-text seen by an InkCollector is expected to be part of a continuous text block. So when my program went to retrieve the game input, it received a string that contained a set of X and O characters that were clumped together in a single string. For example, on the third move of a game, the string "XOX" might be returned. While this may have been correct for the letters that were being written, this string excluded all information about the specific game squares that were being played. Location information for text entered was lost because the recognizer did not provide any place holders for empty squares.

Because my game needed various functionalities that were quite different from what could easily be achieved with built-in controls, I decided that I would have to build more custom elements than I had originally anticipated.

Nine InkCollectors

My final design used nine windows and nine InkCollectors. I used the panel control for each game square because, like a form, a panel provides a blank customizable canvas. While I used a recognizer to determine what had been drawn in a game square, I did not replace the ink on the display screen with a text character. Keeping the ink as ink retained the flavor of the game as experienced when played on paper.

There are other alternatives that I did not try. One would be to use nine InkPicture controls. This would have allowed me to retain the ink feel, with perhaps less work on my part. A second alternative would have been to have a single input window with nine InkCollectors. Multiple InkCollectors can be connected to a single window (as set with the SetWindowInputRectangle method), provided no overlap exists between input rectangles.

Ink-enabling a custom input window is useful when you need multiple input areas. When a program creates such a window, it has the option of leaving the ink as ink, or converting the ink to text and displaying the results as text. After all, a data entry form might consist of both ink and text windows. Ink windows would be appropriate for signatures and hand-drawn diagrams. Text windows would be appropriate for other entry fields..

My program owns the custom input windows, which required a bit more work than I had hoped. But it provided the greatest amount of control over the input, processing, and display of game data.

In the TicTacToe game, all ink input is handled by InkCollector objects. The UI consists of nine white panels arranged on a black form, which produces the effect of having lines between game squares. Each panel has its own InkCollector object.

Assembly and Namespace References

For ink programming with managed code, you need to reference the Microsoft.Ink.dll, which ships with the Tablet PC SDK. You reference the proper namespace like this:

using Microsoft.Ink;

The Ink assembly that ships with the Tablet PC SDK is an update to the version that ships with the first generation of Tablet PCs. If your Tablet PC programs reference the updated assemblies that ship with version 1.5 of the SDK, you must also ship updated assemblies with your Tablet PC programs, or you must ensure that all of your users already have the updated assemblies installed. For details on distributing the updated merge modules, refer to Knowledge Base article 331956 at Tablet PC Object Model Components Leak Resources.

The version of Microsoft.Ink.dll that ships with the first generation of devices is version 1.0.2201.0. The assembly that comes with the SDK version 1.7 (which is available online as of October 2004) has a version number of 1.7.2600.2180. (The newer version is installed on development systems during the installation of the Tablet PC SDK.)

On the subject of versions of assemblies, the first generation of Tablet PCs ships with the .NET Framework 1.0. This is not a major issue, but when building Tablet PC programs with Visual Studio .NET 2003, you must be sure to set the runtime version to 1.0; otherwise, the default is the .NET Framework 1.1.

TicTacToe Initialization and Cleanup

Figure 17 shows the initialization of the InkCollectors when the form receives a Load event. References to the nine game square panels are stored in the m_panels array, which parallels the nine InkCollectors stored in the m_inkcoll array. The two arrays allow each set of objects to be put into loops when all objects must be searched or operated on in other ways.

Figure 17 Load Event Handler for TicTacToe Form

private void FormMain_Load(object sender, EventArgs e) { // Put panels into array. m_panels[0] = panel1; m_panels[1] = panel2; m_panels[2] = panel3; m_panels[3] = panel4; m_panels[4] = panel5; m_panels[5] = panel6; m_panels[6] = panel7; m_panels[7] = panel8; m_panels[8] = panel9; for (int i = 0; i < 9; i++) { m_inkcoll[i] = new InkCollector(m_panels[i].Handle); m_inkcoll[i].Enabled = true; m_inkcoll[i].DefaultDrawingAttributes.Width *= 3; m_inkcoll[i].DefaultDrawingAttributes.Color = Color.Red; m_inkcoll[i].Stroke += new InkCollectorStrokeEventHandler(InkCollector_Stroke); } // Record original size of panels to help in scaling. m_sizefPanelStart = new SizeF(panel1.ClientRectangle.Width, panel1.ClientRectangle.Height); }

Each InkCollector object generates various ink-specific events. As each InkCollector object is created and initialized, my initialization code connects an event handler for the Stroke event to each InkCollector. The Stroke event is triggered when ink has been collected from a sequence of user actions that include touching the screen with the pen tip, moving the pen, and removing the pen tip from the screen (described later).

The last action taken by the load event handler is to record the size of the form's client area. My game uses the initial size when the size of the form changes, when the ink inside each panel must be scaled to maintain a consistent look no matter how large or small the individual game squares become. I'll describe the details of this operation later, in the discussion of the Resize event.

A key benefit of managed code is support for automatic garbage collection for managed memory. There are times, however, when objects must be explicitly freed. Ink objects are implemented in native code as COM objects, and so care must be taken to release unmanaged resources in a timely fashion. These include the InkCollector object.

Figure 18 shows the Dispose method for my form. Notice that I first removed the event handler on the InkCollector and then disposed of the InkCollector itself. This is described in more detail in Knowledge Base article 331956, which I referenced earlier.

Figure 18 Dispose Method for TicTacToe

protected override void Dispose(bool disposing) { if (disposing) { if (components != null) { components.Dispose(); } // See Knowledge Base Article #331956 for (int i = 0; i < 9; i++) { if (m_inkcoll[i] != null) { // Remove event handler. m_inkcoll[i].Stroke -= new InkCollectorStrokeEventHandler( InkCollector_Stroke); // Clean up ink collector object m_inkcoll[i].Dispose(); m_inkcoll[i]= null; } } } base.Dispose( disposing ); } // method: Dispose

InkCollector Events

The InkCollector does all the work of capturing the input related to ink entry, in much the same way that an edit/textbox control captures keyboard input. For each of these controls, events are generated by the control in response to changes in the input state.

Some of the input events for the InkCollector are surrogates for mouse input. Others are specific to ink input, such as Gesture, NewInAirPackets, NewPackets, Stroke, and SystemGesture. The two "packet" events let you know when raw input is being received; it's useful, but not for my game.

The two "gesture" events let you know when an application or system gesture has been detected. I considered using gestures in this game, but was not able to find a gesture that matched the X (although the O did have a related gesture).

The final ink input event is the Stroke event. A stroke corresponds to all of the points collected from the time the user pushes down with the pen, through all movements of the pen and until he lifts the pen.

My game looks for the Stroke event to detect when the user has generated ink. The nine InkCollector objects share a single Stroke event handler, named InkCollector_Stroke. I figured out which InkCollector sent the event by looping through the InkCollectors in the array named m_inkcoll array and comparing to the event handler's sender parameter. The result is iWindow, an index to an InkCollector in the array of InkCollectors. The same index also refers to the associated panel that the InkCollector occupies.

I created a bit mask for the current square by shifting the value 1 left by the window number, which resulted in iWindow. The value of iWindow is added to m_SqX when a player enters an X, and to m_SqO when a player enters an O. This approach makes it easy to search for a winning combination. Figure 19 shows the table of winning values that my program uses.

Figure 19 Winning Results

// Table of winning square combinations // Assumes the following layout of squares // // 1 | 2 | 3 // ===|===|=== // 4 | 5 | 6 // ===|===|=== // 7 | 8 | 9 private static int [] m_aiWins = { 0x0007, // 123 0x0038, // 456 0x01C0, // 789 0x0049, // 147 0x0092, // 258 0x0124, // 369 0x0111, // 159 0x0054, // 357 0 }; // Check for a winning combination // from table of possible wins. private bool CheckForWin(int sqInput) { for (int i = 0; m_aiWins[i] != 0; i++) { if ((sqInput & m_aiWins[i]) == m_aiWins[i]) return true; } return false; }//method: CheckForWin

Once stroke data has been entered, the final step in my game is to recognize text (see Figure 20). The game looks for either X or O, and ignores anything else. My game uses the simplest possible approach to text recognition: it calls the ToString method for the Strokes collection in a single line of code, and then my program can detect what text has been written.

Figure 20 Stroke Event Handler Method for TicTacToe

private void InkCollector_Stroke(object sender, InkCollectorStrokeEventArgs e) { if (m_bGameOver) { MessageBox.Show("Game over — select " + "Game->Restart for a new game", strAppName); return; } int iWindow = -1; for (int i = 0; i < 9; i++) { if (sender.Equals(m_inkcoll[i])) { iWindow = i; } } // Create bit-mask for current square. int iSquare = 1<<iWindow; // Check whether square is available. if ((iSquare & m_sqFree) == 0) { MessageBox.Show("Square already occupied", strAppName); e.Cancel = true; return; } // Recognize input in window number iWindow. string str = m_inkcoll[iWindow].Ink.Strokes.ToString().ToUpper(); if (str == "X") { // Set flags. m_sqX += iSquare; m_sqFree -= iSquare; // Check for winning. if (CheckForWin(m_sqX)) { MessageBox.Show("Game over — winner is 'X'", strAppName); m_bGameOver = true; } } else if (str == "O") { // Set flags. m_sqO += iSquare; m_sqFree -= iSquare; // Check for winning. if (CheckForWin(m_sqO)) { MessageBox.Show("Game over — winner is 'O'", strAppName); m_bGameOver = true; } } else { if (m_inkcoll[iWindow].Ink.Strokes.Count > 1) { MessageBox.Show("Cannot recognize input - try again.", strAppName); m_inkcoll[iWindow].Ink.DeleteStrokes(); m_panels[iWindow].Invalidate(); } } } // method: InkCollector_Stroke

That one line of code taps into the ink recognition, but there is much more below the surface. The Tablet PC recognition engine uses a static neural network to convert the strokes to text, which means that recognition is accomplished by assigning probabilities to different potential results. Recognizers are language specific, and rely on a set of dictionaries of possible good results. My one line of code shows how easy it is to get ink converted into text.

Programs that need more sophisticated control over text recognition can get that control. Such needs are addressed with a set of recognizer classes which include the InkRecognizers collection, which lets a program get a list of the available recognizers. For example, a program that accepted multi-language input could choose a recognizer based on its support for a specific language.

Once a specific recognizer has been selected, the InkRecognizerContext class owns the actual recognition work. My program did not have to create an instance of this class, because the ToString method in the Strokes class did it for me. A program can access the results of the recognition engine by creating an instance of this class, and then asking for a list of alternative recognition results and the degree of confidence calculated for each.

As mentioned earlier, text recognition requires the assistance of an installable component known as a handwriting recognizer. Handwriting recognizers ship with the Tablet PC operating system, but not with the Tablet PC SDK. For this reason, this game requires an actual Tablet PC—or a standard PC that has Windows XP Tablet PC Edition installed.

What's Next

Before you write your first real-world Tablet PC program, I suggest you spend time using a Tablet PC to do work in the real world. This gives you a sense of all the functionality of the input devices. A Tablet PC is not just a PC without the familiar input devices. It is a new form factor, which allows a computer to go places that a computer traditionally could not go. It allows for new types of interactions between people and machines. And it allows for the creation of new types of applications. For anyone who has been thinking the use of computers can be expanded, the Tablet PC is worth serious consideration.

Paul Yao is a programming guru, specializing in Windows CE, and coauthor of the first book published on Windows programming, Programmer’s Guide to Windows (Sybex, 1987). His company specializes in training programmers in Windows and Windows-related technologies. Visit his Web site at www.paulyao.com.