The "Avalon" Input System

 

Nick Kramer
Microsoft Corporation

March 2004

Summary: The presentation subsystem in "Longhorn" (code-named "Avalon") provides powerful new APIs for input. This article gives an overview of these APIs: What services are provided to applications, the architecture of the input system, and how new input devices can be supported. (16 printed pages)

Contents

Introduction
Input in a Tree
Text Input
Keyboard Sample
Commands
Input Core Architecture
Adding New Kinds of Devices
Conclusion

Introduction

The presentation subsystem in "Longhorn" (code-named "Avalon") provides new APIs for input. The primary input APIs are on the Element class. Note that in this article, I am using the term "element" as shorthand for "FrameworkElement" or "ContentFrameworkElement". Although they are distinct classes, from an input point of view they are identical. Elements have all the mouse and keyboard functionality you've come to expect from the Windows® operating system—keypresses, mouse buttons, mouse movement, focus management, mouse capture, and so on. Elements have the following input-related properties, methods, and events:

    class Element
    {
        // non-input APIs omitted

        // Mouse          
        event MouseButtonEventHandler MouseLeftButtonDown;
        event MouseButtonEventHandler MouseLeftButtonUp; 
        event MouseButtonEventHandler MouseRightButtonDown;
        event MouseButtonEventHandler MouseRightButtonUp; 
        event MouseEventHandler MouseMove;  

        bool IsMouseOver { get; }         
        bool IsMouseDirectlyOver { get; }         
        event MouseEventHandler MouseEnter;
        event MouseEventHandler MouseLeave;         
        event MouseEventHandler GotMouseCapture;
        event MouseEventHandler LostMouseCapture;        

        bool IsMouseCaptured { get; }
        bool CaptureMouse();
        void ReleaseMouseCapture();         
    
        event MouseEventHandler MouseHover;  
        event MouseWheelEventHandler MouseWheel;

        // Keyboard         
        event KeyEventHandler KeyDown; 
        event KeyEventHandler KeyUp;         
        event TextInputEventHandler TextInput;         

        bool IsFocused { get; }
        bool Focus();                         
        event FocusChangedEventHandler GotFocus;
        event FocusChangedEventHandler LostFocus;
        bool Focusable { get; set; }
        bool IsFocusWithin { get; }
        bool KeyboardActive { get; set; }

        bool IsEnabled { get; }
    }

In addition, the Mouse and Keyboard classes provide:

    class Keyboard
    {
        static Element Focused { get; }
        static bool Focus(Element elt)
        static ModifierKeys Modifiers { get; }
        static bool IsKeyDown(Key key)
        static bool IsKeyUp(Key key)
        static bool IsKeyToggled(Key key)
        static KeyState GetKeyState(Key key)
        static KeyboardDevice PrimaryDevice { get; }
    }

    class Mouse
    {
        static Element DirectlyOver { get; }
        static Element Captured { get; }
        static bool Capture(Element elt);
        static Cursor OverrideCursor { get; set; }
        static bool SetCursor(Cursor cursor);
        static MouseButtonState LeftButton { get; }
        static MouseButtonState RightButton { get; }
        static MouseButtonState MiddleButton { get; }
        static MouseButtonState XButton1 { get; }
        static MouseButtonState XButton2 { get; }
        static Point GetPosition(Element relativeTo);
        static void Synchronize(bool force);
        static MouseDevice PrimaryDevice { get; }

        static void AddAnyButtonDown(Element element, 
MouseButtonEventHandler handler);
        static void RemoveAnyButtonDown(Element element, 
MouseButtonEventHandler handler);
    }

Avalon also has integrated support for the stylus. The stylus is pen input, made popular by the Tablet PC. Avalon applications can treat the stylus as a mouse, using mouse APIs. But Avalon also exposes stylus APIs on par with keyboard and mouse:

        // stylus APIs on Element

        event StylusEventHandler StylusDown;
        event StylusEventHandler StylusUp;
        event StylusEventHandler StylusMove;
        event StylusEventHandler StylusInAirMove;

        bool IsStylusOver { get; }
        bool IsStylusDirectlyOver { get; }
        event StylusEventHandler StylusEnter;
        event StylusEventHandler StylusLeave;
        event StylusEventHandler StylusInRange;
        event StylusEventHandler StylusOutOfRange;

        event StylusSystemGestureEventHandler StylusSystemGesture;

        event StylusEventHandler GotStylusCapture;
        event StylusEventHandler LostStylusCapture;
        bool IsStylusCaptured { get; }
        bool CaptureStylus()
        void ReleaseStylusCapture() 

The stylus is also capable of acting as a mouse, so applications that only recognize mice automatically get some level of stylus support. When the stylus is used in such a manner, the application first gets the appropriate stylus event, and then it gets the corresponding mouse event—we say that the stylus event gets promoted to a mouse event. (I briefly discuss the concept of promotion in Adding New Kinds of Devices.)

Additional, higher-level services such as inking are also available, although they are beyond the scope of this paper.

Input in a Tree

Elements contain other elements (its children), forming a tree of elements, typically several layers deep. In Avalon, the parent element can always participate in input directed to its child elements (or grandchild elements, etc). This is particularly useful for control composition, the building of controls out of smaller controls.

Avalon uses event routing to give notifications to parent elements. Routing is the process of delivering events to multiple elements, until one of them marks the event as handled. Events use one of three routing mechanisms—direct-only (also known as "no routing"), tunneling, and bubbling. Direct-only means only the target element is notified, and is what Windows Forms and other .NET libraries use. Bubbling works up the element tree. first notifying the target element, then the target's parent, then the parent's parent, and so on. Tunneling is the opposite process—starting at the root of the element tree and working down, ending with the target element.

Avalon input events generally come in pairs—a tunneling event followed by a bubbling event. For instance, PreviewMouseMove is the tunneling event that goes with the bubbling MouseMove event. As an example, suppose in the following tree "leaf element #2" is the target of MouseDown/PreviewMouseDown:

Aa480167.avaloninput01(en-us,MSDN.10).gif

The order of event processing will be:

  1. PreviewMouseDown (tunnel) on root element
  2. PreviewMouseDown (tunnel) on intermediate element #1
  3. PreviewMouseDown (tunnel) on leaf element #2
  4. MouseDown (bubble) on leaf element #2
  5. MouseDown (bubble) on intermediate element #1
  6. MouseDown (bubble) on root element

Here is a list of Preview input events on Element:

        // Preview events on Element
        event MouseButtonEventHandler PreviewMouseLeftButtonDown;
        event MouseButtonEventHandler PreviewMouseLeftButtonUp; 
        event MouseButtonEventHandler PreviewMouseRightButtonDown;
        event MouseButtonEventHandler PreviewMouseRightButtonUp; 
        event MouseEventHandler PreviewMouseMove;  
        event MouseWheelEventHandler PreviewMouseWheel;
        event MouseEventHandler PreviewMouseHover;  
        event MouseEventHandler PreviewMouseEnter;
        event MouseEventHandler PreviewMouseLeave;
        event KeyEventHandler PreviewKeyDown; 
        event KeyEventHandler PreviewKeyUp;
        event FocusChangedEventHandler PreviewGotFocus;
        event FocusChangedEventHandler PreviewLostFocus;
        event TextInputEventHandler PreviewTextInput;
        event StylusEventHandler PreviewStylusDown;
        event StylusEventHandler PreviewStylusUp;
        event StylusEventHandler PreviewStylusMove;
        event StylusEventHandler PreviewStylusInAirMove;
        event StylusEventHandler PreviewStylusEnter;
        event StylusEventHandler PreviewStylusLeave;
        event StylusEventHandler PreviewStylusInRange;
        event StylusEventHandler PreviewStylusOutOfRange;
        event StylusSystemGestureEventHandler PreviewStylusSystemGesture;

Usually, after the event is marked handled, further handlers are not invoked. However, when you create a handler, you can ask that it receives handled events as well as unhandled ones by using the AddHandler method (pass "true" for the handledEventsToo parameter).

Because of tunneling and bubbling, parents will receive events originally targeted to their children. Usually, it's not important who the target was—after all, the event is unhandled. But when it is important to know (particularly MouseEnter/MouseLeave and GotFocus/LostFocus), InputEventArgs.Source will tell you.

Another interesting issue is coordinate spaces. Coordinate (0,0) is the upper left, but the upper left of what—the element that is the input target, the element you attached your event handler to, or something else? To avoid confusion, Avalon input APIs require you to specify your frame of reference when dealing with coordinates. For example, MouseEventArgs.GetPosition method takes an Element as a parameter, and the (0,0) coordinate returned by GetPosition is the upper left corner of that element.

Text Input

The TextInput event allows a component or application to listen for text input in a device-independent fashion. The keyboard is the primary means of TextInput, but speech, handwriting, and other input devices can generate TextInput.

For keyboard input, Avalon will first send the appropriate KeyDown/KeyUp events, and if those are not handled and the key is textual, then a TextInput event. There's not always a simple one-to-one mapping between KeyDown/KeyUp and TextInput events—multiple keystrokes can generate a single character of TextInput, and single keystrokes can generate multicharacter strings. This is particularly true for Chinese, Japanese, and Korean, which use Input Method Editors (IMEs) to generate the thousands of different characters in the alphabet.

When Avalon sends a KeyDown/KeyUp event, if the keystrokes could become part of a TextInput event, KeyEventArgs.Key will be set to Key.TextInput, so applications don't accidentally process keystrokes that are part of larger TextInput. In these cases, KeyEventArgs.TextInputKey will reveal the real keystroke. Similarly, if an IME is active, KeyEventArgs.Key will be Key.ImeProcessed, and KeyEventArgs.ImeProcessedKey will give the actual keystroke.

Keyboard Sample

Let's take a simple example where pressing CTRL+O opens a file (no matter what control has focus), and pressing the Open button also performs that action:

Aa480167.avaloninput02(en-us,MSDN.10).gif

In Win32, one would define an accelerator table, and handle WM_COMMAND, usually with a switch statement. (You could try to instead handle WM_KEYDOWN inside the window's WndProc, but unless you also modified the button and edit box's WndProc, you'd only get keystrokes if focus wasn't on the button or edit box.)

// sample.rc
…
IDC_INPUTSAMPLE2 ACCELERATORS BEGIN    "O",            ID_ACCELERATOR_O,       VIRTKEY, CONTROL, NOINVERTEND

// sample.cpp
…

int APIENTRY _tWinMain(HINSTANCE hInstance,
                     HINSTANCE hPrevInstance,
                     LPTSTR    lpCmdLine,
                     int       nCmdShow)
{
    . . .
    MyRegisterClass(hInstance);
    InitInstance(hInstance, nCmdShow);

    HACCEL hAccelTable = LoadAccelerators(hInstance, (LPCTSTR)IDC_INPUTSAMPLE2);

    MSG msg;
    while (GetMessage(&msg, NULL, 0, 0)) 
    {
        if (!TranslateAccelerator(window, hAccelTable, &msg)) 
        {
            TranslateMessage(&msg);
            DispatchMessage(&msg);
        }
    }

    return (int) msg.wParam;
}

ATODWM MyRegisterClass(HINSTANCE hInstance) { . . . }

BOOL InitInstance(HINSTANCE hInstance, int nCmdShow)
{
   window = CreateWindow(szWindowClass, szTitle, WS_OVERLAPPEDWINDOW,
      CW_USEDEFAULT, 0, CW_USEDEFAULT, 0, NULL, NULL, hInstance, NULL);

   if (!window)
   {
      return FALSE;
   }

   button = CreateWindow("BUTTON", "Open", 
       WS_CHILD | WS_VISIBLE | BS_PUSHBUTTON, 
       40, 40, 90, 30, window, (HMENU) ID_BUTTON, hInstance, NULL);
   if (!button)
   {
      return FALSE;
   }

    DWORD dwStyle = WS_CHILD | WS_VISIBLE
                | WS_BORDER | ES_LEFT | ES_NOHIDESEL
                | ES_AUTOHSCROLL | ES_AUTOVSCROLL;  

    edit = CreateWindow("EDIT", "...", dwStyle,
                40, 80, 150, 40,
                window, (HMENU) 6, hInstance, NULL); 

   ShowWindow(window, nCmdShow);
   UpdateWindow(window);

   return TRUE;
}

LRESULT CALLBACK WndProc(HWND hWnd, UINT message, 
                                           WPARAM wParam, LPARAM lParam)
{
    switch (message) 
    {
    case WM_COMMAND: {        switch (LOWORD(wParam))        {        case ID_ACCELERATOR_O:         case ID_BUTTON:            MessageBox(NULL, "Pretend this opens a file", "", 0);           return 0;        }        break;    }
    case WM_DESTROY:
        PostQuitMessage(0);
        return 0;
    }
    return DefWindowProc(hWnd, message, wParam, lParam);
}

In Windows Forms, one would set KeyPreview on the form to true, and handle the KeyDown event on the form:

using System;
using System.Drawing;
using System.Collections;
using System.ComponentModel;
using System.Windows.Forms;
using System.Data;

public class Form1 : Form
{
    static void Main() 
    {
        Application.Run(new Form1());
    }

    private Button button1;
    private TextBox textBox1;

    public Form1()
    {
        this.button1 = new Button();
        this.textBox1 = new TextBox();
        this.SuspendLayout();
        // 
        // button1
        // 
        this.button1.Location = new Point(8, 40);
        this.button1.Name = "button1";
        this.button1.TabIndex = 0;
        this.button1.Text = "Open";
        this.button1.Click += new EventHandler(this.button1_Click);
        // 
        // textBox1
        // 
        this.textBox1.Location = new Point(8, 88);
        this.textBox1.Name = "textBox1";
        this.textBox1.TabIndex = 1;
        this.textBox1.Text = "...";
        // 
        // Form1
        // 
        this.AutoScaleBaseSize = new Size(6, 15);
        this.ClientSize = new System.Drawing.Size(292, 260);
        this.Controls.AddRange(new Control[] {
                                                 this.textBox1,
                                                 this.button1});
        this.Name = "Form1";
        this.Text = "Input Sample";
        this.KeyPreview = true;        this.KeyDown += new KeyEventHandler(this.Form1_KeyDown);
        this.ResumeLayout(false);
    }

    private void button1_Click(object sender, EventArgs e)    {        handle();    }    private void Form1_KeyDown(object sender, KeyEventArgs e)    {        if (e.KeyCode == Keys.O && e.Modifiers == Keys.Control)         {            handle();            e.Handled = true;        }    }    void handle()     {        MessageBox.Show("Pretend this opens a file");    }
}

In Avalon, one would define a handler for the Button's Click event (btn_Click), and a handler for KeyDown (fp_KeyDown):

<Window  
    xmlns="https://schemas.microsoft.com/2003/xaml"
    xmlns:def="Definition"
    Text="Application1" Visible="True"
    >
<FlowPanel KeyDown="fp_KeyDown">
    <Button Click="btn_Click"> Open </Button>
    <TextBox> ... </TextBox>

    <def:Code> <![CDATA[
    void fp_KeyDown(object sender, KeyEventArgs e) {        if (e.Key == Key.O                 && Keyboard.Modifiers == ModifierKeys.Control) {            handle();            e.Handled = true;        }    }    void btn_Click(object sender, ClickEventArgs e) {        handle();        e.Handled = true;    }   void handle() {       MessageBox.Show("Pretend this opens a file");    }
    ]]> </def:Code>
</FlowPanel>
</Window>

Note that the KeyDown handler is attached to the FlowPanel near the root of the tree. (We use FlowPanel instead of Window because the Window class doesn't get input.) Because input bubbles up the tree, the FlowPanel will get the input no matter which element has focus.

The examples differ in one subtle point—suppose the edit control wants to handle CTRL+O? In the Win32 and Windows Forms samples, the edit control never receives a WM_KEYDOWN or equivalent notification, because the event is handled in the message loop by way of TranslateAccelerator. In the Avalon sample, the TextBox control is notified first, and our fp_KeyDown handler is called only if the TextBox didn't handle the input. Or, we could handle PreviewKeyDown instead of KeyDown, in which case our fp_KeyDown handler will be called first.

In the Avalon example above, we ended up writing handling logic twice—once for CTRL+O, and a second time for the button click. We can simplify this using Avalon commands.

Commands

Note Commands are only partially implemented in the PDC 2003 Longhorn prerelease build.

Commands allow you to handle input at a more semantic level than device input. Commands are simple directives, like "cut," "copy," "paste," or "open." Avalon will provide a library of common commands, but you can also define your own.

Commands are useful for centralizing your handling logic. The same command might be accessible from a menu, on a toolbar, or through a keyboard shortcut, and by using commands you write a single piece of code that works for all the different input cases. Commands also provide a mechanism for graying out menu items and toolbar buttons when the command becomes unavailable.

The common commands provided by Avalon come with a set of default input bindings built in, so when you specify that your application handles Copy, you automatically get the CTRL+C = Copy binding. You also get bindings for other input devices, such as Tablet pen gestures and speech information. Finally, many of the common commands come with their own icons, making toolbars look more consistent and professional.

Many controls have built-in support for certain commands. For example, TextBox understands Cut, Copy, and Paste. And since each of those commands provides the default key binding, TextBox automatically supports those shortcuts.

Input Core Architecture

Aa480167.avaloninput03(en-us,MSDN.10).gif

The input system consists of both kernel-mode and user-mode pieces. Input originates in a device driver, which for most input devices is then sent to win32k.sys, the kernel-mode component of USER and GDI. Win32k.sys does some processing on the input and decides which application process to send the input to. Inside a Longhorn application, Avalon performs further processing on the input and sends notifications to the application.

Like Win32 programs, Avalon programs have a message loop that polls the external world for new notifications. Avalon can integrate with a standard Win32 message loop, which is connected with the rest of Avalon through a dispatcher. The dispatcher abstracts details of the specific loop, as well as providing services for dealing with nested message loops. To receive messages from Win32, Avalon has an hwnd known as the HwndSource. Message processing is synchronous—the HwndSource WndProc doesn't return until Avalon has fully processed the input message. This enables integration with Win32 in cases where WndProc is expected to return a value.

Inside a Longhorn application, input processing looks like this:

Aa480167.avaloninput05(en-us,MSDN.10).gif

To the core input system (represented by the gray boxes), input begins when an IInputProvider notifies its corresponding InputProviderSite about available input report. The site tells the InputManager, which puts the input report in the staging area. Various monitors and filters are then run over the staging area, turning the input report into a series of events. Finally, the events are routed through the element tree and handlers are invoked.

The input providers for the keyboard and mouse get their input from Win32 USER by way of the HwndSource (not pictured). Other devices can choose to do the same, or can choose an entirely different mechanism. Stylus is an example of input that did not come from the HwndSource—the input provider for stylus gets its input from wisptis.exe, which in turn talks to the device driver throughHID (Human Interface Device API). InputManager provides APIs for registering new input providers.

A filter is any code that listens to the InputManager.PreProcessInput or InputManager.PostProcessInput events. Filters can modify the staging area. Canceling PreProcessInput will remove the input event from the staging area. PostProcessInput exposes the staging area as a stack—items can be popped off or pushed to the top of the staging area.

A monitor is any code that listens to InputManager.PreNotifyInput or InputManager.PostNotifyInput. Monitors cannot modify the staging area.

Adding New Kinds of Devices

Note We are in the early stages of designing the features discussed in this section, and appreciate your feedback. Listed below are some of the scenarios that enable device extensibility:

  • Adding new keys to keyboards, adding buttons to mice
  • Adding new kinds of devices that expose an API to applications
  • Adding new kinds of devices that emulate existing devices (usually mouse and keyboard)
  • Adding new kinds of devices that both expose an API and emulate existing devices, for compatibility with applications that don't understand that type of device
  • Using HID to add new devices or device functionality
  • Enabling applications using HID to get lower-level "raw" input
  • Enabling globally binding input sequences to actions; for example, the "mail" key will launch an e-mail program (including scenarios where the foreground application doesn't understand the mail key)

Conclusion

Avalon gives full access to the mouse, keyboard, and stylus, while providing higher-level services for text input and commands.