Developing Windows Applications in C++: Windows Basics

What is a Windows application?

What does it mean to write a Windows application? If you have a Windows computer, aren’t all the applications that run on it “Windows applications”? Not really, no. Here are some things that are not Windows applications:

  • Console applications like “Hello World” and most of the samples from any Introduction to C++ course. These applications can be run from a command prompt. They write to stdout and don’t use Windows controls or graphics.
  • Services, also called Windows Services, that don’t have any user interface and just run in the background, monitoring or reacting.
  • Web Services, which respond to calls over HTTP.
  • Web applications designed to run inside a Web browser, which may contain code that executes on the end user machine (JavaScript or Silverlight, for example) but don’t rely on the operating system directly.
  • Plug-ins, add-ons, and customizations of various applications (Office, Visual Studio, and many others) that rely on the hosting application for all that they need.

A Windows application executes on a Windows machine, has a windowed user interface that can usually be resized, and makes calls to Windows to provide some of its functionality. That’s the kind of application you’ll learn how to make with this material. Many of the skills you’ll learn writing Windows applications in C++ are applicable to writing other kinds of applications as well, even though there are no examples of those other kinds of applications in this material.

There are three main ways that Windows and your application will interact. As you’ll see, the interactions go both directions – sometimes you call to Windows because you want some information or want something done, and sometimes Windows calls to you. The categories of interaction are:

  • Windows messages
  • Flat C-style calls
  • COM

There are some other forms of interaction that are used less often, but these will cover what you need to get started.

One very important difference between these Windows applications and some of the other kinds of applications listed above is that a Windows application is essentially always waiting to do something in response to a signal from Windows. This is a different paradigm than you use in a console app or a straight procedural program.

Because a Windows application is different in a lot of ways from other applications you may have used or written, it involves different concepts, nomenclature, and related technologies than other applications. When you see a “first Windows application” like the one in the next chapter, it may appear to be entirely foreign. This chapter covers some of that background so that you can get started. If you prefer, you can go straight to the next chapter and come back here on finding an unfamiliar concept or word, to look it up.

What is a Window?

One of the hallmarks of a Windows application is, of course, the main window that it has. Take a look at Notepad, Visual Studio, and whatever web browser you regularly use. They have plenty in common, right?

There’s a title bar at the top, which typically contains the name of the document you have open, a dash, and then the name of the application. There are buttons to minimize, to maximize or restore, and to close the application. There is also a frame around the window. All of this is often called “the chrome” by developers, but is technically called the non-client area, to distinguish it from the client area, which is where your application will display its user interface. In the case of Notepad, that UI is a single toolbar and a large area where you can type text. In the case of Visual Studio, the client area is a lot more complicated, with multiple panes, lots of toolbars, and controls everywhere that you can use to access all the functionality of the IDE. Together, the non-client area and the client area make up the main window of a windows application.

Not all windows have all these parts. Every control in a Windows application, such as a button, a checkbox, an edit box, or the large white area of Notepad that you can type in, is in fact a window itself. In fact, your code may interact more with these windows than with the main window.

A window has a number of properties and behaviors. A full list would not be useful here. The most important aspects of a window are these:

  • It has a location on the screen, and a size
  • It may be visible, hidden, or partly visible at any particular time
  • If it is not a main window, it knows its parent or owner
  • It can handle messages and can send messages

In fact, the sending and handling of messages is what really makes a Windows application so different from a “hello world” program that runs through once and then is done.

A Windows application is essentially always waiting to do something in response to a message. Messages are one of the crucial concepts in Windows programming. They come from the operating system to a window, from a window to the operating system, and from one window to another. For example, when the user clicks a button, the button receives a “click” message, which causes it to send out another message (typically, to the button’s parent window) that makes the application do something the user wants to be done. When the user shuts down the computer, Windows will send a message to all the open applications so that they will shut down. Processing Windows messages, and sending appropriate messages to other windows, is a large part of writing a Windows application. Windows messages can carry parameters, the meaning of which varies from message to message.

Every window has a handle, and this handle is vital to communication among windows. The name handle is because this is what you use to “pick up” a window and work with it in some way. It’s not really a pointer, nor is it a reference. You’re not supposed to know how it works inside – it’s opaque – or to rely on things like the numerical value you find in a handle variable. There’s a special typedef set up for windows handles, HWND, and many of the functions you’ll call take or return HWNDs. (If you need to say HWND aloud, just say the letter H, then “wind”.) Because they’re not pointers, don’t try to dereference them, increment them, or manipulate them in any way. Just hang on to them and use them when you want to tell a window to do something or ask a window for some information.

Where do window handles come from? Well, your application will ask Windows to create its main window when it first runs, by calling a global function, either CreateWindow() or CreateWindowEx(), that returns an HWND. Then if there are any controls or other child windows in your application, they too will be created and you will have HWNDs to them so that your other code can communicate to them as needed.  The next chapter walks through a simple Windows application, including creating the main window.

Windows helps your application

A Windows application turns to Windows, the operating system, for almost everything that it does. If your application wants to read from a file, or write to a file, it’s the operating system that actually does the reads and writes. The same holds for almost everything the application does, including all the graphics and text it displays, many of the dialogs (such as the File Open dialog) it uses, and the way it reacts to the keyboard and mouse, or even touch for a touch-enabled system.

Windows messages

As mentioned above, Windows messages are the lifeblood of your application. Each has a message ID, which is just an integer. Each possible message ID also has a name, such as WM_PAINT or BCM_SETSHIELD, and these names are defined in a header file included with the Windows SDK. The names follow a naming convention that generally makes it simpler to remember the message and its purpose. For example, WM_PAINT is a message to any kind of window, so it starts with W. It is a message to the window from the operating system, designated with M. All the messages that start WM_ are messages from the operating system to a window. (For a full list of the prefixes used in the names of system-defined messages, see About Messages and Message Queues. A notification, which generally has an N in the prefix instead of an M, is a message from a window rather than a message to a window. ) Remember that everything in your user interface, including the main window, dialogs, and controls are all windows.

WM_MOVE is sent by the operating system to the main window when the user drags your application around on the screen with the mouse (or does the equivalent using cursor keys or touch.) WM_KEYDOWN is sent whenever the user presses a key on the keyboard and your application has the focus. WM_MOUSEWHEEL is sent when the user spins the mouse wheel (typically to scroll or zoom). WM_TIMECHANGE is sent when the system clock changes. There are hundreds of such messages, and it can take months to learn them all. If you want to handle a particular kind of user interaction or react to a particular event happening on the machine where your application is running, you can look up the message and the meaning of its parameters as needed. The MSDN documentation includes pages describing collections of messages, such as Mouse Input Notifications for messages notifying applications of mouse activity, as well as pages describing specific messages, such as WM_LBUTTONDOWN. It is more likely that you will use MSDN to find out how to do a particular task, and the documentation will include links to the detailed page for any message you need to handle.

One special message, WM_COMMAND, is used for communication from windows other than the main window (typically controls) to the main window. This is how the main window can react to a button being clicked, or a menu item being chosen, inside the application.

Microsoft Foundation Classes, MFC, are a set of C++ classes (some involving templates) that represent the basic building blocks of a Windows application: windows, views, controls, menus, and so on. MFC is as much a framework as a library; it sets up a structure for your application, and gives you spots to put your own code, by writing your own classes or by implementing particular functions that the framework expects to call. Microsoft continues to support MFC and made many updates in 2010 to enable the latest user interface elements including support for Windows 7 capabilities such as taskbar integration and a ribbon.

Many successful applications have been built on MFC. The larger the application, the more useful the structure provided by MFC can be. To use MFC effectively, you should understand the basics of Windows application development, even though MFC hides some of those basics from you. So, for getting started, this material will steer clear of MFC and show exactly what’s happening “behind the scenes.” This is a benefit to  developers who use Visual C++ Express, because MFC is not available with Express.

Flat C-style calls

The original method for interacting with Windows, and still the way a large amount of Windows functionality is offered, is in function calls that use the C calling convention. You can call these from your C++ applications because they are declared as “extern C” in the appropriate header files.  These header files can be hard to read because for most Windows functionality, they include definitions for two API calls – one that works with single-byte strings, suffixed with A, and another that works with Unicode strings, suffixed with W. (The letters stand for ASCII and Wide.) For example, these lines are from WinUser.h:

WINUSERAPI BOOL WINAPI SetUserObjectInformationA( __in HANDLE hObj, __in int nIndex, __in_bcount(nLength) PVOID pvInfo, __in DWORD nLength); WINUSERAPI BOOL WINAPI SetUserObjectInformationW( __in HANDLE hObj, __in int nIndex, __in_bcount(nLength) PVOID pvInfo, __in DWORD nLength); #ifdef UNICODE #define SetUserObjectInformation SetUserObjectInformationW #else #define SetUserObjectInformation SetUserObjectInformationA #endif // !UNICODE

This approach allows you to just call SetUserInformation() within your code. If your project is Unicode (as it should be for a modern application) the W version will be used. These definitions rely on typedefs and macros set up elsewhere in this and other header files. WINUSERAPI will be translated to DECLSPEC_IMPORT  whenever WinUser.h is included into any file except the one that actually implements these methods. DECLSPEC_IMPORT is itself a macro, defined as __declspec(dllimport). This compiler-reserved storage-class attribute identifies a function that is being imported from a DLL. (You may recognize the convention of using leading double underscores to identify a compiler-specific extension. __declspec is for the Visual C++ compiler specifically.) BOOL is just a typedef for int, from a time before the bool type was available. Finally WINAPI is a macro that expands to __stdcall, another compiler-reserved storage-class attribute that indicates the way parameters are passed on the stack when a function is called. All Windows API functions are __stdcall functions, and the macro WINAPI ensures they are declared correctly.

You don’t need to know this to call a Windows API function, of course – they’re declared for you in the header files like windows.h and all the files it includes. The explanation is just included so these API declarations don’t seem too opaque.

Sometimes, you can call a Windows API function to get something taken care of, and you get an immediate return giving you the information you wanted, or reporting that your request succeeded or failed. But often, an asynchronous pattern is used. You ask Windows to do something, and later it will call back to you as part of fulfilling that request. (For example, if your application makes an API call to register for Restart and Recovery, the operating system will call back into your application just before it is terminated, to allow unsaved data to be stored before the termination.) While there may be some Windows API functions that send you a Windows Message as a way of finishing the interaction, the more common approach is for Windows to call one of your own functions, using a function pointer you provided. These function pointers are generally referred to as callbacks.

You probably won’t be surprised to learn there’s a macro to help you declare a callback function. CALLBACK expands to __stdcall, and the compiler won’t care whether you use CALLBACK or WINAPI when you’re declaring your callback. You’ll see examples using either of these. Here are some excerpts from an application that uses Restart and Recovery:

DWORD WINAPI Recover(PVOID pContext); . . . RegisterApplicationRecoveryCallback(Recover, &g_dwRecordId, RECOVERY_DEFAULT_PING_INTERVAL, 0);

These two lines declare a function called Recover – the signature is established by Windows, and you would read the documentation to know what signature you’re required to implement – and then call a Windows API function, passing in a function pointer to that Recover function as the first parameter. Windows can now call Recover when it needs to. Any time documentation tells you that an API call “needs a callback function” or “takes a callback function” this is what it refers to.

COM

The Component Object Model, COM, is used for Windows functionality as an alternative to the flat API calls. A collection of related methods are gathered together into an interface, and a COM component implements one or more interfaces. Consuming a COM interface, that is using COM to get Windows to do something for you, is reasonably easy. You can write a simple Windows application that makes a few calls to the same COM interface in just a few dozen lines. This is covered in chapter 5 of this material. Implementing a COM interface, offering a component to Windows or to other applications, is more difficult and will not be discussed here.

COM is a binary standard, meaning that it defines how two executables can interact, whatever language they were written in. At first, COM components and the applications that used them were written in either C++ or Visual Basic. In the present, interop with .NET applications and components is also possible using the binary standard of COM. COM supports a separation of implementation from interface, and provides a mechanism to bind in a component at runtime, even after an application has already started running. Because you are interacting with a component, rather than just making a single function call and returning, you need to put a little effort into managing the lifetime of the component you are using. For example, you might make some calls that set state in the component, then another call that relies on the state you set. It’s important, then, that the component isn’t cleaned up before you make that last call.

Generally speaking, you don’t have a choice about whether to consume COM or not. If you want to use a particular capability offered by Windows, such as Text-to-Speech for example, you will have to use COM in order to gain access to that functionality.

The Active Template Library, ATL, was designed to simplify COM programming for C++ developers. It makes extensive use of C++ templates to enable developers to both consume and implement COM components quickly and easily. Some developers, especially those who weren’t using templates in their code already, found they made COM harder to learn than it otherwise would have been. ATL is not included in Visual C++ Express, and therefore this material will not use ATL in any of its COM coverage. Once you know the basic COM concepts, you may want to look into ATL as a way to reduce the amount of “boilerplate” code you write.

Where does .NET fit into all of this?

The Common Language Framework, and languages that adhere to the Common Language Specification, generally known as .NET, is another way of building Windows applications. There are probably dozens of ways to build Windows applications just with Microsoft tools and languages: C++ without MFC or ATL, C++ with MFC, C++ with ATL, C# with Windows Forms, C# with the Windows Presentation Foundation, Visual Basic with Windows Forms, Visual Basic with Windows Presentation Foundation and so on. This material is concentrating on C++ without MFC or ATL. A solid understanding of the way that Windows works and the way that an application interacts with it will stand you in good stead if you want to look in to other ways of building Windows applications in the future.

If your C++ application needs to interact with a .NET application, there are a number of ways to make that happen. The .NET assembly can be exposed to COM, and you can interact with it using the COM techniques in chapter 5. Alternatively, the .NET assembly might call in to your code using a facility called Platform Invoke or P/Invoke. You could also use the .NET variant of C++, C++/CLI, to create a wrapper class that understands how to interact with your C++ code but exposes CLR interfaces to be called from C# or Visual Basic. These most often happen when you write a class library in C++ rather than a Windows application. These techniques are covered elsewhere because they are not specific to Windows development.

Applications written in C# or Visual Basic are often referred to as managed applications, because the Common Language Runtime provides a number of services to them including component services (similar to COM), security, and memory management. Some developers then refer to C++ applications that don’t use the CLR as unmanaged applications. The preferred term is native applications. The executables from an ordinary C++ application run directly on the operating system, not in a runtime the way managed applications do.