Windows Vista Security Series: Building Plug-ins for Network Access Protection

Dan Griffin

JWSecure, Inc.

October 2007

Applies to:

   Microsoft Windows Vista for Developers

Summary: The third in a series of articles about some of the new security-related features in Microsoft Windows Vista, this article discusses the implementation of plug-ins for Microsoft Network Access Protection (NAP). (12 printed pages)

Click here to download the sample code.

Contents

Introduction

Network Access Protection

Sample Business Problem

Solution Architecture

Solution Implementation

Operational Security Considerations

Conclusion

Resources

About the Author

Introduction

The purpose of this article is to discuss the implementation of plug-ins for Microsoft Network Access Protection (NAP). I'll discuss NAP in detail, in the next section.

This is the third in a series of articles about some of the new security-related features in Microsoft Windows Vista. The first article demonstrated how to add a new symmetric cipher algorithm to "Crypto API: Next Generation" (CNG) and the Cryptographic Message Syntax (CMS) APIs. The second article demonstrated programmatic configuration of the Microsoft Windows Firewall. For additional resources and downloads that are related to this series, see the Resources section of this document.

This article is written for any developer who is familiar with programming on Windows. Most of the topics that I'm addressing relate to new functionality in Windows Vista that's available only with native APIs (as opposed to with the Microsoft .NET Framework), as of the time of this writing. However, whenever I can tie in to the managed code side of things, I will.

In summary, if you've previously done some Windows programming by using Visual C, Visual C++, and/or Visual C#, you'll find this series to be accessible and (I hope) interesting, too.

Network Access Protection

NAP is a feature that helps to protect network resources from insecure computers. Although NAP is a complex and extensible technology, it's easy to gain a basic understanding of it by considering a common deployment scenario.

Suppose that an enterprise that is called Contoso has deployed a Remote Access (RAS) server to allow traveling employees to access the corporate network from the road. While RAS access is critical for the productivity of Contoso's employees, and the underlying protocols that are used by RAS provide robust authentication and encryption, exposing the corporate network to roaming computers nevertheless increases its attack surface. For example, roaming computers are frequently exposed to high-risk hotel and coffee-shop wireless networks. Such machines, therefore, have higher malicious-software (malware) and virus exposure.

To help combat the security risk that is introduced by allowing roaming computers to connect, Contoso's network administrators would like to ensure that roaming machines are as "healthy" as possible, before allowing them to connect. For example, they want to confirm that every client has installed the latest patches from Microsoft Update. Contoso can accomplish this by enabling NAP on the RAS server and on the RAS/VPN clients. In fact, NAP has built-in support for verifying that clients have the latest Microsoft security updates.

With NAP enabled in the RAS scenario, clients who attempt to connect first must pass the health check that is administered by the RAS server. If the health check succeeds (that is, the NAP client reports that the latest security patches have been installed), the virtual private network (VPN) connection is established. If the health check fails, NAP allows clients to attempt automatically to correct the problem (that is, to download and install the latest security patches, in this case) and to reattempt the RAS connection. If the health problem is not corrected, the client is not allowed to connect, thus keeping unpatched systems off of the corporate network.

With that scenario in mind, consider that NAP is actually more broadly applicable. For one thing, RAS/VPN isn't the only NAP-enabled network service. The other NAP-enabled network services in Microsoft Windows Server 2008 include DHCP, IPsec, and 802.1X. For more information, see Network Access Protection (NAP) for Windows Server 2008.

Also broadening NAP's applicability is the fact that the health-check architecture supports a plug-in model. This is the primary subject of this article, and it is what's demonstrated in the accompanying sample code.

As an introduction to the sample NAP plug-ins that are discussed in the remainder of the article, consider the following business problem.

Sample Business Problem

This sample business problem builds upon the deployment scenario that I introduced earlier. In this example, suppose that I'm a programmer for a line-of-business (LOB) application-development team at a large enterprise. The company for which I work, Contoso, has a large sales department, and many of those employees spend a lot of time on the road. As a result, Contoso is heavily dependent upon its remote-access infrastructure, which allows its sales team to access data and applications reliably on the corporate network while they travel.

The nature of this environment is such that many of Contoso's employees are accessing corpnet resources from networks with high-risk profiles. For example, hotel and coffee-shop networks are not known to be the most secure, especially when compared to Contoso's. The workstations that are used by Contoso's traveling sales force face heavy virus and malware exposure.

Furthermore, Contoso's administrators recognize that, as connectivity needs increase, enterprise network perimeters are becoming more permeable. For example, the machines (laptops, typically) that are used by traveling employees are at times "docked," connected directly to the corporate network, behind the firewall. Thus, the VPN servers aren't the only point of entry for high-risk clients; on-site services such as DHCP also are at risk.

Therefore, Contoso's system administrators are deploying NAP in order to better protect corporate resources from these potentially unhealthy client machines. As mentioned previously, NAP is available across multiple scenarios, including VPN and DHCP. In addition, NAP's extensibility allows the sysadmins to determine which client health policies to enforce, and when to modify those policies over time.

However, while NAP is indeed extensible, doing so generally involves implementing certain COM plug-in interfaces via Visual C++ code. This is not an area in which the average sysadmin is particularly strong! This creates a dependency on me and the rest of Contoso's LOB dev team to respond to the sysadmins in a timely manner, each time that a new health policy is required for enforcement, or each time that a significant change to an existing policy plug-in is required. The relative complexity of most COM/Visual C++ interfaces, as well as the increased testing burden, is such that this dependency is likely to result in deployment delays that Contoso's administrators would prefer to avoid.

Fortunately for the LOB devs and the sysadmins, the observation was made that many decisions about client-workstation health are made based (either directly or indirectly) on information that is read from the Windows registry. As a result, a generic NAP solution could be created that is based simply on a list of target registry keys, as well as expected values, provided at run time. Those clients without the expected registry settings, whatever they happen to be, will be considered unhealthy.

Both teams agree: The generic registry NAP plug-in is a great idea! The remainder of the article discusses its implementation.

Solution Architecture

As discussed in the preceding section, the system administrators and LOB programmers at Contoso have agreed on a generic solution for NAP extensibility. By allowing a to-be-defined list of client registry keys to be checked, a broad set of configuration policies can be enforced without incurring a downstream dependency upon low-level programming skill.

Let's dig further into the NAP architecture. This solution will consist of the following pieces:

  • Registry System Health Agent (SHA)—The SHA interface is what exposes NAP client-side extensibility; the Registry SHA is thus a COM DLL that implements this interface. The Registry SHA implements the policy reporting that was discussed earlier: Accept an arbitrary list of registry keys, and report their values.
  • Registry SHA service—This Microsoft Windows NT service will host the Registry SHA on the client, registering its interface upon startup.
  • Registry System Health Validator (SHV)—The SHV interface is what exposes NAP server-side extensibility. Like its SHA counterpart on the client, the Registry SHV is a COM DLL. In summary, the SHV allows the administrator to define a list of registry keys and expected values. Based on the actual values that are reported by the client SHA, the SHV determines whether the client should be considered healthy or otherwise.

Message Flow

Figure 1 shows the message flow between logical components for a typical NAP-enabled scenario—DHCP, in this case. A description of the messages and principals that are involved follows.

Figure 1. Message flow for typical NAP-enabled scenario

Figure 1 shows the following scenario:

  1. The client workstation prepares to request an IP address from the DHCP server. The DHCP client is configured with Group Policy to be NAP-enabled. As a result, it will contact the NAP client service. Note that, in Figure 1, the DHCP client, NAP client, and SHA are all components that are running on the client workstation.
  2. The NAP client service requests a Statement of Health (SoH) from each registered SHA—possibly, including those provided by Microsoft, as well as any third-party plug-ins, such as the Registry SHA that is discussed in this article.
  3. The SoH request is returned by each SHA and gathered by the NAP client.
  4. The collective client SoH request data is returned by the NAP client to the DHCP client.
  5. Now, the workstation DHCP client makes its request for an IP address lease from the network DHCP server. The client SoH is included in this request.
  6. Upon receipt of the IP lease request, the DHCP server must consult a Network Policy Server (NPS) to evaluate the client SoH. The NPS may be running on a separate server, but not necessarily.
  7. For each SHA that is running on the client, a corresponding SHV is expected to be present on the NPS. Thus, NAP's extensibility consists of SHA/SHV pairs. For example, if the demo Registry SHA were installed on the client workstation, its SoH would now be passed to the Registry SHV to be evaluated.
  8. In this scenario, suppose that the SoH information that is provided by the client does not meet the policy requirements that are configured on the NPS.
    For example, in the sample Registry SHA/SHV case, the SoH may have shown a required security-related registry key to be missing.
    In the not-healthy case, an SHV optionally can return remediation instructions to the client. The sample Registry SHV implements this feature, which allows the security posture of the client workstation to be corrected automatically without direct user intervention.
    This "auto-remediation" is an important feature of NAP; it reduces the administrative overhead of providing client connectivity, while maintaining network security.
  9. The SoH response passes back through each component in the logical chain (shown in Figure 1), eventually making its way to each corresponding SHA on the client workstation.
  10. If the SoH response indicates that the client is unhealthy, any of the SHAs that are configured to handle auto-remediation can now act upon the fix instructions that are provided by the SHV. Of course, in general, remediation could succeed or fail for any number of reasons. Either way, if remediation was requested, the result of that operation is returned to the NAP client, and then back to the NPS and SHV.
  11. Only after the client is determined to be healthy is the request for an IP address satisfied.
  12. Some time subsequent to the completion of DHCP, the security status of the client may change.
    For example, in the case of the Registry SHA, a piece of client software could make a registry change to one of the keys that are being monitored. In response, the SHA prepares a new SoH request and notifies the NAP client.
    As an aside, there are conceivable scenarios in which it doesn't make sense for a SHA to monitor client health changes dynamically. For example, suppose that a SHA/SHV pair checks on the processor architecture of the client machine. That information is unlikely to change—at least, without a reboot.
  13. In response to the updated SoH request, if the client is deemed unhealthy and remediation fails, the DHCP server terminates the client IP lease.

Architecture Diagram

This section presents the architectural details of the implementation of the NAP SHA and SHV plug-ins. Figure 2 illustrates the NAP architecture.

Figure 2. NAP-architecture scenario

Figure 2 depicts the following scenario:

  1. The Windows Vista client has been configured to use NAP-enabled DHCP. As a result, when the workstation needs to request an IP address, the DHCP client first requests a SoH from the NAP client.
  2. System Health Agents are registered with NAP on the client workstation.
    See the sample SHA COM registration logic in sha\dll\sdkshamodule.cpp!CSdkShaModule::RegisterSdkSha in the accompanying sample code.
    The NAP client consults each registered SHA for its SoH. See sha\exe\callback.cpp!ShaCallback::GetSoHRequest in the sample code.
  3. The sample Registry SHA queries the local registry keys and values, per its configuration. (See sha\exe\callback.cpp!ShaCallback::FillSoHRequest).
  4. As soon as the NAP client has prepared a SoH, the DHCP client sends its request to the DHCP server.
  5. The DHCP server sends the client SoH to the NPS.
    The NPS in Figure 2 is shown on the same server as the DHCP service, but they can be separate.
    The NPS consults the SHV plug-ins that correspond to each SHA on the client. Based on the data that is reported by each SHA, the SHVs determine if the client is healthy or otherwise. (See shv\sampleshv.cpp!CSampleShv::CheckRequestSoHHealth in the sample code.)

Handing of auto-remediation is not shown in Figure 2. See shv\samplesshv.cpp!CSampleShv::FillResponseSoH for the sample server-side logic. See sha\exe\callback.cpp!ShaCallback::ProcessSoHResponse and ::DoPatch for the sample client-side logic.

Solution Implementation

In this section, I'll highlight some of the important implementation details of the sample code that accompanies this article. The code is based on an existing sample implementation in the Windows Vista version of the Windows SDK. For reference, the existing sample can be found in the Samples\NetDs\NAP subdirectory of a default SDK installation.

The Registry SHA/SHV sample that is described herein builds upon the original SDK sample in a number of important ways. These include:

  • Auto-remediation.
  • Server-side configuration user interface.
  • Build and test with run-time security enhancements.

I'll discuss each in the following sections.

Auto-Remediation

As mentioned previously, auto-remediation is an important feature of NAP, wherein unhealthy clients can automatically be brought into compliance by the underlying plug-in implementation without requiring error-prone end-user intervention.

For example, suppose that the Registry SHA and SHV have been configured to monitor the fictitious registry value EnableSecureMode that is located at HKEY_LOCAL_MACHINE\Software\Contoso. The system administrators want to enforce that only clients that have that value present and set to 1 (enabled) are able to acquire an IP address with DHCP.

What happens in the code when a client that lacks that registry value attempts to obtain an IP address? In order to answer that question, there are a few abstraction layers to be aware of in the sample SHA. The sample SHA implements the INapSystemHealthAgentCallback (see napsystemhealthagent.h in the Windows SDK) via its ShaCallback class (see sha\exe\callback.cpp). Also, to simplify interacting with the registry, and to allow that code to be shared between the SHA and SHV, a helper class that is named CRegistryKeyValue is used (see sdkcommon\src\sdkcommon.cpp).

When the sample SHA's host service first starts (see sha\exe\shasvc.cpp), it immediately reads its current registry policy values (using ShaCallback::Init). It then registers for change notifications for each target registry key (by using ShaCallback::ListenRegChange).

Subsequently, when the NAP client queries the registry SHA for a SoH request, the target registry values are serialized into a byte array that can be interpreted by the SHV. On the client, the following call stack shows how this serialization is accomplished:

CRegistryKeyValue::Serialize
ShaCallback::Poll
ShaCallback::FillSoHRequest
ShaCallback::GetSoHRequest

The SHV uses a similar architecture, wherein the helper CSampleShv class (see shv\sampleshv.cpp) implements the INapSystemHealthValidator interface (from the public header, napsystemhealthvalidator.h). After the SoH request and embedded serialized registry-setting information are received by the SHV that is using CSampleShv::Validate, it deserializes the registry data and compares each value against the expected setting.

However, as a performance enhancement, the bulk of the sample SHV validate implementation is asynchronous. This allows the host NPS to achieve greater throughput across multiple SHV plug-ins. The following call stack shows how this check is accomplished:

CRegistryKeyValue::ExtractData (followed by ::MatchedKey)
CSampleShv::CheckRequestSoHHealth
CSampleShv::HandleRequestSoH
CSampleShv::QShvRespondSHVHost
CSampleShv::AsyncThreadHandlerMain
CSampleShv::AsyncThreadHandler

In this example, because the client workstation is lacking a required registry value, the preceding call to CheckRequestSoHealth in the SHV results in a status of QUAR_E_NOTPATCHED. The CheckRequestSoHHealth function also serializes the expected values corresponding to the client registry locations that were determined to be noncompliant. This serialized output can be thought of as the remediation instructions for the client.

The resulting status is then returned by HandleRequestSoH (refer again to the preceding call stack), at which point QShvRespondSHVHost calls ::HandleResponseSoH, which in turn calls ::FillResponseSoH to attach the serialized remediation instructions to the SoH response.

Next, the SoH response is returned by NAP to the registry SHA on the client. In this example, the SHA must notify the NAP client that, per instructions that are received from the SHV, remediation actions are now pending. As a result, ProcessSoHResponse calls HandleSoHResponse, which sets a return status of FIXESINPROGRESS.

As an aside, note that an SHA implementation could also determine a noncompliant status in which auto-remediation would not be performed (that is, something requiring direct user intervention to fix). In that case, FIXESNEEDED would be returned, instead.

The SHA remediation interface is driven by ::GetFixupInfo, which is the next interface call that is made by the NAP client. Because health-remediation patches are to be applied in this example, GetFixupInfo calls ShaCallback::DoPatch, which will validate the registry information that is returned by the SHV and call CRegistryKeyValue::SetTargetValue to make the mandated correction to the example EnableSecureMode value under HKEY_LOCAL_MACHINE\Software\Contoso. Thus, the client system is patched and may now obtain an IP address!

Finally, it's worth noting that the NAP client interface allows helpful messages to be displayed to the user via systray balloon (a pop-up window in the notification area). As an example, see how the FixupInfo structure (from the public naptypes.h) is populated, following successful remediation in GetFixupInfo.

FixupInfo * pStatus
...
pStatus->fixupMsgId = MSG_ID_FIXUP_SUCCESS;
pStatus->percentage = 100;
pStatus->state = fixupStateSuccess;

That particular code block informs the NAP client that the client was successfully patched and causes the resource string that corresponds to MSG_ID_FIXUP_SUCCESS to be shown briefly in a notification balloon, so that the user knows what's going on behind the scenes. Other SHA implementations should take advantage of this mechanism in order to keep the user informed of important state changes (including failure cases, such as an unhealthy client and/or patching fails).

Server-Side Configuration User Interface

The preceding auto-remediation walkthrough section mentions that the CSampleShv class implements the NAP INapSystemHealthValidator interface. Observe in shv\sampleshv.h that the CSampleShv class also implements the INapComponentConfig interface.

The INapComponentConfig interface is an important consideration for SHV implementers, because it exposes a mechanism for the system administrator to configure an SHV by using whatever user interface is appropriate, while still maintaining a common management environment across all SHVs.

The sample Registry SHV includes an implementation of that interface (see shv\sampleshv.cpp!CSampleShv::InvokeUI). To see this interface at work, perform the following steps:

  1. Configure a Windows Server 2008 (Beta) machine with the Network Policy and Access Services role. See the Resources section, for more information about this step.
  2. Copy RegistrySHV.dll to the System32 directory.
  3. Run regsvr32.exe RegistrySHV.dll
  4. Click Start, click Run, type nps.msc, and then click OK.
  5. In the console tree (left pane), expand Network Access Protection, and then expand System Health Validators.
  6. In the details pane (right pane), right-click Sample Registry SHV, and then click Properties.
  7. Press the Configure button.

Next, you'll see the following dialog box, which is created by the Registry SHV:

Figure 3. Configure Registry Key dialog box

Implementation Security

Inherent in their architecture and role is the fact that the various components of NAP will be exposed to data from the network. This potentially includes data that is received via an un-encrypted channel over the Internet. This observation is intended simply to reinforce the fact that all code—particularly system-level code, written in Visual C++, exposed to untrusted network data—must be reviewed and made to be as robust as possible.

To this end, the sample Registry SHA and SHV have been configured and tested with the following compile-, link-, and run-time protections in place. More detail about these options is available online and in Writing Secure Code for Windows Vista (Howard and LeBlanc).

Operational Security Considerations

When implementing and/or deploying NAP plug-ins, it's important to be aware of two security considerations. These should not be construed as design flaws, but instead as limitations of the environment in which NAP must operate.

Remediation

NAP plug-ins that implement auto-remediation should take into account the fact that not all NAP scenarios can guarantee a trusted path between the NAP client and the NPS.

DHCP is an example of an NAP scenario in which remediation data is not protected by a cryptographic integrity check. As a result, it's possible in that scenario that an attacker could modify the SoH response data en route from the SHV, and that the SHA would blindly apply changes that could actually compromise the security of the client.

There are two potential mitigations to this situation. The first is that SHAs should be implemented in such a way that, if they can auto-remediate, they include logic to determine if a requested setting is worse than what's already present. Of course, that's not always possible (the sample Registry SHA is a good example of how hard that logic can be to implement in a general way). Thus, the second potential mitigation is simply not to deploy auto-remediation plug-ins in NAP scenarios that don't guarantee trusted path.

NAP scenarios that do implement a secure channel include IPsec, 801.1x, and VPN.

Client Trust

The second security consideration of the NAP operational environment is that the client is implicitly trusted by NPS.

To take an example, suppose that the Registry SHV returns remediation instructions to set the fictitious EnableSecureMode registry value that was described earlier. Suppose further that the client workstation is one on which the end user has local administrator rights. In this example, the end user has installed a custom SHA that looks and acts just like the sample Registry SHA, except that it silently discards remediation changes. As a result, the NAP client reports that patching has occurred, and the client is deemed healthy, even though it's actually not. The NPS has no way to determine otherwise.

The usual security mantra applies here: Only give administrator access to trustworthy users, and only to the machines where the users truly need it. However, some NAP scenarios, such as VPN, are inherently mobile-user-oriented, in which case the client is likely a laptop. Historically, mobile laptop users have required administrator access; otherwise, they risk not being able to make certain unforeseen configuration changes while they travel. Thus, those users have the level of access that is required to make the kinds of changes that could compromise the NAP client.

Conclusion

Returning to the business problem with which I started, the Registry SHA and SHV address the collective LOB need: For any configuration settings that are accessible via the Windows system registry, network administrators can exercise their prerogative to enforce those policies by using NAP. And they can do so without having to bother the programmers!

Resources

NAP Resources

Previous Articles in This Series

Other Windows Vista Security Code Samples

About the Author

Dan Griffin is a software-security consultant in Seattle, WA. He previously spent seven years at Microsoft on the Windows Security development team. You can contact Dan at https://www.jwsecure.com.