XBAP deployment failures and collecting diagnostic information

[Update 6/12/08] A tool that repairs the most commonly occurring permission problems failing XBAP deployment has been released: https://www.microsoft.com/downloads/details.aspx?FamilyID=adb47247-4e27-4490-a153-39d8334172d9. Shortly, the Watson service will start offering it to users of affected computers.

We occasionally hear from users (mostly developers) about XBAPs unexplainably failing to start on particular computers. This is troubling news when everything is installed correctly on the system and the XBAP itself is known to be good. There are many pieces involved in XBAP deployment & activation, some of them external to WPF, and only one has to be slightly out of whack in order for the whole process to fail. Unfortunately, there isn't just one known cause, and the failure symptoms are often such that figuring out what exactly is broken is not easy, or just impossible for end users. As a core infrastructure feature, XBAP deployment should be highly reliable. That's why we are trying to identify the top common causes and then implement solutions in the next version/service pack of the framework. To this end, we need feedback from "the real world", where computer configurations vary greatly and OS and software installations are anything but fresh and done in a particular order.

First, whenever you get a Dr.Watson dialog saying the "Windows Presentation Foundation Host has encountered a problem and needs to close", please send the report. (PresentationHost.exe is the process that hosts an XBAP. Even though the application visually runs within the browser's window, it's loaded in a separate process for better isolation and reliability.) This is the best way for us to learn about deployment failures in the field, measure frequency of occurrence, and possibly request additional system information to be automatically sent with subsequent reports of a particular failure. If the hosted application (the XBAP) crashes, the managed exception should be caught and shown formatted in an HTML page, in place of the application's content. The Watson dialog almost always suggests the crash is due to a platform bug or to a critical system configuration problem.

Because currently all managed exceptions are caught and shown in the XBAP error page, we are preventing Watson from reporting failures due to bugs in the managed part of the framework. We'll likely add a mechanism in a future release to enable sending a report of any managed exception. This will help both us and XBAP vendors to improve reliability. Also, in SP1 (and .NET v3.5), a number of possible deployment error conditions and the majority of previously silent failures will trigger a Watson report.

For now, you can just look at the exception call stack shown in the error page and try keyword search using symbols from the top few frames. Solutions for some common causes can thus easily be found.

In particular, errors from ClickOnce are frequently occurring. You can recognize them by the assembly name and namespace - System.Deployment. Here's a sample:

Sample XBAP deployment error from ClickOnce

There are certain ClickOnce errors that tend to occur mostly or only on developers' computers, when multiple builds of an application are deployed and in particular when doing so from different locations (URLs) and without incrementing the version number. Then you may have a corrupt ClickOnce application store (cache). To clear it, run this command from an SDK command window: mage -cc and/or directly delete the store folder:

  • x:\Documents and Settings\username\Local Settings\Apps - on Windows XP
  • x:\Users\username\AppData\Local\Apps - on Windows Vista.

If this doesn't help, you may have a case of a bad framework installation or bad system configuration...

A group of known issues is with NT permissions on certain system objects. The most frequently occurring case seems to be with the HKCU\Software\Classes registry key. The problem, an experimental workaround, and a manual fix are discussed in this Forum post. The gist is that the individual user's account has to be explicitly given Full Control permission. Getting this permission via membership in the Administrators group is not sufficient, because PresentationHost runs with a restricted process token. Similar, mostly isolated, problems have been observed with these objects as well:

  • "x:\Documents and Settings\username\Local Settings\Application Data" folder and/or Deployment subfolder
  • "x:\Documents and Settings\username\Local Settings\Apps" folder. This is the main storage for the ClickOnce application cache and mage -cc clears it, but it may not be able to resolve NT permission problems.
  • HKCR\Interface\{79EAC9C9-BAF9-11CE-8C82-00AA004BA90B} key

PresentationHost.exe released with .NET 3 SP1 and .NET v3.5 will detect these specific configuration problems, trigger Dr. Watson, and if the user chooses to send a report, we'll present a response page that leads to some patch utility. Unfortunately, for a number of reasons it is not possible to automatically resolve each of these possible permission problems at the time the framework is installed, so we have to rely on this indirect method.

Collecting diagnostic information 

If you have a case of XBAP deployment failure and the XBAP itself is known to be good (runs on another computer, other than the one on which it was built), and you are not seeing the ACL/permission anomaly involving any of the above file/registry objects, then it may be due to a yet-unknown cause, and providing us with specific diagnostic information can help us figure out the cause and possibly address it in a future release. Besides the Windows, .NET framework and browser version information, two types of logs are usually sufficient.

I. From Process Monitor. Set up two filters: 'Process name is PresentationHost.exe' and 'Process name is iexplore.exe'. (Keep the default filters. They eliminate a lot of noise.) For Firefox, the process name is Firefox.exe. (Note that only v3.5+ will support XBAPs in Firefox.)

[Update, 10/10] It really helps if Process Monitor captures stack traces for logged events. To enable this, ProcMon needs dbghelp.dll v6 or newer. The file path is given to it in the Configure Symbols dialog, from the Options menu. You can get the right dbghelp.dll in at least two ways:

  • It's part of Visual Studio: "c:\Program Files\Microsoft Visual Studio 8\Common7\IDE\dbghelp.dll"
  • You can download Debugging Tools for Windows.

II. ETW traces. Both IExplore (v7) and PresentationHost are instrumented to trace significant events (when tracing is enabled). Unfortunately, the Windows SDK tools for trace collection and manipulation are not very user friendly. Just for collection, the WPFPerf tool shipped with the .NET Framework SDK suffices.

1. From an SDK command window, start wpfperf.exe. (The tool is documented here: https://msdn2.microsoft.com/en-us/library/aa969767.aspx) On Windows Vista, the tool has to be run elevated.

2. If that’s the first time you are starting it, the Add Tool dialog will pop up. (If not, get it from the File menu.) Select the Event Trace tool. (This is a consumer/logger for ETW events. WpfPerf combines a bunch of different tools in one shell.)

3. In the middle-right part of the Event Trace window, click Add.

a. In Add New Logger, enter: Logger name = PH [just a name], GUID = “WPF – All” [from the drop-down list], some file path with .etl extension. Click Add.

b. While still in Add New Logger, change GUID to “NT Kernel Logger”. Enter a different filename (.etl). Click Add.

c. If you are using IE 7, also add a logger for it with these parameters: Logger name: IE7
GUID: 797FABAC-7B58-4796-B924-D51178A59CE4
Log file: c:\temp\IE.etl
Level: 5
Flags: 4294967295

d. Click Done.

4. Select each logger in the list box and click Start.

5. In Internet Explorer, enter the URL of an XBAP in the address line. Wait until the error occurs or the browser just goes idle without any response. (The navigation globe/wheel may keep spinning on error, though. This happens when the navigation sequence was aborted abnormally in the middle.)

6. Go back to WPFPerf and stop the loggers.

(Since you should be doing the Process Monitor capturing in parallel, stop/pause it at this time as well.)

7. To make sure events were captured, look at the .etl file sizes. They should be several hundred KB each. The IE one may be smaller.

8. Compress the resulting .etl files, and put the archive on some web server where we can copy it from.

Notes:

· It’s best to act quickly during logging so that we don’t accumulate too many kernel events. Start logging just when you are ready to navigate to the XBAP in the browser.

· The Open Logs button in the tool probably won’t show you anything from the WPF trace file. There is a bug in the first release that prevents the events generated by the hosting code (PresentationHost) from being shown. But they are in the .etl file. And you should be able to display the kernel events, although not in a very digestible form.

 

When done with the log collection, please send a description of your problem, symptoms and system information along with the logs link to xbap-sos at micrsoft . com [Microsoft intentionally misspelled here]. A team member will try to help. Please understand that we cannot guarantee responses for each case. For general help with XBAPs, please post to the WPF Forum.