Monitoring Applications

Article
06/17/2011

This section describes how to use the AppFabric Dashboard to monitor the health and lifetime of your .NET Framework applications that include WCF and/or WF services. The AppFabric Dashboard is the primary point within Windows Server AppFabricto monitor and troubleshoot .NET Framework version 4 services. Data presented within the AppFabric Dashboard provides you with both live (real-time) and historical metrics of your services. The live metrics provide current information about your durable workflows and allow their states to be controlled. The dashboard’s historical metrics provide visibility into the health of your services over a specific time period.

The AppFabric Dashboard presents the following information to help you more completely comprehend the state of the .NET Framework services managed by AppFabric:

The AppFabric Dashboard tracks instances of durable workflows and presents status data about how many are running (active or idle) or suspended. The AppFabric Dashboard also provides drill-down capabilities to observe individual persisted WF instances and enables you to issue your own commands on persisted workflows to control their execution.
WF services are tracked at various levels of verbosity by AppFabric by storing the events they emit during normal execution. The AppFabric Dashboard provides visibility into the historic health of WF services that have monitoring enabled at the level of Health Monitoring or higher.
All WCF and WF services in .NET Framework use WCF for communication with clients and other services. The AppFabric Dashboard monitors and displays cumulative totals of successes and exceptions as a result of received WCF calls. It also displays information about failed or faulted calls associated with service exceptions.

The AppFabric Dashboard provides metrics for the services deployed to the local AppFabric server as well as on any remote AppFabric servers in a server farm that are configured to use the same persistence and monitoring stores. AppFabric allows you to filter metrics by local server or all servers. You can also use the AppFabric Dashboard to adjust the time range for displayed data in hours, days, weeks, or the entire time history of the monitoring store for the selected servers.

AppFabric Dashboard Sections

The AppFabric Dashboard is divided into three primary sections: Persisted WF Instances, WCF Call History, and WF Instance History. Each section provides a specific function, and the sections can be combined logically to present a more detailed picture of the lifetime of a service or of a problem that has occurred. Live or historical data are presented in the various sections. You can collapse or expand a section by clicking the corresponding up and down arrows on its top-right side.

Unique summary metrics are displayed in the different sections within each section. For example, in the WCF Call History section, there are three summary call metrics: Completed, Errors, and Throttle Hits. Clicking a summary value takes you to the drill-down page that relates to that section. In the WCF Call History section, clicking any of the three summary metrics takes you to the Tracked Events page. While all three metrics take you to the same page, the data displayed most likely will be different because each metric maps to a specific query against the event data stored in the monitoring store. The following list describes each section and the relationships between its summary values and metrics pages.

The Dublin Dashboard

Persisted WF Instances. This section displays a “live” summary of Active, Idle, or Suspended durable workflow service instances by viewing the status of the persistence store at the time the AppFabric Dashboard is displayed. For a durable workflow that is additionally configured to use monitoring, its historic metrics are also reflected in the other two AppFabric Dashboard sections. Clicking any of these summary links takes you to the corresponding Persisted WF Instances Page. Expanding the Persisted WF Instances section displays the top five Active or Idle or Suspended .NET Framework 4 workflow services with the most Active or Idle or Suspended instances. For more information about how persisted workflows are monitored, see Real-Time Monitoring of Durable Workflows.

Note

AppFabric does not support persistence of WCF service instances. Only workflow (WF) service instances can use the AppFabric persistence feature.
WCF Call History. This section displays a summary of WCF call history for .NET Framework WCF and WF services that have monitoring enabled. It provides a summary of all WCF Completed Calls, Errors, and Throttle Hits within the time frame selected in the Time Period drop-down list. Clicking any of these summary links takes you to the corresponding Tracked Events Page with query result data specific to the originating category. Expanding the WCF Call History section displays the top five services with the most WCF Completed Calls and WCF Service Exceptions. It also provides a breakdown of Errors by Service Exceptions (mostly caused by Failed or Faulted Calls) and User Defined Errors. For more information, see Historical Monitoring Using WCF Call Metrics.
WF Instance History. This section displays a historical summary of Activations, Failures, and Completions for .NET Framework 4 workflow service instances with monitoring enabled within the timeframe selected in the Time Period drop-down list. Clicking any of these summary links takes you to the corresponding Tracked WF Instances Page with query result data specific to the originating category. Expanding the WF Instance History section displays the top five WF services with the highest number of Instance Activations and Instances with Failures. It also breaks down the number of instances with failures by outcome (recovered versus not recovered). For more information, see Historical Monitoring Using Workflow Metrics.

AppFabric Dashboard Metrics

The AppFabric Dashboard metrics are displayed for Windows Process Activation service (WAS)-hosted .NET Framework 4 WCF and WF services at the selected level, or “scope” in the IIS hierarchy. The different levels of scope are server, site, and application. Scope is determined by selecting a server, site, or application in the IIS hierarchy in the IIS Manager Connections pane (left pane). The collection of metrics displayed in the sections is the same for every scope. That means you see the same metric names at every scope, but the values change according to what is included in the scope. By changing the selected level in the IIS hierarchy, you can display metrics from instances of all services on the server or site, or you can display metrics related to only a selected application. The amount of monitoring data displayed at a specific view corresponds to values configured on the Monitoring tab within the Configure WCF and WF dialog box for that specific scope.

Note

If the Monitoring level for participating applications is set to Health Monitoring or higher, the amount of data shown by the AppFabric Dashboard does not change. However, changing the scope to include a different number of services with monitoring enabled will cause the metrics to change.

For more information about configuring scope and metrics, see Configure WCF and WF for Server, Site, or Application: Monitoring Tab and Configuration Dialog Box for a Server, Site, Application, and Virtual Directory.

Monitoring and Persistence Defaults

When a .NET Framework 4 service is installed into AppFabric the following two Monitoring defaults are configured automatically on its behalf. You can change their settings by using the Monitoring tab in the server, site, application, or service configuration dialog boxes. For more information, see Configure WCF and WF for Server, Site, or Application: Monitoring Tab and Configure Service: Monitoring Tab.

Monitoring Level. By default, monitoring is enabled for all services. The default level of monitoring is configured as Health Monitoring, which is the middle of five settings for Monitoring (Troubleshooting, End-To-End Monitoring, Health Monitoring, Errors Only, and Off). Health Monitoring is the best performance choice for everyday health monitoring of an application’s metrics. It is also the minimum requirement for all metrics on the dashboard to be utilized. These include tracking of message flow between services, WCF and WF events, and other events. It also includes errors from the less verbose Errors Only level to assist in simplified troubleshooting. If a problem occurs, you can increase the amount of monitoring data by enabling a more verbose level of monitoring, solve the problem, and then restore the monitoring level to the default Health Monitoring setting. For more information about monitoring levels and how to choose the most appropriate level for your monitoring requirements, see Configure Monitoring.

Monitoring data is collected by the Event Collection service and written to the default monitoring store by using the DefaultMonitoringConnectionString connection string. This data corresponds to what is displayed on the Tracked Events Page. On the Monitoring tab, Enable database event collection is by default enabled, and the Tracked Events page will display all data available from any configured monitoring stores. If you disable event collection, you will no longer see any new events going forward. However, if there are events tracked from the past still in the store, you will still see them on the Tracked Events page. To prevent viewing of these old events, you must manually remove the existing connection string from the configuration. For more information, see Configure the Event Collection Service.

When you configure a certain level of monitoring, the corresponding default tracking profile for that level is enabled. A tracking profile is a declarative definition of filters against event type and the desired information to be obtained from the workflow instance. You can also write custom tracking profiles if the default profiles don’t meet your monitoring requirements. For more information about tracking profiles and how to configure them, see Configure Tracking.
Diagnostic Message Logging and Tracing. Unlike Database Event Collection and Monitoring Level, Diagnostic Message Logging and Tracing is disabled by default. Instead of sending data to the monitoring store, this function sends it to a configurable file that can be viewed by the Service Trace Viewer utility. The Diagnostic Message Logging and Tracing setting has no effect on what is displayed in the dashboard. Rather, it is an additional mechanism beyond the AppFabric Dashboard to assist in troubleshooting by using .NET Framework tracing and logging mechanisms. For more information about configuring this feature, refer to Configure Diagnostic Tracing and Message Logging Dialog Box.

In addition to default monitoring capabilities, AppFabric also provides default persistence functionality. When a .NET Framework 4 WF service is installed into AppFabric, persistence is configured automatically on its behalf by default. Like the monitoring settings, you can change persistence configuration for a workflow by using the Persistence tab in the server, site, application, or service configuration dialog boxes. The workflow persistence data is written to the default persistence store by using the DefaultPersistenceConnectionString connection string. For more information, see Configure WCF and WF for Server, Site, Application, or Virtual Directory: Workflow Persistence Tab and Configure Service: Workflow Persistence Tab.

Monitoring and Persistence Stores and Dashboard Metrics

The AppFabric Dashboard metrics are obtained from both the monitoring and persistence stores. There can be more than one monitoring or persistence store in the current scope, depending on how persistence and monitoring are configured. If services are configured to use different stores, the dashboard shows the combined metrics for all the stores associated with the services in the current scope. Persisted WF Instance metrics are a summary of workflow state data from one or more persistence stores. Tracked WF instances and WCF Call History metrics are a summary of data from one or more monitoring stores.

Important

Under load, the staging table in the monitoring database can build a backlog of records to be processed by the SQL Agent jobs. This causes information displayed on the AppFabric Dashboard to be outdated by ten minutes or more. Also, if you restrict the timeframe to a more limited recent period (rather than the default 24 hours), you won't see any new transactions since these are also backlogged.

Persisted WF Instance. These metrics show the current status of persisted workflow instances from one or more persistence stores when the dashboard is invoked. For a workflow to have its information presented in this section, it must be designed to use persistence. Long-running workflows, or those that operate on sensitive or calculated critical data that needs to be preserved during the workflow’s lifetime, are more likely to use .NET Framework 4 persistence. Typically workflows that do not use persistence are run quickly and preservation of their state is not critical if the process inadvertently exits.

Additionally, persistence must be configured for a WF service from within AppFabric to make use of the AppFabric persistence functionality. AppFabric provides the ability to host workflows written to be durable with persistence capabilities through its persistence store and management tools. For information about how to enable AppFabric persistence for a service, see Configuring Workflow Persistence.
WCF Call History. These metrics are a historical summary of the number of WCF completed calls, errors, and throttle hits from one or more monitoring stores for the services within the selected AppFabric Dashboard scope. To track this data, AppFabric event collection must be enabled for that service. For information about how to enable event collection for a service, see Configure the Event Collection Service.
WF instance History. These metrics are a historical summary of tracked WF instances from one or more monitoring stores. Activations, Failures, and Completions for .NET Framework 4 workflow service instances are summarized. To track this data, an AppFabric monitoring level higher than or equal to Health Monitoring must be enabled. For information about how to enable a monitoring level for a service, see Configuring Monitoring.

The following table summarizes when AppFabric Dashboard metrics are displayed for a WF service based upon its configuration.

.NET Framework Service Type	Persistence Configured	Valid Monitoring Level Configured	Persisted WF Instances Section Metrics (Persisted WF Instances Page)	WCF Call History Section Metrics (Tracked Events Page)	WF Instance History Section Metrics (Tracked WF Instances Page)
WF Service	NO	NO	NO	NO	NO
WF Service	YES	NO	YES	NO	NO
WF Service	YES	YES	YES	YES	YES
WF Service	NO	YES	NO	YES	YES

The following table summarizes when AppFabric Dashboard metrics are displayed for a pure WCF service (no workflow) based upon its configuration. Because AppFabric does not offer any support for persistence of WCF services, the only section that displays data for it is WCF Call History.

.NET Framework Service Type	Persistence Configured	Valid Monitoring Level Configured	Persisted WF Instances Section Metrics (Persisted WF Instances Page)	WCF Call History Section Metrics (Tracked Events Page)	WF Instance History Section Metrics (Tracked WF Instances Page)
Pure WCF Service	N/A	NO	NO	NO	NO
Pure WCF Service	N/A	YES	NO	YES	NO

WCF User-Defined Events

The .NET Framework 4 provides the capability to programmatically insert Windows Communication Foundation (WCF) user events into the ETW (Event Tracing for Windows) event stream provided by the .NET Framework. All user events are emitted and captured by default for applications configured to use at least the Health Monitoring level of monitoring. At the less verbose Errors Only level, only the WCF error user event will be emitted and captured. AppFabric collects these WCF user events and stores them in its Monitoring data store. User-defined event information can be displayed on two pages.

The AppFabric Dashboard page reflects the count of user events emitted at the Error severity level in a given time period in the Errors summary metric counter.
The Tracked Events page displays all user-defined events, error-related or not, when the Events field has the All WCF events option selected, or when no Events field is specified in the Query Control. The Query Builder does have the WCF user-defined errors sub-option under the All WCF Errors option for the Events condition. When a user-defined error event is selected on the Tracked Events page, the error is displayed in the Errors tab in the Details pane.

For a sample that shows how to programmatically add user events to the ETW stream, refer to WCF Analytic Tracing (https://go.microsoft.com/fwlink/?LinkId=184956).

AppFabric Dashboard Support for Multiple Stores

The AppFabric Dashboard supports the display of data across multiple persistence and monitoring data stores. The AppFabric Dashboard assumes that persistence data for a given service resides in a single persistence store, and monitoring data for a given application should reside in a single monitoring data store. When using more than one monitoring or persistence store, old data should be removed from the original store when an application or service switches stores. In scenarios when the original store is still used by other applications or services in your environment, failing to do this could lead to unexpected or inconsistent results.

An example may help you understand the issue. Suppose applications that include WCF and/or WF services 1 and 2 are configured to use monitoring data store X. Application 1 is later reconfigured to use monitoring data store Y instead of X, and its old application 1 data remains in monitoring data store X. When viewing the AppFabric Dashboard at the application scope, the metrics for application 1 correctly display the data from its current store Y because only one monitoring store can be associated with an application. However, when viewing the AppFabric Dashboard at the server or site level, the counters include correct data for application 1 from its current store Y, and old data from its old store X.

In this example, the proper procedure is to clear out the data from application 1 in old monitoring store X when configuring your application to use new monitoring store Y. This ensures that the AppFabric Dashboard provides the correct information for application 1. You can perform this cleanup at the database level by using the appropriate database tools and methods.

Case-Sensitive Queries

When the SQL Server monitoring database is configured to use binary collation, parameters for any queries using the AppFabric Dashboard Query Builder are case-sensitive. In that case, the exact case of the string must be specified for Computer Name, Site, and Virtual Path when providing query clauses in the AppFabric Dashboard Query Builder. To avoid the case-sensitivity issue you can manually change the ASEventSourcesTable, which contains fields such as Computer, Site, VirtualPath, ApplicationVirtualPath, and ServiceVirtualPath, to be a case-insensitive collation.

Monitoring Applications

AppFabric Dashboard Sections

AppFabric Dashboard Metrics

Monitoring and Persistence Defaults

Monitoring and Persistence Stores and Dashboard Metrics

WCF User-Defined Events

AppFabric Dashboard Support for Multiple Stores

Case-Sensitive Queries

In This Section

See Also

Concepts

Additional resources