Real-Time Monitoring of Durable Workflows

The Persisted WF Instances section displays “live” metrics on persisted instances of durable workflow services. The metrics are taken at the time the Dashboard is invoked and represent the current status of any durable workflow instances as persisted within the persistence store. These metrics are not historical in nature like the historical metrics displayed in the other two sections (WCF Call History and WF Instance History). There are summary workflow instance metrics classified as Active, Idle, or Suspended. Expanding the Persisted WF Instances section (by clicking the down arrow or the Persisted WF Instances name of the section) displays a summary of the top five services with Active or Idle Instances or Suspended Instances workflow service instances. You can use these summary values, and their descending correlated detailed views, to do real-time monitoring of AppFabric durable workflow instances.

Persisted Workflow Instance Metrics

The Persisted WF Instances section on the Monitoring Dashboard provides a summary view of all persisted workflow instances (Active, Idle, or Suspended) that have not yet reached the Completed state. These summary metrics are in the shaded header box where the title Persisted WF Instances exists. The following summary metrics highlight the key states or conditions of a persisted workflow:

  • Active. The Running (Active) state where a workflow is locked in memory.

  • Idle. The Running (Idle) state where a workflow is in memory and waiting for a message.

  • Suspended. Execution of the workflow was interrupted by an exception, or it was persisted to the persistence store as a normal part of its long-running lifetime.

Below the summary metrics are the following service metrics that group services within specific categories in descending order:

  • Active or Idle Instances - Grouped by Service (top 5). The top five services with the highest total number of active or idle instances within the specified time period.

  • Suspended Instances - Grouped by Service (top 5). The top five services with the highest total number of suspended instances within the specified time period.

Persisted WF Instances Page

You can use the Persisted WF Instances page to get a real-time view of persisted workflow instances in various states of persistence. Clicking any of the summary metrics (say Suspended), or one of the service links under a column (say Suspended Instances - Grouped by Service (top 5)), takes you to the Persisted WF Instances page.

Persisted WF Instances Page

The link that you click to take you from the Dashboard page to the Persisted WF Instances page is used to filter the persisted workflow instances. This ensures that what is enumerated on the Persisted WF Instances page is specific to that originating link. For example, clicking a service link under the Suspended Instances - Grouped by Service (top 5) column takes you to the Persisted WF Instances page and displays filtered workflow service instances with the Suspended value for the Status column in the query results window. You can, however, change the value of one or more fields (say Status) within the Query Summary frame to change the initial output and do further troubleshooting on a specific workflow. For example, if the original status that took you to the Tracked Events page was Suspended, you could change the value of that field to Running - Active, and then click Run Query to see different results.

Within the Persisted WF Instances page, AppFabric provides an enumerated set of state values for the Status column. You can use the following values to easily identify the state of an instance while also sorting or grouping at a more granular level:

  • The Completed state is divided in into different values (-Successfully, -Cancelled, and -Terminated) to provide additional state-related context on how an instance reached the Completed state.

  • The Running state is divided into different values (-Active and -Idle) to provide additional state-related information for running instances.

  • The Suspended is divided into different values (-Exception and –UserSuspension).

Here are some key points about the differences between the Running (Idle) and the Suspended workflow states. Their subtle differences can be a source of confusion, and understanding them may help you to more easily understand the metrics used in the Dashboard.

  • Idle and Suspended do not have the same meaning. Idle is when there is no more scheduled work, but if an event arrives, the workflow will resume.

  • A workflow never gets suspended during normal execution except when explicitly suspended by a host manager such as AppFabric. Or it can get suspended if an unhandled exception occurs and the user has configured the service to "Abandon and Suspend" the instance in AppFabric in case of an unhandled exception.

  • When a workflow is suspended, it stops executing and does no more work until it is explicitly resumed by the host.

Orphaned Workflow Instances

AppFabric provides support for enumerating and controlling orphaned workflow instances. However, the only control operation that can be applied to an orphaned instance is the Delete operation. An orphaned workflow instance belongs to a service that is no longer deployed on the computer to which the user was connected when that instance was in a Running or Suspended state.

Workflow Instance Control

For workflow instances displayed as a result of a query on the Persisted Instances Page, you can right-click an instance and bring up a context-dependent control command menu. From this menu you can select only control actions that apply to the current state of the workflow. For example, if you have a workflow in either the Running (Idle) or Running (Active) state, the context-dependent actions are Suspend, Cancel, Terminate, and Delete. The Resume operation is disabled because it does not apply to a workflow in the Running state.

All control commands that result in a completed state of an instance are accompanied by a standard warning confirmation dialog box. Each dialog box not only asks for confirmation but also explains the effect that the selected command will have on the selected instances. If you change your mind, or begin a control action erroneously, this is your opportunity to cancel that operation.

For more information, see Persisted WF Instances Page.

Troubleshooting by Monitoring Durable Workflow Metrics

You can assemble the preceding information into a troubleshooting approach by using the Persisted WF Instances section to monitor the persisted state of durable workflows. When you initially view the Persisted WF Instances section, you get a high-level summary view of the status of persisted workflow instances. You can quickly see if there is a problem at the persisted workflow level by any Suspended workflows that exist. If the Suspended Instances - Grouped by Service (top 5) summary metric contains a non-zero value, it indicates where a problem may have occurred. All summary metrics are linked to the Persisted WF Instances page, where you can see explicit detailed metrics of persisted workflow instance data that the initial Dashboard page summarized for you at the higher level. This raw data gives you additional information when working to isolate a problem surrounding persisted WF instance calls.

Let’s take a scenario where you are using the Persisted WF Instances section to monitor the services at a given scope for any problems. If you see the Suspended summary metric as non-zero, then expanding the widget will allow you to see a breakdown of the Suspended instances by the top five services. This allows you to focus on the services with the greatest number of potential issues. You can then touch on a specific problem service and specify details by going to the enumeration page and changing the query values.

Suppose the Suspended summary header displays a non-zero value to show that some durable workflow instances were suspended. You can expand the Persisted WF Instances widget and look under the Suspended Instances - Grouped by Service (top 5) column to see the top five services that have the most suspended workflow instances during the selected time period. Clicking any of the services listed here would take you to the Persisted WF Instances page.

Note

If you are looking for a particular service that is not shown as one of the top five services, you can click the Suspended summary column to take you to the Persisted WF Instances instance enumeration page, and then locate the service through the scope query condition and rerunning the query.

The Persisted WF Instances page is populated with a real-time view of the persisted workflow instances at the specific scope in the IIS hierarchy... You can click one of these workflow instances in the middle pane (still within the Persisted WF Instances page) to display specifics of that workflow in the Details pane at the bottom of the page. Within the Details pane you can view information about the persisted workflow instance on the Overview tab. This tab contains information about the persisted workflow instance, such as the Service Virtual Path, its Workflow Instance ID, number of Tracked Events for its lifetime, Creation Time, and other information. You can use this information to better understand the lifetime of a persisted workflow instance.

You can use the Persisted WF Instances page to issue instance-control commands for its enumerated durable workflow instances. After you determine there is an issue from the higher and more abstract levels, and troubleshoot to better determine what the problem is or even if it can be fixed, you can then issue control operations such as suspending or terminating an instance

Note

A persisted WF instance does not directly correlate to a tracked WF instance because you can enable tracking (monitoring) and persistence independently of each other.

If you need additional context to help solve a problem surrounding a persisted workflow instance, you can right-click the instance in the middle pane and select View Tracked Events. This takes you to the Tracked Events page and displays information for that workflow instance ID. If this workflow also supports tracking, you can also select View Tracked Instance from the context menu for a workflow instance. The Tracked WF Instances page is displayed and populated with persisted workflow information related to the original workflow instance ID.

Note

To enable the View Tracked Instance and the View Tracked Events options, the application containing the persisted WF service instance must be configured to use persistence and have tracking enabled.

For additional information about how to obtain more specific information about a persisted workflow instance to help you solve a problem, see Tracked Events Page and Tracked WF Instances Page.

See Also

Reference

Persisted WF Instances Page
Tracked Events Page
Persisted WF Instances Page