Opalis 6.3: Incident Escalation using the System Center Service Manager IP
System Center Service Manager provides a rich environment for customers to implement their Incident Management processes. The connectors provided with Service Manager to Operations Manager, Configuration Manager and Active Directory greatly simplify the process getting these processes running.
Opalis with the System Center Service Manager Integration Pack (SCSMIP) is a natural fit for customers looking to use automation to customize and/or automate Incident Management processes. This Integration Pack provides an IT Pro the flexibility and ease-of-use required to interact with Service Manager without the prerequisite of deep API-level knowledge of API-style programming.
One area I get asked about on a fairly regular basis is the area of Incident Escalation. The requirement usually is time-based in nature (“active incidents older than X minutes must be escalated”.) These scenarios are very easy to implement in Opalis. Let’s walk through the building-blocks of how such a workflow would be authored.
Scenario: Escalate the Support Group by one tier for High Urgency incidents that remain active after 1 hour.
So let’s think about what we are going to do BEFORE we start building a workflow… what a concept! OK, Service Manager incidents have a few properties that we will need to work with for this scenario:
- Status: “Active”, “Closed”, “Resolved” and “Pending”
- Support Group: “Tier 1”, “Tier 2” and “Tier 3”
- Created Date: Date/Time field. Example: “12/03/2010 07:56:11 AM”
- Escalated: Boolean (it’s a checkbox on the SM GUI and “True” or “False” in Opalis)
Our workflow will have to look for all “Active” incidents that do NOT have the Escalated flag set that are older than one hour. Then we need to “bump” the Support Group by one tier until we get to Tier 3. Now that know what we need to do let’s do it…
First we need to determine the condition which triggers our workflow. Our scenario is time-based so we simply use a “Monitor Date/Time” activity and indicate the frequency with which we want to check. I have selected 1 hour as the interval in this sample. Notice I have checked “Trigger immediately” which runs the workflow as soon as I start it. This is key if you want to debug the workflow in the testing console, since if you didn’t do this you would have to wait for the trigger condition… I don’t know about you but I am seldom that patient! Obviously one would need to tweak the interval to match the number of incidents one might have to manage as well as the desired escalation interval.
Now that we have a trigger condition we need to look for incidents that satisfy our escalation criteria. That means we need to do two things: First we need to compute the timestamp 1-hour prior to the current date/time (so we can query for records older than 1 hour) and then we need to filter for records that meet our selection criteria.
Formatting date/times in Opalis is easy. We select the “Format Date/Time” activity for this purpose. For the current date/time we subscribe to the “Object end time” from the Monitor Date/Time (it’s a sneaky way to get the approximate current time without being dependent on an Opalis Global Variable… but that’s a topic for another blog post). I have accepted the default Format for the date/time stamp and have copied and pasted this into the “Output” value since I don’t want to reformat the date/time, only shift if 1 hour. Finally, I set the offset to subtract 1 hour from current time.
Now that we have the time 1 hour ago we can compose our query to Service Manager. We do this using the “Get Object” activity in the Opalis SM IP. This activity allows a workflow author to query Service Manager for objects of a given class that satisfy a certain criteria. Our process says we want to escalate all incidents of “High” urgency that are still active after 1 hour and have not already been escalated. So that’s four conditions we will want to apply to the query. The process for composing this query is really easy because the SCSMIP discovers the classes in SM (in this sample we want the default "Incident” class, but if you had your own custom classes these would be discovered as well). Likewise, the pick-List values for “Escalated”, “Status” and “Urgency” are all discovered. We are using out-of-the-box List values but if you created Lists of your own then these would likewise be discovered by Opalis. So we’ll filter for incidents that are “Active”, “Urgent” and not Escalated and whose Created Date is in the last (“before”) 1 hour. We get the timestamp 1 hour ago from a published data subscription from the prior “Format Date/Time” activity.
The results of this query will be 0 (zero) to N objects injected into the Opalis data bus… sort of a spreadsheet of incident data with each row being a different incident and each column being a different property for that incident. The first think we’ll want to do is filter OUT the case where 0 records were returned. That means there are no incidents to escalate. We do this with link-level logic (the logic applied to the arrows between workflow activities). So we need to put another activity on the palate so we can draw an arrow. Our next activity is a “Map Published Data” activity (I’ll explain why in a moment) so we add this and then drag an arrow from “Get Object” to “Map Published Data”. Now we double-click on the arrow (the “Link”) and configure the link logic. In this case let’s add a new condition that says “Number of records returned” does not equal 0. So as long as the “Get Object” runs and the number of records returned <> 0 we will proceed to the next step (“Map Published Data”). If we get back zero records we won’t progress to the next step since we don’t satisfy the link-level logic.
Note that this is the link I edited (the arrow labeled “Records Found”). You do this by double-clicking on the link itself. Also note that I labeled the link, which is a good practice. You won’t see the link labels unless you turn this feature from the “Options” menu in the Opalis designer…
I said I would explain why I used a “Map Published Data” activity as my next step, so let me do this. Our process wants us to change the Support Group of the incident based on it’s age. Let’s say want the following rule applied:
- Tier 1 becomes Tier 2
- Tier 2 becomes Tier 3
- Tier 3 stays Tier 3 but let’s send an executive an email just for fun and bonus points…
This is exactly the kind of scenario the “Map Published Data” activity is for… translating one set of static values into another set. I’ll configure the activity to map “Tier 1” to “Tier 2”, “Tier 2” to “Tier 3” and “Tier 3” to “Email”. We’ll branch our workflow when we bump a “Tier 3” incident by looking for a value of “Email” from “Map Published Data” (which would be invalid in the default Service Manager configuration).
Now we use link-level logic to configure the branches in the workflow. One branch for new “Tier 2” and “Tier 3” incidents will simply update the incident and change the value of “Support Group” and “Escalated” accordingly. The other branch will only permit “Email” values through and we’ll not update the Support Group of these incidents (they will remain “Tier 3”) but we’ll send an email to an executive.
Updating the incident in our new “Tier 2” or “Tier 3” branch is trivial. The Opalis SM IP has an activity “Update Object” that does this nicely. We could update other properties as well (notes, etc.) but we’ll keep this simple for now. Here we set the new “Support Group” and also set the “Escalated” flag. Notice how we get the new Support Group from the Map Published Data activity… this is the “bump” of the support group to the new value.
The other branch for “Email” (remember this is for “Tier 3” incidents) will use the Send Email activity from the Opalis Foundation Activity palate. Again, this is very simple but it illustrates how one would use branching to provide different escalation paths for different rules. Really, the possibilities are quite extensive! After you send the email make sure to set the “Escalated” flag… otherwise the executive will get an email every hour!
That’s it! Really this is a very simple workflow and I am only putting this on the blog to provide a sample of how one can fill a simple Incident Management need with Opalis. Obviously every environment will have different needs and time-based escalation is only one. For example you might want to escalate based on the age of the incident plus the business impact of the service associated with it… again something easy to do but I’ll make that the topic of another blog post CMDB operations and working with service models. Stay tuned…