BRE: Performance Consideration - Documentation in Development

Article
09/15/2006

Performance Considerations

Introduction

This topic discusses how the rule engine performs in various scenarios and with different values for the configuration/tuning parameters.

Fact Types

The rule engine takes less time to access .NET facts compared the time it takes to access the XML and database facts. If you have a choice of using either .NET or XML or database fact in a policy, you should consider using .NET facts for higher performance.

Data Table vs. Data Connection

When the size of the data set is small (< 10 or so), the TypedDataTable binding performs better than the DataConnection binding. Where as, the DataConnection binding performs better than the TypedDataTable binding when the data set is large (greater than or equal to 10 rows approximately). Therefore, you should decide whether to use the DataConnection binding or TypedDataTable binding based on the estimated size of the data set.

Fact Retrievers

You can write a fact retriever—an object that implements standard methods and typically uses them to supply long-term and slowly changing facts to the rule engine before the policy is executed. The engine caches these facts and uses them over multiple execution cycles. Instead of submitting a static or fairly static fact each time the you invoke the rule engine, you should create a fact retriever that submits the fact for the first time, and then updates the fact in memory only when it is needed.

Rule Priority

The priority setting for a rule can range on either side of 0, with larger numbers having higher priority. Actions are executed in order from the highest priority to lowest priority. When the policy implements forward-chaining behavior by using Assert/Update calls, the chaining can be optimized by using the priority setting. For example, assume that Rule2 has a dependency on a value set by Rule1. Giving Rule1 a higher priority means that Rule2 will only execute after Rule1 fires and updates the value. Conversely, if Rule2 were given a higher priority, it could fire once, and then fire again after Rule1 fires and updates the fact that Rule2 is using in a condition. This may or may not result in the correct results, but clearly would have a performance impact versus only firing once.

Update Calls

The Update function updates the fact that exists in the working memory of the rule engine and causes all the rules using the updated facts in conditions to be reevaluated. The Update function calls can be expensive especially if large set of rules need to be reevaluated because of updating the facts. There are situations where they can be avoided. For example, consider the following rules:

Rule1:

IF PurchaseOrder.Amount > 5

THEN StatusObj.Flag = true; Update(StatusObj)

Rule2:

IF PurchaseOrder.Amount <= 5

THEN StatusObj.Flag = false; Update(StatusObj)

All remaining rules of the policy use StatusObj.Flag in their conditions. Therefore, when Update is called on the StatusObj object, all the rules will be reevaluated. Whatever the value of the Amount field is, all the rules except Rule1 or Rule2 are evaluated twice, once before the Update call and once after the Update call.

Instead, you could set the value of the flag field to false prior to invoking the policy and then use only Rule1 in the policy to set the flag. In this case, Update would be called only if the value of the Amount field is greater than 5, and Update function is not called if amount is less than or equal to 5. Therefore, all the rules except Rule1 or Rule2 are evaluated twice only if the value of the Amount field is greater than 5.

Usage of Logical OR Operators

Using an increasing number of logical OR operators in conditions creates additional permutations that expand the analysis network of the rule engine. From a performance standpoint, you are better off splitting the conditions into atomic rules that do not contain logical OR operators.

Caching Settings

The rule engine uses two caches. The first one is in the update service and the second one is in each BizTalk process. The first time a policy is used, the BizTalk process requests for the policy information from the update service. The update service retrieves the policy information from the rule engine database, caches it and returns the information to the BizTalk process. The BizTalk process creates a policy object based on that information and stores the policy object in a cache when the associated rule engine instance completes executing the policy. When the same policy is invoked again, the BizTalk process reuses the policy object from the cache if one is available in the cache. Similarly, if BizTalk process requests for the information about a policy from update service, the update service looks for the policy information in its cache if it is available. The update service also checks if there have been any updates to the policy in the database every 60 seconds (1 minute). If there are any updates, the update service retrieves the information and caches the updated information.

There are three tuning parameters for the rule engine related to these caches and they are CacheEntries, CacheTimeout, and PollingInterval. You can specify the values for these parameters either in the registry or in a configuration file. The value of the CacheEntries is the maximum number of entries in the cache. The default value of CacheEntries parameter is 32. You may want to increase the value of the CacheEntries parameter to improve performance in some cases. For example, say, you are using 40 policies repeatedly; you may want to increase the value of CacheEntries parameter to 40 to improve the performance. This would allow the update service to cache details of up to 40 policies in memory. While it would cause the BizTalk service to cache up to 40 policy instances in memory. There may be more than one instance of a policy in the cache of BizTalk service.

The value of CacheTimeout is the time (in seconds) for entries to age out of the update service cache. In other words, the CacheTimeout value refers to how long a cache entry for a policy is kept in the cache without it being referred. The default value of CacheTimeout parameter is 3600 seconds (1 Hr). It means that, if the cache entry is not referenced with in an hour, it is deleted. In some cases, you may want to increase the value to a higher value to improve the performance. For example, say, the policy is invoked every 2 hrs. You could improve the performance of the policy execution by increasing the value of the CacheTimeout parameter to a value higher than 2 hrs.

The PollingInterval parameter to the rule engine defines the time in seconds for the update service to check the rule engine database for updates. The default value for the PollingInterval parameter is 60 seconds (1 minute). If you know that the policies do not get updated at all or they are updated rarely, you could change this value to a higher value to improve the performance.

Side Effects

The ClassMemberBinding, DatabaseColumnBinding, and XmlDocumentFieldBinding classes have a property named SideEffects. This property determines if the value of the bound field/member/column value is cached or not. The default value of the SideEffects property in the DatabaseColumnBinding and XmlDocumentFieldBinding classes is false. Whereas, the default value of the SideEffects property in the ClassMemberBinding class is true. Therefore, when a field of an XML document or a column of a database table is accessed for the second time or later with in the policy, the value is retrieved from the cache. Where as, when a member of a .NET object is accessed for the second time onwards, the value is retrieved from the .NET object, not from the cache. Setting the siddeffects flag of a .NET ClassMemberBinding to false will improve the performance as the value of the field is retrieved from the cache from second time onwards. You can only do this programmatically. The Business Rule Composer tool does not expose the sideeffects flag.

Instances and Selectivity

The XmlDocumentBinding, ClassBinding and DatabaseBinding classes have two properties, Instances and Selectivity. The value of Instances property is the expected number of instances of the class in working memory. The value of Selectivity property is the percentage of the class instances that will successfully pass the rule conditions. The rule engine uses these values to optimize the condition evaluation so that the lowest possible number of instances are used in condition evaluations first and then the remaining instances. If you have prior knowledge of the number of instances of the object, setting the Instances property to that value would improve the performance. Similarly, if you have prior knowledge of the the percentage of these objects passing the conditions, setting the Selectivity to that value would improve the performance. You can only set value for these parameters programmatically. The Business Rule Composer tool does not expose them.