Case of the never-ending Full Collection Evaluation

 I was recently working with a customer with a large number of collections set for incremental evaluation. The customer had tuned the settings on the primary sites to only run incremental every 30 minutes, their evaluation of incremental normal takes between 10-15 minutes to complete.


Recently, many of the collections began showing the hourglass icon in the console on a consistent basis. Upon investigation, it appeared that one of the primary sites was running almost a constant full evaluation cycle of all their collections. 


In the colleval.log, we were able to pick out a message that too out:

spCollBeginIncEvaluation: Too many changes, marking incremental collections for full evaluation


We started looking into why this site was having more changes then the others. We could not determine any logic reason.


I decided to go to my lab, and take a closer look at the spCollBeginIncEvaluation stored procedure. I found it runs two queries to determine what items are going to be seen as eligible for incremental evaluation. 

  • First, it looks at the number of collections with flag 4 (Incremental) set, have a query rule, and are not currently being evaluated

  • Next, a query identifies computer objects from the CollectionNotifications table and gathers machine ID's from devices that match the tables used in the query rules from the collections in the previous query. 

  • Finally, the stored procedure multiples the number of collections from the first query by the number of machine ID's from the second query. If the number is bigger than 10,000,000 (yes, 10 million) then full evaluation will be triggered instead of incremental.


I could not image how we could have more than 10 million “changes” that needed to be updated, but I started watching the data, and the number of machineID's return was growing astronomically between cycles, (up to 20,000 machines that would show up on the list).


In this case, I have more 500 collections found that needed incremental evaluation, and 20,000+ machine ID’s (500 x 20,000 = 10,000,000). I was over 10 million “changes”, thus the stored procedure was triggering Full Evaluation.

So now we needed to understand why 20,000+ were updating data within a 30-minute window. 


We took the data from the CollectionNotification table and copied it into excel, and found that of the 20,000 record updates, that 18,000 of them were associated with table name v_CH_ClientSummary.


So the question became, what is updating this view for so many clients each 30 minutes. Turns out, this view/table updates each time a client requests policy. So, if I have my policy interval set to 60 min, on average, half of my clients are going to request policy within 30 minutes, so If I have 40,000 clients, each 30 minutes, 20,000 clients are going to update policy. 


Now, I wanted to find the collections that are using incremental updates and have queries to the v_CH_ClientSummary view, with this query:


Now that I found the collection(s), I removed the incremental evaluation check box from the collection and the number of machineID's that needed updating each 30 min went from 20,000 to less than 2,000.


The morale of the story, don't create collections that use incremental evaluation from the v_CH_ClientSummary view (wmi class SMS_G_System_CH_ClientSummary or in the query builder 'Client Status')



Also found good reference information here:(