question

njadric avatar image
0 Votes"
njadric asked CyrAz commented

PowerShell SCOM monitor using too much resources

Hi all, I hope you could help me.

I have built custom IIS Web Site and IIS App Pool availability monitors to replace the original ones. The reason is that the original do not have an option to alert only after certain period of object being down, and we have a lot of IIS restarts that cause alerts.

I have used Kevin Holman's fragments in Visual studio and the whole thing have two, pretty much the same, PowerShell script based monitors. One for App Pools and one for Web Sites. At the beginning I haven't build with cookdown and I though that was the reason. Once implemented, the cookdown has reduced resource usage a lot, but I still get a lot of strain on agents.
For example, all my agents on IIS web servers are restarting all the time, and Monitoring host private bytes are looking like this:


109878-mon-host-priv-bytes.png


If I disable one of these monitors, I will get slightly less impact on performance, but only after disabling both, it will come back to normal.

The scripts are very simple. I am including only logic part, not the whole script.


App Pool Availability Monitor

 # Begin MAIN script section
 #=================================================================================
    
 $strCondition = Get-WebAppPoolState
    
 foreach ($strCond in $strCondition) {
    
     $bag = $momapi.CreatePropertyBag()
     $PoolID = $strCond.ItemXPath.split("'")[1]
     $AppPoolStatus = $strCond.Value
     $bag.AddValue('PoolID',$PoolID)
    
     #Check the value of $strCond
     IF ($AppPoolStatus -eq 'Started') {
     $momapi.LogScriptEvent($ScriptName,$EventID,0,'Good Condition Found')
     $bag.AddValue('Result','Running')
     } else {
     $momapi.LogScriptEvent($ScriptName,$EventID,0,'Bad Condition Found')
     $bag.AddValue('Result','NotRunning')
     }
    
     $bag
 }
    
 # End of script section


Web Site Availability Monitor

 # Begin MAIN script section
 #=================================================================================
    
 $strCondition = Get-WebSiteState
    
 foreach ($strCond in $strCondition) {
    
     $bag = $momapi.CreatePropertyBag()
     $DisplayName = $strCond.ItemXPath.split("'")[1]
     $WebSiteStatus = $strCond.Value
     $bag.AddValue('DisplayName',$DisplayName)
    
     #Check the value of $strCond
     IF ($WebSiteStatus -eq 'Started') {
     $momapi.LogScriptEvent($ScriptName,$EventID,0,'Good Condition Found')
     $bag.AddValue('Result','Running')
     } else {
     $momapi.LogScriptEvent($ScriptName,$EventID,0,'Bad Condition Found')
     $bag.AddValue('Result','NotRunning')
     }
    
     $bag
 }
    
 # End of script section



I am wondering why is this using so much resources and is there a way for me to pinpoint where it happens?

I have also tried to override an agent and add bigger values to Private Bytes and Handle Count, but it just kept on rising until agent restarted.


Also, if there are alternatives that solve initial problem, an other way to get Web Site and App Pool availability monitor that you can control in terms of alert suppression.

Thanks!
Niksa










msc-operations-manager
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

njadric avatar image
0 Votes"
njadric answered CyrAz commented

Okay, I have found a solution, or rather a workaround.

Instead of using IIS Webadministration PS module cmdlets, I have found alternative ways to get IIS App Pool and IIS Web Site status data.

This is how my PowerShell part looks like in each:

App Pool Availability Monitor

 # Begin MAIN script section
 #=================================================================================
    
    
 $counters = Get-Counter -Counter "\APP_POOL_WAS(*)\Current Application Pool State"
 $strCondition = $counters.countersamples | where InstanceName -ne "_total"
    
 foreach ($strCond in $strCondition) {
    
     $bag = $momapi.CreatePropertyBag()
     $PoolID = $strCond.InstanceName
     $AppPoolStatus = $strCond.RawValue
     $bag.AddValue('PoolID',$PoolID)
    
     #Check the value of $strCond
     IF ($AppPoolStatus -eq '3') {
         $momapi.LogScriptEvent($ScriptName,$EventID,0,'Good Condition Found')
         $bag.AddValue('Result','Running')
     } else {
         $momapi.LogScriptEvent($ScriptName,$EventID,0,'Bad Condition Found')
         $bag.AddValue('Result','NotRunning')
     }
                                        
     $bag
 }                                    
    
    
 # End of script section


Web Site Availability Monitor

 # Begin MAIN script section
 #=================================================================================
    
                                        
 $strCondition = Invoke-Expression "$env:SystemRoot\system32\inetsrv\AppCmd.exe list site"
    
 foreach ($strCond in $strCondition) {
    
     $bag = $momapi.CreatePropertyBag()
     $DisplayName = $strCond.Split('"')[1]
     $WebSiteStatus = $strCond.Split(':')[-1]
     $bag.AddValue('DisplayName',$DisplayName)
    
     #Check the value of $strCond
     IF ($WebSiteStatus -eq 'Started)') {     # Started) is okay because that's the result of splitting the string
         $momapi.LogScriptEvent($ScriptName,$EventID,0,'Good Condition Found')
         $bag.AddValue('Result','Running')
     } else {
         $momapi.LogScriptEvent($ScriptName,$EventID,0,'Bad Condition Found')
         $bag.AddValue('Result','NotRunning')
     }
    
     $bag
 }
    
                                        
 # End of script section


I still don't understand why those two cmdlets are causing so much overhead, hopefully I will find out one day.

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Weird one indeed, it would be interesting to see how they behave when ran multiple time in a row in a regular powershell session... Would memory usage increase dramatically as well?

0 Votes 0 ·
CyrAz avatar image
0 Votes"
CyrAz answered

You don't show how you handle the " after certain period of object being down" part of your monitoring... Just a wild guess, but that could be the cause of your issue if there is some kind of never-exiting loop in your code.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

njadric avatar image
0 Votes"
njadric answered njadric published

@CyrAz,

It's in the fragment. Below is the whole "IIS Web Site Availability" fragment from management pack. See "MatchCount" part.

I should add that the monitors are working as intended.


 <ManagementPackFragment SchemaVersion="2.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <!--
 %%
 Description:
  This monitor supports cookdown.
  PowerShell script runs so that it returns multiple propery bags - for each IIS Web Site
  Each property bag containg Web Site name (DisplayName) and status
  Later on, in Condition Detection we filter each of these property bags and "connect" it to an SCOM IIS Web Site object
    
   A MONITOR which runs a timed PowerShell script as the DataSource and outputs a propertybag as GOOD or BAD to drive Monitor state and ALERT
   This is a simple script monitor where the script is not passed in any external parameters
   CompanyID - is a short abbreviation for your company with NO SPACES OR SPECIAL CHARACTERS ALLOWED
   AppName - is a short name for your app with NO SPACES OR SPECIAL CHARACTERS ALLOWED  
   ClassID - is the targeted class such as your custom class or Windows!Microsoft.Windows.Server.OperatingSystem
   UniqueID - Is a unique short description of the monitor purpose (NO SPACES OR SPECIAL CHARACTERS ALLOWED) such as "MonitorFilesInFolder"  
      
 Version: 1.4
 LastModified: 25-May-2019
 %%
    
 In this fragment you need to replace:
   CORP
   IIS8
   MWI2!Microsoft.Windows.InternetInformationServices.6.2.WebSite
   IIS8WebSiteAvailabilityCustom  
      
 This fragment depends on references:
   RequiredReference: Alias="System", ID="System.Library"
   RequiredReference: Alias="Windows", ID="Microsoft.Windows.Library"
   RequiredReference: Alias="Health", ID="System.Health.Library"
      
 @@Author=Kevin Holman@@  
 -->
  <TypeDefinitions>
  <ModuleTypes>
  <DataSourceModuleType ID="CORP.IIS8.IIS8WebSiteAvailabilityCustom.Monitor.DataSource" Accessibility="Internal" Batching="false">
  <Configuration>
  <xsd:element minOccurs="1" type="xsd:integer" name="IntervalSeconds" xmlns:xsd="http://www.w3.org/2001/XMLSchema" />
  <xsd:element minOccurs="0" type="xsd:string" name="SyncTime" xmlns:xsd="http://www.w3.org/2001/XMLSchema" />
  <xsd:element minOccurs="1" type="xsd:integer" name="TimeoutSeconds" xmlns:xsd="http://www.w3.org/2001/XMLSchema" />
  </Configuration>
  <OverrideableParameters>
  <OverrideableParameter ID="IntervalSeconds" Selector="$Config/IntervalSeconds$" ParameterType="int" />
  <OverrideableParameter ID="SyncTime" Selector="$Config/SyncTime$" ParameterType="string" />
  <OverrideableParameter ID="TimeoutSeconds" Selector="$Config/TimeoutSeconds$" ParameterType="int" />
  </OverrideableParameters>
  <ModuleImplementation Isolation="Any">
  <Composite>
  <MemberModules>
  <DataSource ID="Scheduler" TypeID="System!System.Scheduler">
  <Scheduler>
  <SimpleReccuringSchedule>
  <Interval Unit="Seconds">$Config/IntervalSeconds$</Interval>
  <SyncTime>$Config/SyncTime$</SyncTime>
  </SimpleReccuringSchedule>
  <ExcludeDates />
  </Scheduler>
  </DataSource>
  <ProbeAction ID="PA" TypeID="Windows!Microsoft.Windows.PowerShellPropertyBagTriggerOnlyProbe">
  <ScriptName>CORP.IIS8.IIS8WebSiteAvailabilityCustom.Monitor.DataSource.ps1</ScriptName>
  <ScriptBody>
  #=================================================================================
  #  Describe Script Here
  #
  #  Author:
  #  v1.0
  #=================================================================================
    
    
  # Constants section - modify stuff here:
  #=================================================================================
  # Assign script name variable for use in event logging.
  $ScriptName = "CORP.IIS8.IIS8WebSiteAvailabilityCustom.Monitor.DataSource.ps1"
  $EventID = "1234"
  #=================================================================================
    
    
  # Starting Script section - All scripts get this
  #=================================================================================
  # Gather the start time of the script
  $StartTime = Get-Date
  #Set variable to be used in logging events
  $whoami = whoami
  # Load MOMScript API
  $momapi = New-Object -comObject MOM.ScriptAPI
  # Load PropertyBag function
  # $bag = $momapi.CreatePropertyBag()  WE ARE CREATING PROPERTY BAG FOR EACH OBJECT BELOW
  #Log script event that we are starting task
  $momapi.LogScriptEvent($ScriptName,$EventID,0,"`n Script is starting. `n Running as ($whoami).")
  #=================================================================================
    
    
  # Begin MAIN script section
  #=================================================================================
    
  $strCondition = Get-WebSiteState
    
  foreach ($strCond in $strCondition) {
    
  $bag = $momapi.CreatePropertyBag()
  $DisplayName = $strCond.ItemXPath.split("'")[1]
  $WebSiteStatus = $strCond.Value
  $bag.AddValue('DisplayName',$DisplayName)
    
  #Check the value of $strCond
  IF ($WebSiteStatus -eq 'Started') {
  $momapi.LogScriptEvent($ScriptName,$EventID,0,'Good Condition Found')
  $bag.AddValue('Result','Running')
  } else {
  $momapi.LogScriptEvent($ScriptName,$EventID,0,'Bad Condition Found')
  $bag.AddValue('Result','NotRunning')
  }
    
  $bag
  }
    
  # End of script section
  #=================================================================================
  #Log an event for script ending and total execution time.
  $EndTime = Get-Date
  $ScriptTime = ($EndTime - $StartTime).TotalSeconds
  $momapi.LogScriptEvent($ScriptName,$EventID,0,"`n Script Completed. `n Script Runtime: ($ScriptTime) seconds.")
  #=================================================================================
  # End of script
  </ScriptBody>
  <TimeoutSeconds>$Config/TimeoutSeconds$</TimeoutSeconds>
  </ProbeAction>
  </MemberModules>
  <Composition>
  <Node ID="PA">
  <Node ID="Scheduler" />
  </Node>
  </Composition>
  </Composite>
  </ModuleImplementation>
  <OutputType>System!System.PropertyBagData</OutputType>
  </DataSourceModuleType>
  </ModuleTypes>
  <MonitorTypes>
  <UnitMonitorType ID="CORP.IIS8.IIS8WebSiteAvailabilityCustom.Monitor.MonitorType" Accessibility="Internal">
  <MonitorTypeStates>
  <MonitorTypeState ID="Running" NoDetection="false" />
  <MonitorTypeState ID="NotRunning" NoDetection="false" />
  </MonitorTypeStates>
  <Configuration>
  <xsd:element minOccurs="1" type="xsd:integer" name="IntervalSeconds" xmlns:xsd="http://www.w3.org/2001/XMLSchema" />
  <xsd:element minOccurs="0" type="xsd:string" name="SyncTime" xmlns:xsd="http://www.w3.org/2001/XMLSchema" />
  <xsd:element minOccurs="1" type="xsd:integer" name="TimeoutSeconds" xmlns:xsd="http://www.w3.org/2001/XMLSchema" />
  <xsd:element minOccurs="1" type="xsd:integer" name="MatchCount" xmlns:xsd="http://www.w3.org/2001/XMLSchema" />
  <xsd:element minOccurs="1" type="xsd:string" name="DisplayName" xmlns:xsd="http://www.w3.org/2001/XMLSchema" />
  </Configuration>
  <OverrideableParameters>
  <OverrideableParameter ID="IntervalSeconds" Selector="$Config/IntervalSeconds$" ParameterType="int" />
  <OverrideableParameter ID="SyncTime" Selector="$Config/SyncTime$" ParameterType="string" />
  <OverrideableParameter ID="TimeoutSeconds" Selector="$Config/TimeoutSeconds$" ParameterType="int" />
  <OverrideableParameter ID="MatchCount" Selector="$Config/MatchCount$" ParameterType="int" />
  </OverrideableParameters>
  <MonitorImplementation>
  <MemberModules>
  <DataSource ID="DS" TypeID="CORP.IIS8.IIS8WebSiteAvailabilityCustom.Monitor.DataSource">
  <IntervalSeconds>$Config/IntervalSeconds$</IntervalSeconds>
  <SyncTime>$Config/SyncTime$</SyncTime>
  <TimeoutSeconds>$Config/TimeoutSeconds$</TimeoutSeconds>
  </DataSource>
  <!-- added this condition detection for cookdown BEGIN -->
  <ConditionDetection ID="CookDownFilter" TypeID="System!System.ExpressionFilter">
  <Expression>
  <SimpleExpression>
  <ValueExpression>
  <XPathQuery Type="String">Property[@Name="DisplayName"]</XPathQuery>
  </ValueExpression>
  <Operator>Equal</Operator>
  <ValueExpression>
  <Value Type="String">$Config/DisplayName$</Value>
  </ValueExpression>
  </SimpleExpression>
  </Expression>
  </ConditionDetection>
  <!-- added this condition detection for cookdown END -->
  <ConditionDetection ID="RunningFilter" TypeID="System!System.ExpressionFilter">
  <Expression>
  <SimpleExpression>
  <ValueExpression>
  <XPathQuery Type="String">Property[@Name="Result"]</XPathQuery>
  </ValueExpression>
  <Operator>Equal</Operator>
  <ValueExpression>
  <Value Type="String">Running</Value>
  </ValueExpression>
  </SimpleExpression>
  </Expression>
  </ConditionDetection>
  <ConditionDetection ID="NotRunningFilter" TypeID="System!System.ExpressionFilter">
  <Expression>
  <SimpleExpression>
  <ValueExpression>
  <XPathQuery Type="String">Property[@Name="Result"]</XPathQuery>
  </ValueExpression>
  <Operator>Equal</Operator>
  <ValueExpression>
  <Value Type="String">NotRunning</Value>
  </ValueExpression>
  </SimpleExpression>
  </Expression>
  <SuppressionSettings>
  <MatchCount>$Config/MatchCount$</MatchCount>
  </SuppressionSettings>
  </ConditionDetection>
  </MemberModules>
  <RegularDetections>
  <RegularDetection MonitorTypeStateID="Running">
  <Node ID="RunningFilter">
  <Node ID="CookDownFilter">
  <Node ID="DS" />
  </Node>
  </Node>
  </RegularDetection>
  <RegularDetection MonitorTypeStateID="NotRunning">
  <Node ID="NotRunningFilter">
  <Node ID="CookDownFilter">
  <Node ID="DS" />
  </Node>
  </Node>
  </RegularDetection>
  </RegularDetections>
  <OnDemandDetections>
  <OnDemandDetection MonitorTypeStateID="Running">
  <Node ID="RunningFilter">
  <Node ID="CookDownFilter">
  <Node ID="DS" />
  </Node>
  </Node>
  </OnDemandDetection>
  <OnDemandDetection MonitorTypeStateID="NotRunning">
  <Node ID="NotRunningFilter">
  <Node ID="CookDownFilter">
  <Node ID="DS" />
  </Node>
  </Node>
  </OnDemandDetection>
  </OnDemandDetections>
  </MonitorImplementation>
  </UnitMonitorType>
  </MonitorTypes>
  </TypeDefinitions>
  <Monitoring>
  <Monitors>
  <UnitMonitor ID="CORP.IIS8.IIS8WebSiteAvailabilityCustom.Monitor" Accessibility="Public" Enabled="true" Target="MWI2!Microsoft.Windows.InternetInformationServices.6.2.WebSite" ParentMonitorID="Health!System.Health.AvailabilityState" Remotable="true" Priority="Normal" TypeID="CORP.IIS8.IIS8WebSiteAvailabilityCustom.Monitor.MonitorType" ConfirmDelivery="true">
  <Category>AvailabilityHealth</Category>
  <AlertSettings AlertMessage="CORP.IIS8.IIS8WebSiteAvailabilityCustom.Monitor.AlertMessage">
  <AlertOnState>Error</AlertOnState>
  <!-- Warning or Error should match OperationalStates below  -->
  <AutoResolve>true</AutoResolve>
  <AlertPriority>Normal</AlertPriority>
  <AlertSeverity>MatchMonitorHealth</AlertSeverity>
  <!-- Common options for AlertSeverity are MatchMonitorHealth, Information, Warning, Error -->
  </AlertSettings>
  <OperationalStates>
  <OperationalState ID="Running" MonitorTypeStateID="Running" HealthState="Success" />
  <OperationalState ID="NotRunning" MonitorTypeStateID="NotRunning" HealthState="Error" />
  <!-- HealthState = Warning or Error -->
  </OperationalStates>
  <Configuration>
  <!-- has to go in order like it goes in MonitorType / Configuration at line 144  -->
  <IntervalSeconds>120</IntervalSeconds>
  <SyncTime></SyncTime>
  <TimeoutSeconds>120</TimeoutSeconds>
  <MatchCount>3</MatchCount>
  <!-- This is the number of consecutive matches that must be met before the monitor will change state.  Also a good example of passing in Integer data. -->
  <DisplayName>$Target/Property[Type="System!System.Entity"]/DisplayName$</DisplayName>
  </Configuration>
  </UnitMonitor>
  </Monitors>
  </Monitoring>
  <Presentation>
  <StringResources>
  <StringResource ID="CORP.IIS8.IIS8WebSiteAvailabilityCustom.Monitor.AlertMessage" />
  </StringResources>
  </Presentation>
  <LanguagePacks>
  <LanguagePack ID="ENU" IsDefault="true">
  <DisplayStrings>
  <DisplayString ElementID="CORP.IIS8.IIS8WebSiteAvailabilityCustom.Monitor">
  <Name>Web Site Availability - IIS8 Custom Monitor</Name>
  <Description></Description>
  </DisplayString>
  <DisplayString ElementID="CORP.IIS8.IIS8WebSiteAvailabilityCustom.Monitor" SubElementID="Running">
  <Name>Good Condition</Name>
  </DisplayString>
  <DisplayString ElementID="CORP.IIS8.IIS8WebSiteAvailabilityCustom.Monitor" SubElementID="NotRunning">
  <Name>Bad Condition</Name>
  </DisplayString>
  <DisplayString ElementID="CORP.IIS8.IIS8WebSiteAvailabilityCustom.Monitor.AlertMessage">
  <Name>Web Site Availability - IIS8 Custom Monitor: Failure</Name>
  <Description>Web Site Availability - IIS8 Custom Monitor: detected a bad condition</Description>
  </DisplayString>
  </DisplayStrings>
  </LanguagePack>
  </LanguagePacks>
 </ManagementPackFragment>

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

CyrAz avatar image
0 Votes"
CyrAz answered

I can't say I see anything extroardinarily wrong here, but I would try the following adjustments :
- Set a synctime instead of leaving it blank to ensure cookdown is indeed properly enforced
- Set a shorter Timeout duration than the running interval

If the issue remains the same, I guess you'll have to take a memory dump of monitoringhost.exe and try to find what objects are staying in memory...

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

njadric avatar image
0 Votes"
njadric answered

@CyrAz

I have removed SyncTime completely and have set shorter Timeout than Interval, but no change.
I did process dump, but I have no idea how to debug that, and nothing comes out obvious straight away.

But I managed to pinpoint that the problem is in PowerShell bit, in particular the two pieces of code I shared first.
When I comment out those pieces, Private Bytes come back to normal behavior. Further more, I have narrowed down the problem to lines where $strCondition variables are declared:

For App Pool Availability Monitor:

 $strCondition = Get-WebAppPoolState

For Web Site Availability Monitor:

 $strCondition = Get-WebSiteState

Why do these two commands leak so much? I have tried to use Remove-Module WebAdministration at the end of the script to unload the module, but it didn't help.

One more thing... On every interval, private bytes of MonitoringHost increase for around 70MB.


5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

CyrAz avatar image
0 Votes"
CyrAz answered CyrAz edited

I would have set SyncTime instead of removing it just to make sure cookdown would always happen, but that doesn't seem to be the cause of your issue anyway.

Unfortunately I'm not great at debugging memory dumps either, I usually just tinker around until I find something relevant - or pass it to a colleague who can handle that better than I can.

I wouldn't say exactly that these commands are "leaking", however it does look like the variables are somehow never cleared up.
Could you try adding Remove-Variable $strCondition at the end of your script?
Also you can have a look at this article : https://getblogpost.pendlenet.co.uk/garbage-collection-in-powershell-scripts-in-scom/

I must add that I absolutely never preoccupied with manually cleaning the variables or handling garbage collection in my powershell workflows, and never faced that kind of issue either... Neither did I ever saw any other MP doing it.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.