When our Dev/Test (non-prod) AKS clusters are not in use
we like to shut them down but it was verified that
“az vmss deallocate” is causing missing Monitoring compliance alert.
“az aks stop” is causing PLE AKS removing in Regional Bastion and the cluster state of a stopped AKS cluster is preserved for up to 12 months, if a cluster is stopped for more than 12 months, the cluster state cannot be recovered (see https://docs.microsoft.com/en-us/azure/aks/start-stop-cluster)
So the best way to reduce costs associated with an AKS cluster (when it isn't being used) looks like the node count manual scale.
For each our NPRD AKS Cluster has been defined just a system node pool (e.g. agentpool), it means that "Every AKS cluster must contain at least one system node pool with at least one node." (according to that in https://docs.microsoft.com/en-us/azure/aks/use-system-pools).
So we can't scale the system node pool to 0, this looks like allowed just if a user node pool is defined along with a system node pool as the user node pool can be scaled to 0, but the system node pool can't, to verify possible side-effects.
Please, could you confirm that reported above in terms of expected behavior ?
Are there any further chances to hang our AKS clusters in a sort of Stand-by state to save costs when not in use without risks ?
Any hints will be appreciated.
Thanks !