I have several Pods with PodDisruptionBudget (max unavailable is 1) in an AKS node pool (one Node with one Pod). I upgraded the node pool right after control plane upgrade finished (Kubernetes version from v1.18 to v1.19), and then more than one Node restarting at the same time, which led to workload downtime.
According to AKS CSS team, after control plane upgraded, an additional reconciliation loop would be triggered by design due to a PUT request to check if configurations in VMSS instances are expected; if not, a restart/reimage may be carried out. So during my node pool upgrade, some Nodes unexpectedly restarted twice.
My question is what kinds of AKS reconciliation would restart Nodes regardless of PDB? I doubt there's some risk to do certain operation as Nodes would not be drained and directly restart sometimes.