Microsoft 365 change management

Microsoft 365 enforces change management procedures when both code and non-code changes to its systems are made to maintain its security posture. Any configuration drift from the initial posture can introduce vulnerabilities, break functionality, or disrupt availability. Once an information system within Microsoft 365 has been deployed with a robust security posture, detailed change management processes are enforced to maintain system integrity.

There are many drivers of change in Microsoft 365, including new functional or security requirements, feedback from customers, identified vulnerabilities, and audit findings. Regardless of the driver for change, service teams use ticketing or source control tools to document evidence of approval and track all changes.

Source code changes

Changes are deployed through Microsoft's Secure Development Lifecycle (SDL), which is followed by all engineering and development projects in Microsoft 365. This is a software development model that includes specific security considerations related to code reviews, tests, and approvals before they're systematically released into the Microsoft 365 environment.

Code change process.

The SDL acts as a framework and includes the identification of possible risks to the finished development project and mitigation strategies that can be implemented and tested during the development phases. Critical security review and approval checkpoints are included as well.

Change identification and planning

Service teams meet regularly to discuss proposed changes, including justification, scope, security impact, priority, dependencies, deployment plans, roles, and responsibilities. This information is documented in the change management tracking system. If the change is rejected, the justification is explicitly documented in the ticket for future reference.

Personnel code reviews

Developers tasked with implementing a code change submit a pull request that replicates the main branch's code, allowing them to make necessary modifications. Before any new code can be included in a new build and deployed, it must pass personnel code review. Enforcement of these reviews is handled through an automated code pipeline attached to each code repository and can't be circumvented. Once the required approval is received, the code can move on to the next phase.

Code reviewers check for coding errors, verify that the changes meet the requirements, and perform a security impact analysis. Reviews must be conducted by someone other than the people who developed the code, enforcing the principle of separation of duties. Preventing the same people from submitting and approving their own code is a critical control that Microsoft strictly enforces. This greatly reduces the possibility of people single-handedly releasing, either intentionally or unintentionally, harmful, or buggy code. If reviewers find problems during the code review, they halt the change, and have developers resubmit the code with suggested changes and additional testing. Code reviewers may also decide to reject check-in entirely for code that doesn't meet the identified requirements. Once the code is deemed satisfactory by the reviewer, approval is provided, and the code is checked into the main branch as a commit.

Automated build pipeline and security checks

Once all changes for the sprint are committed to the main branch, the automated build process begins. This is where the code is subjected to various automated security checks. These checks include static code analysis, binary analysis, and encryption scanning. Microsoft 365 defines a set of essential tests that each build must pass prior to deployment to pre-production environments. Builds that don't pass are rejected and sent back to the development team where the necessary adjustments are made until they can reach the threshold pass rate. Successful builds proceed to the pre-production environment via an automated deployment pipeline.

Build release

Builds are initially released to only the service team that developed the feature. It must function without issue before being released to progressively larger test groups in logically isolated cloud environments called rings. After the service team, the build is released to all internal Microsoft 365 groups, followed by release of the build to all internal Microsoft groups. This testing, often referred to internally as dogfooding, allows Microsoft to identify bugs in the true production environment prior to the build being released to external customers. These testing methods ensure Microsoft's code is secure and functioning as expected before it reaches customers and worldwide deployment. Previous builds are always retained for rollback purposes.

The engineering teams determine the amount of time a build spends in each ring during high load periods before proceeding to the next ring. If all testing is successful in each internal ring, the build is released to customers worldwide, first as a Targeted Release to customer tenants who have opted into that ring, followed by a Worldwide Standard Release.

Non-code changes

Non-code changes are defined as any modifications to Microsoft 365 systems that don't involve creating or editing service source code. This can include the opening of ports, changing of Access Control Lists (ACLs), or other changes to the underlying system. In comparison, non-code changes occur less frequently than code changes but still require a high level of scrutiny.

Non-code change process.

A description of the change is documented along with implementation steps, validation steps, and a rollback plan. Before the change is implemented, the plans are peer reviewed for accuracy and security impact by at least one person. Once approved, the documented plans are implemented. If all validation steps successfully pass, the results are documented in the ticket, and it's marked as resolved.

If the implementation of the change is unsuccessful, the rollback plans are triggered, and the team returns to the planning phase and repeats the process until successful.

Resources