Summary

Completed

In this module, you've looked at the five key principles of the Reliability pillar of the Azure Well-Architected Framework.

Outages and malfunctions are expected events for workloads deployed in the cloud, on distributed systems. As such, workload design should proactively consider the effects of outages and malfunctions and build reliability measures into the design to optimize the resiliency of the system. Workload reliability design focuses on maximizing availability through redundancy, scalability, the application of design patterns, and the use of proper operational procedures. When malfunctions and outages do happen, the design also focuses on minimizing the effects, or blast radius, of the event through industry-proven design patterns. To allow the teams supporting the workload to be able to efficiently react to potential or active events, a robust observability platform is necessary.

With strong reliability comes tradeoffs with other Well-Architected Framework pillars like performance efficiency and cost optimization, so careful consideration of the balance between pillars and prioritization is paramount to success.

Learn more

To learn more about the Azure Well-Architected Framework and Azure services that improve the reliability of your architectures, see the following resources: