您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.
入门:通过适当的控件提高可靠性Get started: Improve reliability with the right controls
如何应用正确的控件来提高可靠性?How do you apply the right controls to improve reliability? 本文可帮助你最大程度地减少与相关的中断:This article helps you minimize disruptions related to:
- 配置中的不一致。Inconsistencies in configuration.
- 资源组织。Resource organization.
- 安全基线。Security baselines.
- 资源保护。Resource protection.
本文中的步骤可帮助运营团队平衡 IT 组合的可靠性和成本。The steps in this article help the operations team balance reliability and cost across the IT portfolio. 本文还有助于调控团队确保按一致的方式应用余额。This article also helps the governance team to ensure that balance is applied consistently. 可靠性还依赖于其他角色和功能。Reliability also depends on other roles and functions. 本文将介绍支持函数,以帮助你在涉及的团队之间创建对齐。This article maps supporting functions to help you create alignment among the involved teams.
操作管理和调控是企业可靠性中的平等合作伙伴。Operations management and governance are equal partners in enterprise reliability. 针对操作做法做出的决策将基线设置为可靠性。The decisions you make about operational practices set the baseline for reliability. 用于控制整体环境的方法可确保所有资源的一致性。The approaches used to govern the overall environment ensure consistency across all resources.
本文中的前两个步骤有助于这两个团队入门。The first two steps in this article help both teams get started. 它们按顺序列出,但你可以并行执行它们。They're listed sequentially, but you can perform them in parallel. 后续步骤可帮助整个企业开始在整个企业中实现更可靠的解决方案。The subsequent steps help you get the entire enterprise started on a shared journey toward more reliable solutions throughout the enterprise.
步骤1:建立运营管理要求Step 1: Establish operations management requirements
并非所有工作负荷都创建相等。Not all workloads are created equal. 在任何环境中,都有一些工作负荷对业务有直接的影响。In any environment, there are workloads that have a direct and constant impact on the business. 此外,还支持对总体业务产生较小影响的业务流程和工作负荷。There are also supporting business processes and workloads that have a smaller impact on the overall business. 在此步骤中,云运营团队确定并实现初始要求,以支持整体 IT 组合。In this step, the cloud operations team identifies and implements initial requirements to support the overall IT portfolio.
可Deliverables:
- 实施管理基线来定义所有生产工作负荷所需的标准操作。Implement a management baseline to define the standard operations that are required for all production workloads.
- 与云策略团队协商业务承诺,制定一套高级操作和复原要求计划。Negotiate business commitments with the cloud strategy team to develop a plan for advanced operations and resiliency requirements.
- 如果大多数工作负荷需要执行其他操作,请展开你的管理基线。Expand your management baseline, if additional operations are required for the majority of workloads.
- 将高级操作要求应用于支持最关键工作负荷的登录区域和资源。Apply advanced operations requirements to landing zones and resources that support the workloads that are most critical.
- 在 operations management 工作簿中的 IT 组合上记录操作决策。Document operations decisions across the IT portfolio in the operations management workbook.
支持交付完成的指南:Guidance to support deliverable completion:
-
- 清单和可见性: 云本机工具 可帮助你 收集数据 和 配置警报。Inventory and visibility: Cloud-native tools can help you collect data and configure alerts. 这些工具还可帮助你实现最适合你的操作模型的 监视平台 。The tools also can help you implement the monitoring platform that best fits your operating model.
- 操作合规性:最大的停机百分比往往来自资源配置或不良维护做法的变化。Operational compliance: The highest percentages of outages tend to come from changes to resource configuration or poor maintenance practices. 遵循 Azure 服务器管理指南 来实现云本机工具,以管理对资源配置的修补和更改。Follow the Azure server management guide to implement cloud-native tools to manage patching and changes to resource configuration.
- 保护和恢复:对于任何平台,中断都是不可避免的。Protection and recovery: Outages are inevitable on any platform. 发生中断时,请准备好 备份和恢复解决方案 ,以最大程度地减少持续时间。When a disruption occurs, be prepared with backup and recovery solutions to minimize the duration.
高级操作:使用管理基线作为 业务协调 会话的基础。Advanced operations: Use the management baseline as the foundation for your business alignment conversations. 它可帮助你清楚地讨论 关键程度、 业务影响和 运营承诺。It helps you to clearly discuss criticality, business impact, and operations commitments. 业务协调有助于量化和验证对增强的 基线、特定 技术平台的管理或特定 于工作负荷的操作的请求。Business alignment helps quantify and validate requests for an enhanced baseline, management of specific technology platforms, or workload-specific operations.
指南: 为了满足操作要求,可能需要在工作负荷级别进行体系结构更改。Guide an architecture review: Architecture changes at the workload level might be required to meet operations requirements. Microsoft Azure Well-Architected 框架和Microsoft Azure Well-Architected 检查有助于指导这些与特定工作负荷的技术所有者的对话。The Microsoft Azure Well-Architected Framework and Microsoft Azure Well-Architected Review can help guide those conversations with the technical owner of a specific workload.
负责团队Accountable team | 负责人和支持团队Responsible and supporting teams |
---|---|
步骤2:一致地应用管理基线Step 2: Consistently apply the management baseline
企业可靠性要求一致地应用管理基线。Enterprise reliability requires consistent application of the management baseline. 这种一致性来自于适当的公司政策、IT 过程和自动化的工具。That consistency comes from appropriate corporate policy, IT processes, and automated tools. 这些资源控制所有受影响资源的管理基线实现。These resources govern the implementation of the management baseline for all affected resources.
可Deliverables:
- 确保正确应用所有受影响系统的管理基线。Ensure proper application of the management baseline for all affected systems.
- 记录资源一致性 训练模板中的资源一致性策略、进程和设计指南。Document your Resource Consistency policies, processes, and design guidance in the Resource Consistency discipline template.
支持交付完成的指南:Guidance to support deliverable completion:
- 确保所有工作负荷和资源都遵循 正确的命名和标记约定。Ensure all workloads and resources follow proper naming and tagging conventions. 使用 Azure 策略强制执行标记约定,对关键程度标记有特定的强调。Enforce tagging conventions by using Azure Policy, with a specific emphasis on tags for criticality.
- 如果你不熟悉云治理,请使用管理方法建立 管理策略、流程和学科 。If you're new to cloud governance, establish governance policies, processes, and disciplines by using the Govern methodology.
- 如果你不熟悉成本管理训练,请按照 成本管理训练训练 一文中的指导进行操作。If you're new to the Cost Management discipline, follow the guidance in the Cost Management discipline improvements article. 专注于 实现 部分。Focus on the Implementation section.
备注
启动与其他团队的可靠性合作关系的步骤: 整个云采用生命周期中的各种决策可能会直接影响可靠性。Steps to start reliability partnerships with other teams: Various decisions throughout the cloud adoption lifecycle can have a direct impact on reliability. 以下步骤概述了在整个 IT 资产组合中提供一致的可靠性所需的合作关系和支持工作。The following steps outline the partnerships and supporting efforts required to deliver consistent reliability across the IT portfolio.
负责团队Accountable team | 负责人和支持团队Responsible and supporting teams |
---|---|
步骤3:定义策略Step 3: Define your strategy
战略决策会直接影响可靠性。Strategic decisions directly affect reliability. 它们通过采用生命周期和长期操作。They ripple through the adoption lifecycle and into long-term operations. 战略性清晰提高了可靠性。Strategic clarity improves reliability efforts.
可Deliverables:
- 记录 策略和计划模板中的动机、结果和业务理由。Record motivations, outcomes, and business justification in the strategy and plan template.
- 确保管理基线提供与云采用战略方向一致的操作支持。Ensure the management baseline provides operational support that aligns to the strategic direction of cloud adoption.
支持交付完成的指南:Guidance to support deliverable completion:
- 了解动机:关键业务事件和某些迁移动机往往是成本敏感的。Understand motivations: Critical business events and some migration motivations tend to be cost sensitive. 这些领域可能会提高所有后续工作的成本控制的重要性。These areas can increase the importance of cost control for all later efforts. 通过迁移与创新或增长相关的其他前瞻性动机可能更侧重于顶线收入。Other forward-looking motivations related to innovation or growth through migration might be focused more on top-line revenue. 了解动机有助于确定成本管理的优先级。Understanding motivations helps you prioritize your cost management.
- 业务成果:某些会计结果往往非常经济地敏感。Business outcomes: Some fiscal outcomes tend to be extremely cost sensitive. 当所需的结果映射到会计指标时,你应该提前投入成本管理调控规范。When the desired outcomes map to fiscal metrics, you should invest early in the Cost Management governance discipline.
- 业务理由:业务理由作为云采用的总体财务计划的高级别视图。Business justification: The business justification serves as a high-level view of the overall financial plan for cloud adoption. 它可能是初始预算工作的好来源。It can be a good source for initial budgeting efforts.
负责团队Accountable team | 负责人和支持团队Responsible and supporting teams |
---|---|
步骤4:制定云采用计划Step 4: Develop a cloud adoption plan
数字场地 (或分析现有 IT 组合) 可帮助你验证业务理由。The digital estate (or analysis of the existing IT portfolio) can help you to validate the business justification. 它可以提供整体 IT 组合的精确视图。It can provide a refined view of the overall IT portfolio. 采用计划为采用期间活动的时间线提供了明确的说明。The adoption plan provides clarity on the timeline of activities during adoption.
将采用计划与数字房地产分析进行协调时,可以规划未来的操作管理依赖项。When you align the adoption plan with the digital estate analysis, you can plan for future operations management dependencies. 了解采用计划还邀请云操作团队进入开发周期。Understanding the adoption plan also invites the cloud operations team into the development cycles. 他们可以评估和规划对提供工作负荷操作所需的管理基线进行的任何更改。They can evaluate and plan for any changes to the management baseline that are required to provide workload operations.
可Deliverables:
更新 策略和计划模板 ,以反映实现所需策略所需的更改。Update the strategy and plan template to reflect changes that are needed to achieve the desired strategy. 记录的更改可能包括:The changes recorded can include:
- 对现有数字房地产的评估。An assessment of the existing digital estate.
- 一种云采用计划,反映所需的更改和所涉及的工作。A cloud adoption plan that reflects the required changes and the work involved.
- 交付计划所需的组织更改。The organizational change that's required to deliver on the plan.
- 一种计划,用于解决使现有团队成功完成所需工作所需的技能。A plan for addressing the skills that are needed to enable the existing team to successfully complete the required work.
与调控团队合作,协调成本模型和预测模型。Work with the governance team to align cost models and forecast models. 此过程包括通过定量分析开始优化花费的工作量。This process includes efforts to start optimizing spend through quantitative analysis.
支持交付完成的指南:Guidance to support deliverable completion:
- 收集清单:建立数据源,用于在采用之前对数字场地进行分析。Gather inventory: Establish a source of data for analysis of the digital estate prior to adoption.
- 最佳做法: Azure Migrate:使用 Azure Migrate 来收集清单。Best practice: Azure Migrate: Use Azure Migrate to gather inventory.
- 增量合理化:在增量合理化过程中,定量分析可以确定适用于预算的云候选项。Incremental rationalization: During incremental rationalization, a quantitative analysis can identify cloud candidates for budgeting purposes.
- 协调成本模型和预测模型:通过 创建预算,使用 Azure 成本管理 + 计费来协调成本和预测模型。Align cost models and forecast models: Use Azure Cost Management + Billing to align cost and forecast models by creating budgets.
- 构建云采用计划:使用可操作的工作负荷、资产和时间线详细信息构建计划。Build your cloud adoption plan: Build a plan with actionable workload, assets, and timeline details.
负责团队Accountable team | 负责人和支持团队Responsible and supporting teams |
---|---|
步骤5:实施登陆区域最佳实践Step 5: Implement landing zone best practices
云采用框架的现成方法集中于开发登陆区域以在云中托管工作负荷。The Ready methodology of the Cloud Adoption Framework focuses heavily on the development of landing zones to host workloads in the cloud. 在登陆区域实现过程中,多个决策可能会影响操作。During landing zone implementation, multiple decisions could affect operations. 有关操作改进的信息,请参阅云运营团队。Consult the cloud operations team to help review the landing zone for operations improvements. 另外,请咨询云治理团队了解资源一致性策略和设计指南,它们可能会影响登陆区域设计。Also consult the cloud governance team to understand Resource Consistency policies and design guidance that might affect the landing zone design.
可Deliverables:
- 部署一个或多个可在短期采用计划中托管工作负荷的登录区域。Deploy one or more landing zones capable of hosting workloads in the short-term adoption plan.
- 确保所有登陆区域满足操作决策和资源一致性要求。Ensure that all landing zones meet operations decisions and resource consistency requirements.
支持交付完成的指南:Guidance to support deliverable completion:
- 改进登陆区域操作:提高给定登陆区域内的操作的最佳实践。Improve landing zone operations: Best practices for improving operations within a given landing zone.
负责团队Accountable team | 负责人和支持团队Responsible and supporting teams |
---|---|
步骤6:完成采用工作量和变化的波浪Step 6: Complete waves of adoption effort and change
长期操作可能会受到迁移和创新工作期间做出的决策影响。Long-term operations can be affected by the decisions made during migration and innovation efforts. 在采用过程早期保持一致的协调有助于消除对生产版本的屏障。Maintaining consistent alignment early in adoption processes helps to remove barriers to production releases. 它还减少了向操作管理实践引入新解决方案所需的工作量。It also reduces the effort that's required to introduce new solutions into operations management practices.
可Deliverables:
- 使用资源一致性策略测试生产部署的操作准备情况。Test operational readiness of production deployments by using Resource Consistency policies.
- 验证遵守资源一致性设计指南和操作要求。Validate adherence to resource consistency design guidance and operations requirements.
- 记录 operations management 工作簿中的任何高级操作要求。Document any advanced operations requirements in the operations management workbook.
支持交付完成的指南:Guidance to support deliverable completion:
- 环境准备情况一览表Environmental readiness checklist
- 升级前的清单Pre-promotion checklist
- 生产发布清单Production release checklist
负责团队Accountable team | 负责人和支持团队Responsible and supporting teams |
---|---|
Value 语句Value statement
这些步骤可帮助您实现确保企业中的可靠性和所有托管资源所需的控件和过程。These steps help you to implement the controls and processes that are needed to ensure reliability across the enterprise and all hosted resources.