您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

Azure Database for MySQL 中的高可用性High availability in Azure Database for MySQL

Azure Database for MySQL 服务提供有保证的高级别可用性,同时提供99.99%运行时间 (SLA) 的财务支持服务级别协议。The Azure Database for MySQL service provides a guaranteed high level of availability with the financially backed service level agreement (SLA) of 99.99% uptime. Azure Database for MySQL 在计划事件(如用户发起的缩放计算操作)期间提供高可用性,并且还在发生基础硬件、软件或网络故障等计划外事件时提供高可用性。Azure Database for MySQL provides high availability during planned events such as user-initated scale compute operation, and also when unplanned events such as underlying hardware, software, or network failures occur. Azure Database for MySQL 可以从大多数关键情况快速恢复,确保在使用此服务时几乎不会关闭应用程序。Azure Database for MySQL can quickly recover from most critical circumstances, ensuring virtually no application down time when using this service.

Azure Database for MySQL 适用于运行需要高运行时间的任务关键型数据库。Azure Database for MySQL is suitable for running mission critical databases that require high uptime. 该服务在 Azure 体系结构的基础上构建,具有固有的高可用性、冗余和复原能力,可减少计划内和计划外停机的数据库停机时间,而无需配置任何其他组件。Built on Azure architecture, the service has inherent high availability, redundancy, and resiliency capabilities to mitigate database downtime from planned and unplanned outages, without requiring you to configure any additional components.

Azure Database for MySQL 中的组件Components in Azure Database for MySQL

组件Component 说明Description
MySQL 数据库服务器MySQL Database Server Azure Database for MySQL 为数据库服务器提供安全、隔离、资源保护和快速重新启动功能。Azure Database for MySQL provides security, isolation, resource safeguards, and fast restart capability for database servers. 这些功能有助于在几秒钟内发生中断后执行缩放和数据库服务器恢复操作等操作。These capabilities facilitate operations such as scaling and database server recovery operation after an outage to happen in seconds.
数据库服务器中的数据修改通常发生在数据库事务的上下文中。Data modifications in the database server typically occur in the context of a database transaction. 所有数据库更改都以 "预写入日志" 的形式以同步方式记录 (ib_log Azure 存储上的) (附加到数据库服务器)。All database changes are recorded synchronously in the form of write ahead logs (ib_log) on Azure Storage – which is attached to the database server. 在数据库检查点过程中,数据库服务器内存中的数据页也会刷新到存储中。During the database checkpoint process, data pages from the database server memory are also flushed to the storage.
远程存储Remote Storage 所有 MySQL 物理数据文件和日志文件都存储在 Azure 存储中,这种存储可用于在一个区域内存储三个数据副本,以确保数据冗余、可用性和可靠性。All MySQL physical data files and log files are stored on Azure Storage, which is architected to store three copies of data within a region to ensure data redundancy, availability, and reliability. 存储层还独立于数据库服务器。The storage layer is also independent of the database server. 它可以从失败的数据库服务器分离,并在几秒钟内重新附加到新的数据库服务器。It can be detached from a failed database server and reattached to a new database server within few seconds. 此外,Azure 存储会持续监视是否存在任何存储故障。Also, Azure Storage continuously monitors for any storage faults. 如果检测到块损坏,则会通过实例化新的存储副本来自动修复。If a block corruption is detected, it is automatically fixed by instantiating a new storage copy.
网关Gateway 网关充当数据库代理,将所有客户端连接路由到数据库服务器。The Gateway acts as a database proxy, routes all client connections to the database server.

计划内停机缓解Planned downtime mitigation

Azure Database for MySQL 旨在在计划的停机时间内提供高可用性。Azure Database for MySQL is architected to provide high availability during planned downtime operations.

Azure MySQL 中的弹性缩放视图

下面是一些计划内维护方案:Here are some planned maintenance scenarios:

方案Scenario 说明Description
计算扩展/缩减Compute scale up/down 当用户执行计算增加/减少操作时,将使用缩放的计算配置来预配新的数据库服务器。When the user performs compute scale up/down operation, a new database server is provisioned using the scaled compute configuration. 在旧的数据库服务器中,允许完成活动检查点,客户端连接将耗尽,所有未提交的事务将被取消,然后关闭。In the old database server, active checkpoints are allowed to complete, client connections are drained, any uncommitted transactions are canceled, and then it is shut down. 然后,将从旧数据库服务器分离存储并将其附加到新的数据库服务器。The storage is then detached from the old database server and attached to the new database server. 当客户端应用程序重试连接或尝试建立新连接时,网关会将连接请求定向到新的数据库服务器。When the client application retries the connection, or tries to make a new connection, the Gateway directs the connection request to the new database server.
扩展存储Scaling Up Storage 扩展存储是一种联机操作,不会中断数据库服务器。Scaling up the storage is an online operation and does not interrupt the database server.
Azure) (新的软件部署New Software Deployment (Azure) 新功能的推出或 bug 修复会自动在服务的计划内维护过程中发生。New features rollout or bug fixes automatically happen as part of service’s planned maintenance. 有关详细信息,请参阅文档,并查看门户For more information, refer to the documentation, and also check your portal.
次版本升级Minor version upgrades Azure Database for MySQL 会自动将数据库服务器修补到 Azure 确定的次要版本。Azure Database for MySQL automatically patches database servers to the minor version determined by Azure. 它在服务的计划内维护过程中发生。It happens as part of service's planned maintenance. 这会导致短暂的停机时间(以秒为单位),并且会使用新的次要版本自动重新启动数据库服务器。This would incur a short downtime in terms of seconds, and the database server is automatically restarted with the new minor version. 有关详细信息,请参阅文档,并查看门户For more information, refer to the documentation, and also check your portal.

计划外停机缓解Unplanned downtime mitigation

由于不可预见的故障(包括底层硬件故障、网络问题和软件错误)导致不计划的停机时间。Unplanned downtime can occur as a result of unforeseen failures, including underlying hardware fault, networking issues, and software bugs. 如果数据库服务器意外中断,则会在数秒内自动设置新的数据库服务器。If the database server goes down unexpectedly, a new database server is automatically provisioned in seconds. 远程存储会自动连接到新的数据库服务器。The remote storage is automatically attached to the new database server. MySQL 引擎使用 WAL 和数据库文件执行恢复操作,并打开数据库服务器以允许客户端连接。MySQL engine performs the recovery operation using WAL and database files, and opens up the database server to allow clients to connect. 未提交的事务将丢失,并且必须由应用程序重试。Uncommitted transactions are lost, and they have to be retried by the application. 尽管不能避免非计划的停机时间,但 Azure Database for MySQL 可以通过在数据库服务器和存储层上自动执行恢复操作而无需人工干预来减少停机时间。While an unplanned downtime cannot be avoided, Azure Database for MySQL mitigates the downtime by automatically performing recovery operations at both database server and storage layers without requiring human intervention.

Azure MySQL 中的高可用性视图

计划外停机:故障方案和服务恢复Unplanned downtime: failure scenarios and service recovery

下面是一些故障方案以及 Azure Database for MySQL 如何自动恢复:Here are some failure scenarios and how Azure Database for MySQL automatically recovers:

方案Scenario 自动恢复Automatic recovery
数据库服务器故障Database server failure 如果数据库服务器由于某些基础硬件故障而关闭,则会丢弃活动的连接,并中止所有即时事务。If the database server is down because of some underlying hardware fault, active connections are dropped, and any inflight transactions are aborted. 将自动部署新的数据库服务器,并将远程数据存储附加到新的数据库服务器。A new database server is automatically deployed, and the remote data storage is attached to the new database server. 数据库恢复完成后,客户端可以通过网关连接到新的数据库服务器。After the database recovery is complete, clients can connect to the new database server through the Gateway.

使用 MySQL 数据库的应用程序需要以检测和重试已删除的连接和失败的事务的方式进行构建。Applications using the MySQL databases need to be built in a way that they detect and retry dropped connections and failed transactions. 当应用程序重试时,网关会将连接透明地重定向到新创建的数据库服务器。When the application retries, the Gateway transparently redirects the connection to the newly created database server.
存储失败Storage failure 应用程序不会对与存储相关的任何问题(例如磁盘故障或物理块损坏)产生任何影响。Applications do not see any impact for any storage-related issues such as a disk failure or a physical block corruption. 当数据存储在3个副本中时,数据的副本将由幸存下来的存储提供服务。As the data is stored in 3 copies, the copy of the data is served by the surviving storage. 自动更正损坏。Block corruptions are automatically corrected. 如果丢失了数据的副本,则会自动创建数据的新副本。If a copy of data is lost, a new copy of the data is automatically created.

下面是一些要求用户操作恢复的故障方案:Here are some failure scenarios that require user action to recover:

方案Scenario 恢复计划Recovery plan
区域故障 Region failure 如果某个区域发生故障,这种情况很少发生。Failure of a region is a rare event. 但是,如果需要从区域故障中获得保护,则可以在其他区域中配置一个或多个读取副本,以便进行灾难恢复 (DR) 。However, if you need protection from a region failure, you can configure one or more read replicas in other regions for disaster recovery (DR). (参阅此文,了解有关详细信息) 有关创建和管理读取副本的信息。(See this article about creating and managing read replicas for details). 如果出现区域级故障,可以手动将其他区域上配置的读取副本提升为生产数据库服务器。In the event of a region-level failure, you can manually promote the read replica configured on the other region to be your production database server.
逻辑/用户错误 Logical/user errors 从用户错误(例如意外删除的表或错误更新的数据)进行的恢复涉及到执行时间点恢复 (PITR) ,方法是在错误发生之前还原和恢复数据。Recovery from user errors, such as accidentally dropped tables or incorrectly updated data, involves performing a point-in-time recovery (PITR), by restoring and recovering the data until the time just before the error had occurred.

如果要仅还原数据库或特定表的一个子集,而不是数据库服务器中的所有数据库,可以在新的实例中还原数据库服务器,通过mysqldump将表导出 () ,然后使用restore将这些表还原到数据库。If you want to restore only a subset of databases or specific tables rather than all databases in the database server, you can restore the database server in a new instance, export the table(s) via mysqldump, and then use restore to restore those tables into your database.

总结Summary

Azure Database for MySQL 提供了数据库服务器、冗余存储和网关的高效路由的快速重新启动功能。Azure Database for MySQL provides fast restart capability of database servers, redundant storage, and efficient routing from the Gateway. 对于其他数据保护,你可以将备份配置为异地复制,同时在其他区域部署一个或多个读取副本。For additional data protection, you can configure backups to be geo-replicated, and also deploy one or more read replicas in other regions. 利用固有的高可用性功能,Azure Database for MySQL 保护数据库免受最常见的服务中断的影响,并提供行业领先、支持融资的99.99% 的运行时间 SLAWith inherent high availability capabilities, Azure Database for MySQL protects your databases from most common outages, and offers an industry leading, finance-backed 99.99% of uptime SLA. 所有这些可用性和可靠性功能使得 Azure 成为运行任务关键型应用程序的理想平台。All these availability and reliability capabilities enable Azure to be the ideal platform to run your mission-critical applications.

后续步骤Next steps