Azure Stack 集线器中的缩放单位节点操作-模块化数据中心 (MDC) Scale unit node actions in Azure Stack Hub - Modular Data Center (MDC)

本文介绍如何查看缩放单元的状态。This article describes how to view the status of a scale unit. 可以查看单元的节点。You can view the unit's nodes. 可以运行开机、关机、关闭、清空、恢复和修复等节点操作。You can run node actions like power on, power off, shut down, drain, resume, and repair. 通常,在现场更换组件期间或者在帮助恢复节点时,会使用这些节点操作。Typically, you use these node actions during field replacement of parts, or to help recover a node.

重要

本文中所述的所有节点操作每次应该针对一个节点。All node actions described in this article should target one node at a time.

查看节点状态View the node status

在管理员门户中,可以查看缩放单元及其关联节点的状态。In the administrator portal, you can view the status of a scale unit and its associated nodes.

查看缩放单元的状态:To view the status of a scale unit:

  1. 在“区域管理”磁贴中选择区域。On the Region management tile, select the region.

  2. 在左侧的“基础结构资源”下,选择“缩放单元”。 On the left, under Infrastructure resources, select Scale units.

  3. 在结果中选择缩放单元。In the results, select the scale unit.

  4. 从左侧的“常规”下面,选择“节点”。 On the left, under General, select Nodes.

    查看以下信息:View the following information:

    • 各个节点的列表The list of individual nodes

    • 操作状态 (参阅下面的列表) Operational State (see list below)

    • 电源状态 (运行或已停止) Power State (Running or Stopped)

    • 服务器模型Server model

    • 基板管理控制器 (BMC) 的 IP 地址IP address of the baseboard management controller (BMC)

    • 内核总数Total number of cores

    • 总内存量Total amount of memory

      缩放单元的状态

节点操作状态Node operational states

状态Status 说明Description
正在运行Running 节点都积极参与缩放单元。The node is actively participating in the scale unit.
已停止Stopped 节点不可用。The node is unavailable.
正在添加Adding 正在主动将节点添加到缩放单元。The node is actively being added to the scale unit.
正在修复Repairing 正在主动修复节点。The node is actively being repaired.
维护Maintenance 节点已暂停,没有处于运行状态的活动用户工作负荷。The node is paused, and no active user workload is running.
需要修正Requires Remediation 检测到错误,需要修复节点。An error has been detected that requires the node to be repaired.

缩放单元节点操作Scale unit node actions

查看缩放单元节点的相关信息时,也可以执行节点操作,例如:When you view information about a scale unit node, you can also perform node actions like:

  • 启动和停止(取决于当前电源状态)。Start and stop (depending on current power status).
  • 禁用和恢复(取决于操作状态)。Disable and resume (depending on operations status).
  • 修复。Repair.
  • 关闭。Shutdown.

节点的工作状态确定了哪些选项可用。The operational state of the node determines which options are available.

需要安装 Azure Stack Hub PowerShell 模块。You need to install Azure Stack Hub PowerShell modules. 这些 cmdlet 位于 Azs.Fabric.Admin 模块中。These cmdlets are in the Azs.Fabric.Admin module. 若要安装或验证适用于 Azure Stack Hub 的 PowerShell 的安装,请参阅安装适用于 Azure Stack Hub 的 PowerShellTo install or verify your installation of PowerShell for Azure Stack Hub, see Install PowerShell for Azure Stack Hub.

停止Stop

“停止”操作会关闭节点。The Stop action turns off the node. 它的作用如同按下电源按钮。It's the same as pressing the power button. 它不会向操作系统发送关闭信号。It doesn't send a shutdown signal to the operating system. 对于计划的停止操作,请始终先尝试关闭操作。For planned stop operations, always try the shutdown operation first.

此操作通常在节点处于无响应状态时使用。This action is typically used when a node is in an unresponsive state.

若要运行停止操作,请打开权限提升的 PowerShell 提示符,并运行以下 cmdlet:To run the stop action, open an elevated PowerShell prompt, and run the following cmdlet:

  Stop-AzsScaleUnitNode -Location <RegionName> -Name <NodeName>

在停止操作不起作用的情况下(这种情况很少见),请重试操作,如果仍然失败,请改用 BMC Web 界面。In the unlikely case that the stop action doesn't work, retry the operation and if it fails a second time use the BMC web interface instead.

有关详细信息,请参阅 Stop-AzsScaleUnitNodeFor more information, see Stop-AzsScaleUnitNode.

开始Start

“启动”操作会打开节点。The start action turns on the node. 它的作用如同按下电源按钮。It's the same as if you press the power button.

若要运行启动操作,请打开权限提升的 PowerShell 提示符,并运行以下 cmdlet:To run the start action, open an elevated PowerShell prompt, and run the following cmdlet:

  Start-AzsScaleUnitNode -Location <RegionName> -Name <NodeName>

万一启动操作不起作用,则重试该操作。In the unlikely case that the start action doesn't work, retry the operation. 如果它再次失败,请改用 BMC Web 界面。If it fails a second time, use the BMC web interface instead.

有关详细信息,请参阅 Start-AzsScaleUnitNodeFor more information, see Start-AzsScaleUnitNode.

清空Drain

“清空”操作将所有活动工作负荷移到该特定缩放单元中的剩余节点。The drain action moves all active workloads to the remaining nodes in that particular scale unit.

在现场更换组件期间(例如,更换整个节点),通常使用此操作。This action is typically used during field replacement of parts, like the replacement of an entire node.

重要

在计划内维护时段内,确保只在已通知用户后才对节点进行清空操作。Make sure you use a drain operation on a node during a planned maintenance window, where users have been notified. 在某些情况下,活动的工作负荷可能遇到中断。Under some conditions, active workloads can experience interruptions.

若要运行清空操作,请打开权限提升的 PowerShell 提示符,并运行以下 cmdlet:To run the drain action, open an elevated PowerShell prompt, and run the following cmdlet:

  Disable-AzsScaleUnitNode -Location <RegionName> -Name <NodeName>

有关详细信息,请参阅 Disable-AzsScaleUnitNodeFor more information, see Disable-AzsScaleUnitNode.

恢复Resume

“恢复”操作恢复已禁用的节点,并将其标记为活动,可用于放置工作负荷。The resume action resumes a disabled node and marks it active for workload placement. 之前在节点上运行的工作负荷不会故障回复。Earlier workloads that were running on the node don't fail back. (如果在节点上使用清空操作,请务必关机。(If you use a drain operation on a node be sure to power off. 将节点重新开机时,系统不会将它标记为可放置工作负荷的活动状态。When you power the node back on it's not marked as active for workload placement. 准备就绪后,必须使用恢复操作将节点标记为活动。)When ready, you must use the resume action to mark the node as active.)

若要运行恢复操作,请打开权限提升的 PowerShell 提示符,并运行以下 cmdlet:To run the resume action, open an elevated PowerShell prompt, and run the following cmdlet:

  Enable-AzsScaleUnitNode -Location <RegionName> -Name <NodeName>

有关详细信息,请参阅 Enable-AzsScaleUnitNodeFor more information, see Enable-AzsScaleUnitNode.

修复Repair

“修复”操作可修复节点。The repair action repairs a node. 请只在出现以下情况时才使用此操作:Use it only for either of the following scenarios:

  • 更换整个节点(不管是否包含新数据磁盘)时。Full node replacement (with or without new data disks).
  • 硬件组件发生故障并予以更换之后(如果现场可更换单元 (FRU) 文档中建议更换)。After hardware component failure and replacement (if advised in the field replaceable unit (FRU) documentation).

重要

需要更换节点或单个硬件组件时,请参阅 OEM 硬件供应商的 FRU 文档,以了解具体步骤。See your OEM hardware vendor's FRU documentation for exact steps when you need to replace a node or individual hardware components. FRU 文档将指定在更换硬件组件之后是否需要运行修复操作。The FRU documentation will specify whether you need to run the repair action after replacing a hardware component.

运行修复操作时,需要指定 BMC IP 地址。When you run the repair action, you need to specify the BMC IP address.

若要运行修复操作,请打开权限提升的 PowerShell 提示符,并运行以下 cmdlet:To run the repair action, open an elevated PowerShell prompt, and run the following cmdlet:

Repair-AzsScaleUnitNode -Location <RegionName> -Name <NodeName> -BMCIPv4Address <BMCIPv4Address>

ShutdownShutdown

“关闭”操作会先将所有活动工作负荷移到同一缩放单元中的其余节点。The shutdown action first moves all active workloads to the remaining nodes in the same scale unit. 然后该操作会正常关闭缩放单元节点。Then the action gracefully shuts down the scale unit node.

启动已关闭的节点后,需要运行 恢复操作。After you start a node that was shut down, you need to run the resume action. 之前在节点上运行的工作负荷不会故障回复。Earlier workloads that were running on the node don't fail back.

如果关闭操作失败,请尝试“清空”操作,然后执行关闭操作。If the shutdown operation fails, attempt the drain operation followed by the shutdown operation.

若要运行关闭操作,请打开权限提升的 PowerShell 提示符,并运行以下 cmdlet:To run the shutdown action, open an elevated PowerShell prompt, and run the following cmdlet:

Stop-AzsScaleUnitNode -Location <RegionName> -Name <NodeName> -Shutdown

后续步骤Next steps

了解 Azure Stack Hub Fabric 操作员模块Learn about the Azure Stack Hub Fabric operator module.