Windows Server 2016 中的运行状况服务Health Service in Windows Server 2016

适用于 Windows Server 2016Applies to Windows Server 2016

运行状况服务是 Windows Server 2016,以便改进的日常监视和运营群集运行存储空间直通体验中的新增功能。The Health Service is a new feature in Windows Server 2016 that improves the day-to-day monitoring and operational experience for clusters running Storage Spaces Direct.

先决条件Prerequisites

默认情况下使用存储空间直通启用运行状况服务。The Health Service is enabled by default with Storage Spaces Direct. 不需要对其进行设置,或开始它执行任何其他操作。No additional action is required to set it up or start it. 若要了解有关存储空间直通的详细信息,请参阅存储空间直通在 Windows Server 2016To learn more about Storage Spaces Direct, see Storage Spaces Direct in Windows Server 2016.

报告Reports

请参阅运行状况服务报告See Health Service reports.

错误Faults

请参阅运行状况服务错误See Health Service faults.

操作Actions

请参阅运行状况服务操作See Health Service actions.

自动化Automation

本部分介绍工作流,这自动磁盘生命周期的运行状况服务。This section describes workflows which are automated by the Health Service in the disk lifecycle.

磁盘生命周期Disk Lifecycle

运行状况服务自动大多数阶段物理磁盘生命周期。The Health Service automates most stages of the physical disk lifecycle. 假设你的部署的初始状态处于完美健康-是说一句所有物理磁盘正常工作。Let's say that the initial state of your deployment is in perfect health - which is to say, all physical disks are working properly.

停用Retirement

当可以不再使用它们,并引发相应故障,物理磁盘自动已停用。Physical disks are automatically retired when they can no longer be used, and a corresponding Fault is raised. 有几种情况:There are several cases:

  • 媒体失败: 物理磁盘明确失败,或折断,必须取代。Media Failure: the physical disk is definitively failed or broken, and must be replaced.

  • 丢失通信: 物理磁盘具有连续的超过 15 分钟失去连接。Lost Communication: the physical disk has lost connectivity for over 15 consecutive minutes.

  • 响应: 物理磁盘具有表现超过 5.0 秒三或一小时内更多时间的延迟。Unresponsive: the physical disk has exhibited latency of over 5.0 seconds three or more times within an hour.

备注

如果连接到多物理磁盘丢失次,或运行状况服务将给整个节点或设置箱存储,停用这些磁盘,因为它们很可能无法根问题。If connectivity is lost to many physical disks at once, or to an entire node or storage enclosure, the Health Service will not retire these disks since they are unlikely to be the root problem.

如果磁盘停用充当的缓存多物理磁盘,这些将自动重新分配到另一个缓存磁盘如果可用。If the retired disk was serving as the cache for many other physical disks, these will automatically be reassigned to another cache disk if one is available. 不不需要任何特殊的用户操作。No special user action is required.

还原复原Restoring resiliency

一旦物理磁盘已停用,运行状况服务将立即开始复制到剩余物理磁盘,若要还原完整复原及其数据。Once a physical disk has been retired, the Health Service immediately begins copying its data onto the remaining physical disks, to restore full resiliency. 完成后的数据是完全安全,并重新出现故障能力。Once this has completed, the data is completely safe and fault tolerant anew.

备注

此即时还原需要剩余物理磁盘足够可用的容量。This immediate restoration requires sufficient available capacity among the remaining physical disks.

闪烁的指示灯Blinking the indicator light

如果可能,请运行状况服务将开始闪烁停用的物理磁盘或其插槽指示灯。If possible, the Health Service will begin blinking the indicator light on the retired physical disk or its slot. 这将继续无限期,直到更换停用的磁盘。This will continue indefinitely, until the retired disk is replaced.

备注

在某些情况下,磁盘可能失败甚至其指示灯,不能无法正常工作的方式等总体电源胡子。In some cases, the disk may have failed in a way that precludes even its indicator light from functioning - for example, a total loss of power.

物理更换Physical replacement

你应当替换停用的物理磁盘时可能。You should replace the retired physical disk when possible. 大多数情况下,这包含热交换-即Most often, this consists of a hot-swap - i.e. 不需要关闭节点或存储外壳接通电源。powering off the node or storage enclosure is not required. 请参阅有关有用的位置和一部分信息错误。See the Fault for helpful location and part information.

验证Verification

替换磁盘插入时,它将验证针对支持组件文档 (请参阅下一步部分)。When the replacement disk is inserted, it will be verified against the Supported Components Document (see the next section).

Pooling

如果允许,到其前置池输入使用自动替换替换磁盘。If allowed, the replacement disk is automatically substituted into its predecessor's pool to enter use. 在此情况下,系统会返回为其初始状态完美的运行状况,并故障消失。At this point, the system is returned to its initial state of perfect health, and then the Fault disappears.

受支持的组件文档Supported Components Document

运行状况服务提供强制机制限制使用存储空间直通到受支持组件文档管理员或解决方案供应商提供的这些组件。The Health Service provides an enforcement mechanism to restrict the components used by Storage Spaces Direct to those on a Supported Components Document provided by the administrator or solution vendor. 这可用于防止错误的使用不受支持的硬件由你或其他人,它可帮助的保修或支持合同合规性。This can be used to prevent mistaken use of unsupported hardware by you or others, which may help with warranty or support contract compliance. 此功能当前仅限于物理磁盘设备,包括 ssd 的系统,硬盘,而且 NVMe 驱动器。This functionality is currently limited to physical disk devices, including SSDs, HDDs, and NVMe drives. 在型号、 (可选)、 制造商和固件版本 (可选) 上,可以限制支持组件文档。The Supported Components Document can restrict on model, manufacturer (optional), and firmware version (optional).

使用情况Usage

受支持组件文档使用 XML 灵感语法。The Supported Components Document uses an XML-inspired syntax. 我们建议使用你最喜欢的文本编辑器,如 Visual Studio 代码 (可用免费此处) 或记事本,创建一个 XML 文档,你可以保存并重新使用该值。We recommend using your favorite text editor, such as Visual Studio Code (available for free here) or Notepad, to create an XML document which you can save and reuse.

部分Sections

文档有两个独立的各部分:磁盘缓存The document has two independent sections: Disks and Cache.

如果磁盘提供部分时,列出的驱动器允许加入池。If the Disks section is provided, only the drives listed are allowed to join pools. 任何未列出的驱动器不能加入池,它可以有效地消除他们使用生产中。Any unlisted drives are prevented from joining pools, which effectively precludes their use in production. 如果本部分留空,将允许任何驱动器加入池。If this section is left empty, any drive will be allowed to join pools.

如果缓存提供部分、 列出的驱动器将用于缓存。If the Cache section is provided, only the drives listed will be used for caching. 如果本部分留空,存储空间直通将尝试自行也猜不到根据的媒体类型和总线类型。If this section is left empty, Storage Spaces Direct will attempt to guess based on media type and bus type. 例如,如果你的部署使用固态驱动器 (SSD) 与硬盘驱动器 (硬盘),前者自动为选择缓存;但是,如果你的部署使用所有 flash,你可能需要指定你希望用于下面缓存的更高版本耐受设备。For example, if your deployment uses solid-state drives (SSD) and hard disk drives (HDD), the former is automatically chosen for caching; however, if your deployment uses all-flash, you may need to specify the higher endurance devices you'd like to use for caching here.

重要

已汇集在一起的驱动器,并且在使用支持组件文档不会应用。The Supported Components Document does not apply retroactively to drives already pooled and in use.

示例Example

<Components>

  <Disks>
    <Disk>
      <Manufacturer>Contoso</Manufacturer>
      <Model>XYZ9000</Model>
      <AllowedFirmware>
        <Version>2.0</Version>
        <Version>2.1</Version>
        <Version>2.2</Version>
      </AllowedFirmware>
      <TargetFirmware>
        <Version>2.1</Version>
        <BinaryPath>\\path\to\image.bin</BinaryPath>
      </TargetFirmware>
    </Disk>
  </Disks>

  <Cache>
    <Disk>
      <Manufacturer>Fabrikam</Manufacturer>
      <Model>QRSTUV</Model>
    </Disk>
  </Cache>

</Components>

若要在列表中的多个驱动器,只需添加其他<磁盘>两部分内的标签。To list multiple drives, simply add additional <Disk> tags within either section.

以便此 XML,插入部署存储空间直通时,使用XML标志:To inject this XML when deploying Storage Spaces Direct, use the -XML flag:

Enable-ClusterS2D -XML <MyXML>

若要设置或对其进行修改支持组件存储空间直通部署后 (即后已经在运行状况服务),使用以下 PowerShell cmdlet:To set or modify the Supported Components Document once Storage Spaces Direct has been deployed (i.e. once the Health Service is already running), use the following PowerShell cmdlet:

$MyXML = Get-Content <\\path\to\file.xml> | Out-String  
Get-StorageSubSystem Cluster* | Set-StorageHealthSetting -Name "System.Storage.SupportedComponents.Document" -Value $MyXML  

备注

模型、 制造商和固件版本属性完全应匹配的值你获得使用获取物理磁盘cmdlet。The model, manufacturer, and the firmware version properties should exactly match the values that you get using the Get-PhysicalDisk cmdlet. 这可能不同于你的"常识"期望,具体取决于你的供应商实现。This may differ from your "common sense" expectation, depending on your vendor's implementation. 例如,而不是"Contoso,"制造商可能"CONTOSO-LTD"或者"Contoso XZY9000"模型时可能空白。For example, rather than "Contoso", the manufacturer may be "CONTOSO-LTD", or it may be blank while the model is "Contoso-XZY9000".

你可以使用以下 PowerShell cmdlet,验证:You can verify using the following PowerShell cmdlet:

Get-PhysicalDisk | Select Model, Manufacturer, FirmwareVersion  

设置Settings

请参阅运行状况服务设置See Health Service settings.

请参阅See also