健康服務錯誤Health Service faults

適用於 Windows Server 2016Applies to Windows Server 2016

有哪些錯誤?What are faults

健康服務持續監視儲存空間直接存取叢集偵測到問題並產生」錯誤」。The Health Service constantly monitors your Storage Spaces Direct cluster to detect problems and generate "faults". 一個新 cmdlet 會顯示目前錯誤,可讓您輕鬆地不想在每個實體驗證您的部署的健康狀態,或依序功能。One new cmdlet displays any current faults, allowing you to easily verify the health of your deployment without looking at every entity or feature in turn. 錯誤專為精確、 輕鬆地了解,且可執行動作。Faults are designed to be precise, easy to understand, and actionable.

每個錯誤包含五個重要欄位:Each fault contains five important fields:

  • 嚴重性Severity
  • 問題描述Description of the problem
  • 建議的下一個步驟地問題Recommended next step(s) to address the problem
  • 錯誤實體辨識資訊Identifying information for the faulting entity
  • (如果有的話),其所在的位置Its physical location (if applicable)

例如,以下是一般錯誤:For example, here is a typical fault:

Severity: MINOR                                         
Reason: Connectivity has been lost to the physical disk.                           
Recommendation: Check that the physical disk is working and properly connected.    
Part: Manufacturer Contoso, Model XYZ9000, Serial 123456789                        
Location: Seattle DC, Rack B07, Node 4, Slot 11

注意

從您的錯誤網域設定衍生所在的位置。The physical location is derived from your fault domain configuration. 如需網域錯誤,請查看在 Windows Server 2016 錯誤網域For more information about fault domains, see Fault Domains in Windows Server 2016. 如果您不提供這項資訊,會較很有幫助的 [位置] 欄位-,例如它可能只顯示卡插槽的數字。If you do not provide this information, the location field will be less helpful - for example, it may only show the slot number.

根本原因分析Root cause analysis

健康服務可以存取之間發生錯誤實體以找出並結合錯誤相同的基本問題的結果,這可能的原因。The Health Service can assess the potential causality among faulting entities to identify and combine faults which are consequences of the same underlying problem. 來辨識鏈結的影響,如此較頻繁報告。By recognizing chains of effect, this makes for less chatty reporting. 例如,伺服器當機,如果預期比任何伺服器的磁碟機而不需要連接也會。For example, if a server is down, it is expected than any drives within the server will also be without connectivity. 因此,只有一個錯誤都會引發的根本原因-在本案例中伺服器。Therefore, only one fault will be raised for the root cause - in this case, the server.

使用中的 PowerShellUsage in PowerShell

若要查看目前在 PowerShell 錯誤,請執行下列 cmdlet:To see any current faults in PowerShell, run this cmdlet:

Get-StorageSubSystem Cluster* | Debug-StorageSubSystem  

這會影響儲存空間直接存取叢集整體的任何錯誤。This returns any faults which affect the overall Storage Spaces Direct cluster. 最常,這些錯誤與硬體或設定。Most often, these faults relate to hardware or configuration. 如果不有任何錯誤,此 cmdlet 會傳回執行任何動作。If there are no faults, this cmdlet will return nothing.

注意

在非 production 環境中,並自行承擔,您可以嘗試使用此功能觸發錯誤自己-,例如移除一個所在的磁碟或關機一個節點。In a non-production environment, and at your own risk, you can experiment with this feature by triggering faults yourself - for example, by removing one physical disk or shutting down one node. 出現錯誤之後, 重新插入所在的磁碟或重新開機] 節點和錯誤會消失再試一次。Once the fault has appeared, re-insert the physical disk or restart the node and the fault will disappear again.

您也可以檢視的影響,只有特定磁碟區或使用下列 cmdlet 檔案共用的錯誤:You can also view faults that are affecting only specific volumes or file shares with the following cmdlets:

Get-Volume -FileSystemLabel <Label> | Debug-Volume  

Get-FileShare -Name <Name> | Debug-FileShare  

這會影響只有在特定音量或檔案共用的錯誤。This returns any faults that affect only the specific volume or file share. 最常,這些錯誤與容量規劃、資料恢復功能或功能,例如儲存服務品質或儲存複本。Most often, these faults relate to capacity planning, data resiliency, or features like Storage Quality-of-Service or Storage Replica.

使用.NET 和 C#Usage in .NET and C#

連接Connect

為了查詢健康服務,您將需要建立CimSession與叢集。In order to query the Health Service, you will need to establish a CimSession with the cluster. 若要這樣做,您將需要幾個步驟,將只提供完整.NET 中,這表示您無法輕易地執行此動作直接從 web 或行動裝置應用程式。To do so, you will need some things that are only available in full .NET, meaning you cannot readily do this directly from a web or mobile app. 這些程式碼範例將會使用 C\ #,此資料的存取層最簡單的選擇。These code samples will use C#, the most straightforward choice for this data access layer.

...
using System.Security;
using Microsoft.Management.Infrastructure;

public CimSession Connect(string Domain = "...", string Computer = "...", string Username = "...", string Password = "...")
{
    SecureString PasswordSecureString = new SecureString();
    foreach (char c in Password)
    {
        PasswordSecureString.AppendChar(c);
    }

    CimCredential Credentials = new CimCredential(
        PasswordAuthenticationMechanism.Default, Domain, Username, PasswordSecureString);
    WSManSessionOptions SessionOptions = new WSManSessionOptions();
    SessionOptions.AddDestinationCredentials(Credentials);
    Session = CimSession.Create(Computer, SessionOptions);
    return Session;
}

提供的使用者名稱應該的目標電腦本機系統管理員。The provided Username should be a local Administrator of the target Computer.

我們建議您建構密碼SecureString直接中的使用者輸入的即時,讓他們的密碼永遠不會儲存在記憶體中明文。It is recommended that you construct the Password SecureString directly from user input in real-time, so their password is never stored in memory in cleartext. 這有助於減少各種不同的安全性問題。This helps mitigate a variety of security concerns. 但實際上,它建構上述常見的原型用途。But in practice, constructing it as above is common for prototyping purposes.

探索物件Discover objects

使用CimSession建立,您可以查詢 Windows 管理檢測 (WMI) 叢集上。With the CimSession established, you can query Windows Management Instrumentation (WMI) on the cluster.

您可以取得錯誤或計量之前,您必須將數個相關的物件的執行個體。Before you can get Faults or Metrics, you will need to get instances of several relevant objects. 第一次,MSFT_StorageSubSystem,表示叢集上儲存空間直接存取。First, the MSFT_StorageSubSystem which represents Storage Spaces Direct on the cluster. 使用的您可以取得每個MSFT_StorageNode中叢集,與每個MSFT_Volume,資料磁碟區。Using that, you can get every MSFT_StorageNode in the cluster, and every MSFT_Volume, the data volumes. 最後,您將必須MSFT_StorageHealth,健康太服務本身。Finally, you will need the MSFT_StorageHealth, the Health Service itself, too.

CimInstance Cluster;
List<CimInstance> Nodes;
List<CimInstance> Volumes;
CimInstance HealthService;

public void DiscoverObjects(CimSession Session)
{
    // Get MSFT_StorageSubSystem for Storage Spaces Direct
    Cluster = Session.QueryInstances(@"root\microsoft\windows\storage", "WQL", "SELECT * FROM MSFT_StorageSubSystem")
        .First(Instance => (Instance.CimInstanceProperties["FriendlyName"].Value.ToString()).Contains("Cluster"));

    // Get MSFT_StorageNode for each cluster node
    Nodes = Session.EnumerateAssociatedInstances(Cluster.CimSystemProperties.Namespace,
        Cluster, "MSFT_StorageSubSystemToStorageNode", null, "StorageSubSystem", "StorageNode").ToList();

    // Get MSFT_Volumes for each data volume
    Volumes = Session.EnumerateAssociatedInstances(Cluster.CimSystemProperties.Namespace,
        Cluster, "MSFT_StorageSubSystemToVolume", null, "StorageSubSystem", "Volume").ToList();

    // Get MSFT_StorageHealth itself
    HealthService = Session.EnumerateAssociatedInstances(Cluster.CimSystemProperties.Namespace,
        Cluster, "MSFT_StorageSubSystemToStorageHealth", null, "StorageSubSystem", "StorageHealth").First();
}

這些是您收到 PowerShell 中使用像是 cmdlet 相同的物件取得-StorageSubSystem取得-StorageNode,並取得磁碟區These are the same objects you get in PowerShell using cmdlets like Get-StorageSubSystem, Get-StorageNode, and Get-Volume.

您具有相同可以存取屬性,記載在儲存管理 API 類別You can access all the same properties, documented at Storage Management API Classes.

...
using System.Diagnostics;

foreach (CimInstance Node in Nodes)
{
    // For illustration, write each node's Name to the console. You could also write State (up/down), or anything else!
    Debug.WriteLine("Discovered Node " + Node.CimInstanceProperties["Name"].Value.ToString());
}

查詢錯誤Query faults

叫用診斷將目前的範圍目標錯誤CimInstance,這是叢集或任何磁碟區。Invoke Diagnose to get any current faults scoped to the target CimInstance, which be the cluster or any volume.

下述是可在 Windows Server 2016 中的每個範圍錯誤的完整清單。The complete list of faults available at each scope in Windows Server 2016 is documented below.

public void GetFaults(CimSession Session, CimInstance Target)
{
    // Set Parameters (None)
    CimMethodParametersCollection FaultsParams = new CimMethodParametersCollection();
    // Invoke API
    CimMethodResult Result = Session.InvokeMethod(Target, "Diagnose", FaultsParams);
    IEnumerable<CimInstance> DiagnoseResults = (IEnumerable<CimInstance>)Result.OutParameters["DiagnoseResults"].Value;
    // Unpack
    if (DiagnoseResults != null)
    {
        foreach (CimInstance DiagnoseResult in DiagnoseResults)
        {
            // TODO: Whatever you want!
        }
    }
}

選項:MyFault 課Optional: MyFault class

這可能會讓您建立和保存您自己的代表項錯誤。It may make sense for you to construct and persist your own representation of faults. 例如,此MyFault課程儲存錯誤,包括重要的數個屬性FaultId,可以使用稍後之間的關聯更新或移除通知,或 deduplicate,是相同的錯誤偵測到有任何原因多個時間。For example, this MyFault class stores several key properties of faults, including the FaultId, which can be used later to associate update or remove notifications, or to deduplicate in the event that the same fault is detected multiple times, for whatever reason.

public class MyFault {
    public String FaultId { get; set; }
    public String Reason { get; set; }
    public String Severity { get; set; }
    public String Description { get; set; }
    public String Location { get; set; }

    // Constructor
    public MyFault(CimInstance DiagnoseResult)
    {
        CimKeyedCollection<CimProperty> Properties = DiagnoseResult.CimInstanceProperties;
        FaultId     = Properties["FaultId"                  ].Value.ToString();
        Reason      = Properties["Reason"                   ].Value.ToString();
        Severity    = Properties["PerceivedSeverity"        ].Value.ToString();
        Description = Properties["FaultingObjectDescription"].Value.ToString();
        Location    = Properties["FaultingObjectLocation"   ].Value.ToString();
    }
}
List<MyFault> Faults = new List<MyFault>;

foreach (CimInstance DiagnoseResult in DiagnoseResults)
{
    Faults.Add(new Fault(DiagnoseResult));
}

在每個錯誤屬性的完整清單 (DiagnoseResult) 下述。The complete list of properties in each fault (DiagnoseResult) is documented below.

錯誤事件Fault events

當您建立、移除或更新錯誤時,Health 服務會產生 WMI 事件。When Faults are created, removed, or updated, the Health Service generates WMI events. 這些是基本保持不常用輪詢,同步您的應用程式的狀態,可協助如判斷時傳送電子郵件通知,例如。These are essential to keeping your application state in sync without frequent polling, and can help with things like determining when to send email alerts, for example. 若要希望這些事件,此程式碼範例再試一次使用觀察者設計模式。To subscribe to these events, this sample code uses the Observer Design Pattern again.

首先,希望MSFT_StorageFaultEvent事件。First, subscribe to MSFT_StorageFaultEvent events.

public void ListenForFaultEvents()
{
    IObservable<CimSubscriptionResult> Events = Session.SubscribeAsync(
        @"root\microsoft\windows\storage", "WQL", "SELECT * FROM MSFT_StorageFaultEvent");
    // Subscribe the Observer
    FaultsObserver<CimSubscriptionResult> Observer = new FaultsObserver<CimSubscriptionResult>(this);
    IDisposable Disposeable = Events.Subscribe(Observer);
}   

接下來,實作觀察者其OnNext()當新的事件也會叫用方法。Next, implement an Observer whose OnNext() method will be invoked whenever a new event is generated.

每個事件包含變更型別表示是否錯誤已建立、移除或更新,並相關FaultIdEach event contains ChangeType indicating whether a fault is being created, removed, or updated, and the relevant FaultId.

此外,它們會包含錯誤本身的所有屬性。In addition, they contain all the properties of the fault itself.

class FaultsObserver : IObserver
{
    public void OnNext(T Event)
    {
        // Cast
        CimSubscriptionResult SubscriptionResult = Event as CimSubscriptionResult;

        if (SubscriptionResult != null)
        {
            // Unpack            
            CimKeyedCollection<CimProperty> Properties = SubscriptionResult.Instance.CimInstanceProperties;
            String ChangeType = Properties["ChangeType"].Value.ToString();
            String FaultId = Properties["FaultId"].Value.ToString();

            // Create
            if (ChangeType == "0")
            {
                Fault MyNewFault = new MyFault(SubscriptionResult.Instance);
                // TODO: Whatever you want!
            }
            // Remove
            if (ChangeType == "1")
            {
                // TODO: Use FaultId to find and delete whatever representation you have...
            }
            // Update
            if (ChangeType == "2")
            {
                // TODO: Use FaultId to find and modify whatever representation you have...
            }
        }
    }
    public void OnError(Exception e)
    {
        // Handle Exceptions
    }
    public void OnCompleted()
    {
        // Nothing
    }
}

了解錯誤週期Understand fault lifecycle

錯誤不打算會標示為「出現「或解析使用者。Faults are not intended to be marked "seen" or resolved by the user. 建立和時 Health 服務會遵守問題,會自動移除才不會再健康服務觀察到的問題。They are created when the Health Service observes a problem, and they are removed automatically and only when the Health Service can no longer observe the problem. 一般而言,這會反映已修正該問題。In general, this reflects that the problem has been fixed.

不過,有時候,錯誤可能被 rediscovered 健康服務(例如後移轉後,或因暫時性連接等。)。However, in some cases, faults may be rediscovered by the Health Service (e.g. after failover, or due to intermittent connectivity, etc.). 基於這個原因,可能會使得保存您自己的代表項錯誤,因此您可以輕鬆地 deduplicate 據用量感知器。For this reason, it may makes sense to persist your own representation of faults, so you can easily deduplicate. 這是您傳送電子郵件通知或相當於的非常重要。This is especially important if you send email alerts or equivalent.

錯誤的屬性Properties of faults

此表格提供幾個重要錯誤物件的屬性。This table presents several key properties of the fault object. 完整架構、檢查MSFT_StorageDiagnoseResultstoragewmi.mofFor the full schema, inspect the MSFT_StorageDiagnoseResult class in storagewmi.mof.

屬性Property 範例Example
FaultIdFaultId {12345-12345-12345-12345-12345}{12345-12345-12345-12345-12345}
FaultTypeFaultType Microsoft.Health.FaultType.Volume.CapacityMicrosoft.Health.FaultType.Volume.Capacity
原因Reason 「磁碟區可用空間不足。」"The volume is running out of available space."
PerceivedSeverityPerceivedSeverity 55
FaultingObjectDescriptionFaultingObjectDescription Contoso XYZ9000 S.N.Contoso XYZ9000 S.N. 123456789123456789
FaultingObjectLocationFaultingObjectLocation 提供架狀 A06 RU 25、插槽 11Rack A06, RU 25, Slot 11
RecommendedActionsRecommendedActions {」展開磁碟區。」、「將工作負載移轉到其他磁碟區」。}{"Expand the volume.", "Migrate workloads to other volumes."}

FaultId在一個叢集的範圍中唯一。FaultId Unique within the scope of one cluster.

PerceivedSeverity PerceivedSeverity = {4、5、6} {」資訊」、「警告,「和」錯誤「} = 或相當於藍色、黃色和紅色的色彩。PerceivedSeverity PerceivedSeverity = { 4, 5, 6 } = { "Informational", "Warning", and "Error" }, or equivalent colors such as blue, yellow, and red.

FaultingObjectDescription第一的硬體、軟體物件通常空白的資訊。FaultingObjectDescription Part information for hardware, typically blank for software objects.

FaultingObjectLocation的位置資訊的硬體,通常是空白的軟體物件。FaultingObjectLocation Location information for hardware, typically blank for software objects.

RecommendedActions和任何特定訂單中的建議的動作,無關清單。RecommendedActions List of recommended actions, which are independent and in no particular order. 今天,這份清單通常是 1 長度。Today, this list is often of length 1.

屬性的錯誤事件Properties of fault events

此表格提供幾個重要的屬性的錯誤事件。This table presents several key properties of the fault event. 完整架構、檢查MSFT_StorageFaultEventstoragewmi.mofFor the full schema, inspect the MSFT_StorageFaultEvent class in storagewmi.mof.

注意變更型別,來表示是否錯誤所建立,移除,或更新和FaultIdNote the ChangeType, which indicates whether a fault is being created, removed, or updated, and the FaultId. 事件也包含的所有受影響的錯誤屬性。An event also contains all the properties of the affected fault.

屬性Property 範例Example
變更型別ChangeType 00
FaultIdFaultId {12345-12345-12345-12345-12345}{12345-12345-12345-12345-12345}
FaultTypeFaultType Microsoft.Health.FaultType.Volume.CapacityMicrosoft.Health.FaultType.Volume.Capacity
原因Reason 「磁碟區可用空間不足。」"The volume is running out of available space."
PerceivedSeverityPerceivedSeverity 55
FaultingObjectDescriptionFaultingObjectDescription Contoso XYZ9000 S.N.Contoso XYZ9000 S.N. 123456789123456789
FaultingObjectLocationFaultingObjectLocation 提供架狀 A06 RU 25、插槽 11Rack A06, RU 25, Slot 11
RecommendedActionsRecommendedActions {」展開磁碟區。」、「將工作負載移轉到其他磁碟區」。}{"Expand the volume.", "Migrate workloads to other volumes."}

變更型別變更型別 = {0、1、2} = {「建立」,」(移除),」更新」}。ChangeType ChangeType = { 0, 1, 2 } = { "Create", "Remove", "Update" }.

涵蓋範圍Coverage

Windows Server 2016 中健康服務提供下列錯誤涵蓋範圍:In Windows Server 2016, the Health Service provides the following fault coverage:

平均 (8)PhysicalDisk (8)

FaultType: Microsoft.Health.FaultType.PhysicalDisk.FailedMediaFaultType: Microsoft.Health.FaultType.PhysicalDisk.FailedMedia

  • 嚴重性:警告Severity: Warning
  • 理由:[實體磁碟已無法」。Reason: "The physical disk has failed."
  • RecommendedAction: 「取代實體磁碟」。RecommendedAction: "Replace the physical disk."

FaultType: Microsoft.Health.FaultType.PhysicalDisk.LostCommunicationFaultType: Microsoft.Health.FaultType.PhysicalDisk.LostCommunication

  • 嚴重性:警告Severity: Warning
  • 理由:「連接遺失了實體磁碟。」Reason: "Connectivity has been lost to the physical disk."
  • RecommendedAction: 「檢查實體磁碟是否工作及正確連接。」RecommendedAction: "Check that the physical disk is working and properly connected."

FaultType: Microsoft.Health.FaultType.PhysicalDisk.UnresponsiveFaultType: Microsoft.Health.FaultType.PhysicalDisk.Unresponsive

  • 嚴重性:警告Severity: Warning
  • 理由:[實體磁碟正呈現週期性一堆」。Reason: "The physical disk is exhibiting recurring unresponsiveness."
  • RecommendedAction: 「取代實體磁碟」。RecommendedAction: "Replace the physical disk."

FaultType: Microsoft.Health.FaultType.PhysicalDisk.PredictiveFailureFaultType: Microsoft.Health.FaultType.PhysicalDisk.PredictiveFailure

  • 嚴重性:警告Severity: Warning
  • 理由:「失敗實體磁碟很快就會發生預測」。Reason: "A failure of the physical disk is predicted to occur soon."
  • RecommendedAction: 「取代實體磁碟」。RecommendedAction: "Replace the physical disk."

FaultType: Microsoft.Health.FaultType.PhysicalDisk.UnsupportedHardwareFaultType: Microsoft.Health.FaultType.PhysicalDisk.UnsupportedHardware

  • 嚴重性:警告Severity: Warning
  • 理由:[實體磁碟隔離因為您的方案廠商不支援「。Reason: "The physical disk is quarantined because it is not supported by your solution vendor."
  • RecommendedAction: 「取代實體磁碟支援的硬體。」RecommendedAction: "Replace the physical disk with supported hardware."

FaultType: Microsoft.Health.FaultType.PhysicalDisk.UnsupportedFirmwareFaultType: Microsoft.Health.FaultType.PhysicalDisk.UnsupportedFirmware

  • 嚴重性:警告Severity: Warning
  • 理由:[實體磁碟處於隔離因為其韌體版本不支援方案廠商。」Reason: "The physical disk is in quarantine because its firmware version is not supported by your solution vendor."
  • RecommendedAction: 「更新實體磁碟上的韌體目標版本」。RecommendedAction: "Update the firmware on the physical disk to the target version."

FaultType: Microsoft.Health.FaultType.PhysicalDisk.UnrecognizedMetadataFaultType: Microsoft.Health.FaultType.PhysicalDisk.UnrecognizedMetadata

  • 嚴重性:警告Severity: Warning
  • 理由:[實體磁碟已發現無法辨識的中繼資料。」Reason: "The physical disk has unrecognised meta data."
  • RecommendedAction: 「這個磁碟可能包含來自不明的儲存集區的資料。第一次確定有任何可用的資料硬碟,然後重設磁片。」RecommendedAction: "This disk may contain data from an unknown storage pool. First make sure there is no useful data on this disk, then reset the disk."

FaultType: Microsoft.Health.FaultType.PhysicalDisk.FailedFirmwareUpdateFaultType: Microsoft.Health.FaultType.PhysicalDisk.FailedFirmwareUpdate

  • 嚴重性:警告Severity: Warning
  • 理由:「失敗嘗試更新韌體實體磁碟」。Reason: "Failed attempt to update firmware on the physical disk."
  • RecommendedAction: 「請嘗試使用不同韌體二進位」。RecommendedAction: "Try using a different firmware binary."

Virtual 磁碟 (2)Virtual Disk (2)

FaultType: Microsoft.Health.FaultType.VirtualDisks.NeedsRepairFaultType: Microsoft.Health.FaultType.VirtualDisks.NeedsRepair

  • 嚴重性:資訊Severity: Informational
  • 理由:「這個磁碟區上的某些資料不完整能因應。該 app 會維持無障礙。」Reason: "Some data on this volume is not fully resilient. It remains accessible."
  • RecommendedAction: 「還原恢復的資料」。RecommendedAction: "Restoring resiliency of the data."

FaultType: Microsoft.Health.FaultType.VirtualDisks.DetachedFaultType: Microsoft.Health.FaultType.VirtualDisks.Detached

  • 嚴重性:重要Severity: Critical
  • 理由:「磁碟區就無法存取。某些資料可能會遺失。」Reason: "The volume is inaccessible. Some data may be lost."
  • RecommendedAction: 「檢查實體和/或網路的所有存放裝置連接。您可能需要從備份還原。」RecommendedAction: "Check the physical and/or network connectivity of all storage devices. You may need to restore from backup."

集區容量 (1)Pool Capacity (1)

FaultType: Microsoft.Health.FaultType.StoragePool.InsufficientReserveCapacityFaultFaultType: Microsoft.Health.FaultType.StoragePool.InsufficientReserveCapacityFault

  • 嚴重性:警告Severity: Warning
  • 理由:「儲存集區未最低建議的保留容量。這可能會限制您要還原資料恢復發生磁碟機故障。」Reason: "The storage pool does not have the minimum recommended reserve capacity. This may limit your ability to restore data resiliency in the event of drive failure(s)."
  • RecommendedAction: 「到儲存集區中新增額外的容量或釋出容量。最小值建議保留視部署,但容量大約 2 磁碟機價值。」RecommendedAction: "Add additional capacity to the storage pool, or free up capacity. The minimum recommended reserve varies by deployment, but is approximately 2 drives' worth of capacity."

磁碟區容量 (2)1Volume Capacity (2)1

FaultType: Microsoft.Health.FaultType.Volume.CapacityFaultType: Microsoft.Health.FaultType.Volume.Capacity

  • 嚴重性:警告Severity: Warning
  • 理由:「磁碟區可用空間不足。」Reason: "The volume is running out of available space."
  • RecommendedAction: 「展開音量,或工作負載移轉給其他磁碟區」。RecommendedAction: "Expand the volume or migrate workloads to other volumes."

FaultType: Microsoft.Health.FaultType.Volume.CapacityFaultType: Microsoft.Health.FaultType.Volume.Capacity

  • 嚴重性:重要Severity: Critical
  • 理由:「磁碟區可用空間不足。」Reason: "The volume is running out of available space."
  • RecommendedAction: 「展開音量,或工作負載移轉給其他磁碟區」。RecommendedAction: "Expand the volume or migrate workloads to other volumes."

伺服器 (3)Server (3)

FaultType: Microsoft.Health.FaultType.Server.DownFaultType: Microsoft.Health.FaultType.Server.Down

  • 嚴重性:重要Severity: Critical
  • 理由:「無法存取伺服器」。Reason: "The server cannot be reached."
  • RecommendedAction: [[開始] 或更換伺服器。」RecommendedAction: "Start or replace server."

FaultType: Microsoft.Health.FaultType.Server.IsolatedFaultType: Microsoft.Health.FaultType.Server.Isolated

  • 嚴重性:重要Severity: Critical
  • 理由:「伺服器是因為連接問題的隔離的「。Reason: "The server is isolated from the cluster due to connectivity issues."
  • RecommendedAction: 「如果隔離持續發生,請查看的網路或工作負載移轉給其他節點」。RecommendedAction: "If isolation persists, check the network(s) or migrate workloads to other nodes."

FaultType: Microsoft.Health.FaultType.Server.QuarantinedFaultType: Microsoft.Health.FaultType.Server.Quarantined

  • 嚴重性:重要Severity: Critical
  • 理由:「伺服器隔離,因為週期性失敗叢集」。Reason: "The server is quarantined by the cluster due to recurring failures."
  • RecommendedAction: 「取代伺服器,或修正網路」。RecommendedAction: "Replace the server or fix the network."

(1) 叢集Cluster (1)

FaultType: Microsoft.Health.FaultType.ClusterQuorumWitness.ErrorFaultType: Microsoft.Health.FaultType.ClusterQuorumWitness.Error

  • 嚴重性:重要Severity: Critical
  • 理由:「叢集是原位前往一個伺服器失敗」。Reason: "The cluster is one server failure away from going down."
  • RecommendedAction: [見證資源,請檢查並視需要重新開機。[開始] 或更換失敗的伺服器」。RecommendedAction: "Check the witness resource, and restart as needed. Start or replace failed servers."

網路介面卡日介面 (4)Network Adapter/Interface (4)

FaultType: Microsoft.Health.FaultType.NetworkAdapter.DisconnectedFaultType: Microsoft.Health.FaultType.NetworkAdapter.Disconnected

  • 嚴重性:警告Severity: Warning
  • 理由:[網路介面已經成為中斷連接。」Reason: "The network interface has become disconnected."
  • RecommendedAction: 「重新網路線」。RecommendedAction: "Reconnect the network cable."

FaultType: Microsoft.Health.FaultType.NetworkInterface.MissingFaultType: Microsoft.Health.FaultType.NetworkInterface.Missing

  • 嚴重性:警告Severity: Warning
  • 理由:「{伺服器} 已遺失網路介面卡連接叢集網路 {叢集網路}」。Reason: "The server {server} has missing network adapter(s) connected to cluster network {cluster network}."
  • RecommendedAction: 「連接伺服器叢集網路遺失」。RecommendedAction: "Connect the server to the missing cluster network."

FaultType: Microsoft.Health.FaultType.NetworkAdapter.HardwareFaultType: Microsoft.Health.FaultType.NetworkAdapter.Hardware

  • 嚴重性:警告Severity: Warning
  • 理由:[網路介面已經有硬體故障」。Reason: "The network interface has had a hardware failure."
  • RecommendedAction: 「取代網路介面卡]。RecommendedAction: "Replace the network interface adapter."

FaultType: Microsoft.Health.FaultType.NetworkAdapter.DisabledFaultType: Microsoft.Health.FaultType.NetworkAdapter.Disabled

  • 嚴重性:警告Severity: Warning
  • 理由:[網路介面 {網路介面} 不支援並不使用]Reason: "The network interface {network interface} is not enabled and is not being used."
  • RecommendedAction: 「讓的網路介面。」RecommendedAction: "Enable the the network interface."

圍繞 (6)Enclosure (6)

FaultType: Microsoft.Health.FaultType.StorageEnclosure.LostCommunicationFaultType: Microsoft.Health.FaultType.StorageEnclosure.LostCommunication

  • 嚴重性:警告Severity: Warning
  • 理由:「通訊遺失了儲存圍繞。」Reason: "Communication has been lost to the storage enclosure."
  • RecommendedAction: [[開始] 畫面或取代儲存圍繞」。RecommendedAction: "Start or replace the storage enclosure."

FaultType: Microsoft.Health.FaultType.StorageEnclosure.FanErrorFaultType: Microsoft.Health.FaultType.StorageEnclosure.FanError

  • 嚴重性:警告Severity: Warning
  • 理由:「失敗位置 {位置} 儲存圍繞的粉絲」。Reason: "The fan at position {position} of the storage enclosure has failed."
  • RecommendedAction: 「取代粉絲儲存圍繞在「。RecommendedAction: "Replace the fan in the storage enclosure."

FaultType: Microsoft.Health.FaultType.StorageEnclosure.CurrentSensorErrorFaultType: Microsoft.Health.FaultType.StorageEnclosure.CurrentSensorError

  • 嚴重性:警告Severity: Warning
  • 理由:「失敗目前位置 {位置} 儲存圍繞的感應器」。Reason: "The current sensor at position {position} of the storage enclosure has failed."
  • RecommendedAction: 「取代目前感應器儲存圍繞在「。RecommendedAction: "Replace a current sensor in the storage enclosure."

FaultType: Microsoft.Health.FaultType.StorageEnclosure.VoltageSensorErrorFaultType: Microsoft.Health.FaultType.StorageEnclosure.VoltageSensorError

  • 嚴重性:警告Severity: Warning
  • 理由:「失敗電壓位置 {位置} 儲存圍繞的感應器」。Reason: "The voltage sensor at position {position} of the storage enclosure has failed."
  • RecommendedAction: 「取代電壓感應器儲存圍繞在「。RecommendedAction: "Replace a voltage sensor in the storage enclosure."

FaultType: Microsoft.Health.FaultType.StorageEnclosure.IoControllerErrorFaultType: Microsoft.Health.FaultType.StorageEnclosure.IoControllerError

  • 嚴重性:警告Severity: Warning
  • 理由:「失敗 IO 控制器位置 {位置} 儲存圍繞在「。Reason: "The IO controller at position {position} of the storage enclosure has failed."
  • RecommendedAction: 「取代儲存圍繞在 IO 控制器」。RecommendedAction: "Replace an IO controller in the storage enclosure."

FaultType: Microsoft.Health.FaultType.StorageEnclosure.TemperatureSensorErrorFaultType: Microsoft.Health.FaultType.StorageEnclosure.TemperatureSensorError

  • 嚴重性:警告Severity: Warning
  • 理由:「失敗溫度計位置 {位置} 儲存圍繞在「。Reason: "The temperature sensor at position {position} of the storage enclosure has failed."
  • RecommendedAction: 「取代測器儲存圍繞在「。RecommendedAction: "Replace a temperature sensor in the storage enclosure."

韌體推出 (3)Firmware Rollout (3)

FaultType: Microsoft.Health.FaultType.FaultDomain.FailedMaintenanceModeFaultType: Microsoft.Health.FaultType.FaultDomain.FailedMaintenanceMode

  • 嚴重性:警告Severity: Warning
  • 理由:「執行韌體大規模時進行目前無法」。Reason: "Currently unable to make progress while performing firmware roll out."
  • RecommendedAction: 「驗證所有的儲存空間的健康狀態,並不錯誤網域目前正在維護模式」。RecommendedAction: "Verify all storage spaces are healthy, and that no fault domain is currently in maintenance mode."

FaultType: Microsoft.Health.FaultType.FaultDomain.FirmwareVerifyVersionFaileFaultType: Microsoft.Health.FaultType.FaultDomain.FirmwareVerifyVersionFaile

  • 嚴重性:警告Severity: Warning
  • 理由:「韌體大規模已取消因為套用韌體更新後無法讀取或非預期的韌體版本資訊」。Reason: "Firmware roll out was cancelled due to unreadable or unexpected firmware version information after applying a firmware update."
  • RecommendedAction: [重新開機韌體儲蓄韌體問題解析之後。」RecommendedAction: "Restart firmware roll out once the firmware issue has been resolved."

FaultType: Microsoft.Health.FaultType.FaultDomain.TooManyFailedUpdatesFaultType: Microsoft.Health.FaultType.FaultDomain.TooManyFailedUpdates

  • 嚴重性:警告Severity: Warning
  • 理由:「因為失敗的韌體更新嘗試太多實體磁碟已取消韌體大規模」。Reason: "Firmware roll out was cancelled due to too many physical disks failing a firmware update attempt."
  • RecommendedAction: [重新開機韌體儲蓄韌體問題解析之後。」RecommendedAction: "Restart firmware roll out once the firmware issue has been resolved."

儲存空間 QoS (3)2Storage QoS (3)2

FaultType: Microsoft.Health.FaultType.StorQos.InsufficientThroughputFaultType: Microsoft.Health.FaultType.StorQos.InsufficientThroughput

  • 嚴重性:警告Severity: Warning
  • 理由:「輸送量儲存空間不足滿足會保留」。Reason: "Storage throughput is insufficient to satisfy reserves."
  • RecommendedAction: 「重新設定存放區 QoS 原則」。RecommendedAction: "Reconfigure Storage QoS policies."

FaultType: Microsoft.Health.FaultType.StorQos.LostCommunicationFaultType: Microsoft.Health.FaultType.StorQos.LostCommunication

  • 嚴重性:警告Severity: Warning
  • 理由:「儲存空間 QoS 原則管理員已遺失通訊的磁碟區」。Reason: "The Storage QoS policy manager has lost communication with the volume."
  • RecommendedAction: 「請重新開機節點 {節點}」RecommendedAction: "Please reboot nodes {nodes}"

FaultType: Microsoft.Health.FaultType.StorQos.MisconfiguredFlowFaultType: Microsoft.Health.FaultType.StorQos.MisconfiguredFlow

  • 嚴重性:警告Severity: Warning
  • 理由:「一或多個儲存消費者(通常是虛擬機器)使用不存在原則 {橋接器} 與。」Reason: "One or more storage consumers (usually Virtual Machines) are using a non-existent policy with id {id}."
  • RecommendedAction: 「重新建立任何缺少的儲存空間 QoS 原則」。RecommendedAction: "Recreate any missing Storage QoS policies."

1表示 80%完整 (次要嚴重性) 或 90%完整 (主要嚴重性) 人數已達磁碟區。1 Indicates the volume has reached 80% full (minor severity) or 90% full (major severity).
2表示一些.vhd(s) 磁碟區上的有未符合為他們最小值 IOPS 超過 10%(次要)、 30%(主要) 或 50%的循環 24 小時視窗 (重大)。2 Indicates some .vhd(s) on the volume have not met their Minimum IOPS for over 10% (minor), 30% (major), or 50% (critical) of rolling 24-hour window.

注意

健康的粉絲,感應器電源供應器,例如儲存圍繞元件衍生從 SCSI 圍繞服務 (SES)。The health of storage enclosure components such as fans, power supplies, and sensors is derived from SCSI Enclosure Services (SES). 如果您的供應商不提供這項資訊,健康服務不會顯示。If your vendor does not provide this information, the Health Service cannot display it.

也了See also