Replace a physical disk in Azure Stack

Applies to: Azure Stack integrated systems and Azure Stack Development Kit

This article describes the general process to replace a physical disk in Azure Stack. If a physical disk fails, you should replace it as soon as possible.

You can use this procedure for integrated systems, and for development kit deployments that have hot-swappable disks.

Actual disk replacement steps will vary based on your original equipment manufacturer (OEM) hardware vendor. See your vendor's field replaceable unit (FRU) documentation for detailed steps that are specific to your system.

Review disk alert information

When a disk fails, you receive an alert that tells you that connectivity has been lost to a physical disk.

Alert showing connectivity lost to physical disk

If you open the alert, the alert description contains the scale unit node and the exact physical slot location for the disk that you must replace. Azure Stack further helps you to identify the failed disk by using LED indicator capabilities.

Replace the disk

Follow your OEM hardware vendor's FRU instructions for actual disk replacement.

Note

Replace disks for one scale unit node at a time. Wait for the virtual disk repair jobs to complete before moving on to the next scale unit node

To prevent the use of an unsupported disk in an integrated system, the system blocks disks that are not supported by your vendor. If you try to use an unsupported disk, a new alert tells you that a disk has been quarantined because of an unsupported model or firmware.

After you replace the disk, Azure Stack automatically discovers the new disk and starts the virtual disk repair process.

Check the status of virtual disk repair using Azure Stack PowerShell

After you replace the disk, you can monitor the virtual disk health status and repair job progress by using Azure Stack PowerShell.

  1. Check that you have Azure Stack PowerShell installed. For more information, see Install PowerShell for Azure Stack.

  2. Connect to Azure Stack with PowerShell as an operator. For more information, see Connect to Azure Stack with PowerShell as an operator.

  3. Run the following cmdlets to verify the virtual disk health and repair status:

    $scaleunit=Get-AzsScaleUnit
    $StorageSubSystem=Get-AzsStorageSubSystem -ScaleUnit $scaleunit.Name
    Get-AzsVolume -StorageSubSystem $StorageSubSystem.Name -ScaleUnit $scaleunit.name | Select-Object VolumeLabel, OperationalStatus, RepairStatus
    

    Azure Stack volumes health

  4. Validate Azure Stack system state. For instructions, see Validate Azure Stack system state.

  5. Optionally, you can run the following command to verify the status of the replaced physical disk.

$scaleunit=Get-AzsScaleUnit
$StorageSubSystem=Get-AzsStorageSubSystem -ScaleUnit $scaleunit.Name

Get-AzsDrive -StorageSubSystem $StorageSubSystem.Name -ScaleUnit $scaleunit.name | Format-Table Storagenode, Healthstatus, PhysicalLocation, Model, MediaType,  CapacityGB, CanPool, CannotPoolReason

Replaced physical disks in Azure Stack

Check the status of virtual disk repair using the privileged endpoint

After you replace the disk, you can monitor the virtual disk health status and repair job progress by using the privileged endpoint. Follow these steps from any computer that has network connectivity to the privileged endpoint.

  1. Open a Windows PowerShell session and connect to the privileged endpoint.

        $cred = Get-Credential
        Enter-PSSession -ComputerName <IP_address_of_ERCS>`
          -ConfigurationName PrivilegedEndpoint -Credential $cred
    
  2. Run the following command to view virtual disk health:

        Get-VirtualDisk -CimSession s-cluster
    

    Powershell output of Get-VirtualDisk command

  3. Run the following command to view current storage job status:

        Get-VirtualDisk -CimSession s-cluster | Get-StorageJob
    

    Powershell output of Get-StorageJob command

  4. Validate the Azure Stack system state. For instructions, see Validate Azure Stack system state.

Troubleshoot virtual disk repair using the privileged endpoint

If the virtual disk repair job appears stuck, run the following command to restart the job:

      Get-VirtualDisk -CimSession s-cluster | Repair-VirtualDisk