Add additional scale unit nodes in Azure Stack Hub
You can increase the overall capacity of an existing scale unit by adding an additional physical computer. The physical computer is also referred to as a scale unit node. Each new scale unit node you add must be homogeneous in CPU type, memory, and disk number and size to the nodes that are already present in the scale unit. Azure Stack Hub doesn't support removing scale unit nodes for the purpose of scaling down due to architectural limitations. It is only possible to expand capacity by the addition of nodes.
To add a scale unit node, sign into in Azure Stack Hub and run tooling from your hardware equipment manufacturer (OEM). The OEM tooling runs on the hardware lifecycle host (HLH) to make sure the new physical computer matches the same firmware level as existing nodes.
The following flow diagram shows the general process to add a scale unit node:
Whether your OEM hardware vendor enacts the physical server rack placement and updates the firmware varies based on your support contract.
The operation to add a new node can take several hours or days to complete. There is no impact to any running workloads on the system while an additional scale unit node is added.
Note
Don't attempt any of the following operations while an add scale unit node operation is already in progress:
- Update Azure Stack Hub
- Rotate certificates
- Stop Azure Stack Hub
- Repair scale unit node
- Add another node (the previous add-node action failure is also considered in progress)
Add scale unit nodes
The following steps are a high-level overview of how to add a node. Don't follow these steps without first referring to your OEM-provided capacity expansion documentation.
- Place the new physical server in the rack and cable it appropriately.
- Enable physical switch ports and adjust access control lists (ACLs) if applicable.
- Configure the correct IP address in the baseboard management controller (BMC) and apply all BIOS settings per your OEM-provided documentation.
- Apply the current firmware baseline to all components by using the tools that are provided by the hardware manufacturer that run on the HLH.
- Run the add node operation in the Azure Stack Hub administrator portal.
- Validate that the add node operation succeeds. To do so, check the Status of the Scale Unit.
Add the node
You can use the administrator portal or PowerShell to add new nodes. The add node operation first adds the new scale unit node as available compute capacity and then automatically extends the storage capacity. The capacity expands automatically because Azure Stack Hub is a hyperconverged system where compute and storage scale together.
- Sign in to the Azure Stack Hub administrator portal as an Azure Stack Hub operator.
- Navigate to + Create a resource > Capacity > Scale Unit Node.
- On the Add node pane, select the Region, and then select the Scale unit that you want to add the node to. Also specify the BMC IP ADDRESS for the scale unit node you're adding. You can only add one node at a time.
Monitor add node operations
Use the administrator portal or PowerShell to get the status of the add node operation. Add node operations can take several hours to days to complete.
Use the administrator portal
To monitor the addition of a new node, review the scale unit or scale unit node objects in the administrator portal. To do so, go to Region management > Scale units. Next, select the scale unit or scale unit node you want to review.
Use PowerShell
The status for scale unit and scale unit nodes can be retrieved using PowerShell as follows:
#Retrieve Status for the Scale Unit
Get-AzsScaleUnit|select name,state
#Retrieve Status for each Scale Unit Node
Get-AzsScaleUnitNode |Select Name, ScaleUnitNodeStatus
Status for the add node operation
For a scale unit:
Status | Description |
---|---|
Running | All nodes are actively participating in the scale unit. |
Stopped | The scale unit node is either down or unreachable. |
Expanding | One or more scale unit nodes are currently being added as compute capacity. |
Configuring Storage | The compute capacity has been expanded and the storage configuration is running. |
Requires Remediation | An error has been detected that requires one or more scale unit nodes to be repaired. |
For a scale unit node:
Status | Description |
---|---|
Running | The node is actively participating in the scale unit. |
Stopped | The node is unavailable. |
Adding | The node is actively being added to the scale unit. |
Repairing | The node is actively being repaired. |
Maintenance | The node is paused, and no active user workload is running. |
Requires Remediation | An error has been detected that requires the node to be repaired. |
Troubleshooting
The following are common issues seen when adding a node.
Scenario 1: The add scale unit node operation fails but one or more nodes are listed with a status of Stopped.
- Remediation: Use the repair operation to repair one or more nodes. Only a single repair operation can run at one time.
Scenario 2: One or more scale unit nodes have been added but the storage expansion failed. In this scenario, the scale unit node object reports a status of Running but the Configuring Storage task isn't started.
- Remediation: Use the privileged endpoint to review the storage health by running the following PowerShell cmdlet:
Get-VirtualDisk -CimSession s-cluster | Get-StorageJob
Scenario 3: You received an alert that indicates the storage scale-out job failed.
- Remediation: In this case, the storage configuration task has failed. This problem requires you to contact support.