Hi,
we wanted to test the optional extend to cache drives szenario with our S2D cluster according to this MS doc:
https://learn.microsoft.com/en-us/windows-server/storage/storage-spaces/nested-resiliency
But unfortonately the tolerate of a cache drive loss don't work for us.
Our szenario:
We have here a 2-node S2D cluster with SSDs and HDDs. (Win Srv 2019)
SSDs for cache and HDDs for capacity. (created automatically, 1:4 ratio)
Nested resiliency storage tier is configured with the following commands:
New-StorageTier -StoragePoolFriendlyName S2D* -FriendlyName NestedMirror -ResiliencySettingName Mirror -MediaType HDD -NumberOfDataCopies 4
2 nested resiliency volumes are created:
New-Volume -FriendlyName csv1 -FileSystem CSVFS_ReFS -StoragePoolFriendlyName "S2D on cluster1" -StorageTierFriendlyNames NestedMirror -StorageTierSize 400GB
New-Volume -FriendlyName csv2 -FileSystem CSVFS_ReFS -StoragePoolFriendlyName "S2D on cluster1" -StorageTierFriendlyNames NestedMirror -StorageTierSize 400GB
We set cache settings Read+Write for SSDs and HDDS according to the MS doc: https://learn.microsoft.com/en-us/windows-server/storage/storage-spaces/understand-the-cache
At least we wanted to disable the (write)cache, if one node goes down:
Get-StorageSubSystem Cluster* | Set-StorageHealthSetting -Name "System.Storage.NestedResiliency.DisableWriteCacheOnNodeDown.Enabled" -Value "True"
Then we made a test:
-VM on the cluster is running.
--shut down node 1
-VM still running.
--after 30 min get-clusters2d -> CacheModeHDD + CacheModeSDD chanced automatically to ReadOnly
-VM still running
--Then we tried to remove 1 ssd from node 2.
-VM stopped!
Any ideas why our VM is stopping?
According to the MS doc it should tolerate a cache drive loss.
Thanks