安裝並啟用重複資料刪除Install and enable Data Deduplication

適用於 Windows Server (半年度管道)、Windows Server 2016Applies to Windows Server (Semi-Annual Channel), Windows Server 2016

本主題說明如何安裝重複資料刪除、評估重複資料刪除的工作負載,以及在特定磁碟區上啟用重複資料刪除。This topic explains how to install Data Deduplication, evaluate workloads for deduplication, and enable Data Deduplication on specific volumes.

注意

如果您打算在容錯移轉叢集中執行「重複資料刪除」,該叢集的每個節點都必須安裝「重複資料刪除」伺服器角色。If you're planning to run Data Deduplication in a Failover Cluster, every node in the cluster must have the Data Deduplication server role installed.

安裝重複資料刪除Install Data Deduplication

重要

KB4025334 包含重複資料刪除的修正彙總套件,包括重要的可靠性修正,我們極力建議您在 Windows Server 2016 上使用重複資料刪除時安裝它。KB4025334 contains a roll up of fixes for Data Deduplication, including important reliability fixes, and we strongly recommend installing it when using Data Deduplication with Windows Server 2016.

使用伺服器管理員安裝重複資料刪除Install Data Deduplication by using Server Manager

  1. 在 [新增角色及功能精靈] 中,選取 [伺服器角色],然後選取 [重複資料刪除]In the Add Roles and Feature wizard, select Server Roles, and then select Data Deduplication.
    透過伺服器管理員安裝重複資料刪除︰從 [伺服器角色] 中選取 [重複資料刪除]
  2. 按一下 [下一步] 直到 [安裝] 按鈕被啟用,然後按一下 [安裝]Click Next until the Install button is active, and then click Install.
    透過伺服器管理員安裝重複資料刪除:按一下 [安裝]

使用 PowerShell 安裝重複資料刪除Install Data Deduplication by using PowerShell

若要安裝重複資料刪除,請以系統管理員身分執行下列 PowerShell 命令︰To install Data Deduplication, run the following PowerShell command as an administrator:
Install-WindowsFeature -Name FS-Data-Deduplication

若要在 Nano 伺服器安裝中安裝重複資料刪除︰To install Data Deduplication in a Nano Server installation:

  1. 利用開始使用 Nano 伺服器所述的已安裝存放裝置,建立 Nano 伺服器安裝。Create a Nano Server installation with the Storage installed as described in Getting Started with Nano Server.
  2. 從執行 Nano 伺服器以外之任何模式的 WindowsServer 2016 伺服器中,或從已安裝遠端伺服器管理工具 (RSAT) 的 Windows 電腦中,將具有明確參考的重複資料刪除安裝到 Nano 伺服器執行個體上 (以 Nano 伺服器執行個體的實際名稱取代 'MyNanoServer'):From a server running Windows Server 2016 in any mode other than Nano Server, or from a Windows PC with the Remote Server Administration Tools (RSAT) installed, install Data Deduplication with an explicit reference to the Nano Server instance (replace 'MyNanoServer' with the real name of the Nano Server instance):

    Install-WindowsFeature -ComputerName <MyNanoServer> -Name FS-Data-Deduplication
    


    -- 或 ---- OR --
    使用 PowerShell 遠端執行功能遠端連線至 Nano 伺服器執行個體,然後使用 DISM 安裝重複資料刪除:Connect remotely to the Nano Server instance with PowerShell remoting and install Data Deduplication by using DISM:

    Enter-PSSession -ComputerName MyNanoServer 
    dism /online /enable-feature /featurename:dedup-core /all
    

啟用重複資料刪除Enable Data Deduplication

決定可進行重複資料刪除的候選工作負載Determine which workloads are candidates for Data Deduplication

重複資料刪除可透過減少重複資料所耗用的磁碟空間量,有效地將伺服器應用程式的資料消耗量成本降至最低。Data Deduplication can effectively minimize the costs of a server application's data consumption by reducing the amount of disk space consumed by redundant data. 啟用重複資料刪除之前,請務必了解您工作負載的特性,以確保存放裝置能夠發揮最大效能。Before enabling deduplication, it is important that you understand the characteristics of your workload to ensure that you get the maximum performance out of your storage. 有兩種工作負載類別需要考量:There are two classes of workloads to consider:

  • 「建議的工作負載」,此類別已證明同時具有能高度受益於重複資料刪除的兩個資料集,且具有與重複資料刪除之後續處理模型相容的資源耗用量模式。Recommended workloads that have been proven to have both datasets that benefit highly from deduplication and have resource consumption patterns that are compatible with Data Deduplication's post-processing model. 建議您一律在下列的工作負載上啟用重複資料刪除We recommend that you always enable Data Deduplication on these workloads:
    • 提供共用的一般用途檔案伺服器 (GPFS),例如小組共用、使用者主資料夾、工作資料夾,以及軟體開發共用。General purpose file servers (GPFS) serving shares such as team shares, user home folders, work folders, and software development shares.
    • 虛擬桌面基礎結構 (VDI) 伺服器。Virtualized desktop infrastructure (VDI) servers.
    • 虛擬備份應用程式,例如 Microsoft Data Protection Manager (DPM)Virtualized backup applications, such as Microsoft Data Protection Manager (DPM).
  • 可能受益於重複資料刪除,但並非總是重複資料刪除良好候選的工作負載。Workloads that might benefit from deduplication, but aren't always good candidates for deduplication. 例如,下列工作負載可能適合進行重複資料刪除,但您應該先評估重複資料刪除的優點︰For example, the following workloads could work well with deduplication, but you should evaluate the benefits of deduplication first:
    • 一般用途的 Hyper-V 主機General purpose Hyper-V hosts
    • SQL 伺服器SQL servers
    • 企業營運 (LOB) 伺服器Line-of-business (LOB) servers

評估重複資料刪除的工作負載Evaluate workloads for Data Deduplication

重要

如果您是執行建議的工作負載,則可以略過本節,並直接為工作負載啟用重複資料刪除If you are running a recommended workload, you can skip this section and go to Enable Data Deduplication for your workload.

若要判斷工作負載是否適合進行重複資料刪除,請回答下列問題。To determine whether a workload works well with deduplication, answer the following questions. 如果您對工作負載感到不確定,請考慮為工作負載在測試資料集上執行重複資料刪除的試驗部署,以查看它的執行情況。If you're unsure about a workload, consider doing a pilot deployment of Data Deduplication on a test dataset for your workload to see how it performs.

  1. 我的工作負載資料集是否具有足夠的重複資料量,以受益於啟用重複資料刪除?Does my workload's dataset have enough duplication to benefit from enabling deduplication?
    在為工作負載啟用重複資料刪除之前,請使用重複資料刪除節省評估工具 (或稱為 DDPEval) 調查您工作負載資料集的重複資料量。Before enabling Data Deduplication for a workload, investigate how much duplication your workload's dataset has by using the Data Deduplication Savings Evaluation tool, or DDPEval. 安裝重複資料刪除之後,您可以在下列位置找到此工具:C:\Windows\System32\DDPEval.exeAfter installing Data Deduplication, you can find this tool at C:\Windows\System32\DDPEval.exe. DDPEval 可針對直接連線的磁碟區 (包括本機磁碟機或叢集共用磁碟區),以及對應或未對應的網路共用,評估最佳化的可能性。DDPEval can evaluate the potential for optimization against directly connected volumes (including local drives or Cluster Shared Volumes) and mapped or unmapped network shares.
      
    執行 DDPEval.exe 將會傳回類似下列的輸出:Running DDPEval.exe will return an output similar to the following:
     
    Data Deduplication Savings Evaluation Tool
    Copyright 2011-2012 Microsoft Corporation. All Rights Reserved.
      
    Evaluated folder: E:\Test
    Processed files: 34
    Processed files size: 12.03MB
    Optimized files size: 4.02MB
    Space savings: 8.01MB
    Space savings percent: 66
    Optimized files size (no compression): 11.47MB
    Space savings (no compression): 571.53KB
    Space savings percent (no compression): 4
    Files with duplication: 2
    Files excluded by policy: 20
    Files excluded by error: 0

  2. 我的工作負載針對其資料集的 I/O 模式看起來如何?What do my workload's I/O patterns to its dataset look like? 我的工作負載的效能為何?What performance do I have for my workload?
    重複資料刪除會將檔案最佳化為定期工作,而不是將檔案寫入至磁碟時。Data Deduplication optimizes files as a periodic job, rather than when the file is written to disk. 因此,一定要檢查的是工作負載對已經過重複資料刪除處理之磁碟區的預期讀取模式。As a result, it is important to examine is a workload's expected read patterns to the deduplicated volume. 由於重複資料刪除會將檔案內容移入區塊存放區中,並嘗試盡可能依檔案來組織區塊存放區,因此針對檔案循序範圍所套用的讀取作業,將會有最佳的效能。Because Data Deduplication moves file content into the Chunk Store and attempts to organize the Chunk Store by file as much as possible, read operations perform best when they are applied to sequential ranges of a file.

    類資料庫的工作負載通常具有較為隨機的讀取模式 (而非循序讀取模式),因為資料庫通常並不會保證資料庫配置會針對所有可能執行的查詢進行最佳化。Database-like workloads typically have more random read patterns than sequential read patterns because databases do not typically guarantee that the database layout will be optimal for all possible queries that may be run. 由於區塊存放區的區段可能會分散在磁碟區各處,因此針對資料庫查詢存取區塊存放區中的資料範圍,可能會產生額外的延遲。Because the sections of the Chunk Store may exist all over the volume, accessing data ranges in the Chunk Store for database queries may introduce additional latency. 高效能的工作負載特別容易受到上述額外延遲的影響,但其他類資料庫的工作負載可能不會。High performance workloads are particularly sensitive to this extra latency, but other database-like workloads might not be.

    注意

    這些考量主要適用於由傳統旋轉式儲存媒體 (也稱為硬碟磁碟機或 HDD) 所組成之磁碟區上的存放裝置工作負載。These concerns primarily apply to storage workloads on volumes made up of traditional rotational storage media (also known as Hard Disk drives, or HDDs). 全快閃存放裝置基礎結構 (也稱為固態硬碟磁碟機或 SSD) 較不會受到隨機 IO 模式的影響,原因在於快閃媒體的特性之一便是針對媒體上所有位置都具有相同的存取時間。All-flash storage infrastructure (also known as Solid State Disk drives, or SSDs), is less affected by random I/O patterns because one of the properties of flash media is equal access time to all locations on the media. 因此,重複資料刪除針對儲存在全快閃媒體上之工作負載資料集所產生的讀取延遲,與在傳統旋轉式儲存媒體上將會不同。Therefore, deduplication will not introduce the same amount of latency for reads to a workload's datasets stored on all-flash media as it would on traditional rotational storage media.

  3. 我在伺服器上的工作負載會有哪些資源需求?What are the resource requirements of my workload on the server?
    由於重複資料刪除是使用後續處理模型,因此重複資料刪除將定期需要有足夠的系統資源以完成最佳化和其他工作Because Data Deduplication uses a post-processing model, Data Deduplication periodically needs to have sufficient system resources to complete its optimization and other jobs. 這表示具有閒置時間 (例如晚上或週末) 的工作負載最適合進行重複資料刪除,而需要全天候執行的工作負載則較不適合。This means that workloads that have idle time, such as in the evening or on weekends, are excellent candidates for deduplication, and workloads that run all day, every day may not be. 沒有任何閒置時間的工作負載如果在伺服器上的資源需求不高,則該工作負載仍然可能適合進行重複資料刪除。Workloads that have no idle time may still be good candidates for deduplication if the workload does not have high resource requirements on the server.

啟用重複資料刪除Enable Data Deduplication

啟用重複資料刪除功能之前,您必須選擇與您的工作負載最類似的使用類型Before enabling Data Deduplication, you must choose the Usage Type that most closely resembles your workload. 重複資料刪除包含的使用類型有三種。There are three Usage Types included with Data Deduplication.

  • 預設:專為一般用途的檔案伺服器調整Default - tuned specifically for general purpose file servers
  • HYPER-V:專為 VDI 伺服器調整Hyper-V - tuned specifically for VDI servers
  • 備份:專為虛擬備份應用程式調整,例如 Microsoft DPMBackup - tuned specifically for virtualized backup applications, such as Microsoft DPM

使用伺服器管理員啟用重複資料刪除Enable Data Deduplication by using Server Manager

  1. 選取伺服器管理員中的 [檔案和存放服務]Select File and Storage Services in Server Manager.
    按一下 [檔案和存放服務]
  2. [檔案和存放服務] 中,選取 [磁碟區]Select Volumes from File and Storage Services.
    按一下 [磁碟區]
  3. 在所需的磁碟區上按滑鼠右鍵,然後選取 [設定重複資料刪除]Right-click the desired volume and select Configure Data Deduplication.
    按一下 [設定重複資料刪除]
  4. 從下拉式清單方塊中選取所需的 [使用類型],然後選取 [確定]Select the desired Usage Type from the drop-down box and select OK.
    從下拉式清單中選取所需的 [使用類型]
  5. 如果您是執行建議的工作負載,即大功告成。If you are running a recommended workload, you're done. 針對其他工作負載,請參閱其他考量For other workloads, see Other considerations.

注意

您可以在 [設定重複資料刪除] 頁面上找到排除副檔名或資料夾,以及選取重複資料刪除排程的詳細資訊 (包括這麼做的原因)。You can find more information on excluding file extensions or folders and selecting the deduplication schedule, including why you would want to do this, in Configuring Data Deduplication.

使用 PowerShell 啟用重複資料刪除Enable Data Deduplication by using PowerShell

  1. 使用系統管理員內容,執行下列 PowerShell 命令︰With an administrator context, run the following PowerShell command:

    Enable-DedupVolume -Volume <Volume-Path> -UsageType <Selected-Usage-Type>
    
  2. 如果您是執行建議的工作負載,即大功告成。If you are running a recommended workload, you're done. 針對其他工作負載,請參閱其他考量For other workloads, see Other considerations.

注意

重複資料刪除 PowerShell Cmdlet (包括 Enable-DedupVolume) 可透過 CIM 工作階段附加 -CimSession 參數從遠端執行。The Data Deduplication PowerShell cmdlets, including Enable-DedupVolume, can be run remotely by appending the -CimSession parameter with a CIM Session. 這特別適用於從遠端針對 Nano 伺服器執行個體執行重複資料刪除 PowerShell Cmdlet。This is particularly useful for running the Data Deduplication PowerShell cmdlets remotely against a Nano Server instance. 若要建立新的 CIM 工作階段,請執行 New-CimSessionTo create a new CIM Session run New-CimSession.

其他考量Other considerations

重要

如果您是執行建議的工作負載,則可以略過本節。If you are running a recommended workload, you can skip this section.

常見問題集 (FAQ)Frequently asked questions (FAQ)

我想要在 X 工作負載的資料集上執行重複資料刪除。I want to run Data Deduplication on the dataset for X workload. 這是否受支援?Is this supported?
除了已知無法與重複資料刪除功能相互操作的工作負載之外,我們完全支援搭配任何工作負載之重複資料刪除的資料完整性。Aside from workloads that are known not to interoperate with Data Deduplication, we fully support the data integrity of Data Deduplication with any workload. 建議的工作負載也受到 Microsoft 針對效能上的支援。Recommended workloads are supported by Microsoft for performance as well. 其他工作負載的效能大量取決於它們對您伺服器上執行的作業。The performance of other workloads depends greatly on what they are doing on your server. 您必須判斷重複資料刪除對您工作負載造成的效能影響,以及該影響對此工作負載是否可以接受。You must determine what performance impacts Data Deduplication has on your workload, and if this is acceptable for this workload.

重複資料刪除磁碟區的磁碟區大小需求為何?What are the volume sizing requirements for deduplicated volumes?
在 WindowsServer 2012 與 WindowsServer 2012 R2 中,使用者必須仔細調整磁碟區大小,以確保重複資料刪除可以跟上磁碟區上資料量變換的步調。In Windows Server 2012 and Windows Server 2012 R2, volumes had to be carefully sized to ensure that Data Deduplication could keep up with the churn on the volume. 這通常表示工作負載變換度較高之已經過重複資料刪除處理的磁碟區平均大小上限為 1 至 2 TB,而建議的絕對大小上限為 10 TB。This typically meant that the average maximum size of a deduplicated volume for a high-churn workload was 1-2 TB, and the absolute maximum recommended size was 10 TB. 在 WindowsServer 2016 中,這些限制已經被移除。In Windows Server 2016, these limitations were removed. 如需詳細資訊,請參閱重複資料刪除的新功能For more information, see What's new in Data Deduplication.

我需要為建議的工作負載修改排程或其他重複資料刪除設定嗎?Do I need to modify the schedule or other Data Deduplication settings for recommended workloads?
否,我們所提供的使用類型能夠為建議的工作負載提供合理的預設值。No, the provided Usage Types were created to provide reasonable defaults for recommended workloads.

重複資料刪除的記憶體需求為何?What are the memory requirements for Data Deduplication?
重複資料刪除最少應該要有 300 MB 的基本記憶體,並針對每 1 TB 的邏輯資料額外增加 50 MB 的記憶體。At a minimum, Data Deduplication should have 300 MB + 50 MB for each TB of logical data. 比方說,如果您要最佳化 10 TB 的磁碟區,您最少需要配置 800 MB 的記憶體,以供進行重複資料刪除 (300 MB + 50 MB * 10 = 300 MB + 500 MB = 800 MB)。For instance, if you are optimizing a 10 TB volume, you would need a minimum of 800 MB of memory allocated for deduplication (300 MB + 50 MB * 10 = 300 MB + 500 MB = 800 MB). 雖然重複資料刪除可以利用最低限度的記憶體容量對磁碟區進行最佳化,如此有限的資源將會減緩重複資料刪除工作的速度。While Data Deduplication can optimize a volume with this low amount of memory, having such constrained resources will slow down Data Deduplication's jobs.

最佳情況是,重複資料刪除針對每 1 TB 的邏輯資料,應該要有 1 GB 的記憶體。Optimally, Data Deduplication should have 1 GB of memory for every 1 TB of logical data. 比方說,如果您要最佳化 10 TB 的磁碟區,您最好配置 10 GB 的記憶體,以供進行重複資料刪除 (1 GB * 10)。For instance, if you are optimizing a 10 TB volume, you would optimally need 10 GB of memory allocated for Data Deduplication (1 GB * 10). 這個比率將能確保重複資料刪除工作具有最高效能。This ratio will ensure the maximum performance for Data Deduplication jobs.

重複資料刪除的儲存體需求為何?What are the storage requirements for Data Deduplication?
在 WindowsServer 2016 中,重複資料刪除可支援最多 64 TB 的磁碟區大小。In Windows Server 2016, Data Deduplication can support volume sizes up to 64 TB. 如需詳細資訊,請檢視重複資料刪除的新功能For more information, view What's new in Data Deduplication.