From the MVPs: Windows Server 2012’s Data Deduplication feature

Here’s the 18th post in our series of guest posts by Microsoft Most Valued Professionals (MVPs). (Click the “MVPs” tag in the right column to see the rest.) Since the early 1990s, Microsoft has recognized technology champions around the world with the MVP Award. MVPs freely share their knowledge, real-world experience, and impartial and objective feedback to help people enhance the way they use technology. Of the millions of individuals who participate in technology communities, around 4,000 are recognized as Microsoft MVPs. You can read more original MVP-authored content on the Microsoft MVP Award Program Blog.

This post is by Paul Clement, a Microsoft Directory Services MVP .

Paul here! For as long as there has been file servers running in our organizations, there has been the need to control data sprawl to conserve expensive storage space. As disks began getting larger in capacity and less expensive in cost this issue has moved from critical to more of an annoyance for IT staff to manage. Larger disks meant more space to save data and less urgency to deal with duplicate files.

Solutions have existed for many years to deal with what is known as “deduplication,” both in software and hardware; however they were expensive and not always as simple as they claimed to be.

With the newly minted Windows Server 2012, one feature of the exhaustive list of under the hood improvements and additions is a Service called Data Deduplication. Finally, a built-in and free tool that is integrated with the operating system is here for us to realize some pretty significant storage savings without the need to make it a capital project.

Some of you may have read my previous blog explaining how to install and configure this feature in the Beta version of Server 8; however the release version has changed this process.

As before, the service cannot be configured on the System or Boot volumes, remotely mapped drives, or removable media, but any other directly connected volumes are a good target. If you are running a file server and want to use this feature, it will be necessary to configure a separate drive, SAN LUN, NAS or iSCSI target as your data repository in order to configure Data Deduplication on it. With all new features comes inherent risk, while Data Deduplication should not cause any major problem there are some standard practices that you should exercise before turning this feature loose on your data repositories. Ensure you have a working backup of a small volume of non-critical data. Test the feature on this volume thoroughly before enabling it on larger, more critical data stores.

Below are the instructions on how to install the service and configure it on one or more volumes in the release version of Windows Server 2012.

1) From within Server Manager, ensure the focus is on the Dashboard entry in the left navigation pane. Select Add Roles and Features from the Manage menu item or directly from the Quick Start pane’s hot link.


2) At the Add Roles and Features Wizard, select Next.


3) On the Installation Type screen, ensure Role-based or Feature-based installation is selected. Press Next.


4) On the Server Selection screen, if there is more than one server showing, select the server you are adding the feature to then select Next.


5) On the Select Server Roles screen, expand File and Storage Services (Installed) by clicking the small arrow. Expand File and iSCSI Services and then select the Data Deduplication checkbox.


6) Immediately upon selecting the checkbox, you are presented with the next screen. Select the Add Features button.


7) Once the required services or features are added, you will be returned to the Server Roles screen with the options now selected. Press Next.


8) On the Features screen, select Next.


9) On the Confirmation screen you have the option of selecting the checkbox to Restart the destination server automatically if required. If you are adding roles that you know may require a restart, you can select this box. Use caution if the server is remote since there are no prompts and you will get disconnected. In our case, we don’t need a restart. Press Install.


10) During the installation you will be presented with the following progress screen.


11) On the results screen, select Close. You will be returned to Server Manager.


12) The service is now installed but not yet configured. To configure it on a Volume, select File and Storage Services from the left navigation pane.


13) From the File and Storage Services screen, select Volumes in the left navigation pane and right-click any drive you intend to add Deduplication Service to. If it’s a newly added drive, it must first have a volume created on it and then be formatted NTFS. From the right context menu, select Configure Data Deduplication.


14) From the Deduplication Settings screen, select Enable Data Deduplication.


15) If you noticed in the screenshot in Step 14, there are options you can configure to exclude certain file types from the service as well as the ability to Set Deduplication Schedule. If you select the option to schedule, you are presented with the following screen.


As you can see, you can enable background optimization (which should be done if this server is doing more than just running this service) and schedule when it can be allowed to run in normal priority and consume whatever resources it needs. This is a nice feature since it will run in the background as a low priority service during times when the server is being used, but it can be ramped up during periods of quiet time, like overnight or during weekends.

16) When you are finished setting up Deduplication on your Volume you can see your results in the Files and Storage Service>Volumes screen. Notice that you now have a status under the Deduplication Rate and Deduplication Savings columns.


You have now configured Deduplication Services on your Volume of choice. Be certain you have a solid backup of the data before turning the service loose on your volume—just in case!

For further reading on Data Deduplication, see the following references: