Data deduplication (or dedupe for short) is a process which by the system responsible for the deduplication scans the files in one or more specific locations for duplicates, and where duplicates are found it replaces all the duplicate data with a reference to the “original” data. This in essence is a data compression technique designed to save space by reducing the data actually stored, as well as aiming to provide single-instance data storage (storing only one copy of the data, no matter how many places its located in).
The way this is achieved is dependent on the system used, it can be done but it can be done on block level, file level or other levels, again depending on the system and how it is implemented.
What we are going to do in this article is we are going to enable deduplication on a Windows Server 2012 R2 Server. Keep in mind this is changing data and quite possibly going to cause data damage or loss, as such make sure you have a working backup BEFORE continuing.
Firstly we need to access the server that you are planning to configure deduplication on, I will leave it up to you how you achieve that. Once you have access to the server we can begin.
On the server open “Server Manager” if it is not already open
If it gives you the default splash page, simply click next (and I suggest telling it to skip that page in future by use of the checkbox) Once we are in the “Installation Type” page we need to select “Role-based or feature-based installation” and click “Next”
In the “Server Selection” page select the server you want to install the service on (commonly the one your using), Click “Next”
Next up is the “Server Roles” page, here is where the configuration changes need to take place. In the right had list of checkboxes (titled “Roles”) scroll down till you see “File And Storage Services” then open “File and iSCSI Services” then further down the page check the “Data Duplication” checkbox. Click “Next”, accepting any additional features it wants to install.
In the “Features” page simply click “Next”
On the “Confirmation” page check you are installing what is required and click “Install”
Wait for the system to install, and exit the installer control panel, restart if your server requires it.
Upon completion of the install and any tasks associated with the installation re-open “Server Manager” and in the left hand column select “File and Storage Services”
This will change the screen in “Server Manager” to a three column layout, in the middle column select “Volumes”
With the volumes now displaying in the right hand of the three columns, right click on the volume you want to configure deduplication on and select “Configure Data Deduplication”
This will bring up the “Deduplication Settings” screen for the volume you right clicked on. Unless Data Deduplication has been configured before, the “Data deduplication” will be “Disabled”.
As I am configuring this on a file server, I am going to select the “General purpose file server” option, and leave the rest as defaults. I am then going to click on the “Set Deduplciation Schedule” button
The “Deduplication Schedule” will now open. I suggest checking the “Enable background optimization” checkbox as this will allow the server to optimise data in the background. I also elected to create schedules to allow for more aggressive use of system resources, the first one allows for it to be done after most people have left for the day, and before the servers scheduled backup, the second one allows it to run all weekend but again stops for backups. Please note that these settings are SYSTEM settings and apply to all data deduplication jobs on the system, and are not unique to each individual deduplication job
Click “Apply” on the “Deduplication Schedule” screen, and then “Apply” on the “Deduplication Settings” screen, this will drop you back to the “File and Storage Services > Volumes” screen, and you are now done, Data deduplication is configured.
Have fun, and don’t forget that backup