Oh Shit!!! The Data is Gone…. Or Is It?

Yep, I screwed up, I made assumptions, didn’t double and triple check things, and made a mess of something I was working on, professionally none the less. I did fix it, but it was a stupid screw up none the less. The irony is not lost on me about how I harp on about backups regularly to everyone.

The other day I ended up with one of those Oh!! SHIT moments, I was migrating an older 2012 R2 file server to 2016 and whilst I was doing this I decided to kick the old server that was due to return for the leasing company from the Failover Cluster it was in. As standard I paused the node, removed all the references from it and I then hit the evict function on the node to evict it from the cluster, and it was from this that it all went to shit, doing this whilst doing migrations, what the first part of my mistake. What happened the Failover Cluster borked itself and crashed on the remaining servers, and it would not restart, to this day I have not got it to restart.

After spending an hour or so trying to get the cluster to restart, I relented and went to the backups to restore the offending server. Hitting the backups I go to the server I want to store, and notice that its only 16GB WTF!!!! the server should be several TB in size (it is a file server after all).

Upon further investigation it seems that I was missreading the backup reports and the old server, which has the same name on the old Hyper-V Cluster as the new server does on the new implementation was not getting backed up, it was the new one. I misread the report and assumed that it was backing up the old server, mistake number 0 (this had been happening for the 6 weeks before the backup failure) and the old restore points being more than our retention limit were gone. Ok I will hit the long term off-site backups might take a while but the data is safe, well it was not or so it seems, the other technican at the offsite location had removed the offsite backups for the fileserver from the primary site. Why, because they were taking up too much space on that site’s primary backup disk (The storage at each site is partitioned to provide onsite backup for each site, with the second partition being the offsite backup for the other site).

Damn, so this copy of the data is the only one.

Ok so I killed the cluster server that everything was on, and using the old evicted node I rebuild a single node “cluster” and mounted the CSV, mounted the VHDX and everything appeared as it should. Whoo Hoo access to the data, well not so fast there buddy.

After moving some data an error popped up stating that the data was inaccessible, ok no problem loss of a single file is not a real issue. Then it popped up again, then again.. the second Oh! Shit! moment within several hours.

2017-02-02 - Dedupe Error

I recovered and moved the data I could access leaving me purely with data I couldn’t. I tried chkdsk and other tools and after several hours I took a break from it, needing to clear my mind.

Coming back to it later I looked at the error, looked at what was happening, and recalled seeing an article on another blog about Data Deduplication corrupting files on Server 2016. With this I began wondering if it had effected Server 2012 R2, then the lightning struck deduplication, this process leaves redirects in place and essentially has a database of files that it links to for the deduplication. The server the VHDX was mounted did not know about the Deduplication, the database or how to access it.

Up until now I had only mounted the Data VHD. Now I rebuilt the server utilising the original Operating System VHDX to run the server. I let it install the new devices and boot.

Upon the server booting I opened a file I could not access before, and it instantly popped onto my screen. Problem Solved

Note to remember If you are doing Data Recovery or trying to copy data from a VHDX (or other disk, virtual or physical) that was part of a deduplicated file server, you need to do it from the server due to the deduplication database. You may be able to import the database to another server, I really have no idea, and I am not going to try to find out.

Enabling Data Deduplication on Server 2012 R2


Data deduplication (or dedupe for short) is a process which by the system responsible for the deduplication scans the files in one or more specific locations for duplicates, and where duplicates are found it replaces all the duplicate data with a reference to the “original” data. This in essence is a data compression technique designed to save space by reducing the data actually stored, as well as aiming to provide single-instance data storage (storing only one copy of the data, no matter how many places its located in).

The way this is achieved is dependent on the system used, it can be done but it can be done on block level, file level or other levels, again depending on the system and how it is implemented.

What we are going to do in this article is we are going to enable deduplication on a Windows Server 2012 R2 Server. Keep in mind this is changing data and quite possibly going to cause data damage or loss, as such make sure you have a working backup BEFORE continuing.

Firstly we need to access the server that you are planning to configure deduplication on, I will leave it up to you how you achieve that. Once you have access to the server we can begin.

On the server open “Server Manager” if it is not already open

2014-09-19-01-ServerManager

If it gives you the default splash page, simply click next (and I suggest telling it to skip that page in future by use of the checkbox) Once we are in the “Installation Type” page we need to select “Role-based or feature-based installation” and click “Next”

2014-09-19-02-AddRoleorFeature

In the “Server Selection” page select the server you want to install the service on (commonly the one your using), Click “Next”

2014-09-19-03-SelectServer

 

Next up is the “Server Roles” page, here is where the configuration changes need to take place. In the right had list of checkboxes (titled “Roles”) scroll down till you see “File And Storage Services” then open “File and iSCSI Services” then further down the page check the “Data Duplication” checkbox. Click “Next”, accepting any additional features it wants to install.

2014-09-19-04-SelectService

In the “Features” page simply click “Next”

2014-09-19-05-IgnoreFeatures

On the “Confirmation” page check you are installing what is required and click “Install”

2014-09-19-06-Install

Wait for the system to install, and exit the installer control panel, restart if your server requires it.

Upon completion of the install and any tasks associated with the installation re-open “Server Manager” and in the left hand column select “File and Storage Services”

2014-09-19-07-ServerManager

This will change the screen in “Server Manager” to a three column layout, in the middle column select “Volumes”

2014-09-19-08-ServerVolumes

With the volumes now displaying in the right hand of the three columns, right click on the volume you want to configure deduplication on and select “Configure Data Deduplication”

2014-09-19-09-ServerVolumesRightClick

This will bring up the “Deduplication Settings” screen for the volume you right clicked on. Unless Data Deduplication has been configured before, the “Data deduplication” will be “Disabled”.

2014-09-19-10-DuplicationSettings-Initial

As I am configuring this on a file server, I am going to select the “General purpose file server” option, and leave the rest as defaults. I am then going to click on the “Set Deduplciation Schedule” button

2014-09-19-11-DuplicationSettings-Enable

The “Deduplication Schedule” will now open. I suggest checking the “Enable background optimization” checkbox as this will allow the server to optimise data in the background. I also elected to create schedules to allow for more aggressive use of system resources, the first one allows for it to be done after most people have left for the day, and before the servers scheduled backup, the second one allows it to run all weekend but again stops for backups. Please note that these settings are SYSTEM settings and apply to all data deduplication jobs on the system, and are not unique to each individual deduplication job

Click “Apply” on the “Deduplication Schedule” screen, and then “Apply” on the “Deduplication Settings” screen, this will drop you back to the “File and Storage Services > Volumes” screen, and you are now done, Data deduplication is configured.

Have fun, and don’t forget that backup

Justin