VMFS File System Reconstruction (Part 4)
In the last three installments we have taken an in depth look at the VMFS file system. As we have seen there are many different components that make up the file system where each component not only does its own job, but is interdependent on the other components to do their job. This dependency is either manifested in a staging manner, for example the component may need to perform a certain task before the other components are able to continue. There are also components that may generate a set of data in order to mount the VMFS volume.
The file system component synergy is a capstone to the processing of almost all file systems. It is most important then that each component as well as its data be maintained in such a way that the possibility of corrupted data is kept to a minimum and in many cases nonexistent. With this in mind there is always a possibility that one of the VMFS components may fail and will halt the operating system and or boot handler. When the mounting process is interrupted or halted there still needs to be a method for recovering data. The probability of not being able to mount the VMFS file system increases several fold if it must be pieced together using the tools that we have provided. It is therefore prudent that DTI offer an alternate method of recovering the data in order to overcome the possibility of an unmountable file system. The following is an outline on how to do that.
The inode, as defined earlier, is the heart of a file. It is the roadmap of meta data and in addition maintains a block map of where the data is stored. With that being said, there is a method for isolating the inode and using the VMFS tool set and library to create a method for recovering a file. The following are the steps to accomplish that task.
- First of all use the tool provided by DTI for finding where the volume begins (VMFS Volume Finder). Be sure to write down the sector number where the software found the volume. This is our starting point for defining some of the data necessary for tweaking the VMFS tool set source code.
- Second, use the tool provided by DTI to build a GPT/EFI partition (VMFS GPT Builder). Hopefully this will be enough for a Linux distribution to mount the file system. It is not necessary for the operating system to recognize the file system but it must mount it as an unknown entity.
- Third, download and compile the VMFS file system handler that was created by Mike Hommey and is found here: VMFS Tools. The library and the tool set needs to be compiled on your Linux distribution. The instructions provided within the package are very thorough and simple to implement. Read the documentation very carefully about the different tools and there are two that can be used to copy files from the VMFS partition to another mounted device.
These three steps should hopefully be enough to recover your data from an unmountable VMFS volume. However, if the file system is in a state that makes it impossible to mount then the next steps should be taken.
- First, using the tool provided by DTI (VMFS Inode Scanner) scan the entire raw device for every inode within the file system. If you know the file size and approximate date of the file then you can use the software to pinpoint which files you would like to recover. Write down the sector where each of the inodes is located.
- Second, take the source code that I have provided and download it from this location: ReadInode. Create a folder called ‘inode’ as a sibling of the other tools. In other words ‘inode’ should be seen with the other tool folders like debugvmfs and fsck.vmfs. Copy the source code into that folder.
- Third and this is of primary importance, the file that was downloaded is named ReadInode.c. This file has three Roman Numerals each one has a set of data that must be updated before the tool can be compiled. Follow the instructions at each one of the Roman Numerals that is within the source code.
- Fourth, the VMFS tool set works predicated upon the fact that the VMFS volume is mounted. The file system may not be recognized but it is mounted. This means that the volume offset is zero since the mounted volume makes transparent the physical offset. In our case the volume is not mounted so we must artificially create the volume offset. In the file utils.c there is a function called m_pread. That function has a parameter called ‘offset’. In the first line of the function after the variable definitions you want to add this line:
offset += FOUND_VOLUME_OFFSET
where FOUND_VOLUME_OFFSET is the value found by using the DTI tool that tells us where the volume starts.
- Fifth, recompile the software and then run the new tool created called ‘inode’. Pass the raw device name of the storage medium as a parameter to the executable. If all goes well then you will be able to follow the diagnostics and watch your file being created.
That is about it when it comes to recovering your data. There is one small fly in the ointment. Certain versions of distributions of Linux have much better I/O than others. When using Ubuntu it was found that the file copying was extremely slow even to the point of agonizing. When copying large files like ‘vmdk’ files it is imperative that the I/O be lightning fast. In the latter case we found that using the CentOS distribution was a much better answer to slow I/O. I am sure you will notice at the end of the ReadInode.c function you see a ‘flush’ command. This is to make sure that the file is written real time to the disk and does not leave a large cache to be written even after the file has been closed.
It is with all sincerity that I hope this group of articles and the set of tools help you get you or your client back online. If there are any questions please post them in the blog as I check it every day. I will be more than happy to answer any and all of your questions.
Part 1 – Part 2 – Part 3 – Part 4