Powered by Blogger.

VMFS File System Reconstruction (Part 2)

In the first installment of our explanation of rebuilding the VMFS file system from a damaged volume we covered some of the attributes of the volume on disk data structure. In addition to that we reviewed some tools and methods for finding the placement of a VMFS volume in order to reconstruct the volume boot handler. Finally we discussed the role the GPT in union with the EFI specification play in managing and ultimately help in processing the ESXi boot handler and pre and post loader sequences. This installment will cover the on disk format of the GUID Partition Table, and the theory behind the Extensible Firmware Interface. This installment also offers a tool to help build a GPT in order that one can mount the file system and use the open source tools upon the file system. If the tools do not work then we will offer a more comprehensive method for retrieving data from a corrupted ESXi VMFS GPT.

First of all let’s take a look at the on disk format for the Master Boot Record (MBR) of a GPT partition handler. Initially the MBR looks for all practical purposes like a standard MBR. However the processing differs a great deal. The MBR boot loader for a GPT is more of a legacy place holder in contrast to actual partition table storage and processing mechanism. The partition delineation tables are stored in the post sector area and can have as many as 128 different partition definitions. In addition the MBR offers a pre-loader type logic that is used as a preliminary staging area for the mounting of the data which is stored in the GPT. The standard partition does point to the start of the GPT and the partition entry is delineated by the partition type 0xEE. Although the MBR looks like a standard MBR and a pre-loader for the operating system it is merely a handler for mounting the ESXi partitions as well as any VMFS data partitions. Let’s now take a look at the on disk format of the GPT and how we can use the data from the first tool to help build a valid GPT.
The GPT is separated into two distinct sections. First there is a GPT header. This header is a general information store for the physical device on which it resides. The header has several data items that help in processing the GPT entry portion of the table. The following is a list of the most pertinent and useful of the header on disk data items:
  • GPT Magic – Oddly enough the magic number for the GPT header is “EFI PART”. This is a certain respect is misleading in that you can have a GPT without an EFI BIOS handler. The GPT and EFI are two separate entities and are not dependent upon each other.
  • Revision Number – This is used to help determine if there are any new entities in the on-disk data format. It can also convey if existing data types within the structure have changed. As an example, where an LBA sector address might be able to handle all addresses as an unsigned int with the larger drives it has become apparent that the use of an INT64 or long long need to be used in order to address not only all sectors but all bytes as well since many file systems address the storage device in bytes and not in sectors. As the sector size of a drive goes from 512 to 4096 this become even more important.
    GPT EFI Magic

    As an aside it is becoming increasingly apparent that the larger hard drives are coming out with a 4096 physical block size in lieu of the old industry standard 512 bytes. This move will help increase performance as most file systems base their block or cluster sizes on 4096. In addition this will aid in 48 bit addressing as the addressing scheme for the ATA specification has always been at the sector level and not the byte level. The move to 4096 sector size multiplies the maximum addressable drive size by a factor of eight.
  • First Useable and Last Useable Sector – These parameters delineate the boundary for successive sectors within the data area. Anything outside of that is part of the partition meta data or backup meta data. The next group of sectors after the Last Useable Sector stores a backup of the GPT Entries.
  • GPT Header Primary and Backup Placement – These parameters speak for themselves, however what is interesting is that the backup GPT header is only the header and not the GPT entry data. The GPT Backup Entry Data is stored at the end of the Last Useable Sector as defined in the GPT Header.
  • Partition Entry Sector – This parameter marks the first sector of all of the GPT entries. As of this writing that is normally sector 2 which is right after the GPT header data. The entries continue for a value found in the following definition.
  • Total Partition Entries – This value hold the maximum entries allowed for this particular version of the GUID Partition Table specification.
  • Partition Entry Size – This is the size of a GPT partition entry. As of this writing that size is 128 bytes. For a 512 byte sector there are four entries per sector.
Now that we have defined and explained the items in the GPT header the following is a list of the items in a GPT Entry record:
  • Start and ending sector of the volume – These are the parameters within the device that define the volume or volume portion residing on this device. I say portion as there may be multiple partition entries within a defined volume that when grouped together define one volume.
  • Partition Type GUID – This GUID defines the type of volume. As an example there might be a Microsoft Basic partition. VMFS data, VMFS core etc.
  • Partition Unique GUID – A GUID that is specific to the partition. This is a unique id that is used by the file system in many of its meta data operations. It is used as a safe guard to ensure that the file system remains constant and in a faultless state.

VMFS Partition Rebuild Software

Once again there are other parameters tied to the GPT entry but these entries are of the greatest importance and pertinence. In order for a VMFS file system to be mountable these entities must be defined and written to the disk in the format the operating system expects. To mount a VMFS file system one would need to go in by hand and define each item through a hex editor. This method would be a pain staking exercise in probable futility and is better left done to a piece of software. In fact DTI offers a piece of software that will do just that. The software can be downloaded here: VMFS GPT Builder
Once the software has been used to build a valid VMFS partition then the next step is to try and use the Linux tools to recover the data from the volume. There are very clear and concise instructions for compiling and using each tool. I have used these tools and compiled them on CentOS as well as Ubuntu. Ubuntu offered no resistance whatsoever and CentOS required the devel-libuuid in order to compile the tools. If however, for whatever reason, the tools cannot be used to recover data from the reconstructed file system the next installment will explain another method for recovering data using the inodes. The inode is the heart and soul of a file. It defines where it is stored, the size, dates, and is the cornerstone for recovering the file. We will take a close look at the inode structure as well as its use and how we can get it to work with the VMFS tools provided by the Linux community.
    Blogger Comment
    Facebook Comment