Analyzing RAID parity
July 23, 2008
Last time I discussed how to find the RAID data offset for a SNAP OS 4.x RAID handler. To put it briefly it was just a simple matter of finding Cylinder Group zero on the first drive in the array and back tracking 48 sectors. Once the RAID data offset is established we can plug those numbers into our RAID Diagnostic Toolkit and begin analyzing the parity.
The main objective of the parity check is to make sure that:
1. We do not have a stale drive in the array
2. We do not have a drive in the array that does not belong
3. All RAID data offsets are correct.
Lets take each item from one to three and explore their impact. Item one basically means that there is a drive in the array that has not been functioning for a certain period of time. Normally an alarm goes off, an email may be sent, there is some sort of notification that a drive has dropped out of the array and now the RAID is running in a degraded state. When the technician who is administering the array does not get a warning it is usually because there has been some type of hardware malfunction that, although the drive is out of the array, the RAID BIOS does not sound the alarm. A second reason is that the alarm stops working. The little speaker on the RAID card that sends this terrible shrill through the server room is malfunctioning and nobody hears it. Another reason might be that the original RAID administrator may have shut off all alarm notification flags during configuration and never turned them back on. There are a lot of other reasons but the fact of the matter is that a RAID administrator may have a RAID that has been degraded for a year and not even be aware of it.
Item two is rare, however, it happens enough to where you need to be concerned if you are trying to recover your RAID. This item also is not very common in the SNAP line of servers as it is in DELL. There are times when a RAID is configured as ‘X’ drives, and one hot swap. The RAID admin who is now working for the company you are trying to recover the data for sends the RAID he tells you it has four drives when it is really three drives and one hot swap. He may not know the original configuration. He may not know how to get into the RAID BIOS to look to see how it was configured. There could be a hundred and one reasons as to why you get a hot swap drive sent to you along with the rest of the array. The point is, be aware that it can happen.
As a side note, DELL has configure many of their RAID models to have two mirrored drives for the OS, and 3 to X drives as a RAID 5. I have received all the drives from a client with them ’swearing’ that all of these drives are in the array. Once I have analyzed the parity, and look at the drives through a hex editor I come to the realization that I have two RAIDS on my hands, not one. Once again, be aware that the client may not know their exact configuration.
Finally item three. Sometimes, not often, actually this was the first time with a SNAP server, the RAID data offsets are staggered. In my next installment I will explain what happened with this particular job, and why it happened. Until next time.
Click here to Download the RAID Diagnostic Toolkit. Be sure to read the instructions on the page as well as follow the links to the instructions with screenshots. You may also visit our page: RAID Configuration and Parity Check for more information.
Finding SNAP OS 4.x RAID Data Offset
July 21, 2008
If you are in this business long enough you will see everything, or will you? Two weeks ago I received a SNAP RAID OS 4.x for recovery. I have done a lot of these and I am pretty familiar with the data offsets, how the drives are setup, and where to begin the virtual RAID for my software. Having said that, these are the steps I normally take, and the results from those steps.
First thing I do is to make images of all four drives. These were four identical Seagate Barracuda ST380011A hard drives, so I made sure I had at least 320GB of space on one of my partitions on my server and, using WinHex dumped the images. Once I had done this I put the clients original drives in their bin hopefully not to use them again.
Next step is to use WinHex and eyeball the beginning of the RAID data. With SNAP OS this is a simple matter of looking for the first cylinder group on the first drive then subtracting forty eight sectors from that. The assumption is that the block size is 8192 bytes, or sixteen sectors. If we were to look sixteen sectors before the first cylinder group you would see the file system superblock. If we skip back another 16 sectors you see another super block. Finally, another sixteen sectors and there should be a null sector. Sometimes I see data in there but that is usually because the drive has somehow been corrupted.
So, once again, to find the beginning of the RAID data segment you find the first cylinder group and subtract forty eight sectors from that. The sector offset derived from that formula is the beginning of the RAID data segment of each drive. They will be the same on all four drives or at least I thought that until this particular recovery.
Next step will be to check the drive parity which, in this case, was unusual. This step will be in the next blog titled “Analyzing RAID parity“.
For more info on RAID Data Recovery or SNAP Data Recovery
SNAP RAID Recovery Part II Drive Set Definition
February 19, 2008
One of the many attributes of a RAID 5 that make it popular is that if a drive goes down in the array the RAID will remain functional. In such a case the following events should occur. An alarm should sound. An alarm that would wake the dead. An alarm that would make raking your fingernails across a chalkboard sound pleasurable by comparison. An alarm that by all known standards would be considered inhumane in most modern cultures. This alarm will sound incessantly, unwavering in its pursuit to be heard until the technician hits it with a keyboard, kicks the server plug out, or kills a chicken and offers a sacrifice to the alarm gods. In other words, you can’t miss this alarm, and if you do, see the reference to ‘wake the dead’.
Secondly, an email will be sent, advising you that the four years worth of data that you thought was being backed up, but you discovered two days ago wasn’t, is now in peril of being lost into the never never land of lost bits and socks that inexplicably disappear from the dryer. Yes, your job, your home, your marriage, all will be lost unless you heed the email warning and immediately shut down the server, to the chagrin of 427 end users who are reading about American Idol. HAH! Welcome to the party pal! (Quote circa 1976: Bruce Willis: Die Hard)
Keep in mind that although there are many RAID cards, as well as on-board RAID interfaces that perform these functions, your particular RAID firmware, as well as its current configuration may not.
The reason that a RAID 5 can have a drive go down and still run is the mathematics of XORing (eXclusive ORing) the data. This method for keeping the data relatively safe in a RAID 5 is called parity. It is a manipulation of bits in each byte of data. For the unwashed a byte is eight bits.
In order to under stand XORing it is imperative that you understand the XOR truth table. Figure 1 is the truth table for eXclusive ORing.
Figure 1
Figure 2 is an example of XORing and how it relates to a four drive RAID 5 and the parity.
Figure 2
The data is arranged thusly:
‘R’ is the ASCII letter
52h is the ASCII hexadecimal value of ‘R’
0101 0010 is the ASCII binary representation of the letter.
As you can see each letter has is set up the same way.
For illustration purposes the following can be assumed. Each line is considered a single byte of a stripe as conveyed by Figure 3. If we take the ‘R’, and the ‘F’, and the ‘T’, and XOR them together, we get the value in D4, where each bit, of each byte is individually XORed across the stripe.
Figure 3
Using Figure 4 as a base, and the truth tables we can see the following:
| 1 | 2 | 3 | 4 | ||||||
|---|---|---|---|---|---|---|---|---|---|
|
L1: |
0 |
XOR |
0 | = | 0 |
XOR |
0 | = | 0 |
|
L2: |
1 |
XOR |
1 | = | 0 |
XOR |
1 | = | 1 |
|
L3: |
0 |
XOR |
0 | = | 0 |
XOR |
0 | = | 0 |
|
L4: |
1 |
XOR |
0 | = | 1 |
XOR |
1 | = | 0 |
|
L5: |
0 |
XOR |
0 | = | 0 |
XOR |
0 | = | 0 |
|
L6: |
0 |
XOR |
1 | = | 1 |
XOR |
1 | = | 0 |
|
L7: |
1 |
XOR |
1 | = | 0 |
XOR |
0 | = | 0 |
|
L8: |
0 |
XOR |
0 | = | 0 |
XOR |
0 | = | 0 |
Figure 4
Now, lets say we lose D2 (drive two) in the array. The following is how the RAID card firmware handles it.
| 1 | 3 | 4 | 2 | ||||||
|---|---|---|---|---|---|---|---|---|---|
|
L1: |
0 |
XOR |
0 | = | 0 |
XOR |
0 | = | 0 |
|
L2: |
1 |
XOR |
1 | = | 0 |
XOR |
1 | = | 1 |
|
L3: |
0 |
XOR |
0 | = | 0 |
XOR |
0 | = | 0 |
|
L4: |
1 |
XOR |
1 | = | 0 |
XOR |
0 | = | 0 |
|
L5: |
0 |
XOR |
0 | = | 0 |
XOR |
0 | = | 0 |
|
L6: |
0 |
XOR |
1 | = | 1 |
XOR |
0 | = | 1 |
|
L7: |
1 |
XOR |
0 | = | 1 |
XOR |
0 | = | 1 |
|
L8: |
0 |
XOR |
0 | = | 0 |
XOR |
0 | = | 1 |
Figure 5
We have built drive 2 on the fly. We do not need to know the data since we can use the XOR truth table and the remaining three drives data to calculate the value of drive two. In the above example the process was illustrated for one byte across one stripe on a four drive array. All of these calculations are done in an instant on a stripe by stripe basis. The full stripe is recalculated for every write, and if a drive is out of the array for every read of the down drive. With all these calculations you would think it would slow down the processing. To a degree, it does, however, bus I/O is infinitely slower than any XOR math a CPU may have to perform. A way to emphasize this point is imagine you are standing on a bridge. Below you is a river. Each byte of data is a boat that passes under the bridge. The boat travels from the hard drive, down the river, to memory, and to the CPU. As one boat passes, you wait for the next boat. The next boat will not pass for one hundred years. The CPU is in a perpetual wait state. It is always waiting for data to process. So, if you want to speed up your PC, by high speed I/O smart boards that can RAID, on a high speed bus.
To be continued…
Learn more about RAID Data Recovery
SNAP Server Data Recovery 3 Spanned RAID 5 Arrays
February 8, 2008
Recently, it was my task to take sixteen drives, spanned across three RAID fives, and recover a set of hundreds of AVI files. These files were used for research and although not time sensitive, were critical to the conclusions of the research.
We have been asked to do many similar jobs where the archive of a set of data has been compromised. Many lawyers have databases of all of their scanned briefs as well as all documentation pertaining to a particular case. If that information is lost and the case reopened for appeal it could be devastating to not be able to review the documentation in a timely manner. I mention this because it took me over a month to complete this task, and although interesting, was very tedious.
What made this recovery interesting was that the drives were in two physical devices. The first device was a four drive SNAP array that was used as the head. The other device was a twelve drive SNAP server that was broken up into two RAID fives. The challenge for this recovery was that no one knew which drives were in which array, no one knew the drive order of any array, the configuration given to me by the SNAP server was in error, no one knew the stripe size of the array, and finally, the data recovery company who had the array before me, marked the drives out of order. In other words, I was handed 16 drives and told to figure out a triple spanned RAID five.
So here are the steps I took to solve this data recovery problem for my client.
Step one, I had to find out which drives went with each other. I would have hoped that each RAID was equal in size. In other words, I hoped the RAIDs would all be four drives for the head in one array, and eight drives each for the other two RAIDS, but this was not the case. In order to find which drives went with which array I had to know several things.
First, I had to know the SNAP layout for arrays. Each drive in a SNAP array is basically broken up into two parts, the operating system, and the data area. In order to find the size of each you must look at the master boot record (MBR) of each of the drives. The MBR houses the partition table which is a listing of the active partitions.
SNAP partitions are divided into three basic areas, an operating system partition, a swap partition and a data partition. SNAP Appliance designed their device so that if one of the drives went down the firmware would roll to the next drive to load the operating system, network interface, and RAID handler. The important piece of information is what the standard offset to the data area is. The data area of each drive is used for the RAID 5. I have found the data area sector offset for the Guardian OS series to be LBA sector 2216970. This information may change from version to version, but all the Guardian operating systems I have worked with have been the same.
Now that we know the data area offset we can take the next step, which is to determine which drive sets comprise the three RAID sets.
To be continued……..
SNAP Data Recovery Through The OS Inode
May 21, 2007
This week is the final offering of our topic Recovering a single file from a SNAP Server Operating System. We have learned what a Super Block is, a cylinder group, and some of the important data elements in those data structures. We have learned how to find these data structures by using the data elements of other structures. Finally, we have learned that the file system is broken into blocks and that these blocks are the storage cornerstone of SNAP OS. Putting all of these facts together we come down to the final data structure the inode. At the bottom of this article there are links to my other posts so you can read or print them all in order.
Recovering a single file from a SNAP OS Part 3
The inode is the final link in the chain of data storage. It holds the map of the blocks where all of the data of each file and directory is stored. Let us dissect the inode and find its most important elements.
The SNAP OS Inode
Figure 1 is a raw hex representation of an inode. There are several data elements within the inode that define the date the file or directory was created, the last time it was updated, and the size of the file. For our purposes however, we are only concerned with one area of data elements and those are the direct and indirect block definitions.

Fig 1
The direct block definitions are defined in the shaded green area, and there are a maximum of twelve direct data blocks. The term direct means that each one of the four byte numbers in the shaded green area point to an actual data storage block. In other words, if we take the first value of 0×14A80E (1353742 decimal) and go view that data block, we will find the first values for our file 2003STEP.PDF. In figure 2 we can see the first few bytes of data from block 0×14A80E.

Fig 2
There are only twelve direct data blocks so if your file exceeds 96 k, then the file system will use a method defined as indirect blocks. There are three data elements of these blocks, they are:
- Indirect Block: Points to a block that has a list of data blocks.
- Double Indirect Block: List of blocks that point to an Indirect block.
- Triple Indirect Block: List of blocks the point to Double Indirect blocks.
From the above explanation you can see how deciphering a very large file can be extremely complicated. Once understood, this method works well and is very fast. Along with those facts, it is also very easy to program using recursion and a set of flags to let the recursive function know what is being processed.
Figure three is a listing of the 2003STEP.PDF direct blocks from its only indirect block.

Fig 3
Well, that’s it! By using the formulas and techniques I have outlined in my last three articles you can easily retrieve any file. I hope this helps those of you that have lost data due to hardware and or software failures on your SNAP Server.
If you have any questions, or if I can be of any help, please feel free to call me, or drop me an email.
(727)345-9665 Ext 203
dickc AT dtidata.com
SNAP Server Data Recovery of a Single File
Here are all the articles about SNAP Server Recovery of a single file:
- SNAP Data Recovery - the first post about the SNAP OS.
- SNAP Server Data Recovery of a Single File - A detailed post about recovering a lost file on the SNAP OS.
- SNAP Server Data Recovery Using The Super Block - the next article about SNAP file recovery.
Our main page for SNAP Server Data Recovery.
SNAP OS Data Recovery Super Block
April 26, 2007
SNAP Operating System File Recovery Through The Super Block
Recovering a single file from a SNAP OS Part 2
Last week we discussed how to find the file name on a SNAP OS file system. Using a sector editor we searched the hard disk drive for the file name. Once we found the file name I broke down the on disk data structure format for a directory/file entry. Among the many elements of the structure the most important in determining where the data for the file is stored is the inode number. In this weeks installment we will discover how to use the on disk data structure called the super block. This is the key to the entire file system and is essential if we are to find the data related to this file name.
In order to find the super block, we must first understand how the SNAP OS positions itself on a drive. In windows the basic element of storage is a cluster. Standard cluster size for an NTFS drive is 4096 bytes, or eight (8) sectors. For SNAP OS the basic element of storage is a block. Standard block size for SNAP OS is 8192 bytes or sixteen 16 sectors. Using the block as a basis for storage, we then have groupings of blocks. These groupings of blocks are called cylinder groups. The on disk layout of a cylinder group is as follows.
Cylinder Group On Disk Layout
All blocks are relative to the beginning of the cylinder group.
- Super Block: Block 0: This block houses a copy of the on-disk structure of the super block
- Cylinder Group: Block 1: On-disk structure of the cylinder group
- Inodes: Block 2 – (2 + n) Inode StorageEach inode is 128 bytes, therefore 4 inodes per sector, or 64 inodes per block.
- Data Blocks: Blocks (2 + n) - (end of cylinder group) All remaining blocks to the end of the cylinder group are data blocks.
Applying real world numbers
In order to help illustrate how all of this works together, let’s take the 2003STEP.PDF example and apply it to our on disk definitions.
First of all, we must find the super block. The best way to do that, is to find the first cylinder group using the magic number I spoke of in my article “SNAP RAID Recovery Using SNAP OS”. The magic number for the cylinder group is 0×550209. So, using WinHex as my sector editor, I plug that value into the “Hex Search” field and run the search. In this case the Cylinder Group is stored at sector 48. Now, we know that the Cylinder Group data structure is stored in relative block 1, and we also know that a copy of the super block is stored in relative block 0. The size of the blocks is 8K, so, we can count back 16 sectors, or 8K and find a copy of the super block.
So, a copy of the super block is stored at sector 32. There are some data elements within the super block that will help us identify the exact placement of the inode we are looking for. These elements are as follows: (Remember the numbers are for this real life situation only, your numbers may differ because of disk size, formatting flags etc.)
Super Block Offset: 2
Cylinder Block Offset: 3
Inode Block Offset 4
Data Block Offset 16
The above numbers are relative to the beginning of the volume. Therefore we can find the beginning of the volume by using the Super Block offset. The Super Block is stored on block two, or translated to sectors, sector 32. If we subtract 2 blocks, to find block zero, which is the beginning of the volume we will find the beginning of the drive. This is important since many of the SNAP OS volumes I work on are RAID-ed. There is a great deal of extraneous data when dealing with RAIDs, however, using this formula, we can easily find the beginning of the drive on a destriped RAID set.
Secondly, and more importantly, we can determine the total inodes per cylinder group. As defined before, we know that there are 64 inodes per block. In our real world example we can see that the inode block starts at relative block 4, and the data block starts at relative block 16. If we subtract 4 from 16 we know that there are 12 blocks of storage per cylinder group. We know that there are 64 inodes per block, times12 blocks, or 768 inodes per cylinder group. There is a data element in the super block that tells us the inodes per cylinder group. If we take our previous calculation, and it matches the super block data element, then we know that our file system is aligned. In this case they both match.
Now, if we know that we have 768 inodes per cylinder group, and the current inode we are looking for is 1015297, we can divide the inode we are looking for, by total inodes per cylinder group to find the cylinder group which house our inode. That value 1322. We then do the mod of the same values to tell us which inode within the cylinder group is the one we are looking for. That value is 1. So, we can say in cylinder group, 1322, inode 1, we have the inode we are looking for.
Lastly, how do we find Cylinder Group 1322? The size of the Cylinder Group is the size of the data group plus 64 sectors. So, in my case, the data group was 1024 blocks, or 16,384 sectors. You add 64 sectors to that and you have each cylinder at 16,448 sectors. One note, every 16 cylinder groups is an adjustment of 1024 sectors. So the 16th cylinder group is only 15,424 sectors.
That’s, it! Now that we have a method for finding the inode, we can actually start pulling data off. I will cover direct disk blocks and the formula for pulling data off of the drive in my next installment.
SNAP Server Data Recovery Of A Single File
April 19, 2007
Recovering a single file from a SNAP OS
As we know SNAP Appliance used a proprietary Unix File System (UFS) handler in order to run there Network Attached Storage (NAS) product. This particular OS ran a Berkley Software Distribution (BSD) flavor of UFS. Although there are many similarities to the original file system, there are also enough changes to make file recovery extremely difficult. In the following paragraphs, and articles we will explore the arithmetic and methodology of recovering a single file from a UFS volume.
In order to recover the file we must use the on-disk data structures that give the OS its road map to the file name, inode, and finally data block placement. Normally I would start with the coarsest data structure, but in order to facilitate an understanding of the file hierarchy I will start from the smallest granularity. That data structure being the directory entry.
To illustrate the data elements and their use I chose a PDF file for recovery. The file name is “2003STEP.PDF”. Using you favorite sector editor, do a scan and search for the file name that you want to recover. The tool I use is a wonderful product called “WinHex”. Figure A is a capture of my sector search.

Figure A
There are many elements in a directory entry structure; however there are actually five key elements.
- File Name: This is the actual name of the directory/file. In Figure A this is defined as “2003STEP.PDF”
- File Name length: This data element is self explanatory as it defines the length of the name. In Figure A the name length is defined as 0×0C, or 12 in decimal.
- File Type: In this case we are only concerned about two types. First 0×04 which is a directory entry and 0×08, which is a standard file name entry. In Figure A this file entry is regular file name.
- Record length: This defines the entire length of this particular directory/file name record. In variable length records there is always a record length. In Figure A you can see the record length is 0×00000028 in hex, or 40 in decimal.
- Inode number: The name, and name length are important, however the inode number holds the key to the data block placement. In Figure A this is defined as 0×000F7E01, or 1015297 in decimal.
Next installment I will describe the ‘cylinder group’ data structure and how we can use that to find our inode element.
SNAP RAID Recovery using SNAP OS
April 11, 2007
SNAP Server NAS RAID Data Recovery
SNAP Appliance, now owned by Adaptec was one of the pioneers of the Network Attached Storage (NAS) technologies. Through the use of the Berkeley Software Distribution (BSD) and the UNIX File System (UFS), SNAP developed a reliable and easy method for using a mass storage device through a shared network. In order to do this SNAP used an abbreviated version of the file system in conjunction with a set of hard coded variables that allowed for a fast boot up, easier recovery facilities within the spectrum of the operating system, and a ROM based web interface that was closely tied to several of the standard UNIX/Linux/BSD recovery tools. However, that being said, when it came to catastrophic recovery this particular OS/FS marriage made it virtually impossible for any third party standard file system handler, or tool, to recover lost, or deleted data. The following is an outline of one of the basic data structures, the Super Block, and how it differs from the standard UFS file system. These differences are the ‘fly in the ointment’ when it comes to using standard UFS data recovery tools. Read my article on SCO Unix RAID Data Recovery for more insight on the UFS.
On-disk file system data structures are the key to data recovery. The knowledge of how a file system resides on the disk is the only way to recover from catastrophic data loss. Using on-disk data structures and their relationship with each other will help a recovery expert piece together lost data on a file system that will not mount. In essence, the data recovery technician creates a virtual file system using key data elements from the on-disk structure. These data elements go through a mathematical and geometrical scrutiny. This evaluation of the data must be strict enough to allow for corrupt data parsing, but flexible enough to build the file system from a partial data structure. In other words, a sort of ‘artificial intelligence’ is used to compare, evaluate, and assign data values to key data elements through the use of file system structure placement. A basic element of the file system in this particular case is the Super Block.
The Super Block is a broad spectrum definition of the entire file system. Although not defining file placement, and block usage, the Super Block is the crux of on-disk data element placement that will lead the data recovery technician to file name, inode definition, and ultimately data block placement. Data fields that reveal such values as total inodes, total data blocks, total cylinder groups, can be used to define a cohesive file system and in many cases rebuild a corrupted data structure. The Super Block defines coarse data that can be used to calculate cylinder group definitions that inevitably lead to directory definitions, and a methodology to build a file tree.
The Super Block is defined across the disk in each cylinder group. This fact alone can aide the trained data recovery technician in the alignment of the file system. Once aligned, it is a simple matter of back tracing directory name, inode definition, and data block in order to build a file tree. As an example the Super Block designates the primary inode block. When parsing the first cylinder group inode 0, and 1 are undefined and the 128 byte data elements are zeroed. However, inode 2 is defined, and can be traced to the root of the directory structure. Using recursion, one can easily define a full tree by using this single data element.
SNAP UFS File System Data Recovery
There are many more data elements that are an integral part of the SNAP UFS, however, the one basic element that is needed in order for third party UFS handlers to function is missing. Each on-disk data structure maintains an element that is unique to its particular type. This element is defined as a ‘MAGIC NUMBER’. This magic number, however derived, is a tell tale element that can be used by the technician to find certain data structures. For whatever reason, SNAP decided to ignore the magic number and it is not stored on-disk. This may be an indication that the SNAP file system designers did not want to carry extra data elements that were superfluous to the functionality and definition of the file system. It is a good strategy for saving precious space in a ROM perhaps, but is not a sound strategy if one is trying to piece together a file system and has no idea where to start. I am not trying to second guess the SNAP Appliance designers, it is merely a fact of the on disk structure and must be dealt with.
If a software engineer wishes to design, code and implement a SNAP Appliance UFS recovery handler then the magic number must be taken into consideration. There are several other data elements of the super block structure that must have certain values. These values can be boundary tested, and used to find other data elements that have a more traditional on-disk data structure. In other words, if the super block cylinder group element points to a particular sector on the disk, that sector can be loaded and masked with a cylinder group on-disk structure. The structure can then be boundary tested and if the testing proves positive then the original super block placement may be correct. Of course, several other elements must be tested, but if the tests return in a positive manner, it is very likely that you may have found your super block without the use of a magic number.
In the final analysis it is up to the data recovery technician to evaluate each SNAP Appliance, and the possibility of recovery. However, with calculator in hand, and hex editor on screen, a well versed data recovery technician can find the super block, and in that, use that key to unlock his clients lot data.
If you have any questions pertaining to this article, or others I have wrote; please contact me at:
Dick Correa
DTI Data
Senior Systems Analyst / Senior Software Designer
dickc AT dtidata.com
SAN SNAP RAID Recovery is our main page or you can find out more information about data recovery.






