Poking about with DBCC PAGE (Part 1 of ?)
(It's been a while since I last posted - summer fun's been in the way obviously and I've been busy picking up some new and exciting (for me) SQL skills, partially to help with four upcoming TechEds I'll be doing in China and Hong Kong. The posting frequency should increase towards the end of the week but in the meantime, this one's a special request.)
I've been getting lots of requests for more info about how to use DBCC PAGE since my Paul-tells-all post a while back and I had been planning to wait until I'd posted more about the internals of DBCC CHECKDB and moved on to various examples of corruptions before starting to use DBCC PAGE again in the blog. However, a friend made me an offer I couldn't refuse and so I've caved in and I'm starting the series now :-)
I'm going to start with cracking a record from scratch. Yes, I know that option 3 of DBCC PAGE will do it for you - but how does it do it? Sometimes you may not have table metadata and so DBCC PAGE won't be able to use option 3 - in that case you'll need to crack it yourself. Also, there are plenty of curious people who like to be immersed in the bits and bytes and besides, cracking records manually is fun (if you're twisted as well as curious like I am :-)
I've created an example database, dbccpagetest, on SQL Server 2005 which I've zipped up and attached to this post. This way you can all use the same data and pages as me without having to go through the tedium of working out where pages are using the PFS pages (as I explained in the initial DBCC PAGE post).
You won't be able to attach it to SQL Server 2000 but you can always download SQL Server 2005 Express Edition and play with it on that. Unzip the files and attached them using the CREATE DATABASE ... FOR ATTACH syntax:
(FILENAME='C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\dbccpagetest.mdf')
Note that if the filenames haven't changed since they were created, you only need to specify the primary file in the CREATE DATABASE statement. Very cool.
The database has a single table with a couple of rows, created using the following T-SQL:
TABLE example (
INTO example VALUES ('Banff', 'sightseeing', 5);
INTO example VALUES ('Chicago', 'sailing', 4);
The data page is page (1:152). Dumping that page using option 3 will give output including:
Slot 0 Offset 0x60 Length 33
Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP VARIABLE_COLUMNS
Memory Dump @0x44DFC060
00000000: 30000800 05000000 0300f802 00160021 †0..............!
00000010: 0042616e 66667369 67687473 6565696e †.Banffsightseein
Slot 0 Column 0 Offset 0x11 Length 5
destination = Banff
Slot 0 Column 1 Offset 0x16 Length 11
activity = sightseeing
Slot 0 Column 2 Offset 0x4 Length 4
duration = 5
Previously (that word always makes me think of Twin Peaks - 'Previously in Twin Peaks...', but I digress...) I'd posted about the structure of a record - here it is again:
- record header
- 4 bytes long
- two bytes of record metadata (record type)
- two bytes pointing forward in the record to the NULL bitmap
- fixed length portion of the record, containing the columns storing data types that have fixed lengths (e.g. bigint, char(10) , datetime)
- NULL bitmap
- two bytes for count of columns in the record
- variable number of bytes to store one bit per column in the record, regardless of whether the column is nullable or not
- this allows an optimization when reading columns that are NULL (see here for more info)
- variable-length column offset array
- two bytes for the count of variable-length columns
- two bytes per variable length column, giving the offset to the start of the column valu
- versioning tag
- this is in SQL Server 2005 only
- this is a 14-byte structure that contains a timestamp plus a pointer into the version store in tempdb
So, let's go through the record from the dump fragment above and figure out what everything means. I'm not going to give all the excruciating detail on every possible value of every byte, but will give you enough to be able to crack records yourself (if you're twisted like me...)
Byte 0 is the TagA byte of the record metadata. Its 0x30, which corresponds to 0x10 (bit 4) and 0x20 (bit 5). Bit 4 means the record has a NULL bitmap and bit 5 means the record has variable length columns. If 0x40 (bit 6) was also set, that would indicate that the record has a versioning tag. If 0x80 (bit 7) was also set, that would indicate that byte 1 has a value in it.
Bits 1-3 of byte 0 give the record type. The possible values are:
- 0 = primary record
- a data record in a heap that hasn't been forwarded or a data record at the leaf level of a clustered index.
- 1 = forwarded record
- a data record in a heap that's been updated and was too large to fit in-place on its original page and so has been moved to another page. A forwarding record is left in its place and points to the new location of the record. This is done to avoid having to update any non-clustered index records that point back directly to the original physical location of the record.
- 2 = forwarding record
- sometimes also called a forwarding stub.
- 3 = index record
- an index record in the tree of a clustered index, or in any level of a non-clustered index.
- 4 = blob fragment
- 5 = ghost index record
- 6 = ghost data record
- these two are self-explanatory
- 7 = ghost version record
- a special 15-byte record containing a single byte record header plus a 14-byte versioning tag that is used in some circumstances (like ghosting a versioned blob record)
In our example, none of these bits are set which means the record is a primary record. If the record was an index record, byte 0 would have the value 0x36. Remember that the record type starts on bit 1, not bit 0, and so the record type value from the enumeration above needs to be shifted left a bit (multiplied by two) to get its value in the byte.
Byte 1 is the TagB byte of the record metadata. It can either be 0x00 or 0x01. If it is 0x01, that means the record type is ghost forwarded record. In this case its 0x00, which is what we expect given the TagA byte value.
Bytes 2 and 3 are the offset of the NULL bitmap in the record. This is 0x0008 (DBCC PAGE presents multi-byte values in hex dumps as least-significant byte first). This means that there's a 4-byte fixed length portion of the record starting at byte 4. We expect this because we know the table schema.
Bytes 4 to 7 are the fixed length portion. Again, because we know the table schema, we know to interpret these bytes as a 4-byte integer. Without that knowledge, you'd have to guess. The value therefore is 0x00000005, which is what we'd expect to see as the value of the duration column.
Bytes 8 and 9 are the count of columns in the record. This is 0x0003 which is correct. Given that there are only 3 columns, the NULL bitmap of one bit per column will fit in a single byte.
Byte 10 is the NULL bitmap. Hey - it's value is 0xF8 - what's up with that? Convert it to binary and what do you get? 11111000 This makes perfect sense - bits 0-2 represent columns 1-3 and they're all 0, meaning the columns aren't NULL. Bits 3-7 represent non-existent columns and they're set to 1 for clarity. So the 0xF8 value makes sense - phew.
Bytes 11 and 12 are the count of variable length columns in the record. That value is 0x0002, which we again know to be correct. This means there will be two two-byte entries in the variable length column offset array. These will be bytes 13-14 and 15-16, having values of 0x0016 and 0x0021 respectively.
Now, the explanation that the variable length column offset array stores the offsets to the start of the column value is over-simplified. The entries actually point to the start of the following column value - this is done so that we know how long each column is (they are variable length after all). Notice that the offset of the first variable length column value isn't stored - it doesn't need to be because by definition it must begin right after the last offset in the variable length column offset array.
So, the final offset is bytes 15 and 16, which means the offset of the start of the first variable length column must be byte 17 (or 0x11 in hex), which agrees with the DBCC PAGE dump. The offset of the second variable length column is 0x0016, so the first value is from byte 17 to byte 21 inclusive. This value is 0x42616E6666. We know from the table metadata that this is the first varchar column, destination. Checking our handy ASCII conversion table we find that this translates to 'Banff'.
Using similar logic, the second value is from byte 22 to byte 32 inclusive and has the value 'sightseeing'. Both of these match the data we're expecting.
And that's it. We've just cracked a record from scratch. Now you try it with the second row in the table. In future posts I'll start making things more complicated (maybe trace into the version store or add in some off-row LOB values).
Hopefully this will satisfy all of those who've been waiting for more info on using DBCC PAGE and now I'm looking forward to getting my side of the bargain from my friend...