Under the covers: IAM chains in SQL Server 2000
I've reached a point in my various ramblings where I need to do some glossary work. First up is an explanation of IAM chains. This will be in two parts, detailing how they're used in SQL Server 2000 and then in 2005. (Probably tomorrow as the sun's shining and I want to play with my tractor once the time of day - 7.30am now - becomes a little more socially acceptable for a large diesel engine to be running.)
As with so many of the explanations of database structure internals, there's the chicken-and-egg problem unless I'm standing in front of you with a big whiteboard, so I'll refer to things that haven't been defined yet. We'll get there eventually, I promise.
An IAM (Index Allocation Map) page tracks approximately 4GB worth of space in a single file, aligned on a 4GB boundary. These 4GB chunks are called 'GAM intervals' (another post!). The IAM page tracks what space within that specific GAM interval belongs to a single entity (I'm chosing my words carefully here and not using any word that has SQL Server connotations like 'object'). I'll get to the structure of an IAM page below.
Now, as an IAM page can only track the space for a single GAM interval in a single file, if the database has multiple files, or some files are more then 4GB, and the entity has space allocated from multiple files or multiple GAM intervals within a file, then you can see how multiple IAM pages are needed for each entity to track all the space that its using.
This is where an IAM chain comes in. It's a linked-list of IAM pages that track the space allocated to a single entity. The linked-list is not sorted at all - IAM pages are appended to it in the order that they're needed. The IAM pages within the list are numbered, again, in the order that they were appended to the list.
Definition of 'entity' - what uses an IAM chain? This is vastly different in SQL Server 2000 and 2005, hence the need for two posts.
In SQL Server 2000, a single IAM chain is used for each:
- heap or clustered index
- a table can only have one or the other, not both
- these have index IDs of 0 and 1 respectively
- non-clustered index
- these have index IDs from 2 to 250 (i.e. you can only(!) have 249 of them)
- text storage
- for LOB columns in the heap or clustered index
- sometimes called the 'text index'
- these have a fixedindex ID of 255
This is very simple, right? I usually generalize and say that in SQL Server 2000, there's one IAM chain per index (which fits nicely if you remember that IAM stands for Index Allocation Map).
How exactly does each IAM page track space usage within the GAM interval that it maps to? Each IAM page has two records, an IAM page header and a bitmap.
The IAM page header tracks metadata about the IAM chain, including:
- the GAM interval that the IAM page maps
- the sequence number of the IAM page within the IAM chain
- this increases by one for each page added to the chain
- the pages are linked using their previous and next page pointers and the linked-list must be in strict sequence number order
- the object and index IDs of the index that the chain maps
- all the IAM pages in the IAM chain (and in fact all pages allocated to the index) have the same object and index ID stamped in their page headers.
It also contains the single-page array. This is only used in the first IAM page in the chain and tracks single pages that have been allocated to the index, instead of extents (an extent is a group of 8 contiguous pages - I'll cover pages and extents more fully next week in Fragmentation parts 2 and 3 - it's that chicken-and-egg thing again).
The bitmap occupies the rest of the IAM page and has a bit for each extent in the GAM interval The bit is set if the extent is allocated to the index, and clear if it is not. Obviously two IAM pages that map the same GAM interval for different indexes cannot both have the same bit set (sounds like a job for CHECKDB!)
You can look at the contents of an IAM page using DBCC PAGE. Use the instructions in this post to work out what pages are IAM pages and have a look.
Some random things to note about IAM chains and pages:
- they are not self-referential (they do not track the space taken up by themselves)
- if some data is deleted, leading to space deallocation, leading to a GAM interval no longer having any space allocated for an index, the IAM page is not deleted
- the only operations that delete pages from the IAM chain are TRUNCATE TABLE, DROP TABLE, and certain repairs in DBCC CHECKDB
- DBCC INDEXDEFRAG makes a copy of the IAM chain for an index and sorts it by file and GAM interval to allow it to work out what the next physical extent/page in each file is
- the IAM chain repairs are far-and-away the most complicated
In the next post, I'll explain how SQL Server 2005 has changed things.
(I notice that some previous posts get rated 4/5 and some 5/5 - I'd love to know what it is that's missing from posts that get rated 4/5 so I can make them more useful/enjoyable - drop me a line...)