Tracking Down Exchange 2007 Database Bloat
I recently dealt with an issue of an Exchange 2007 database being physically larger than what was expected. So we took a few actions to find out more about the cause of the bloat. This outlines some of the work we did to isolate that bloat.
We started by getting more information about the database
- Check the most recent 1221 event in Application Log to ensure that Online Maintenance has completed
- From the Exchange 2007 Management Shell, run Get-mailboxstatistics
- PFDavAmin Item Content report
- Run ESEUTIL /MS against that database (ex: Eseutil /ms DBName.edb >C:\MSOutput.txt)
- Run ISINTEG –DUMP against that database
DETERMINE HOW MUCH BLOAT
Event 1221 showed us how much whitespace the DB reclaimed during online maintenance:
Event ID : 1221 Category : General Source : MSExchangeIS Mailbox Store Type : Information Message : The database "MyStorageGroup\MBXDB" has 5178 megabytes of free space after online defragmentation has terminated.
Added up the Deleted Item Size & Item Size from the Get-mailboxstatistics output. This is rough version of how much “user data” that the database has.
Noted the physical size of the database (ex: 50GB). Determine how much bloat may exist by adding the event 1221 whitespace (ex: 5GB) and the user data (24GB). In our example, we have a total of 29GB accounted for but 21GB unaccounted.
NOTICE: Before you dig into the /MS output, you should read through the ESE Database Structure technet article. At a minimum, understand that pages in Exchange 2007 are divided into 8-KB pages, where Exchange 2000 and 2003 (ESE98) use 4KB pages.
Do not expect to have a DB that is physically the same size as your whitespace and user data. There are many reasons why the database may require additional space. These might include database structure such as indexes, tables, and search folders as well as fragmented pages and unclaimed whitespace (i.e. changes since expiry and online maintenance).
SAMPLE /MS OUTPUT
********************************** SPACE DUMP ************************************* Name Type ObjidFDP PgnoFDP PriExt Owned Available ===================================================================================== Dbname.edb Db 1 1 256-m 3187862 64000 1-121 Tbl 112 426 8-s 8 0 ?B6708?T668f Idx 1848 431 1-s 1 0 MsgFolderIndex7 Idx 113 427 1-s 1 0 MsgFolderIndexPtagDel Idx 116 430 1-s 1 0 MsgFolderIndexURLComp Idx 115 429 1-s 1 0 RuleMsgFolderIndex Idx 114 428 1-s 1 0 1-24 Tbl 61 142 2-m 695104 3 1-611BB71A86 Tbl 312 833 8-m 3014 5 ?B6708?T668f+B67aa+S1 Idx 1850 476 1-s 1 0 MsgFolderIndex7 Idx 313 834 1-s 1 0 MsgFolderIndexPtagDel Idx 316 837 1-s 1 0 MsgFolderIndexURLComp Idx 315 836 1-s 1 0 RuleMsgFolderIndex Idx 314 835 1-s 1 0 S-1-28B913B0D4F Tbl 1862 705 8-s 8 3 MsgFolderIndexURLComp Idx 1863 706 1-s 1 0 ptagFIDIndex Idx 1865 708 1-s 1 0 ptagSearchedFIDIndex Idx 1864 707 1-s 1 0 - continued - --------------------------------------------------------------------------------------------------------------------------- 647540
MS Output Field information
- FDP is a special page in the database which indicates which B+tree this page belongs to. ObjidFDP is the Object ID of the FDP
- PgnoFDP is the page number of the FDP
- PriExt is the combination of a number and letter. The number before the dash is the initial number of pages when the object was first created in the B-Tree. The letter after the dash indicates whether the space for the B-Tree is currently represented using multiple pages ("m") or a single page ("s").
- Owned number of pages that contain data and/or are in use
- Available the number of free pages available
- Type may includeTable (TBL), Index (IDX), and Long-value (LV)
- LV may be required because a column or a record in ESE cannot span pages in the data B+tree. There are values that break the 8KB boundary of a page; referred to as long-values (LV). A table's long-value B+tree is used to store these large values.
READING THE /MS OUTPUT
We decided to look at 4 things within the /MS output:
Calculate Actual Whitespace: The number at the end of the dump (647540 in the above example) is the summation of the total number of pages that are available throughout all the tables. Take that number and multiply that by the page size value (8KB for Exchange 2007). In our example, we have 5,180MB of Whitespace.
Attachment Table: Table 1-24 holds all attachments in the database. In our example, we have 695104 Owned Pages for this attachment table. We multiply that number with the page size (8KB) and the total is 5.5 GB of space is for attachments.
Search Folders: Search folders are listed by the S- value. In the example above, S-1-28B913B0D4F is a search folder. Look for a many S- values in the output and follow DGoldmans blog to identify anyone users has a large number of Search Folders.
Large Consumption Users: Look through the output and see if there is any object that has a large number of owned pages. In our example, we see that 1-611BB71A86 has 3014 pages.
NOTE: All user mailbox folder tables are numbered, not named. In the example above, 1-611BB71A86 is a mailbox folder table. But also look at other tables, such as MailboxTombstone or Message Tombstone.
If you find a numbered table that has a large number of owned pages, you can identify which mailbox that table belongs to by looking at the ISINTEG –DUMP output.
To do this, copy the numbered value after the dash (611BB71A86) and then search the ISINTEG output file from the bottom up for that value.
 RootFID=0001-00611BB71A86 Owner DN=??? GUID=D12C30EC 4938E64D 89999899 906A78DA Display Name=Mailbox - John Doe Comment= Sentmail FID=0000-0000309F1E78 Subtree=0001-00611BB71A87 Inbox=0001-00611BB71A88 Outbox=0001-00611BB71A89 Sentmail=0001-00611BB71A8A Finder=0001-00611BB71A8C DAF=0001-00611BB71A8D Spooler Q=0001-00611BB71A8E Size=(ec:ecNotFound-MAPI_E_NOT_FOUND) Localized=TRUE Locale=0x409 In some cases, the search results yield something like this: Folder FID=0001-00611BB71A86 Parent FID=0001-00611BB71A92 Root FID=0001-00611BB71A92 Folder Type=1 Msg Count=0 Msgs Unread=0 Msgs Submitted=0 Rcv Count=1 Subfolders=0 Name=Shortcuts
If your results do not show a mailbox name, then this folder may be a subfolder. You can then search the ISINTEG output for the Parent FID value (Ex: 00611BB71A92). You may have to do this several times until you locate the root mailbox name.
BACK TO OUR ISSUE
So what we found in our issue was that we had a very large number of Search Folders present in the /MS output. We decided to configure the RESET VIEWS registry key for that database, allow online maintenance to complete for several more times until more whitespace became available. We then perform an offline defrag of the database. This freed up some of the DB bloat.
NOTE: If the database is continuing to grow in size, you may want to capture the data on a regular basis and see if there are any patterns for the growth (i.e. types of data or specific users). Then try to isolate why that bloat may be occurring.
- Determine the True amount of WhiteSpace in an Exchange Database - http://blogs.msdn.com/jeremyk/archive/2004/04/09/110553.aspx
- Microsoft Exchange and Search Folders - http://blogs.msdn.com/dgoldman/archive/2008/07/01/microsoft-exchange-and-search-folders.aspx
- Extensible Storage Engine Architecture - http://technet.microsoft.com/en-us/library/bb310772.aspx
- How to Run Eseutil /M - http://technet.microsoft.com/en-us/library/bb125171.aspx
- ISINTEG - http://technet.microsoft.com/en-us/library/bb125144.aspx
- KB 262196 XADM: How to Determine Which Mailbox Owns a Particular Page in a Database