A lossy failover causes duplicate mails to be delivered to clients from the hub transport dumpster when using Exchange 2007 SP1 Cluster Continuous Replication (CCR).

In the recent weeks I have worked with some customers that have experienced duplicate mails in the Outlook client after a CCR cluster experiences a lossy failover.

A lossy failover occurs when the Exchange resources move between nodes and logs exist on the source that could not be copied to the target.  Depending on settings the actual loss is incurred when a failover occurs and the databases are automatically mounted based on the availability setting, the administrator forces the databases online, or the forceDatabaseMountAfter time period has expired.

The following events can be noted during a lossy failover:

Log Name:      Application
Source:        MSExchangeIS
Date:          2/23/2009 9:24:17 AM
Event ID:      9796
Task Category: General
Level:         Warning
Keywords:      Classic
User:          N/A
Computer:      2008-Node6.exchange.msft
Database "2008-MBX5-SG2\2008-MBX5-SG2-DB1" has been subject to a lossy failover. The database may be patched if the Information Store detects it is necessary.

Log Name:      Application
Source:        MSExchangeRepl
Date:          2/23/2009 9:24:24 AM
Event ID:      2099
Task Category: Service
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      2008-Node6.exchange.msft
The Microsoft Exchange Replication Service requested that Hub Transport server 2008-DC1 resubmit messages between time periods 2/22/2009 8:15:10 PM (UTC) and 2/23/2009 6:24:10 PM (UTC).

The hub transport dumpster is designed to attempt to re-deliver messages that may have been lost during a lossy failover.  A dumpster is maintained on the hub transport server for each CCR or LCR enabled storage group.   The dumpster is a user configured size and a user configured time (see references below).  In terms of how the dumpster operates it’s a FIFO queue of messages.  Let’s look at an example:

My dumpster size is 5 meg.  My dumpster retention time is 7 days.  I have a single storage group with a single mailbox.  I send a 2 meg message.  This 2 meg message is committed to the hub transport dumpster queue.  I then send another 2 meg message – this message is committed to the hub transport dumpster queue.  Finally I finish by sending another 2 meg message – in this case the oldest message in the queue is popped off so that the next 2 meg message can be committed to the hub transport dumpster.  Using this method you can see that messages will recycle through the hub transport dumpster with the first in messages becoming the first out in order to accommodate new messages.

Another example is where messages are moved out due to time expiring.  My dumpster size is 5 meg.  My dumpster retention time is 2 days.  I have a single storage group with a single mailbox.  I send a 2 meg message on Thursday at noon.  I then send another 2 meg message on Friday at noon.  No other messages are sent to this mailbox.  At this point both messages will be committed to the queue since the dumpster has not reached a full condition.  Saturday at noon the message I sent on Thursday is automatically removed from the queue since it has expired based on my maximum retention time.

When a lossy failover occurs the hub transport servers, when requested, will flush the entire contents of their queues back into transport.  This causes the hub transport server to evaluate the messages as if they were just received, and begin the delivery process.  In order to prevent duplicates, each MAILBOX server maintains a list of message ids based on storage group.  Each message that is received is evaluated against this table, if the message id is found the message is turfed (by the store as opposed to being turfed on the transport server) so that no duplicate occurs.  If the message id is not found, the message is re-delivered.  Please note again this table is based on storage group.

Where we run into potential issues is when a move mailbox operation occurs.  When a user is moved between stores or between servers the duplicate tracking table is not updated.  Since this table is not updated, when the messages come from the hub transport server each is evaluated as a new message resulting in duplicate message delivery occurring.  Lets take a look at an example:

I have a user that is on ServerA\StorageGroupA\MailboxDatabaseA.  ServerA in an Exchange 2007 SP1 CCR cluster.  Due to a power outage of the primary node the Exchange resources are failed from ServerANodeA to ServerANodeB.  In the process 3 logs are lost – availability is set to “best availability” – the databases mount automatically.  The cluster informs the hub transport servers of a lossy failover and the dumpsters are flushed.  The user does not receive any duplicate messages.  This is by design – the duplicate tracking table exists for this storage group, is populated with entries for this mailbox, and successfully turfs any potential duplicates for this user.

Lets take a look at another example:

I have a user that is on ServerA\StorageGroupA\MailboxDatabaseA.  ServerA is an Exchange 2007 SP1 CCR cluster.  On Monday I move this user from ServerA to ServerB.  The user is now located in ServerB\StorageGroupB\MailboxDatabaseB.  On Wednesday, due to a power outage, ServerA experiences a lossy failover between nodes.  The hub transport dumpsters are flushed.  When this happens messages destined to this user are re-evaluated, and re-routed to ServerB (where the user now exists).  The duplicate tracking table is consulted, but there are no matches for the messages originating from the dumpster (remember that a move mailbox does not update the duplicate tracking table).  Since there are no matches, all messages for this user, that were submitted from the dumpster, are re-delivered as new to the user.  This appears to the user as duplicate messages.

As of today this behavior is by design.  From an administrator standpoint there is nothing that can be done to mitigate this issue outside of ensuring that the Maximum size per storage group and maximum retention time for the hub transport dumpster are configured appropriately for your organization.

For more information see the following: