Processing Large Interchanges
Recently I’ve have had quite a few customers asking about the processing of large interchanges, and restrictions around them. Firstly, let's make sure we're on the same page regarding what an interchange is, it's a single message that contains many individual messages that are disassembled in the receive pipeline, an example of this would be a flat file which is disassembled (using the FlatFile dissassembler) to produce many smaller messages, each of which are published independently.
As you’ll know BizTalk Server 2004 has a strong large message story, the engine jumps through hops in order to keep a flat memory model at run-time. If you consider the implications of this on a large interchanges they are quite interesting...
Interchanges are processed using one atomic transaction, meaning that either all of the messages disassembled from the interchange are published to the message box or none of them are. But clearly if the messages disassembled from an interchange were committed to the message box only after the interchange had been entirely disassembled, the engine would reach an out of memory condition for large interchanges. In order to keep the memory model flat the engine breaks the interchange into many sub-batches under the scope of a transaction, the transaction is committed only once the interchange has been entirely disassembled successfully. Because the interchange is sub-divided the memory for each sub-batch may be relinquished in order to keep the memory model flat. This all sounds well and good, how ever, each one of these messages keeps a SQL lock, and SQL does not have a limitless number of locks!!
For large interchanges, where large is say 100,000, it’s pretty easy to reach this lock limit when concurrently processing large interchanges. The reason being due to the multi-threaded nature of the engine it is possible to have many interchanges being processed at once. While out of the box the engine restricts the number of active threads per CPU, a context switch will cause another thread to be released and potentially kick off the processing of another large interchange, hence more locks will be taken.
If you’ve not seen it, the performance white paper has a simple equation that can be used to estimate the maximum size of an interchange:
Maximum number of messages per interchange <= 200,000 / (Number of CPUs * BatchSize * MessagingThreadPoolSize)
So, around this, there are a few options that you can employ to reach the out of locks scenario for large interchanges:
- Pre-split the interchange before it hits BizTalk, the size can be determined using the formula above
- Use 64bit SQL, I believe there are more locks
- Write a custom adapter that pre-splits the interchange
- Write an orchestration to split the interchange
The last point to consider is that smaller interchanges allow for more efficient processing so the raw through put will be greatly increased by using smaller interchange sizes, again there is some good data around this in the performance white paper.