Things to Check When Troubleshooting Message Queuing
Updated: June 25, 2007
Applies To: Windows Server 2008
The following recommendations should be followed when troubleshooting Message Queuing problems in the Microsoft Windows 7 and Windows Server 2008 R2 operating systems.
Check the Windows event logs for errors and warnings
Check the Windows event logs of the Message Queuing computer(s) for errors. Search for error or warning events with a Source that contains the text MSMQ, for example "MSMQ(MSMQ)" or "MSMQ Cluster Resource". Sort events in the Event Viewer by Source to quickly locate events that contain the text "MSMQ". Then search http://support.microsoft.com for the corresponding Event ID, text in the error or warning description, or any error codes that are returned as EventData in the details of the event.
For more information about using Event Viewer see the Windows online help.
Use End-to-end tracing to troubleshoot problems with message delivery
Consider enabling End-to-End tracing to troubleshoot problems with message delivery.
For more information see the Windows 7 or Windows Server 2008 R2 help topic "Troubleshooting by Using End-to-End Tracing".
Verify computer connectivity
No matter what the connectivity symptom or problem, using the ping utility to test a computer is always a good idea. The amount of time that ping takes to respond can indicate a problem, as can the fact that ping only succeeds intermittently. Intermittent success indicates issues such as network overload or failure in name resolution, which forces the computer to broadcast for name resolution.
Verify that ping resolves the full DNS name of the problem computer. If it does not, check your DNS server.
Windows Firewall disables ICMP echo messages (ping) by default. If you are using ping to verify name resolution then you may need to temporarily allow ICMP requests and responses through Windows Firewall first.
Note that when messaging through a firewall, ping, which is based on Internet Control Message Protocol (ICMP), is a poor choice for two reasons:
It is not session based; therefore, it does not validate the ability to establish a TCP session.
ICMP is rarely allowed through firewalls, because its primary usage is to control networking devices.
In domain mode, use the MSMQ MQPing diagnostic tool. For more information about using the MQPing diagnostic tool see Test Connectivity Using MQPing.
Additional connectivity problems can be isolated through a NetMonitor (NetMon) trace. A NetMon trace can help determine the object of an attempt by Message Queuing to establish a session and ascertain which part of the connectivity process is failing. NetMon can also help find situations where the connection between the two computers is succeeding, but validation to a domain controller is failing.
For more information about using netmon to monitor and capture network traffic, see the Windows online help.
Determine resource usage
For slowness and resource depletion issues, the MSMQ performance counters in System Monitor are extremely useful. To access the counters, click Start, click Run, and type perfmon. The counters can show the following problems:
Messages accumulating in journal queues or acknowledgment queues. You might have forgotten that the journaling feature is turned on.
Pending outgoing messages. This is extremely useful, because pending messages are not detectable by any other means.
Memory utilization or depletion. This can be a common issue when sending COM objects.
Outgoing messages can be inspected by:
Performance counters, as described above.
Message Queuing MMC snap-in in Computer Management.
Local admin APIs, including:
List local private queues
Enumerate internal outgoing queues
List state of outgoing queue (connected, next hop, message count)
Pause/resume outgoing queue
Take entire queue manager offline/online
Read/delete messages in outgoing queue
Understand message size limits
Messages are stored in .mq files. The .mq file does not represent one particular queue. Messages from multiple queues can be stored in one .mq file, and messages from a single queue can be stored in multiple .mq files. However, a single message cannot span multiple files. This is the reason for the 4 MB limit in message size.
Any attempt to send a message larger than this through the system will raise an insufficient resources error. Be aware that Unicode data takes up twice as much space as non-Unicode data, because two bytes are needed for each character.
Understand threading limitations
A common technique used by programmers to have their application notified of events that happen locally or remotely is to use asynchronous callbacks. This technique works well for Message Queuing application developers, because they can subscribe to an event, go on with other work, and receive a notification that an event has transpired (message arrived) asynchronously.
However, there is a limitation in calling MQReceiveMessage() with callbacks. The limitation is that only 63 callbacks can be made against any one process. This is due to how Message Queuing has been designed to implement callbacks. The consequences of this can be understood when you consider that there is actually only one thread in an application process calling the WaitForMultipleObject API. This lone thread is responsible for waking up when any one of the 63 events is fired. Only one event is being used internally by Message Queuing at any one time. This also means that callbacks in a process are serialized. If an application makes a 64th call to MQReceiveMessage() with a callback, and the other 63 threads are still waiting to be signaled, the 64th call will receive an INSUFFICENT_RESOURCES error.
Another common threading-based scenario is to get an MQ_ERROR_INSUFFICIENT_RESOURCES error when calling MQReceiveMessage() to read from a remote queue. When your application reads from a remote queue, a thread is created by the local Message Queuing service, and this thread waits for completion of the remote read on the remote computer. The default threshold of threads created to handle these requests is based mainly on the version of the operating system you are running. The limit for Windows 2000 Server is 96. There is no limit for Windows XP Professional, Windows Server 2003, Windows Vista, Windows Server 2008, Windows 7, or Windows Server 2008 R2.
To change these limits, create the DWORD registry entries MaxRRThreads and MinRRThreads, under the HKEY_LOCAL_MACHINE\Software\Microsoft\MSMQ\Parameters registry key and set to the corresponding maximum and minimum values. After setting the registry, restart the Message Queuing service for the changes to take effect.
Incorrectly editing the registry may severely damage your system. It is recommended that you back up any valuable data on the computer before making changes to the registry.
Avoid message capacity thresholds
The best way to avoid getting in this situation is to implement quotas in your Message Queuing deployment strategy. This is a two step process:
Set computer or queue quotas. Note the difference between computer quotas and queue quotas. When a computer quota is reached, the destination machine will not accept any further incoming messages, and messages will begin to accumulate in the outgoing queue of the sending computer or on an intermediate routing server. To troubleshoot this issue, acquire a network monitor capture of the Message Queuing traffic, and look at the MSMQ session establishment packets or MSMQ session acknowledgement packets. If the window size is 1, the computer quota has been reached. When a queue quota is reached, the destination machine discards the message. Therefore, it is important to always request the proper quota negative acknowledgement when using queue quotas on the destination machine. This negative acknowledgment will only be sent from the destination machine when the quota has been reached. For more information, see Microsoft Knowledge Base article 899612, "How to set up computer quotas and queue quotas in Message Queuing", available at http://support.microsoft.com/kb/899612.
Request and acknowledgement. Quotas will keep your applications from flooding the Message Queuing service, but will not help your applications to be more flexible when these quotas are reached. To do this, you can request a NACK (negative acknowledgment) from the computer to which you are sending messages. If this acknowledgement is returned to your application, and indicates that the quota for this queue or machine has been reached, your application can either cease sending messages or offload the messages to another destination. This is an excellent way to scale out Message Queuing. For more information about these acknowledgements, see Message Queuing documentation on acknowledgment messages in MSDN.