Is it real or Matrix? Some facts about network traces...
In this blog post, I would like to talk about some facts about network traces. A while ago we were discussing how much we could depend on network trace data when it comes to those question: “The outgoing packet that we see in a network trace, did it really hit the wire?”, “Not seeing a packet that was supposed to be received by that system in a network trace collected on that system, does it really mean that the machine didn’t physically receive the packet?”
Before going to answers, it would be better to clarify the network trace capturing architecture so that you can give the answers for yourself:
NETWORK STACK VIEW IN GENERAL:
Upper layer stacks (Winsock etc)
NDIS Protocol drivers (Example: TCPIP)
NDIS Intermediate (NDIS 5.x) or NDIS Lightweight filter drivers (NDIS 6.x)
(Examples: NLB / QoS / Teaming / 802.1q tagging / packet capturing agent / NDIS level firewalls etc)
NDIS Miniport drivers (Example: NIC drivers)
Note: Microsoft Network Monitor capturing agent runs as an NDIS lightweight filter driver as of Windows Vista onwards.
Note: Wireshark capturing agent (NPF/WinPcap) runs as an NDIS protocol driver. You can find more information at the below link:
Well, after taking a look at the above architecture I think we can easily answer the questions I asked at the beginning of the post (and also some additional questions). May be we need to add one more thing to fully explain this: the order of NDIS intermediate or NDIS lightweight filter drivers. Actually this is not an easy one to answer but we can say from experience that generally NLB, teaming, NDIS level firewall drivers etc run at below packet capturing agent (for Network Monitor scenario. For Wireshark, it looks like all those run below Wireshark capturing engine winpcap since winpcap runs as an NDIS protocol driver).
QUESTIONS & ANSWERS
Q1. The outgoing packet that we see in a network trace, did it really hit the wire?
A1. No, we cannot say it for sure. After the capturing engine got a copy of the packet, the packet could have been dropped by a lower layer NDIS intermediate or lightweight filter driver like teaming, NDIS level firewall, NIC driver etc...
Q2. Not seeing a packet in a network trace that was supposed to be received by that system, does it really mean that the machine didn’t physically receive the packet?
A2. No, we cannot say for sure. The packet might have been dropped by a lower level NDIS miniport driver or NDIS intermediate/lightweight filter driver before it reaches capturing engine on the same box
(Examples: You cannot see packets on an NLB node if that node isn’t supposed to handle that packet. You may not see the packet because a lower level NDIS driver)
Q3. Seeing a packet received in a network trace, does it really mean that the upper layers (like TCPIP/Winsock/Application/etc) on this machine has *seen* that packet?
A3. No, we cannot say for sure. The packet might have been dropped by an upper layer NDIS intermediate/lightweight filter driver or an upper layer firewall or similar security softwares (intrusion detection/URL filtering/etc)
(Examples: For example you see incoming TCP SYNs in a network trace but you don’t see the TCP SYN ACKs being sent in response. This generally stems from a firewall/similar security driver running somewhere at the top of capturing engine. Since the packet is dropped before TCPIP driver had a chance to process it, no response is sent back to the connecting party. Another example could be a higher level filter again that prevent the request from being seen by the application (like winsock level filters)
Q4. I see that there’re TCP segments or IP datagrams with incorrect checksums, does that really mean the packet is problematic and causing the performance/connectivity issue that I see?
A4. The answer is it depends. If you see those checksum errors for packets sent by A to B and the trace is collected on A and TCP/IP checksum offloading is enabled (by default for many NICs) then it doesn’t indicate a problem. May we should also explain why we see it that way:
TCP/IP header checksum offloading is a performance feature supported by many NICs. With this feature, TCPIP driver pass the packet down to NIC driver without calculating the checksum for IP or TCP headers. Since the capturing engine *sees* the packet before the NIC driver, it concludes that the checksum related fields are incorrect (which is normal because those checksums will be calculated by NIC driver before the packet hits the wire). For other situations, this would indicate a problem.
Q5. I sometimes see packets whose payloads are greater than standard Ethernet MTU in a network trace. How could that be possible?
A5. Some NIC drivers support LSO - Large Send offloading feature which is another optimization. Large send offloading allows TCPIP driver to offload TCP data segmentation (fragmentation) to NIC driver to accelarate the process. Also please note that in some instances, generally due to NIC driver issues, LSO might cause performance problems. You can find more information about this feature at the following link:
http://www.microsoft.com/whdc/device/network/taskoffload.mspx Windows Network Task Offload
Hope this helps.