Windows server upload speed problems

Question

Hello, so we have some problem that i cant find any solution..

We have a server with windows server 2019 (latest updates) connected 10G network card directly to 10G router and next ISP router.

we get 9Gb/s download and ~1,5Gb/s upload.

when downloading over internet from that server we get only 10MB/s speed

When we change network card speed to 1GB, download from same server to same computer go up to ~50MB/s

I was trying to change network adapter settings, but nothing helped.

This problem only occurs on windows server, on linux everything go normal and we get even more speed.

Answer

Yeh i readed that thread but nothing helped and i didint find out why on 1G downloading is good but with 10G not.

this server is for testing so no problems about confidentiality.

I made two identical tests.

why.etl file:

10G connection.

1G.etl

1G connection.

Link to files: https://1drv.ms/u/s!Asq_YVUBXCNBkxDpQw8yHJsDCILB?e=Ai6pya

Answer

Hello Aurimas,

Here is some quick feedback, but there is still more analysis and thinking to be done. This is a "visualisation" of the 1G trace:

The blue line is the transmit sequence number and the green line (not to scale) is the size of the congestion window. As can be seen, the congestion window size goes up and down a lot, but it is mostly in the range from 150,000 to 250,000 bytes. The trace shows that LSO (Large Send Offload) is in use and the trace does not contain any information about received (acknowledgement) packets; the trace data is too decoupled from the actual on-the-wire behaviour (dues to LSO) to allow many conclusions to be drawn.

This is a visualization of the 10G trace:

The first half of this trace is similar to the 1G trace (LSO in use, no received/acknowledgement packets in the trace data) and then suddenly acknowledgements start to appear in the raw trace data and other TCPIP events appear in the trace. The congestion window ranges in size from 35,000 to 60,000 bytes. In both traces, the send window is stable at 4194304 bytes (this value is chosen by the receiver and is also the limit on the congestion window size).

There are some "spurious" retransmissions in the 10G trace (the receiver reports receiving the retransmitted data twice via DSACKs) and this might be an indication of the problem that I mention in https://gary-nebbett.blogspot.com/2022/01/windows-11-tcpip-congestion-control.html.

I will spend more time looking at the section of the 10G trace that contains acknowledgements (the second half of the trace).

One thing that you could try is to disable LSO in both adapters and repeat the traces. I think that the main purpose of LSO is to reduce the computational load on the server (rather than to increase transfer speed) and since this is a test system, obtaining a clearer view of what is actually happening on-the-wire and the expense of disabling LSO would probably be worthwhile.

Gary

Answer

Thank you for feedback,

I made test with turned off LSO, speed is better as i can see.

Link to the test results: https://1drv.ms/u/s!Asq_YVUBXCNBkxJvd_l-xx9WFmok?e=pZ62y3

On the network card i enabled these property too:

Jumbo Packet: 9014 Bytes
Receive Buffers: 4096
Transmit Buffers: 16384

Worth to mention that i am using same network card, just change the Speed & Duplex from 10G to 1G.

Will be waiting youre analysis.

Thank you

Answer

Hello Aurimas,

I am still working on understanding how the tracing could possibly fail, but here is a little bit more information about what is happening when the tracing is working. This is just an extract from one of the images previously posted:

The green line is the congestion window size; the first row of dots (nearest the top) are the moments at which a "fast retransmit" occurs (triggered by analysis of the received SACKs); the next row of (red) dots (almost looks like a line) are the moments when SACKs are received and the bottom row of dots are DSACKs (RFC 2883).

For every (fast) retransmission there is a corresponding DSACK (which means that the retransmission was unnecessary - the original data eventually arrived, but just took so long to do so (and the SACKs sent in the interim triggered the retransmit). These "spurious" (unnecessary) retransmits cause the congestion window to close a bit. For every turn downwards in the green line, there is a fast retransmission and a DSACK. This seems to be happening 10 to 15 times per second in the 10G trace.

In the 1G trace, only the congestion window size is available and this seems to take about 5 hits backwards per second. I would guess that the behaviour is the same as the 10G trace: fast retransmit followed by DSACK. Because it happens less often, the congestion window is able to stay wider open.

Gary

Answer

Hello Aurimas,

Here is a concrete example of a "unnecessary" retransmission:

15:28:51.097963 server.50076 > client.60345: . 62673456:62674916(1460) ack 81 win 8192 (DF)
15:28:51.101547 client.60345 > server.50076: . ack 62673456 win 32768 (DF)
15:28:51.101548 client.60345 > server.50076: . ack 62673456 win 32768  (DF)
15:28:51.101549 client.60345 > server.50076: . ack 62673456 win 32768  (DF)
15:28:51.101632 client.60345 > server.50076: . ack 62673456 win 32768  (DF)
15:28:51.101633 client.60345 > server.50076: . ack 62674916 win 32768  (DF)
15:28:51.101645 server.50076 > client.60345: . 62673456:62674916(1460) ack 81 win 8192 (DF)
15:28:51.105159 client.60345 > server.50076: . ack 62709956 win 32768  (DF)

These are just "selected" packets from the trace, all containing a reference to the sequence number 62673456. This is how I interpret that:

The packet is first sent.
All prior packets are acknowledged up to the packet.
1 packet, sent later, is selectively acknowledged.
2 packets, sent later, are selectively acknowledged.
3 packets, sent later, are selectively acknowledged.
The packet is acknowledged, along with some selective acknowledgements.
The packet is retransmitted.
A duplicate selective acknowledgement (DSACK) for the packet is received.

The ordering of steps 6 and 7 may seem a bit odd, but this is just due to the asynchrony of the processing and logging. After step 5 (3 reports that the packet is missing), a retransmission is initiated (queued); before the retransmission is actually sent, the "missing" acknowledgement arrives but it is now too late to stop the queued retransmission.

If the acknowledgement for the packet had arrived 1 microsecond earlier (before the packet was reported "missing" for a third time) then there would have been no problem. These small intervals of time are probably an indication that the "reordering" of packets is happening as the packets pass through some intermediate network device that processes packets in parallel.

The "big" effect that these small glitches have is on the size of the sender's congestion window. It won't be practicable to eliminate the small amount of reordering, so the congestion control mechanism needs to be cleverer.

Gary

Share via

Windows server upload speed problems

10 answers