question

IanTurner-5104 avatar image
1 Vote"
IanTurner-5104 asked EliasTaye-2762 commented

Extreamly slow Upload speed in Windows (all other OS's on network are fine)

I've been having this issue since at least January 2021 on my entire fleet of Windows devices. I first noticed it when my offsite backups stopped completing in time.

Upload speed in Windows is being throttled by something. Download speed is unaffected.

WAN is 2Gbps symmetrical. My ISP (Washington State K-20 Telecommunications Network) confirmed that their circuit is not the cause and is capable of 1800 Mbps symmetrical throughput. Distance is a factor, as I can get 400Mbps to my local telco (which isn't my IPS). Going past a few hundred kilometers it drops to 30-70Mbps. There is an initial burst, but drops quickly.

I have an HPE/Aruba network and a Sophos XG 310 v2 running SFOS 18.0.4 MR-4. If I plug a client directly into my 10Gbit fiber before my firewall, I can get acceptable speeds on Windows. I haven't been able to find any setting in Sophos XG to tweak that would make any difference. Local iPerf3 tests ruled out my core router/switch/datacenter.

All tests run from Hyper-V guests on Server 2019 Datacenter running on HPE ProLiant DL360 Gen10 hardware.
Windows Server 2019 Datacenter speed tests:
81206-image.png
81275-image.png

Linux (CentOS 8) speed tests:

 Speedtest by Ookla
    
      Server: Comcast - Seattle, WA (id = 1782)
         ISP: Washington State K-20 Telecommunications Network
     Latency:     3.93 ms   (0.16 ms jitter)
    Download:   915.29 Mbps (data used: 1.2 GB)
      Upload:  1537.96 Mbps (data used: 1.8 GB)
 Packet Loss: Not available.
  Result URL: https://www.speedtest.net/result/c/c4bca417-e246-4f46-964a-c4291e4a3914

 Speedtest by Ookla
    
      Server: Comcast - Sacramento, CA (id = 9436)
         ISP: Washington State K-20 Telecommunications Network
     Latency:    24.65 ms   (0.13 ms jitter)
    Download:  1232.96 Mbps (data used: 1.7 GB)
      Upload:  1007.46 Mbps (data used: 1.3 GB)
 Packet Loss:     0.4%
  Result URL: https://www.speedtest.net/result/c/21032a9c-8285-44fe-aadf-ad4dc3d90428

OS affected for me:

  • Windows 10 2004

  • Windows 10 20H2

  • Windows 2016

  • Windows 2019

All devices are fully updated, firmware included.

I've tweaked:

  • Limit reservable bandwidth

  • AV

  • Safe mode boot

  • Domain and non-domain computers

  • autotuning

  • Interrupt Moderation

  • Receive Side Scaling

  • TCP Congestion Control

  • Large Send Offload

I've tried the following hardware:

  • Dell Optiplex 7040

  • HP Elitebook 840 G5

  • HPE ProLiant DL380 Gen10

  • HPE ProLiant DL360 Gen10

These OS' are fine:

  • ChromeOS

  • Android

  • MacOS

  • iOS

  • Linux (CentOS, HyperV)

This is a continuation of https://docs.microsoft.com/en-us/answers/questions/89768/slow-wired-upload-speed-vs-linux-on-same-hardware.html


windows-serverwindows-10-network
image.png (43.6 KiB)
image.png (43.5 KiB)
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

GaryNebbett avatar image
0 Votes"
GaryNebbett answered

Hello @IanTurner-5104,

I guess that you read what I wrote about regarding out-of-order delivery undermining the Windows congestion control mechanisms. Have you made any measurements to check to what extent this might explain the behaviour that you are observing?

Gary

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

IanTurner-5104 avatar image
0 Votes"
IanTurner-5104 answered

Hi @GaryNebbett,

I did read through your responses, but I had not run any measures to check for out of order delivery yet. My suspicion is on my Sophos XG firewall which would be why I can get acceptable speeds directly connected to my WAN connection on a client computer. I have an open ticket with Sophos and will bring up Out of Order delivery with them.

I think I tried all the suggested things, unless I missed something. I used to get normal speeds during upload. I do not have a record of what was changed, either a firewall update, Windows update or something else.

  New-NetEventSession -LocalFilePath $Env:TEMP\SlowUp.etl -Name SlowUp
  Add-NetEventPacketCaptureProvider -TruncationLength 100 -Level 255 -SessionName SlowUp
  Add-NetEventProvider -Name "Microsoft-Windows-TCPIP" -Level 255 -SessionName SlowUp
  Start-NetEventSession -Name SlowUp
        
  [run performance test]
        
  Stop-NetEventSession -Name SlowUp
  Remove-NetEventSession

Upload event:

 ThreadID="10,992" ProcessorNumber="16" Tcb="0xffffb88b35872b20" DataBytesOut="19,242,849" DataBytesIn="13,306" DataSegmentsOut="6,359" DataSegmentsIn="15" SegmentsOut="6,369" SegmentsIn="7,480" NonRecovDa="1,436" NonRecovDaEpisodes="1,325" DupAcksIn="1,989" BytesRetrans="124,100" Timeouts="0" SpuriousRtoDetections="0" FastRetran="66" MaxSsthresh="501,694" MaxSsCwnd="716,706" MaxCaCwnd="507,015" SndLimTransRwin="3" SndLimTimeRwin="21" SndLimBytesRwin="86,140" SndLimTransCwnd="5" SndLimTimeCwnd="16,434" SndLimBytesCwnd="13,408,300" SndLimTransSnd="5" SndLimTimeRSnd="165" SndLimBytesRSnd="497,253" ConnectionTimeMs="16,643" TimestampsEnabled="FALSE" RttUs="24,261" MinRttUs="482" MaxRttUs="45,838" SynRetrans="0" CongestionAlgorithm="CUBIC" State="ClosedState" LocalAddress="10.1.3.146:55306" RemoteAddress="69.241.21.18:8080" CWnd="24,404" SsThresh="14,600" RcvWnd="261,479" RcvBuf="262,800" SndWnd="2,474,880" FormattedMessage="TCP: Connection 0xffffb88b35872b20 Summary: DataBytesOut 19,242,849 DataBytesIn 13,306 DataSegmentsOut 6,359 DataSegmentsIn 15 SegmentsOut 6,369 SegmentsIn 7,480 NonRecovDa \   1,436 NonRecovDaEpisodes 1,325 DupAcksIn 1,989 BytesRetrans 124,100 Timeouts 0 SpuriousRtoDetections 0 FastRetran 66 MaxSsthresh 501,694 MaxSsCwnd 716,706 \   MaxCaCwnd 507,015 SndLimTransRwin 3 SndLimTimeRwin 21 SndLimBytesRwin 86,140 SndLimTransCwnd 5 SndLimTimeCwnd 16,434 SndLimBytesCwnd 13,408,300 \   SndLimTransSnd 5 SndLimTimeSnd 165 SndLimBytesSnd 497,253 ConnectionTimeMs 16,643 Timestamps FALSE RttUs 24,261 MinRtt 482 MaxRtt 45,838 SynRetrans 0 CongestionAlgorithm CUBIC \   State ClosedState Local 10.1.3.146:55306 Remote 69.241.21.18:8080 CWnd 24,404 SsThresh 14,600 RcvWnd 261,479 RcvBuf 262,800 SndWnd 2,474,880. " ActivityID="35872b20-b88b-ffff-0000-000000000000"  

SlowUp.etl: https://drive.google.com/file/d/1DUC9if8uDLiijNq_k0HrER-cUss115Ja/view

Thank you for helping,
Ian

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

SunnyQi-MSFT avatar image
0 Votes"
SunnyQi-MSFT answered

Hi ,

Thanks for posting in Q&A platform.

In your case ,we need to analyze performance log to find any clues. Unfortunately, analysis of performance log is beyond our forum support level and due to forum security policy, we have no such channel to collect user log information. So if you want to find the cause ,we recommend you open a case with MS Professional tech support service, they will help you open a phone or email case to Microsoft, so that you would get a technical support on a one-to-one basis while ensuring private information.

Here is the link:

https://support.microsoft.com/en-us/gp/customer-service-phone-numbers

Best Regards,
Sunny


If the Answer is helpful, please click "Accept Answer" and upvote it.

Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

GaryNebbett avatar image
0 Votes"
GaryNebbett answered

Hello @IanTurner-5104,

There are certainly examples of very out-of-order delivery triggering unnecessary retransmissions and reduction of the congestion window in your trace data. The presence of D-SACK data (RFC 2883) in the trace leaves no doubt about this. I need to do some more work to quantify the frequency of such events and their impact on the throughput.

I doubt that the "out-of-order" behaviour is caused by your equipment; I think it is much more likely to be caused by network devices elsewhere in the Internet.

Gary

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

GaryNebbett avatar image
0 Votes"
GaryNebbett answered

Hello @IanTurner-5104,

Speedtest uses several HTTP connections in parallel to test the upload speed. So far, I have just looked at one connection in detail. This connection retransmitted 85 segments, all of which were spurious (i.e. were ultimately received twice because the out-of-order delivery confounded the lost segment detection - the original segments were not lost but just arrived "late").

The "CUBIC for Fast Long-Distance Networks draft-eggert-tcpm-rfc8312bis-01" draft RFC (dated 2 February 2021), which might update "CUBIC for Fast Long-Distance Networks" (RFC 8312) says:

CUBIC MAY implement an algorithm to detect spurious retransmissions,
such as DSACK [RFC3708], Forward RTO-Recovery [RFC5682] or Eifel
[RFC3522]. Once a spurious congestion event is detected, CUBIC
SHOULD restore the original values of above mentioned variables as
follows if the current cwnd is lower than prior_cwnd. Restoring
to the original values ensures that CUBIC's performance is similar to
what it would be if there were no spurious losses.

The current Windows implementation of CUBIC is not undoing the reductions in the congestion window caused by spurious retransmissions - perhaps all of your other systems either do this or use a different congestion control mechanism.

Unless the out-of-order delivery is caused by a device under your control (unlikely), then there is not much that you can do about this.

Gary

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

GaryNebbett avatar image
0 Votes"
GaryNebbett answered

Hello All,

In the network traces that I have seen which record spurious retransmissions there is another characteristic that I mentioned as an aside, namely that "ack only" packets tend to appear in "bunches" with sub-microsecond spacing between the packets. I did not place too much emphasis on this because the process of capturing trace data can introduce "artefacts" (characteristics that would not be present if one could monitor the low-level signalling of the messages).

Whilst doing some background reading, I found this article by Christian Huitema: Implementing Cubic congestion control in Quic. In the section on "Spurious retransmissions", he writes:

If you look closely at the graph of RTT above, you will see a number of vertical bars with multiple measurements happening at almost the same time. This is a classic symptom of “ACK compression”. Some mechanism in the path causes ACK packets to be queued and then all released at the same time.

This gives a name ("ACK compression") to the bunching that can be researched in the Web. Christian was writing about apply Cubic to HTTP/3 (UDP), but it is equally relevant to its use with TCP.

The use of the expression "Some mechanism in the path" is compatible with my belief that the source of the problem being discussed in this thread is beyond our control and that additional mechanisms are needed in the Windows implementation to compensate for it (e.g. undoing reductions in the congestion window caused by spurious retransmissions).

Gary

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

GaryNebbett avatar image
0 Votes"
GaryNebbett answered GaryNebbett edited

Hello All,

I thought that a graphical representation of this problem might be helpful. Let’s start with two pictures of a “healthy” speed test (where the test results match the nominal or expected speed).

First, an overview of the 5 parallel (concurrent) connections used during the upload speed test:
94395-image.png

This is a larger view of one connection:
94422-image.png

The blue line is the TCP send sequence number; the green line is the size of the congestion window (not to scale on the Y-axis – it is its shape that is most interesting).

At 6 arbitrary Y-values, points are shown when “interesting” events occur; the events are:

• TcpEarlyRetransmit
• TcpLossRecoverySend Fast Retransmit
• TcpLossRecoverySend SACK Retransmit
• TcpDataTransferRetransmit
• TcpTcbExpireTimer RetransmitTimer
• DSACK [RFC3708] arrival

Some of these events do not always have a visible impact on the congestion window. For example, expiration of the retransmit timer (lime green coloured dots on the graph) do not affect the window if a “tail loss” is suspected. Potential for “tail loss” situations occur in the speed test because, within each connection, the WebSocket protocol is used to send chunks of data and this requires WebSocket protocol level handshakes.

The overview shows that there were only a few retransmissions (six, I think), roughly equally divided between fast retransmits (based on duplicate acknowledgements, SACK, etc.) and “slow” retransmits (retransmit timer expiration).

Here is a less healthy speed test:
94383-image.png

And again, a larger view of one connection:
94384-image.png

In these traces there are a large number of fast retransmits, almost all of which are accompanied (a few milliseconds later) by a DSACK notification (indigo coloured dots, the highest row of dots) – indicating that the retransmission was not needed. The impact of these retransmissions on the congestion window is clear to see – it is kept very small for almost the entire duration of the speed test.

The first half a second of the above trace hints at what might have been – the steep slope of the blue line suggests that a much higher upload speed is probably possible.

As mentioned in previous postings, there is not much that an end-user can do about this problem. The device responsible for the “ACK compression” (impacting timing and ordering of ACK delivery) is probably not under their control and the Microsoft congestion control provider does not use DSACK information to “undo” its erroneous reductions of the congestion window.

Gary


image.png (114.4 KiB)
image.png (71.0 KiB)
image.png (135.7 KiB)
image.png (84.4 KiB)
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello All,

These articles suggests that improvements in Windows' handling of out-of-order packets are coming: https://techcommunity.microsoft.com/t5/networking-blog/algorithmic-improvements-boost-tcp-performance-on-the-internet/ba-p/2347061 and https://dropbox.tech/infrastructure/boosting-dropbox-upload-speed.

Gary

0 Votes 0 ·
EliasTaye-2762 avatar image
0 Votes"
EliasTaye-2762 answered GaryNebbett commented

@GaryNebbett hey Gary i'm currently facing the same issue... did you find a fix for it? i've been struggling for months now and linux is working fine but i dont want to migrate. it's really bothering my upload speed.

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @EliasTaye-2762,

As my last comment mentioned, the following two articles indicate strongly that improvements are on the way (currently available via Windows Insider builds): https://techcommunity.microsoft.com/t5/networking-blog/algorithmic-improvements-boost-tcp-performance-on-the-internet/ba-p/2347061 and https://dropbox.tech/infrastructure/boosting-dropbox-upload-speed.

Gary

0 Votes 0 ·
EliasTaye-2762 avatar image
0 Votes"
EliasTaye-2762 answered EliasTaye-2762 commented

@GaryNebbett i've seen them and it doesn't say when they will update. have you made peace with the problem and decided to live with it? or have you found a way to overcome this? and would downgrading help?

· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @EliasTaye-2762,

I don't have the problem myself. Nothing other than these modifications will improve the situation for Windows.

Gary

0 Votes 0 ·

@GaryNebbett surprisingly i have been facing this on all windows devices including windows 7. and it happened recently after feburary... i checked with my ISP they are not throttling...and other devices are just fine and i am only having a problem with upload speeds!

0 Votes 0 ·
GaryNebbett avatar image
0 Votes"
GaryNebbett answered EliasTaye-2762 commented

Hello @EliasTaye-2762,

That is all perfectly consistent with the foregoing messages. Reverting to an older version of Windows will not help - one needs to wait for the coming improvements. The slowness is not caused by intentional throttling - indeed probably quite the contrary; somewhere along the path to the systems that you are using, an update, probably intended as an improvement, is interacting badly with the current Windows TCP congestion mechanism.

Gary

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

@GaryNebbett lets hope Microsoft fixes this problem quickly becuase we the users are really struggling! switching to Ubuntu is on the table b/c we'll be forced in the work enviroment we have no other choice...we're helpless. we actually wasted the ISP's tme over non sense then... it wasnt their fault. it was Microsoft all along. and my company is forcing for a quick fix. we're planning to do operations on a webapp over linux machines the coming week if this is not quickly fixed by then. Thank you Gary for the clarification. and Microsoft needs to step up and fix the problem. Have a good day. - Elias

0 Votes 0 ·