question

jaltmann-3698 avatar image
0 Votes"
jaltmann-3698 asked ScottRamirez-9041 commented

Windows 10 build 2004 issues with dnsapi.dll

Hello,

We have an issue (possibly specific to a GPO in our environment) that causes issues with the dnsapi.dll library in build 2004. The behavior that happens after a domain join computer is freshly imaged with 2004 or is updated from a previous version is that if there is any network connectivity, lsass.exe will spike all cores to 100% CPU usage while trying to call dnsapi.dll and it will use multiple threads to attempt to execute. I was able to determine this using Process Explorer for sysinternals. This is platform independent and happens on both our Dell's and Lenovo's. If any network device is connected, this will result in a forever spinning login screen. If the network devices are disabled and the user profile is logged into, then a network device (wifi/ethernet) is connected, services with privilege escalations will fail due to the high CPU usage. If network devices are then disconnected, then after a few minutes cores free up CPU.

As a partial fix, I have replaced both 32 and 64 bit dnsapi.dll's with a version from Windows 10 build 1903 and the issue with lsass goes away and I'm able to log in and have no issues with high CPU usage or privilege escalations. The side affect of an older dnsapi.dll is that I'm unable to browse network shares and receive the following error in event viewer: "The DNS Client service terminated with the following error: The specified procedure could not be found".

The unfortunate thing is I'm unable to get a Microsoft resource because our org is under 500 people and our licensing partner can not even get a resource assigned to investigate the issue. If we are having this issue in our environment, I'm sure others are running into this as well.

CU KB 4565503 from the July 13 update does not fix the issue.

windows-10-generalwindows-dhcp-dns
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

If possible, could you please share us the detailed GPO configuration? I would like to test in my lab and check if I occur the same issue.

0 Votes 0 ·
CandyLuo-MSFT avatar image
0 Votes"
CandyLuo-MSFT answered

Hi ,

Based on my understanding, you have used process monitor and found lsass.exe will spike all cores to 100% CPU usage while trying to call dnsapi.dll in windows 10 2004. Is that right? Please feel free to let me know if my understanding is wrong.

I have not received such feedback yet. It seems that there might be some early adopter issues at this time with Windows 10 2004, if possible, I would recommend you wait a bit until Windows 10 2004 matures with future cumulative updates.

In addition, the Feedback Hub app lets you tell Microsoft about any problems you run in to while using Windows 10. You can report this issue to Microsoft directly with the Feedback Hub app.

For more information on using the app, click here:

https://support.microsoft.com/en-us/help/4021566/windows-10-send-feedback-to-microsoft-with-feedback-hub-app

If this issue is urgent, I would also suggest you contact Microsoft Customer Support and Services where more in-depth investigation can be done so that you would get a more satisfying explanation and solution to this issue.

You may find phone number for your region accordingly from the link below: 

https://support.microsoft.com/en-us/help/4051701/global-customer-service-phone-numbers

---Please Accept as answer if the reply is helpful---

Best Regards,

Candy





5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

jaltmann-3698 avatar image
0 Votes"
jaltmann-3698 answered ScottRamirez-9041 commented

You are correct, it appears to also be happening with the DNS Client (dnscache) service as well. It seems to be attempting to make calls to register the device in DNS through the DHCP process within the dnsapi.dll, it spins up multiple threads and maxes out all cores. Disabling the DNS Client service does not resolve the issue. If there is a service or process calling dnsapi. With the 2004 build of dnsapi.dll as a result of the processes getting "stuck" on calling the functions in the DLL, services like VPN (in our case Palo Alto's GlobalProtect) will fail to establish a tunnel.

· 4
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.



I have deployed a windows 10 2004 and join the machine to the domain, but I did not occur such problem. Could you please share me the detailed GPO configuration(I am not sure if it is GPO related)? Did you configure NRPT rule in your GPO?


0 Votes 0 ·

Just want to confirm the current situations.

Please feel free to let us know if you need further assistance.

0 Votes 0 ·

I have redacted the namespace value, but it's set for our internal domain. From what I've experienced so far, we are only having the issue with physical laptops/desktop (brand independent but both using Intel hardware), we do not have the same issue for virtual machines.

21041-nrpt.png


1 Vote 1 ·
nrpt.png (29.3 KiB)

I want to confirm seeing this same issue on brand new Laptops.

After much investigation I found the following:
-Client would boot and connect to our internal network and connect and function normally
If I tried to connect to ANY external network the DHCP service would spin the CPU to 100% and the device became unusable

I found that removing DNSSEC Group Policy for DNSSEC Validation for our namespace resolved the issue. We need a better solution to allow DNSSEC on our internal network for all of our clients (including laptops). The DHCP process should not go to 100% CPU Utilization even if every fails to meet the policy. Microsoft needs to resolve this.

-Scott

0 Votes 0 ·
dwillia-BNB avatar image
0 Votes"
dwillia-BNB answered

I still have the identical issue with a batch of Lenovo laptops. They were all installed and domain joined with Windows 10 1908. As soon as they took the 2004 feature update, we have the issue you describe above: 100% CPU from dnscache service waiting for lsass.

We still have no solution and have had varying success with assigning static DNS to the wireless adapter (8.8.8.8) or disabling network altogether (dnscache does not load without networking).

Were you offered a valid solution for this? We are keeping 2004 on hold until we can get this resolved.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

dwillia-BNB avatar image
0 Votes"
dwillia-BNB answered CandyLuo-MSFT commented

Replacing dnsapi.dll with the 1909 version has remediated the issue on my Lenovo laptop. This is not a fix, but it's a start.

Time to get moving on this, Microsoft!

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Did you configure NRPT rule in your GPO? Could you please share us the detailed GPO configuration? I have joined a windows 10 2004 (have internet access) machine to domain and I did not reproduce such phenomenon. I would suspect this issue is related with GPO settings. It would be helpful for us identify problem if you can share us more details by uploading the GPO images directly.




0 Votes 0 ·
CandyLuo-MSFT avatar image
0 Votes"
CandyLuo-MSFT answered CandyLuo-MSFT commented

Hi jaltmann,

It seems this issue is related with NRPT rule, please remove NRPT rule and then everything will work fine.

>>we do not have the same issue for virtual machines

We can also reproduce this issue in windows 2004 VM, just make VM has internet access. As the steps below:

  1. Set up a window 10 2004 virtual machine with internal virtual switch and configured NRPT rule via GPO.

  2. Let the win10 2004 join the domain and apply the GPO (NRPT rule).

  3. After applying the GPO, I disabled the internal switch and enable the external switch to let the machine has network connectivity but without reachability to DC.

  4. Then we can see 100% CPU caused by DNS client service in task manager.

---Please Accept as answer if the reply is helpful---

Best Regards,

Candy





· 3
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hi Candy,

I have removed the NRPT rule via the registry at this location, HLKM:\SOFTWARE\Policies\Microsoft\Windows NT\DNSClient\DNSPolicyConfig\

I am now able to connect with our VPN client and CPU usage is down. One thing to note, is, the NRPT rule is necessary for DNSSEC client communications within our network. I'm wondering if our VPN client is factoring in with the issue of DNSSEC/NRPT. We use Palo Alto's Globalprotect w/SSL forced traffic, so I'm not sure if the negotiation of the tunnel along with performing DNS lookups (with DNSSEC in mind) prior to having access to internal DNS servers is causing the issue. Are you able to test this situation with this client or a similar VPN client?

0 Votes 0 ·

I'd also like to note that upon turning on NRPT, even though we can connect to VPN, I'm unable to access any internal resource based on DNS (notably SMB, and administrative tools). This is a similar experience to when we swap out a 1903 dnsapi.dll.

0 Votes 0 ·

Hi ,
Thanks for your updating.

>>We use Palo Alto's Globalprotect w/SSL forced traffic, so I'm not sure if the negotiation of the tunnel along with performing DNS lookups (with DNSSEC in mind) prior to having access to internal DNS servers is causing the issue. Are you able to test this situation with this client or a similar VPN client?

I have no such similar VPN client to do the test in my lab. If this issue is urgent, I would suggest you open a case with Microsoft. In this way , they can have a clear picture about your issue and your environment by phone communication and live share session.

You may find phone number for your region accordingly from the link below:

https://support.microsoft.com/en-us/help/4051701/global-customer-service-phone-numbers

Best Regards,

Candy


0 Votes 0 ·
Thibault-0289 avatar image
0 Votes"
Thibault-0289 answered

I second this problem with a Lenovo laptop since build 2004.
I attempted to reinstall Windows as a clean copy and experience the same issue.
It seems to occur on boot, sometimes when connecting the VPN, it's very frustrating as it cause the computer to be very slow sometimes for a few minutes and drains the battery. Then it gets back to normal.
I have no NRPT rule and a close to original hosts file (read somewhere this could be the issue).

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Gates-8167 avatar image
0 Votes"
Gates-8167 answered DavidHerselman-6588 commented

In our case, it was as simple as going back to the Name Resolution Policy and adding "Generic DNS Servers" to the list.
Our GPO was setup pretty much like the above screenshot by jaltmann-3698; except the namespace was defined as our domain name.

My theory is that if you don't specify the DNS servers to use for the policy, the DNS client will attempt DNSSEC validation with whatever DNS servers you get through DHCP.
In our case, it was noticed when users were taking laptops home, they would get the high CPU usage. But when in the office, things were fine.

So, User A would go home, DNSSEC was enabled, he would get his ISP DNS and validation was trying forever (in some cases, we had reports it would eventually time-out after like an hour).

Changing the policy to only use DNSSEC with the specified DNS servers on that "Generic DNS Servers" tab sovled the problem.
User A would go home, he'd get his ISP DNS which the policy recognized as not a valid DNS server for DNSSEEC, so it wouldn't try to validate records.


30669-nrpt-dns.png



nrpt-dns.png (55.7 KiB)
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

This 'solution' simply attempts to avoid the bug by always sending queries for that domain to servers that support DNSSEC.
Not feasible when your AD DNS zones and public zones are both signed where you want to use whatever DNS resolvers people get assigned and then use NRPT to guarantee that answers are validated cryptographically by resolvers.

1 Vote 1 ·
DavidHerselman-6588 avatar image
0 Votes"
DavidHerselman-6588 answered DavidHerselman-6588 commented

Triggering this bug requires the workstation's DNS server(s), either assigned statically, via DHCP or by CheckPoint VPN client to somehow fail DNSSEC validation. The high CPU usage then continues until the system is restarted. This bug has forced us to stop validating DNSSEC via NRPT policies.

Broadband routers at home or hotspotting via your cell may set the DNS as the router, which in most cases does not support DNSSEC.

Reproduced on an unjoined Windows 10 Enterprise system where we ran gpedit.msc and created a new policy to enforce DNSSEC validation for suffix 'company.com'. Make sure the zone is signed and working (https://dnssec-analyzer.verisignlabs.com) before then creating an alias (cname) to an unsigned destination. For example test.company.com cname www.apple.com.

Trying to 'ping test.company.com' immediately results in high CPU usage making the system unusable...

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Just to avoid ambiguity:
NRPT should prevent resolving test.company.com, which it does, but in the process breaks the system due to the high CPU load that this bug subsequently causes.

0 Votes 0 ·
dwillia-BNB avatar image
0 Votes"
dwillia-BNB answered

CONFIRMED: DNSSEC is the culprit! Only on Windows 10 2004 and 20H2. 1909 and prior were all fine.

Once I added the NRPT config to enforce DNSSEC only the DNS servers listed under Generic DNS, this problem doesn't come up until I start GlobalProtect VPN. My workaround had been to locate the DNS service and set a CPU affinity to one core instead of all. This has made my laptop usable again.

Going Further, I took my laptop out of scope for the DNSSEC GPO altogether and the DNS high CPU usage (100%) has disappeared completely. My laptop is back to normal. Of course, disabling DNSSEC is not a solution as we need it enabled and I have another 200 laptops with the same issue.

Here is the working section of the GPO with the DNSSEC and NRPT settings (domain name and IP changed). Does this look right?:

<Computer>
<VersionDirectory>2</VersionDirectory>
<VersionSysvol>2</VersionSysvol>
<Enabled>true</Enabled>
<ExtensionData>
<Extension xmlns:q1="http://www.microsoft.com/GroupPolicy/Settings/nrpt" xsi:type="q1:NrptSettings">
<q1:Global>
<q1:Nla xsi:nil="true" />
<q1:Fallback xsi:nil="true" />
<q1:Query xsi:nil="true" />
</q1:Global>
<q1:Rules>
<q1:Config>10</q1:Config>
<q1:DnsDnssec>1</q1:DnsDnssec>
<q1:DnsIpsec>0</q1:DnsIpsec>
<q1:DnsEncryption>0</q1:DnsEncryption>
<q1:DirectIpsec xsi:nil="true" />
<q1:DirectEncryption xsi:nil="true" />
<q1:DirectProxyType xsi:nil="true" />
<q1:Name>
<q1:string>.contoso.local</q1:string>
</q1:Name>
<q1:Version>1</q1:Version>
<q1:Ca />
<q1:GenericDnsServers>192.168.252.11;192.168.252.12;192.168.252.13</q1:GenericDnsServers>
<q1:Encoding xsi:nil="true" />
</q1:Rules>
</Extension>
<Name>Name Resolution Policy</Name>
</ExtensionData>
<ExtensionData>
<Extension xmlns:q2="http://www.microsoft.com/GroupPolicy/Settings/Registry" xsi:type="q2:RegistrySettings">
<q2:Blocked>false</q2:Blocked>
</Extension>
<Name>Registry</Name>
</ExtensionData>
</Computer>

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

SteveParankewich avatar image
0 Votes"
SteveParankewich answered SteveParankewich published

We have a client with this same issue. Are there any updates from the Product Team or has a Premier Ticket been opened?

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.