question

Blaise-1597 avatar image
0 Votes"
Blaise-1597 asked Blaise-1597 commented

win 10 machine can't access network shares on pdc, intermittent

I manage a small windows domain for my home/home office. It consists of a PDC running Windows Server 2003 SP1 (I know! it's old! etc.) and a hand full of computers with fixed private IP addresses (10.x.x.x):

3 Win 10-64 Pro

1 Win 8.1-64 Pro

2 Win 7 Pro (one 64-bit, one 32-bit)

1 Win XP-32

My intermittent problem for the past year or so is that one of the Win 10 machines occasionally (maybe 3 times/month) suddenly cannot access any network shares on the PDC. This causes a number of automated backup processes to fail. When this happens:

(machine names and IP addresses changed to protect the guilty! :-)

- it can ping the PDC by name, so DNS is not the issue

 C:\Users\XXX>ping MyServer
    
    
 Pinging MyServer [10.x.x.x] with 32 bytes of data:
    
 Reply from 10.x.x.x: bytes=32 time=1ms TTL=128
    
 Reply from 10.x.x.x: bytes=32 time<1ms TTL=128
    
 Reply from 10.x.x.x: bytes=32 time=1ms TTL=128
    
 Reply from 10.x.x.x: bytes=32 time=1ms TTL=128
    
    
 Ping statistics for 10.x.x.x:
    
 Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
    
 Approximate round trip times in milli-seconds:
    
 Minimum = 0ms, Maximum = 1ms, Average = 0ms

- this fails, with a fairly long delay before the error message appears (perhaps 2 minutes):

 C:\Users\XXX>dir \\MyServer\backupfiles
    
 The semaphore timeout period has expired.

- and yet this works:

 C:\Users\XXX>dir \\10.x.x.x\backupfiles
    
 Volume in drive \\10.x.x.x\backupfiles is zzzz
    
 Volume Serial Number is 10C0-79BC
    
    
 Directory of \\10.x.x.x\backupfiles
    
    
 05/09/2021 08:28 AM <DIR> .
    
 05/09/2021 08:28 AM <DIR> ..
    
 05/13/2019 09:53 AM 23 BackupCacheRoot.txt
    
 03/21/2019 12:14 PM 23 BackupRoot-ZZZZ.txt
    
 03/21/2019 12:14 PM 23 BackupRoot.txt
    
 05/25/2019 11:53 AM 23 BackupStagingRoot.txt
    
 07/27/2020 10:43 AM 22 BatchFolder.txt
    
 03/24/2019 09:36 AM 24 BatchJobFolder.txt
    
 05/14/2019 08:13 PM 1,510 GetBackupLoc.bat
    
 7 File(s) 1,648 bytes
    
 2 Dir(s) 4,306,067,456 bytes free

- it can connect to network shares on other machines

The only way I can clear this is by rebooting the win 10 machine, after which all is well ... for perhaps a week.

I'd be grateful for any insights anyone may have.













windows-serverwindows-10-network
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

GaryNebbett avatar image
1 Vote"
GaryNebbett answered Blaise-1597 commented

Hello @Blaise-1597,

I think that your problem may be related to other reported problems in this forum and we may be getting closer to an understanding of the root cause. The problem seems to affect older Network Attached Storage devices and Windows 2003 and earlier. I have included part of another post of mine here, adapted to your scenario.

The authoritative source of information about SMB is the Microsoft specification "[MS-SMB2]: Server Message Block (SMB) Protocol Versions 2 and 3".

Section "3.2.4.2.2 Negotiating the Protocol" of this document says:

When a new connection is established, the client MUST negotiate capabilities with the server. The
client MAY<111> use either of two possible methods for negotiation.

The first is a multi-protocol negotiation that involves sending an SMB message to negotiate the use of
SMB2. If the server does not implement the SMB 2 Protocol, this method allows the negotiation to fall
back to older SMB dialects, as specified in [MS-SMB].

The second method is to send an SMB2-Only negotiate message. This method will result in successful
negotiation only for servers that implement the SMB 2 Protocol.

The footnote <111> says:

The Windows-based client will initiate a multi-protocol negotiation unless it
has previously negotiated with this server and the negotiated server's DialectRevision is equal to
0x0202, 0x0210, 0x0300, 0x0302, or 0x0311. In the latter case, it will initiate an SMB2-Only
negotiate.

What might be happening is that your clients are attempting to reconnect to the Windows 2003 PDC (after an interruption in the TCP/IP connection) using an SMB2-Only negotiate message. Windows 2003 seems to just silently ignore this negotiate message, perhaps expecting the first SMB message on a new connection to always be a multi-protocol negotiation.

If you issue a "net use" command on a client that can't connect, you might see something like:

 New connections will not be remembered.
    
    
 Status       Local     Remote                    Network
    
 -------------------------------------------------------------------------------
 Disconnected           \\ray\c$                  Microsoft Windows Network
 The command completed successfully.

If you do see that, try issuing the command "net use \\ray\c$ /d" for all of the connections to that server (e.g. "ray"). The reason for doing this is that the client remembers that it negotiated an SMB2/SMB3 connection with the server and skips the multi-protocol negotiation; removing all connections to the server also causes this "remembered" information about protocol capabilities of the server to be deleted.

After this, it should be possible to re-establish connections to the Windows 2003 system.

Please let us know what experiences you have with this approach - it would help to build a body of evidence for the problem.

Gary

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

@GaryNebbet was right, I simply removed the mapped drive and the problem has not recurred, Thanks!!

0 Votes 0 ·
DSPatrick avatar image
0 Votes"
DSPatrick answered

Please put up a new set of files to look at.


5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Blaise-1597 avatar image
0 Votes"
Blaise-1597 answered GaryNebbett commented

UPDATE

After following suggestions @DSPatric, the win 10 machine was fine for a couple of weeks. The issue then started happening again. It seems so random - sometimes happening once every couple of days, other times happening a few times in one day. So, although the suggestions from @DSPatrick were all solid, they don't seem to address the cause of my issue.

Looking at the reply from @GaryNebbett I tried the following on the win 10 machine when it was experiencing the problem:

 C:\Users\Pat>net use
 New connections will be remembered.
    
    
 Status       Local     Remote                    Network
    
 -------------------------------------------------------------------------------
 Reconnecting P:        \\BSSMSV02\Pat            Microsoft Windows Network
 The command completed successfully.
    
    
 C:\Users\Pat>dir p:\
 The semaphore timeout period has expired.
    
 C:\Users\Pat>

On this machine P: is a persistent mapped drive to a share on the Win 2003 PDC. I totally forgot this was here, Hmm, I rebooted the machine and got this:

 C:\Users\Pat>net use
 New connections will be remembered.
    
    
 Status       Local     Remote                    Network
    
 -------------------------------------------------------------------------------
 OK           P:        \\BSSMSV02\Pat            Microsoft Windows Network
 The command completed successfully.
    
    
 C:\Users\Pat>

So either the problem also causes the P: drive disconnect, or maybe the problem is caused by the P: drive mapping getting disconnected. I deleted the mapped drive, and we'll see if the problem continues to occur.

Gary, you may have hit the nail on the head. In retrospect, I should have tried what you suggested (net use \\ray\c$ /d). But I figured that perhaps the scenario you described is occurring because the connection to the mapped drive dropped.

I'll post an update when I have more information.

Thanks.


· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Blaise-1597 avatar image
0 Votes"
Blaise-1597 answered FanFan-MSFT commented

@GaryNebbett Thanks for your detailed response - very interesting!

I've just tried suggestions from @DSPatrick, and because the problem is intermittent, I want to wait a while to see what effect his suggestions may have. But I will cycle back to your post to explore and try your suggestions too.

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hi,
Thans DSPatric for the advice and we will wait for your good news.
Best Regards,

0 Votes 0 ·
Blaise-1597 avatar image
0 Votes"
Blaise-1597 answered DSPatrick commented

@DSPatrick Thanks very much for the detailed response. I have carried out your instructions, including getting rid of the old DC metadata remnants.

I upvoted your response and gave it 5 stars for Satisfaction.

Since the problem is intermittent, I won't know if it's resolved for a while. But if it does not recur in a couple of weeks, I'll come back and mark your reply as Answer Accepted.

Thanks again for your time and efforts!


· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Sounds good, glad to help and you're welcome. If something comes up again in a few weeks I'd suggest starting a new thread. These forums aren't really tailored for ongoing issues.

--please don't forget to upvote and Accept as answer if the reply is helpful--


0 Votes 0 ·
DSPatrick avatar image
1 Vote"
DSPatrick answered DSPatrick edited

On bssmsv02 remove the public DNS (8.8.8.8)
On Guinan remove the public DNS (8.8.8.8)
then do ipconfig /flushdns, ipconfig /registerdns, restart the netlgon service


Domain controller and all domain members must use domain DNS only so members can find and logon to domain. Internet queries will pass on to the 13 default root hints server in a top-level down fashion or optionally to any configured forwarders.

104975-image.png

104964-image.png


There are remnants of a failed domain controller that should be cleaned up.
https://docs.microsoft.com/en-us/windows-server/identity/ad-ds/deploy/ad-ds-metadata-cleanup
https://techcommunity.microsoft.com/t5/itops-talk-blog/step-by-step-manually-removing-a-domain-controller-server/ba-p/280564


--please don't forget to upvote and Accept as answer if the reply is helpful--







image.png (170.7 KiB)
image.png (159.9 KiB)
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Blaise-1597 avatar image
0 Votes"
Blaise-1597 answered

Miguel,

Note that the server is Windows Server 2003 SP2 (not SP1) - my mistake.

I ran the following on the server

 Microsoft Windows [Version 5.2.3790]
 (C) Copyright 1985-2003 Microsoft Corp.

 nbtstat -c
    
 Local Area Connection:
 Node IpAddress: [10.0.0.35] Scope Id: []
    
                   NetBIOS Remote Cache Name Table
    
         Name              Type       Host Address    Life [sec]
     ------------------------------------------------------------
     ODIN           <20>  UNIQUE          10.0.0.30           590


Sorry, but I don't understand what you mean by
"Try \\server.domain.local to force DNS resolution"

Please pardon my ignorance ... :-)




5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Blaise-1597 avatar image
0 Votes"
Blaise-1597 answered

@DSPatrick I have done the following:

On the DC:
Dcdiag /v /c /d /e /s:%computername% >dcdiag.log
repadmin /showrepl >repl.txt
ipconfig /all > dc1.txt

On the problem machine:
ipconfig /all > problemworkstation.txt

We have only one DC, so I did not run the "ipconfig /all > dc2.txt" command.

I hope my understanding of what you want is accurate. Let me know if not.

The files can be viewed here: s!AlMtbw6XuCJanw10jOEFyF32UV6D



Thank you for your help.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Blaise-1597 avatar image
0 Votes"
Blaise-1597 answered DSPatrick commented

Thank you Miguel and DSPatrick for your responses. I will look at your suggestions, but note that since the problem is intermittent, I won't be able to get useful data until another incidence occurs. I will run the commands DSPatrick suggests now anyway to get a baseline. I'll post as soon as I have collected data when the problem next occurs.

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

No need to wait, put them up as soon as you get them.

0 Votes 0 ·
falconitservices avatar image
0 Votes"
falconitservices answered

Hello,

The UNC path that is using the IP address works so this means it's not a permission or connection problem. It's possibly a resolution problem.

\\Name uses netBIOS broadcast to resolve, so that's may be why the IP address works. Try \\server.domain.local to force DNS resolution and check the NetBIOS cache using nbstat -c.

Lastly, since the server is so darn only, check if WINS is enabled and causing problems with stale/tombstoned records.

-Miguel Fra
Falcon IT Services
https://www.falconitservices.com

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.