win 10 machine can't access network shares on pdc, intermittent

Blaise 41 Reputation points
2021-06-11T21:19:28.46+00:00

I manage a small windows domain for my home/home office. It consists of a PDC running Windows Server 2003 SP1 (I know! it's old! etc.) and a hand full of computers with fixed private IP addresses (10.x.x.x):

3 Win 10-64 Pro

1 Win 8.1-64 Pro

2 Win 7 Pro (one 64-bit, one 32-bit)

1 Win XP-32

My intermittent problem for the past year or so is that one of the Win 10 machines occasionally (maybe 3 times/month) suddenly cannot access any network shares on the PDC. This causes a number of automated backup processes to fail. When this happens:

(machine names and IP addresses changed to protect the guilty! :-)

- it can ping the PDC by name, so DNS is not the issue

C:\Users\XXX>ping MyServer


Pinging MyServer [10.x.x.x] with 32 bytes of data:

Reply from 10.x.x.x: bytes=32 time=1ms TTL=128

Reply from 10.x.x.x: bytes=32 time<1ms TTL=128

Reply from 10.x.x.x: bytes=32 time=1ms TTL=128

Reply from 10.x.x.x: bytes=32 time=1ms TTL=128


Ping statistics for 10.x.x.x:

Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),

Approximate round trip times in milli-seconds:

Minimum = 0ms, Maximum = 1ms, Average = 0ms

- this fails, with a fairly long delay before the error message appears (perhaps 2 minutes):

C:\Users\XXX>dir \\MyServer\backupfiles

The semaphore timeout period has expired.

- and yet this works:

C:\Users\XXX>dir \\10.x.x.x\backupfiles

Volume in drive \\10.x.x.x\backupfiles is zzzz

Volume Serial Number is 10C0-79BC


Directory of \\10.x.x.x\backupfiles


05/09/2021 08:28 AM <DIR> .

05/09/2021 08:28 AM <DIR> ..

05/13/2019 09:53 AM 23 BackupCacheRoot.txt

03/21/2019 12:14 PM 23 BackupRoot-ZZZZ.txt

03/21/2019 12:14 PM 23 BackupRoot.txt

05/25/2019 11:53 AM 23 BackupStagingRoot.txt

07/27/2020 10:43 AM 22 BatchFolder.txt

03/24/2019 09:36 AM 24 BatchJobFolder.txt

05/14/2019 08:13 PM 1,510 GetBackupLoc.bat

7 File(s) 1,648 bytes

2 Dir(s) 4,306,067,456 bytes free

- it can connect to network shares on other machines

The only way I can clear this is by rebooting the win 10 machine, after which all is well ... for perhaps a week.

I'd be grateful for any insights anyone may have.

Windows Server
Windows Server
A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.
11,933 questions
Windows 10 Network
Windows 10 Network
Windows 10: A Microsoft operating system that runs on personal computers and tablets.Network: A group of devices that communicate either wirelessly or via a physical connection.
2,261 questions
0 comments No comments
{count} votes

Accepted answer
  1. Gary Nebbett 5,631 Reputation points
    2021-06-12T08:47:18.447+00:00

    Hello @Blaise ,

    I think that your problem may be related to other reported problems in this forum and we may be getting closer to an understanding of the root cause. The problem seems to affect older Network Attached Storage devices and Windows 2003 and earlier. I have included part of another post of mine here, adapted to your scenario.

    The authoritative source of information about SMB is the Microsoft specification "[MS-SMB2]: Server Message Block (SMB) Protocol Versions 2 and 3".

    Section "3.2.4.2.2 Negotiating the Protocol" of this document says:

    When a new connection is established, the client MUST negotiate capabilities with the server. The
    client MAY<111> use either of two possible methods for negotiation.

    The first is a multi-protocol negotiation that involves sending an SMB message to negotiate the use of
    SMB2. If the server does not implement the SMB 2 Protocol, this method allows the negotiation to fall
    back to older SMB dialects, as specified in [MS-SMB].

    The second method is to send an SMB2-Only negotiate message. This method will result in successful
    negotiation only for servers that implement the SMB 2 Protocol.

    The footnote <111> says:

    The Windows-based client will initiate a multi-protocol negotiation unless it
    has previously negotiated with this server and the negotiated server's DialectRevision is equal to
    0x0202, 0x0210, 0x0300, 0x0302, or 0x0311. In the latter case, it will initiate an SMB2-Only
    negotiate.

    What might be happening is that your clients are attempting to reconnect to the Windows 2003 PDC (after an interruption in the TCP/IP connection) using an SMB2-Only negotiate message. Windows 2003 seems to just silently ignore this negotiate message, perhaps expecting the first SMB message on a new connection to always be a multi-protocol negotiation.

    If you issue a "net use" command on a client that can't connect, you might see something like:

    New connections will not be remembered.  
      
      
    Status       Local     Remote                    Network  
      
    -------------------------------------------------------------------------------  
    Disconnected           \\ray\c$                  Microsoft Windows Network  
    The command completed successfully.  
    

    If you do see that, try issuing the command "net use \ray\c$ /d" for all of the connections to that server (e.g. "ray"). The reason for doing this is that the client remembers that it negotiated an SMB2/SMB3 connection with the server and skips the multi-protocol negotiation; removing all connections to the server also causes this "remembered" information about protocol capabilities of the server to be deleted.

    After this, it should be possible to re-establish connections to the Windows 2003 system.

    Please let us know what experiences you have with this approach - it would help to build a body of evidence for the problem.

    Gary

    1 person found this answer helpful.

10 additional answers

Sort by: Most helpful
  1. Dave Patrick 425.8K Reputation points MVP
    2021-06-11T21:33:06.75+00:00

    Please run;

    Dcdiag /v /c /d /e /s:%computername% >C:\dcdiag.log
    repadmin /showrepl >C:\repl.txt
    ipconfig /all > C:\dc1.txt
    ipconfig /all > C:\dc2.txt
    ipconfig /all > C:\problemworkstation.txt

    then put unzipped text files up on OneDrive and share a link.

    1 person found this answer helpful.
    0 comments No comments

  2. Dave Patrick 425.8K Reputation points MVP
    2021-06-12T01:47:22.497+00:00

    On bssmsv02 remove the public DNS (8.8.8.8)
    On Guinan remove the public DNS (8.8.8.8)
    then do ipconfig /flushdns, ipconfig /registerdns, restart the netlgon service

    Domain controller and all domain members must use domain DNS only so members can find and logon to domain. Internet queries will pass on to the 13 default root hints server in a top-level down fashion or optionally to any configured forwarders.

    104975-image.png

    104964-image.png

    There are remnants of a failed domain controller that should be cleaned up.
    https://learn.microsoft.com/en-us/windows-server/identity/ad-ds/deploy/ad-ds-metadata-cleanup
    https://techcommunity.microsoft.com/t5/itops-talk-blog/step-by-step-manually-removing-a-domain-controller-server/ba-p/280564

    --please don't forget to upvote and Accept as answer if the reply is helpful--

    1 person found this answer helpful.
    0 comments No comments

  3. Falcon IT Services 286 Reputation points
    2021-06-11T21:44:38.007+00:00

    Hello,

    The UNC path that is using the IP address works so this means it's not a permission or connection problem. It's possibly a resolution problem.

    \Name uses netBIOS broadcast to resolve, so that's may be why the IP address works. Try \server.domain.local to force DNS resolution and check the NetBIOS cache using nbstat -c.

    Lastly, since the server is so darn only, check if WINS is enabled and causing problems with stale/tombstoned records.

    -Miguel Fra
    Falcon IT Services
    https://www.falconitservices.com

    0 comments No comments

  4. Blaise 41 Reputation points
    2021-06-12T00:41:02.097+00:00

    Thank you Miguel and DSPatrick for your responses. I will look at your suggestions, but note that since the problem is intermittent, I won't be able to get useful data until another incidence occurs. I will run the commands DSPatrick suggests now anyway to get a baseline. I'll post as soon as I have collected data when the problem next occurs.