疑難排解 Windows Server 軟體定義的網路堆疊Troubleshoot the Windows Server Software Defined Networking Stack

適用於:Windows Server(以每年次管道)、Windows Server 2016Applies To: Windows Server (Semi-Annual Channel), Windows Server 2016

本指南檢查軟體定義網路 (SDN) 常見的錯誤和失敗案例和概述運用可診斷工具疑難排解工作流程。This guide examines the common Software Defined Networking (SDN) errors and failure scenarios and outlines a troubleshooting workflow that leverages the available diagnostic tools.

如需 Microsoft 軟體定義網路的相關資訊,請查看軟體定義網路For more information about Microsoft's Software Defined Networking, see Software Defined Networking.

錯誤類型Error types

以下清單代表 Windows Server 2012 R2 的市場中 production 部署種最常看到的 HYPER-V 網路模擬 (HNVv1) 的問題,並具有相同的新的軟體定義網路 (SDN) 堆疊與 Windows Server 2016 HNVv2 中看到的問題類型相合許多種方式。The following list represents the class of problems most often seen with Hyper-V Network Virtualization (HNVv1) in Windows Server 2012 R2 from in-market production deployments and coincides in many ways with the same types of problems seen in Windows Server 2016 HNVv2 with the new Software Defined Network (SDN) Stack.

大部分的錯誤可分為一小組:Most errors can be classified into a small set of classes:

  • 無效的或不支援的設定Invalid or unsupported configuration
    錯誤或不正確的原則,使用者會叫用 NorthBound API。A user invokes the NorthBound API incorrectly or with invalid policy.

  • 原則應用程式中錯誤Error in policy application
    Network Controller 的原則未傳遞至 HYPER-V 主機,大幅延遲和/或不(例如,在動態移轉)之後所有 HYPER-V 主機上最新狀態。Policy from Network Controller was not delivered to a Hyper-V Host, significantly delayed and / or not up to date on all Hyper-V hosts (for example, after a Live Migration).

  • 設定積雪或軟體問題Configuration drift or software bug
    資料路徑問題會導致捨棄封包。Data-path issues resulting in dropped packets.

  • NIC 硬體相關的外部錯誤 / 驅動程式或底圖網路 fabricExternal error related to NIC hardware / drivers or the underlay network fabric
    發生錯誤工作卸載(例如 VMQ) 或底圖網路 fabric 設定錯誤(例如 MTU)Misbehaving task offloads (such as VMQ) or underlay network fabric misconfigured (such as MTU)

    本疑難排解指南檢查每個類的錯誤和建議最佳做法,並找出並修正錯誤診斷工具。This troubleshooting guide examines each of these error categories and recommends best practices and diagnostic tools available to identify and fix the error.

診斷工具Diagnostic tools

再為每個這類的錯誤的疑難排解工作流程,讓我們檢查可用的診斷工具。Before discussing the troubleshooting workflows for each of these type of errors, let's examine the diagnostic tools available.

若要使用的網路控制器(控制項路徑)診斷工具,必須先安裝 RSAT-NetworkController 功能和匯入NetworkControllerDiagnostics模組:To use the Network Controller (control-path) diagnostic tools, you must first install the RSAT-NetworkController feature and import the NetworkControllerDiagnostics module:

Add-WindowsFeature RSAT-NetworkController -IncludeManagementTools  
Import-Module NetworkControllerDiagnostics  

若要使用診斷工具 HNV 診斷(路徑資料),您必須匯入HNVDiagnostics模組:To use the HNV Diagnostics (data-path) diagnostic tools, you must import the HNVDiagnostics module:

# Assumes RSAT-NetworkController feature has already been installed
Import-Module hnvdiagnostics   

網路控制器診斷Network controller diagnostics

這些 cmdlet 所記載上的網路控制器診斷 Cmdlet 主題These cmdlets are documented on TechNet in the Network Controller Diagnostics Cmdlet Topic. 它們協助找出控制路徑 Network Controller 節點間 Network Controller and HYPER-V 主機上執行 NC 主機代理程式及間的網路原則一致性的問題。They help identify problems with network policy consistency in the control-path between Network Controller nodes and between the Network Controller and the NC Host Agents running on the Hyper-V hosts.

偵錯-ServiceFabricNodeStatus_和_取得-NetworkControllerReplica_的其中一個 Network Controller 節點虛擬電腦必須執行 cmdlet。The _Debug-ServiceFabricNodeStatus and Get-NetworkControllerReplica cmdlets must be run from one of the Network Controller node virtual machines. 所有其他 NC 診斷 cmdlet 可執行的任何主機已連接到 Network Controller 並在網路控制器管理安全性群組 (Kerberos) 中或管理 Network Controller X.509 憑證的存取。All other NC Diagnostic cmdlets can be run from any host which has connectivity to the Network Controller and is in either in the Network Controller Management security group (Kerberos) or has access to the X.509 certificate for managing the Network Controller.

HYPER-V 主機診斷]Hyper-V host diagnostics

這些 cmdlet 所記載上的HYPER-V 網路模擬 (HNV) 診斷 Cmdlet 主題These cmdlets are documented on TechNet in the Hyper-V Network Virtualization (HNV) Diagnostics Cmdlet Topic. 它們協助找出問題資料路徑之間承租人虛擬電腦(西東日)中的並輸入流量透過 SLB VIP(北日南)。They help identify problems in the data-path between tenant virtual machines (East/West) and ingress traffic through an SLB VIP (North/South).

偵錯-VirtualMachineQueueOperation取得-CustomerRoute取得-PACAMapping取得-ProviderAddress取得-VMNetworkAdapterPortId取得-VMSwitchExternalPortId,和_測試-EncapOverheadSettings_都可以從任何 HYPER-V 主機執行的所有區域測試。The Debug-VirtualMachineQueueOperation, Get-CustomerRoute, Get-PACAMapping, Get-ProviderAddress, Get-VMNetworkAdapterPortId, Get-VMSwitchExternalPortId, and Test-EncapOverheadSettings are all local tests which can be run from any Hyper-V host. 其他 cmdlet 叫用透過網路控制器資料路徑測試,因此必須設為上述 descried Network Controller 的存取。The other cmdlets invoke data-path tests through the Network Controller and therefore need access to the Network Controller as descried above.

GitHubGitHub

Microsoft 日 SDN GitHub 存放庫範例指令碼或工作流程自動化組建這些附隨 cmdlet 上方的號碼。The Microsoft/SDN GitHub Repo has a number of sample scripts and workflows which build on top of these in-box cmdlets. 尤其是診斷指令碼位於診斷資料夾。In particular, diagnostic scripts can be found in the Diagnostics folder. 請協助我們貢獻提交提取要求,這些指令碼。Please help us contribute to these scripts by submitting Pull Requests.

疑難排解工作流程和指南Troubleshooting Workflows and Guides

[主]驗證系統健康[Hoster] Validate System Health

還有 embedded 的資源名為_設定狀態_在幾個 Network Controller 的資源。There is an embedded resource named Configuration State in several of the Network Controller resources. 設定狀態提供資訊包括網路 controller 的設定和 HYPER-V 主機上的實際(執行)狀態之間的一致性系統健康。Configuration state provides information about system health including the consistency between the network controller's configuration and the actual (running) state on the Hyper-V hosts.

若要檢查設定狀態,執行從任何 HYPER-V 主機連接至網路控制器。To check configuration state, run the following from any Hyper-V host with connectivity to the Network Controller.

注意

值為NetworkController參數應該 FQDN 或 IP 位址根據 X.509 主體名稱 > 建立 Network Controller 的憑證。The value for the NetworkController parameter should either be the FQDN or IP address based on the subject name of the X.509 >certificate created for Network Controller.

認證參數只需要指定控制器網路是否使用 F:kerberos 驗證(一般 VMM 部署)。The Credential parameter only needs to be specified if the network controller is using Kerberos authentication (typical in VMM deployments). 網路控制器管理安全性群組中的使用者必須認證。The credential must be for a user who is in the Network Controller Management Security Group.

Debug-NetworkControllerConfigurationState -NetworkController <FQDN or NC IP> [-Credential <PS Credential>]

# Healthy State Example - no status reported
$cred = Get-Credential
Debug-NetworkControllerConfigurationState -NetworkController 10.127.132.211 -Credential $cred

Fetching ResourceType:     accessControlLists
Fetching ResourceType:     servers
Fetching ResourceType:     virtualNetworks
Fetching ResourceType:     networkInterfaces
Fetching ResourceType:     virtualGateways
Fetching ResourceType:     loadbalancerMuxes
Fetching ResourceType:     Gateways

範例設定狀態訊息如下所示:A sample Configuration State message is shown below:

Fetching ResourceType:     servers
---------------------------------------------------------------------------------------------------------
ResourcePath:     https://10.127.132.211/Networking/v1/servers/4c4c4544-0056-4b10-8058-b8c04f395931
Status:           Warning

Source:           SoftwareLoadBalancerManager
Code:             HostNotConnectedToController
Message:          Host is not Connected.
----------------------------------------------------------------------------------------------------------

注意

有是錯誤,在系統 SLB Mux 傳輸 VM 而的網路介面資源何處失敗狀態的「Virtual 切換-主機無法連接到控制器」錯誤。There is a bug in the system where the Network Interface resources for the SLB Mux Transit VM NIC are in a Failure state with error "Virtual Switch - Host Not Connected To Controller". 如果 IP 設定,在 VM NIC 資源轉送邏輯網路的 IP 集區中設定為 [IP 位址可以放心地忽略此錯誤。This error can be safely ignored if the IP configuration in the VM NIC resource is set to an IP Address from the Transit Logical Network's IP Pool. 有系統閘道 HNV 提供者 VM Nic 的網路介面資源何處失敗狀態發生錯誤」切換 Virtual-PortBlocked」中是第二個錯誤。There is a second bug in the system where the Network Interface resources for the Gateway HNV Provider VM NICs are in a Failure state with error "Virtual Switch - PortBlocked". 這個錯誤,可也放心地忽略設定 IP 設定,在 VM NIC 資源至空值(來設計)。This error can also be safely ignored if the IP configuration in the VM NIC resource is set to null (by design).

下表顯示錯誤碼、簡訊及後續動作上觀察到的組態狀態才會根據的清單。The table below shows the list of error codes, messages, and follow-up actions to take based on the configuration state observed.

程式碼Code 訊息Message 控制項目Action
不明Unknown 未知的錯誤Unknown error
HostUnreachableHostUnreachable 不到主機The host machine is not reachable 檢查網路控制器和主機間管理網路連接Check the Management network connectivity between Network Controller and Host
PAIpAddressExhaustedPAIpAddressExhausted 用盡 PA Ip 位址The PA Ip addresses exhausted 增加 HNV 提供者邏輯子網路的 IP 集區大小Increase the HNV Provider logical subnet's IP Pool Size
PAMacAddressExhaustedPAMacAddressExhausted 用盡 PA Mac 位址The PA Mac addresses exhausted 增加 Mac 集區範圍Increase the Mac Pool Range
PAAddressConfigurationFailurePAAddressConfigurationFailure 無法配管到主機 PA 地址Failed to plumb PA addresses to the host 檢查網路控制器和主機間管理網路連接。Check the Management network connectivity between Network Controller and Host.
CertificateNotTrustedCertificateNotTrusted 不受信任的憑證Certificate is not trusted 修正用於與主機通訊的憑證。Fix the certificates used for communication with the host.
CertificateNotAuthorizedCertificateNotAuthorized 未取得授權的憑證Certificate not authorized 修正用於與主機通訊的憑證。Fix the certificates used for communication with the host.
PolicyConfigurationFailureOnVfpPolicyConfigurationFailureOnVfp 在 [設定原則 VFP 失敗Failure in configuring VFP policies 這是執行階段失敗。This is a runtime failure. 未明確工作替。No definite work arounds. 會收集登。Collect logs.
PolicyConfigurationFailurePolicyConfigurationFailure 在主機,因為通訊失敗或其他具備原則失敗 NetworkController 時發生錯誤。Failure in pushing policies to the hosts, due to communication failures or others error in the NetworkController. 未明確的動作。No definite actions. 這是因為目標狀態處理 Network Controller 單元失敗。This is due to failure in goal state processing in the Network Controller modules. 會收集登。Collect logs.
HostNotConnectedToControllerHostNotConnectedToController 主機尚未連接至網路控制器The Host is not yet connected to the Network Controller 不會套用主機或主機連接埠設定檔不到網路控制器。Port Profile not applied on the host or the host is not reachable from the Network Controller. 驗證 HostID 登錄符合執行個體 ID 伺服器資源Validate that HostID registry key matches the Instance ID of the server resource
MultipleVfpEnabledSwitchesMultipleVfpEnabledSwitches 有多個 VFp 主機上支援選項There are multiple VFp enabled Switches on the host Delete 其中一個選項,因為網路控制器主機代理程式僅支援一個 vSwitch 與 VFP 擴充功能Delete one of the switches, since Network Controller Host Agent only supports one vSwitch with the VFP extension enabled
PolicyConfigurationFailurePolicyConfigurationFailure 無法 VmNic 因為憑證錯誤或連接錯誤的推播 VNet 原則Failed to push VNet policies for a VmNic due to certificate errors or connectivity errors 檢查是否已部署適當的憑證(憑證主體名稱必須符合的主機 FQDN)。Check if proper certificates have been deployed (Certificate subject name must match FQDN of host). 也請確認 Network Controller 的主機連接Also verify the host connectivity with the Network Controller
PolicyConfigurationFailurePolicyConfigurationFailure 無法 VmNic 因為憑證錯誤或連接錯誤的推播 vSwitch 原則Failed to push vSwitch policies for a VmNic due to certificate errors or connectivity errors 檢查是否已部署適當的憑證(憑證主體名稱必須符合的主機 FQDN)。Check if proper certificates have been deployed (Certificate subject name must match FQDN of host). 也請確認 Network Controller 的主機連接Also verify the host connectivity with the Network Controller
PolicyConfigurationFailurePolicyConfigurationFailure 無法 VmNic 因為憑證錯誤或連接錯誤的推播防火牆原則Failed to push Firewall policies for a VmNic due to certificate errors or connectivity errors 檢查是否已部署適當的憑證(憑證主體名稱必須符合的主機 FQDN)。Check if proper certificates have been deployed (Certificate subject name must match FQDN of host). 也請確認 Network Controller 的主機連接Also verify the host connectivity with the Network Controller
DistributedRouterConfigurationFailureDistributedRouterConfigurationFailure 無法在主機但 vNic 設定分散路由器設定Failed to configure the Distributed router settings on the host vNic TCPIP 堆疊時發生錯誤。TCPIP stack error. 可能需要在伺服器的此錯誤 PA 和 DR 主機 vNICs 清理May require cleaning up the PA and DR Host vNICs on the server on which this error was reported
DhcpAddressAllocationFailureDhcpAddressAllocationFailure 失敗 VMNic DHCP 位址配置DHCP address allocation failed for a VMNic 檢查是否已在 NIC 資源靜態 IP 位址屬性Check if the static IP address attribute is configured on the NIC resource
CertificateNotTrustedCertificateNotTrusted
CertificateNotAuthorizedCertificateNotAuthorized
因為網路或憑證錯誤連接 Mux 失敗Failed to connect to Mux due to network or cert errors 請提供錯誤碼的訊息中的數字的程式碼:這 winsock 錯誤碼對應。Check the numeric code provided in the error message code: this corresponds to the winsock error code. 憑證錯誤的細微 (例如,無法驗證憑證,未取得授權的憑證,。)Certificate errors are granular (for example, cert cannot be verified, cert not authorized, etc.)
HostUnreachableHostUnreachable MUX 是 Unhealthy(常見是 BGPRouter 中斷)MUX is Unhealthy (Common case is BGPRouter disconnected) BGP 等 RRAS(BGP 一樣)或切換上架 (ToR) 就無法存取不等成功。BGP peer on the RRAS (BGP virtual machine) or Top-of-Rack (ToR) switch is unreachable or not peering successfully. 檢查 BGP 上軟體負載平衡器多工器資源 BGP 等(ToR 或 RRAS 一樣)的設定Check BGP settings on both Software Load Balancer Multiplexer resource and BGP peer (ToR or RRAS virtual machine)
HostNotConnectedToControllerHostNotConnectedToController 未連接 SLB 主機代理程式SLB host agent is not connected 檢查服務 SLB 主機代理程式正在執行;參考 SLB 主機代理程式登(自動執行)原因為何,以方便 SLBM (NC) 拒絕出示主機代理程式執行憑證狀態將會顯示 nuanced 的資訊Check that SLB Host Agent service is running; Refer to SLB host agent logs (auto running) for reasons why, in case SLBM (NC) rejected the cert presented by the host agent running state will show nuanced information
PortBlockedPortBlocked 封鎖 VFP 連接埠,因為 VNET 缺乏日 ACL 原則The VFP port is blocked, due to lack of VNET / ACL policies 檢查是否有任何其他錯誤,可能會導致原則不會設定此。Check if there are any other errors, which might cause the policies to be not configured.
多載Overloaded 為多載 Loadbalancer MUXLoadbalancer MUX is overloaded MUX 效能問題Performance issue with MUX
RoutePublicationFailureRoutePublicationFailure 未連接 Loadbalancer MUX BGP 路由器Loadbalancer MUX is not connected to a BGP router 檢查是否 MUX 連接 BGP 路由器和的 BGP 外面正確設定Check if the MUX has connectivity with the BGP routers and that BGP peering is setup correctly
VirtualServerUnreachableVirtualServerUnreachable Loadbalancer MUX 未連接 SLB 管理員Loadbalancer MUX is not connected to SLB manager 檢查 SLBM 之間 MUX 連接Check connectivity between SLBM and MUX
QosConfigurationFailureQosConfigurationFailure 若要設定 QOS 原則失敗Failed to configure QOS policies 查看是否有可用的所有 VM 如果 QOS 保留的項目使用不足頻寬See if sufficient bandwidth is available for all VM's if QOS reservation is used

檢查網路連接之間網路控制器上並 HYPER-V 主機(NC 主機專員服務)Check network connectivity between the network controller and Hyper-V Host (NC Host Agent service)

執行netstat下方驗證有三種 ESTABLISHED 連接之間 NC 主機代理程式及 Network Controller 節點 HYPER-V 主機上的一個聆聽通訊端命令Run the netstat command below to validate that there are three ESTABLISHED connections between the NC Host Agent and the Network Controller node(s) and one LISTENING socket on the Hyper-V Host

  • 連接埠 TCP:6640 HYPER-V 主機(NC 主機代理程式服務)上聆聽LISTENING on port TCP:6640 on Hyper-V Host (NC Host Agent Service)
  • 有兩個來自建立的 HYPER-V 主機連接埠 6640 NC 節點 IP 暫時連接埠 (> 32000) 上的 IPTwo established connections from Hyper-V host IP on port 6640 to NC node IP on ephemeral ports (> 32000)
  • 其中一個建立連接 HYPER-V 主機 IP 暫時連接埠上從網路控制器其餘 IP 6640 連接埠One established connection from Hyper-V host IP on ephemeral port to Network Controller REST IP on port 6640

注意

可能只有兩種 HYPER-V 主機上建立的連接是否有任何承租人虛擬機器該特定主機上部署。There may only be two established connections on a Hyper-V host if there are no tenant virtual machines deployed on that particular host.

netstat -anp tcp |findstr 6640

# Successful output
  TCP    0.0.0.0:6640           0.0.0.0:0              LISTENING
  TCP    10.127.132.153:6640    10.127.132.213:50095   ESTABLISHED
  TCP    10.127.132.153:6640    10.127.132.214:62514   ESTABLISHED
  TCP    10.127.132.153:50023   10.127.132.211:6640    ESTABLISHED

檢查服務主機代理程式Check Host Agent services

HYPER-V 主機上的兩個主機專員服務通訊的網路控制器:SLB 主機代理程式和 NC 主機代理程式。The network controller communicates with two host agent services on the Hyper-V hosts: SLB Host Agent and NC Host Agent. 很可能的一個或兩個這些服務無法執行。It is possible that one or both of these services is not running. 檢查其狀態,並重新開機如果他們無法執行。Check their state and restart if they're not running.

Get-Service SlbHostAgent
Get-Service NcHostAgent

# (Re)start requires -Force flag
Start-Service NcHostAgent -Force
Start-Service SlbHostAgent -Force

查看網路控制器的健康Check health of network controller

是否有不三個 ESTABLISHED 連接或會出現無法回應時應網路控制器,請檢查所有節點和服務模組,使用下列 cmdlet 也都會恢復並執行。If there are not three ESTABLISHED connections or if the Network Controller appears unresponsive, check to see that all nodes and service modules are up and running by using the following cmdlets.

# Prints a DIFF state (status is automatically updated if state is changed) of a particular service module replica 
Debug-ServiceFabricNodeStatus [-ServiceTypeName] <Service Module>

網路控制器服務模組︰The network controller service modules are:

  • ControllerServiceControllerService
  • ApiServiceApiService
  • SlbManagerServiceSlbManagerService
  • ServiceInsertionServiceInsertion
  • FirewallServiceFirewallService
  • VSwitchServiceVSwitchService
  • GatewayManagerGatewayManager
  • FnmServiceFnmService
  • HelperServiceHelperService
  • UpdateServiceUpdateService

檢查 ReplicaStatus 的準備HealthState 是[確定]Check that ReplicaStatus is Ready and HealthState is Ok.

在 production 部署是多節點網路控制器,您也可以查看每個服務的主要在哪一個節點,其個人複本狀態。In a production deployment is with a multi-node Network Controller, you can also check which node each service is primary on and its individual replica status.

Get-NetworkControllerReplica

# Sample Output for the API service module
Replicas for service: ApiService

ReplicaRole   : Primary
NodeName      : SA18N30NC3.sa18.nttest.microsoft.com
ReplicaStatus : Ready

檢查複本狀態是準備為每個服務。Check that the Replica Status is Ready for each service.

檢查有對應 HostIDs 和網路控制器與每個 HYPER-V 主機之間的憑證Check for corresponding HostIDs and certificates between network controller and each Hyper-V Host

HYPER-V 主機,執行下列命令,查看 HostID 的伺服器上的資源 Network Controller 的執行個體 id 對應On a Hyper-V Host, run the following commands to check that the HostID corresponds to the Instance Id of a server resource on the Network Controller

Get-ItemProperty "hklm:\system\currentcontrolset\services\nchostagent\parameters" -Name HostId |fl HostId

HostId : **162cd2c8-08d4-4298-8cb4-10c2977e3cfe**

Get-NetworkControllerServer -ConnectionUri $uri |where { $_.InstanceId -eq "162cd2c8-08d4-4298-8cb4-10c2977e3cfe"}

Tags             :
ResourceRef      : /servers/4c4c4544-0056-4a10-8059-b8c04f395931
InstanceId       : **162cd2c8-08d4-4298-8cb4-10c2977e3cfe**
Etag             : W/"50f89b08-215c-495d-8505-0776baab9cb3"
ResourceMetadata : Microsoft.Windows.NetworkController.ResourceMetadata
ResourceId       : 4c4c4544-0056-4a10-8059-b8c04f395931
Properties       : Microsoft.Windows.NetworkController.ServerProperties

補救如果使用 SDNExpress 指令碼或手動部署,更新符合伺服器資源的執行個體 Id 登錄 HostId 機。Remediation If using SDNExpress scripts or manual deployment, update the HostId key in the registry to match the Instance Id of the server resource. 如果使用 VMM,從 VMM delete HYPER-V Server 重新開機 HYPER-V 主機(實體伺服器)上的網路控制器主機代理程式,並移除 HostId 登錄金鑰。Restart the Network Controller Host Agent on the Hyper-V host (physical server) If using VMM, delete the Hyper-V Server from VMM and remove the HostId registry key. 然後,重新新增到 VMM 伺服器。Then, re-add the server through VMM.

檢查 x.509 HYPER-V 主機(主機會憑證的主體名稱)用於 (SouthBound) 通訊 HYPER-V 主機(NC 主機專員服務)和 Network Controller 節點之間的憑證碼相同。Check that the thumbprints of the X.509 certificates used by the Hyper-V host (the hostname will be the cert's Subject Name) for (SouthBound) communication between the Hyper-V Host (NC Host Agent service) and Network Controller nodes are the same. 同時檢查 [Network Controller 的其餘部分憑證已主體名稱DATA-CN =Also check that the Network Controller's REST certificate has subject name of CN=.

# On Hyper-V Host
dir cert:\\localmachine\my  

Thumbprint                                Subject
----------                                -------
2A3A674D07D8D7AE11EBDAC25B86441D68D774F9  CN=SA18n30-4.sa18.nttest.microsoft.com
...

dir cert:\\localmachine\root

Thumbprint                                Subject
----------                                -------
30674C020268AA4E40FD6817BA6966531FB9ADA4  CN=10.127.132.211 **# NC REST IP ADDRESS**

# On Network Controller Node VM
dir cert:\\localmachine\root  

Thumbprint                                Subject
----------                                -------
2A3A674D07D8D7AE11EBDAC25B86441D68D774F9  CN=SA18n30-4.sa18.nttest.microsoft.com
30674C020268AA4E40FD6817BA6966531FB9ADA4  CN=10.127.132.211 **# NC REST IP ADDRESS**
...

您也可以查看下列的每個憑證,以確定主體名稱是參數(主機 NC 其他 FQDN 或 IP),如預期般尚未到期憑證,並的受信任的根授權中所包含的所有憑證授權單位」中的憑證鏈結。You can also check the following parameters of each cert to make sure the subject name is what is expected (hostname or NC REST FQDN or IP), the certificate has not yet expired, and that all certificate authorities in the certificate chain are included in the trusted root authority.

  • 主體名稱Subject Name
  • 到期Expiration Date
  • 信任的根授權Trusted by Root Authority

補救多個憑證有相同的主體名稱 HYPER-V 主機上,如果網路控制器主機代理程式會隨機閃爍的問題選擇一個來向網路控制器。Remediation If multiple certificates have the same subject name on the Hyper-V host, the Network Controller Host Agent will randomly choose one to present to the Network Controller. 這可能不符合指紋的資源已知的網路控制站伺服器。This may not match the thumbprint of the server resource known to the Network Controller. 在這種情形下,其中一個相同主體名稱 HYPER-V 主機上的憑證 delete,然後重新開始網路控制器主機代理程式服務。In this case, delete one of the certificates with the same subject name on the Hyper-V host and then re-start the Network Controller Host Agent service. 如果可以仍然連接,delete 相同主體名稱 HYPER-V 主機上的另一個憑證,並 delete 對應伺服器資源 VMM 中。If a connection can still not be made, delete the other certificate with the same subject name on the Hyper-V Host and delete the corresponding server resource in VMM. 然後,重新建立 VMM 將產生 X.509 新的憑證,並 HYPER-V 主機上安裝在伺服器資源。Then, re-create the server resource in VMM which will generate a new X.509 certificate and install it on the Hyper-V host.

檢查 SLB 設定狀態Check the SLB Configuration State

可以判斷 SLB 設定狀態做為輸出 Debug-NetworkController cmdlet 的一部分。The SLB Configuration State can be determined as part of the output to the Debug-NetworkController cmdlet. 這個 cmdlet 也會輸出目前 Network Controller 資源 JSON 檔案、所有 IP 組態的每個 HYPER-V 主機(伺服器)的區域網路原則從主機代理程式資料庫表格中的設定。This cmdlet will also output the current set of Network Controller resources in JSON files, all IP configurations from each Hyper-V host (server) and local network policy from Host Agent database tables.

根據預設,將會收集其他追蹤。Additional traces will be collected by default. 不會收集追蹤來新增-IncludeTraces: $false 參數。To not collect traces, add the -IncludeTraces:$false parameter.

Debug-NetworkController -NetworkController <FQDN or IP> [-Credential <PS Credential>] [-IncludeTraces:$false]

# Don't collect traces
$cred = Get-Credential 
Debug-NetworkController -NetworkController 10.127.132.211 -Credential $cred -IncludeTraces:$false

Transcript started, output file is C:\\NCDiagnostics.log
Collecting Diagnostics data from NC Nodes

注意

預設的輸出位置將會 < working_directory > \NCDiagnostics\ directory。The default output location will be the <working_directory>\NCDiagnostics\ directory. 變更預設的輸出 directory 可以使用-OutputDirectory的參數。The default output directory can be changed by using the -OutputDirectory parameter.

中找到 SLB 設定狀態的資訊_診斷-slbstateResults.Json_在此 directory 中的檔案。The SLB Configuration State information can be found in the diagnostics-slbstateResults.Json file in this directory.

這個 JSON 檔案可分成下列的區段:This JSON file can be broken down into the following sections:

  • FabricFabric
    • SlbmVips-本節列出使用 coodinate 設定和 SLB Muxes 之間 SLB 主機代理程式健康 Network Controller 的 SLB 管理員 VIP 地址的 IP 位址。SlbmVips - This section lists the IP address of the SLB Manager VIP address which is used by the Network Controller to coodinate configuration and health between the SLB Muxes and SLB Host Agents.
    • MuxState-本節會列出一個值的提供的狀態 mux 每個 SLB Mux 部署MuxState - This section will list one value for each SLB Mux deployed giving the state of the mux
    • 路由器設定的這個區段會列出上游路由器的(BGP 等)獨立系統 (ASN),傳送的 IP 位址和 id。Router Configuration - This section will list the Upstream Router's (BGP Peer) Autonomous System Number (ASN), Transit IP Address, and ID. 它也會列出 SLB Muxes ASN 與傳輸 IP。It will also list the SLB Muxes ASN and Transit IP.
    • 連接主機資訊的這個區段會清單管理 IP 地址所有可用來執行負載平衡工作負載 HYPER-V 主機。Connected Host Info - This section will list the Management IP address all of the Hyper-V hosts available to run load-balanced workloads.
    • Vip 範圍-本章節也會列出的公開和私人 VIP IP 集區範圍。Vip Ranges - This section will list the public and private VIP IP pool ranges. SLBM VIP 將作為從這些範圍的其中一個 IP 配置。The SLBM VIP will be included as an allocated IP from one of these ranges.
    • Mux 路徑-本節會列出一個值的包含所有的該特定 mux 路由廣告的每個 SLB Mux 部署。Mux Routes - This section will list one value for each SLB Mux deployed containing all of the Route Advertisements for that particular mux.
  • 承租人Tenant
    • VipConsolidatedState-本章節也會列出連接狀態的每個承租人 VIP,包括廣告的路由首碼、HYPER-V 主機和 DIP 端點。VipConsolidatedState - This section will list the connectivity state for each Tenant VIP including advertised route prefix, Hyper-V Host and DIP endpoints.

注意

SLB 狀態都可以利用直接確定DumpSlbRestState上的指令碼Microsoft SDN GitHub 存放庫SLB State can be ascertained directly by using the DumpSlbRestState script available on the Microsoft SDN GitHub repository.

閘道驗證Gateway Validation

從 Network Controller:From Network Controller:

Get-NetworkControllerLogicalNetwork
Get-NetworkControllerPublicIPAddress
Get-NetworkControllerGatewayPool
Get-NetworkControllerGateway
Get-NetworkControllerVirtualGateway
Get-NetworkControllerNetworkInterface

從閘道 VM:From Gateway VM:

Ipconfig /allcompartments /all
Get-NetRoute -IncludeAllCompartments -AddressFamily
Get-NetBgpRouter
Get-NetBgpRouter | Get-BgpPeer
Get-NetBgpRouter | Get-BgpRouteInformation

從最上面的架 (ToR) 切換:From Top of Rack (ToR) Switch:

sh ip bgp summary (for 3rd party BGP Routers)

Windows BGP 路由器Windows BGP Router

Get-BgpRouter
Get-BgpPeer
Get-BgpRouteInformation

除了這些項目,我們已經看過為止(尤其是根據 SDNExpress 部署),在問題的承租人區間 GW Vm 上尚未取得設定的最常見原因似乎的 FabricConfig.psd1 GW 容量較少比較他人嘗試指派給網路連接(S2S 通道)TenantConfig.psd1 的事實。In addition to these, from the issues we have seen so far (especially on SDNExpress based deployments), the most common reason for Tenant Compartment not getting configured on GW VMs seem to be the fact that the GW Capacity in FabricConfig.psd1 is less compared to what folks try to assign to the Network Connections (S2S Tunnels) in TenantConfig.psd1. 這都可以比較下列命令的輸出輕鬆地檢查:This can be checked easily by comparing outputs of the following commands:

PS > (Get-NetworkControllerGatewayPool -ConnectionUri $uri).properties.Capacity
PS > (Get-NetworkControllerVirtualgatewayNetworkConnection -ConnectionUri $uri -VirtualGatewayId "TenantName").properties.OutboundKiloBitsPerSecond
PS > (Get-NetworkControllerVirtualgatewayNetworkConnection -ConnectionUri $uri -VirtualGatewayId "TenantName").property

[主]驗證資料平面[Hoster] Validate Data-Plane

部署 Network Controller、建立承租人 virtual 網路和子網路,並附加到 virtual 子網路 Vm 之後,可以執行其他 fabric 層級測試,以檢查承租人連接項。After the Network Controller has been deployed, tenant virtual networks and subnets have been created, and VMs have been attached to the virtual subnets, additional fabric level tests can be performed by the hoster to check tenant connectivity.

檢查 HNV 提供者邏輯網路連接Check HNV Provider Logical Network Connectivity

之後的第一個來賓執行 HYPER-V 主機上 VM 已連接到承租人 virtual 網路,Network Controller 將 HYPER-V 主機指派兩個 HNV 提供者 IP 位址(PA IP 位址)。After the first guest VM running on a Hyper-V host has been connected to a tenant virtual network, the Network Controller will assign two HNV Provider IP Addresses (PA IP Addresses) to the Hyper-V Host. 這些程式將來自 HNV 提供者邏輯網路的 IP 集區與受網路控制器。These IPs will come from the HNV Provider logical network's IP Pool and be managed by the Network Controller. 若要找出這兩個 HNV IP 位址的To find out what these two HNV IP Addresses are 's

PS > Get-ProviderAddress

# Sample Output
ProviderAddress : 10.10.182.66
MAC Address     : 40-1D-D8-B7-1C-04
Subnet Mask     : 255.255.255.128
Default Gateway : 10.10.182.1
VLAN            : VLAN11

ProviderAddress : 10.10.182.67
MAC Address     : 40-1D-D8-B7-1C-05
Subnet Mask     : 255.255.255.128
Default Gateway : 10.10.182.1
VLAN            : VLAN11

這些 HNV 提供者 IP 位址 (PA IPs) 已指派給建立不同的 TCPIP 網路區間乙太網路卡和的介面卡名稱_VLANX_ X 是 VLAN 指派給 HNV 提供者(傳輸)邏輯網路。These HNV Provider IP Addresses (PA IPs) are assigned to Ethernet Adapters created in a separate TCPIP network compartment and have an adapter name of VLANX where X is the VLAN assigned to the HNV Provider (transport) logical network.

連接兩個 HYPER-V 主機使用 HNV 提供者可以使用其他區間 ping 進行邏輯網路之間 (-c Y) 參數,Y 所在中建立 PAhostVNICs TCPIP 網路區間。Connectivity between two Hyper-V hosts using the HNV Provider logical network can be done by a ping with an additional compartment (-c Y) parameter where Y is the TCPIP network compartment in which the PAhostVNICs are created. 可以判斷這區間執行:This compartment can be determined by executing:

C:\> ipconfig /allcompartments /all

<snip> ...
==============================================================================
Network Information for *Compartment 3*
==============================================================================
   Host Name . . . . . . . . . . . . : SA18n30-2
<snip> ...

Ethernet adapter VLAN11:

   Connection-specific DNS Suffix  . :
   Description . . . . . . . . . . . : Microsoft Hyper-V Network Adapter
   Physical Address. . . . . . . . . : 40-1D-D8-B7-1C-04
   DHCP Enabled. . . . . . . . . . . : No
   Autoconfiguration Enabled . . . . : Yes
   Link-local IPv6 Address . . . . . : fe80::5937:a365:d135:2899%39(Preferred)
   IPv4 Address. . . . . . . . . . . : 10.10.182.66(Preferred)
   Subnet Mask . . . . . . . . . . . : 255.255.255.128
   Default Gateway . . . . . . . . . : 10.10.182.1
   NetBIOS over Tcpip. . . . . . . . : Disabled

Ethernet adapter VLAN11:

   Connection-specific DNS Suffix  . :
   Description . . . . . . . . . . . : Microsoft Hyper-V Network Adapter
   Physical Address. . . . . . . . . : 40-1D-D8-B7-1C-05
   DHCP Enabled. . . . . . . . . . . : No
   Autoconfiguration Enabled . . . . : Yes
   Link-local IPv6 Address . . . . . : fe80::28b3:1ab1:d9d9:19ec%44(Preferred)
   IPv4 Address. . . . . . . . . . . : 10.10.182.67(Preferred)
   Subnet Mask . . . . . . . . . . . : 255.255.255.128
   Default Gateway . . . . . . . . . : 10.10.182.1
   NetBIOS over Tcpip. . . . . . . . : Disabled

*Ethernet adapter vEthernet (PAhostVNic):*
<snip> ...

注意

PA 主機但 vNIC 介面卡未使用的資料路徑中,所以不需要 IP 指派給「vEthernet (PAhostVNic) 介面卡]。The PA Host vNIC Adapters are not used in the data-path and so do not have an IP assigned to the "vEthernet (PAhostVNic) adapter".

例如,假設 HYPER-V 主機 1 到 2 有 HNV 提供者 (PA) IP 位址:For instance, assume that Hyper-V hosts 1 and 2 have HNV Provider (PA) IP Addresses of:

-HYPER-V 主機-- Hyper-V Host - -PA IP 位址 1- PA IP Address 1 -PA IP 位址 2- PA IP Address 2
主機 1Host 1 10.10.182.6410.10.182.64 10.10.182.6510.10.182.65
主機 2Host 2 10.10.182.6610.10.182.66 10.10.182.6710.10.182.67

我們可以使用下列命令查看 HNV 提供者邏輯網路連接兩個之間 ping。we can ping between the two using the following command to check HNV Provider logical network connectivity.

# Ping the first PA IP Address on Hyper-V Host 2 from the first PA IP address on Hyper-V Host 1 in compartment (-c) 3
C:\> ping -c 3 10.10.182.66 -S 10.10.182.64

# Ping the second PA IP Address on Hyper-V Host 2 from the first PA IP address on Hyper-V Host 1 in compartment (-c) 3
C:\> ping -c 3 10.10.182.67 -S 10.10.182.64

# Ping the first PA IP Address on Hyper-V Host 2 from the second PA IP address on Hyper-V Host 1 in compartment (-c) 3
C:\> ping -c 3 10.10.182.66 -S 10.10.182.65

# Ping the second PA IP Address on Hyper-V Host 2 from the second PA IP address on Hyper-V Host 1 in compartment (-c) 3
C:\> ping -c 3 10.10.182.67 -S 10.10.182.65

補救如果 HNV 提供者 ping 無法運作,檢查您的實體網路連接包括 VLAN 設定。Remediation If HNV Provider ping does not work, check your physical network connectivity including VLAN configuration. 主幹模式,應該是每個 HYPER-V 主機上的實體 Nic 有任何特定的 VLAN 指派。The physical NICs on each Hyper-V host should be in trunk mode with no specific VLAN assigned. 管理主機但 vNIC 應該和管理邏輯網路 VLAN 分離。The Management Host vNIC should be isolated to the Management Logical Network's VLAN.

PS C:\> Get-NetAdapter "Ethernet 4" |fl

Name                       : Ethernet 4
InterfaceDescription       : <NIC> Ethernet Adapter
InterfaceIndex             : 2
MacAddress                 : F4-52-14-55-BC-21
MediaType                  : 802.3
PhysicalMediaType          : 802.3
InterfaceOperationalStatus : Up
AdminStatus                : Up
LinkSpeed(Gbps)            : 10
MediaConnectionState       : Connected
ConnectorPresent           : True
*VlanID                     : 0*
DriverInformation          : Driver Date 2016-08-28 Version 5.25.12665.0 NDIS 6.60

# VMM uses the older PowerShell cmdlet <Verb>-VMNetworkAdapterVlan to set VLAN isolation
PS C:\> Get-VMNetworkAdapterVlan -ManagementOS -VMNetworkAdapterName <Mgmt>

VMName VMNetworkAdapterName Mode     VlanList
------ -------------------- ----     --------
<snip> ...        
       Mgmt                 Access   7
<snip> ...

# SDNExpress deployments use the newer PowerShell cmdlet <Verb>-VMNetworkAdapterIsolation to set VLAN isolation
PS C:\> Get-VMNetworkAdapterIsolation -ManagementOS

<snip> ...

IsolationMode        : Vlan
AllowUntaggedTraffic : False
DefaultIsolationID   : 7
MultiTenantStack     : Off
ParentAdapter        : VMInternalNetworkAdapter, Name = 'Mgmt'
IsTemplate           : True
CimSession           : CimSession: .
ComputerName         : SA18N30-2
IsDeleted            : False

<snip> ...

檢查 MTU 和巨大框架 HNV 提供者邏輯網路支援Check MTU and Jumbo Frame support on HNV Provider Logical Network

另一個常見邏輯 HNV 提供者網路中的問題會的實體網路連接埠和/或乙太網路卡不需要設定為處理 VXLAN(或 NVGRE)封裝負擔大 MTU。Another common problem in the HNV Provider logical network is that the physical network ports and/or Ethernet card do not have a large enough MTU configured to handle the overhead from VXLAN (or NVGRE) encapsulation.

注意

有些乙太網路卡和驅動程式支援的新 * EncapOverhead 關鍵字,將會自動設定為 160 主機的網路控制器代理程式。Some Ethernet cards and drivers support the new *EncapOverhead keyword which will automatically be set by the Network Controller Host Agent to a value of 160. 這個值,然後會新增至的值 * JumboPacket 關鍵字總和其做為 MTU 通知。This value will then be added to the value of the *JumboPacket keyword whose summation is used as the advertised MTU. 例如 * EncapOverhead = 160 和 * JumboPacket = 1514 年 = > MTU = 1674Be.g. *EncapOverhead = 160 and *JumboPacket = 1514 => MTU = 1674B

# Check whether or not your Ethernet card and driver support *EncapOverhead
PS C:\ > Test-EncapOverheadSettings

Verifying Physical Nic : <NIC> Ethernet Adapter #2
Physical Nic  <NIC> Ethernet Adapter #2 can support SDN traffic. Encapoverhead value set on the nic is  160
Verifying Physical Nic : <NIC> Ethernet Adapter
Physical Nic  <NIC> Ethernet Adapter can support SDN traffic. Encapoverhead value set on the nic is  160

若要測試是否 HNV 提供者邏輯網路支援大 MTU 大小的端點使用_測試-LogicalNetworkSupportsJumboPacket_ cmdlet:To test whether or not the HNV Provider logical network supports the larger MTU size end-to-end, use the Test-LogicalNetworkSupportsJumboPacket cmdlet:

# Get credentials for both source host and destination host (or use the same credential if in the same domain)
$sourcehostcred = Get-Credential
$desthostcred = Get-Credential

# Use the Management IP Address or FQDN of the Source and Destination Hyper-V hosts
Test-LogicalNetworkSupportsJumboPacket -SourceHost sa18n30-2 -DestinationHost sa18n30-3 -SourceHostCreds $sourcehostcred -DestinationHostCreds $desthostcred

# Failure Results
SourceCompartment : 3
pinging Source PA: 10.10.182.66 to Destination PA: 10.10.182.64 with Payload: 1632
pinging Source PA: 10.10.182.66 to Destination PA: 10.10.182.64 with Payload: 1472
Checking if physical nics support jumbo packets on host
Physical Nic  <NIC> Ethernet Adapter #2 can support SDN traffic. Encapoverhead value set on the nic is  160
Cannot send jumbo packets to the destination. Physical switch ports may not be configured to support jumbo packets.
Checking if physical nics support jumbo packets on host
Physical Nic  <NIC> Ethernet Adapter #2 can support SDN traffic. Encapoverhead value set on the nic is  160
Cannot send jumbo packets to the destination. Physical switch ports may not be configured to support jumbo packets.

# TODO: Success Results aftering updating MTU on physical switch ports

修復Remediation

  • 調整大小 MTU 實體切換連接埠,至少會在 1674B(包括 14B 乙太網路標頭和結尾)Adjust the MTU size on the physical switch ports to be at least 1674B (including 14B Ethernet header and trailer)
  • 如果您的卡片 NIC 不支援的 EncapOverhead 關鍵字,調整至少為 JumboPacket 關鍵字 1674BIf your NIC card does not support the EncapOverhead keyword, adjust the JumboPacket keyword to be at least 1674B

檢查承租人 VM NIC 連接Check Tenant VM NIC connectivity

指派給來賓 VM 每個 VM NIC 有私人客戶地址 (CA) 和 HNV 提供者地址 (PA) 空間之間的 CA-PA 對應。Each VM NIC assigned to a guest VM has a CA-PA mapping between the private Customer Address (CA) and the HNV Provider Address (PA) space. 這些對應會保持在每個 HYPER-V 主機 OVSDB 伺服器表格中,執行下列 cmdlet 可以找到。These mappings are kept in the OVSDB server tables on each Hyper-V host and can be found by executing the following cmdlet.

# Get all known PA-CA Mappings from this particular Hyper-V Host
PS > Get-PACAMapping

CA IP Address CA MAC Address    Virtual Subnet ID PA IP Address
------------- --------------    ----------------- -------------
10.254.254.2  00-1D-D8-B7-1C-43              4115 10.10.182.67
10.254.254.3  00-1D-D8-B7-1C-43              4115 10.10.182.67
192.168.1.5   00-1D-D8-B7-1C-07              4114 10.10.182.65
10.254.254.1  40-1D-D8-B7-1C-06              4115 10.10.182.66
192.168.1.1   40-1D-D8-B7-1C-06              4114 10.10.182.66
192.168.1.4   00-1D-D8-B7-1C-05              4114 10.10.182.66

注意

如果指定房客 VM 的不輸出您預期的 CA-PA 對應,請檢查 VM 網路介面卡、IP 設定資源 Network Controller 使用_取得-NetworkControllerNetworkInterface_ cmdlet。If the CA-PA mappings you expect are not output for a given tenant VM, please check the VM NIC and IP Configuration resources on the Network Controller using the Get-NetworkControllerNetworkInterface cmdlet. 此外,檢查 NC 主機代理程式和網路控制器節點間建立的連接。Also, check the established connections between the NC Host Agent and Network Controller nodes.

使用此資訊,請承租人 VM ping 可以立即將車載機起始來從網路控制器使用項_測試-VirtualNetworkConnection_ cmdlet。With this information, a tenant VM ping can now be initiated by the Hoster from the Network Controller using the Test-VirtualNetworkConnection cmdlet.

特定疑難排解案例Specific Troubleshooting Scenarios

下列章節提供指導方針進行疑難排解特定的案例。The following sections provide guidance for troubleshooting specific scenarios.

有兩個承租人虛擬電腦之間不網路連接No network connectivity between two tenant virtual machines

  1. [承租人]確定 Windows 防火牆承租人虛擬電腦中不會封鎖流量。[Tenant] Ensure Windows Firewall in tenant virtual machines is not blocking traffic.
  2. [承租人]檢查您的 IP 位址已指派給承租人一樣執行_ipconfig_。[Tenant] Check that IP addresses have been assigned to the tenant virtual machine by running ipconfig.
  3. [主]執行測試-VirtualNetworkConnection HYPER-V 主機驗證有問題的兩個承租人虛擬電腦之間連接。[Hoster] Run Test-VirtualNetworkConnection from the Hyper-V host to validate connectivity between the two tenant virtual machines in question.

注意

VSID 指的是 Virtual 子網路編號。The VSID refers to the Virtual Subnet ID. 如果是 VXLAN,這是 VXLAN 網路識別碼 (VNI)。In the case of VXLAN, this is the VXLAN Network Identifier (VNI). 您可以找到這個值執行取得-PACAMapping cmdlet。You can find this value by running the Get-PACAMapping cmdlet.

範例Example

$password = ConvertTo-SecureString -String "password" -AsPlainText -Force
$cred = New-Object pscredential -ArgumentList (".\administrator", $password) 

建立的主機上 [sa18n30-2.sa18.nttest.microsoft.com」管理 IP 的 10.127.132.153 ListenerCA ip 192.168.1.5 這兩個附加到 Virtual 子網路 (VSID) 4114 的 192.168.1.4 SenderCA IP CA-ping 之間」遺漏 Web VM 1」。Create CA-ping between "Green Web VM 1" with SenderCA IP of 192.168.1.4 on Host "sa18n30-2.sa18.nttest.microsoft.com" with Mgmt IP of 10.127.132.153 to ListenerCA IP of 192.168.1.5 both attached to Virtual Subnet (VSID) 4114.

Test-VirtualNetworkConnection -OperationId 27 -HostName sa18n30-2.sa18.nttest.microsoft.com -MgmtIp 10.127.132.153 -Creds $cred -VMName "Green Web VM 1" -VMNetworkAdapterName "Green Web VM 1" -SenderCAIP 192.168.1.4 -SenderVSID 4114 -ListenerCAIP 192.168.1.5 -ListenerVSID 4114

Test-VirtualNetworkConnection at command pipeline position 1

開始 CA 空間 ping 測試從開始到 192.168.1.5 追蹤工作階段 Ping 成功電子郵件地址 192.168.1.4 Rtt = 0 msStarting CA-space ping test Starting trace session Ping to 192.168.1.5 succeeded from address 192.168.1.4 Rtt = 0 ms

CA 路由資訊:CA Routing Information:

Local IP: 192.168.1.4
Local VSID: 4114
Remote IP: 192.168.1.5
Remote VSID: 4114
Distributed Router Local IP: 192.168.1.1
Distributed Router Local MAC: 40-1D-D8-B7-1C-06
Local CA MAC: 00-1D-D8-B7-1C-05
Remote CA MAC: 00-1D-D8-B7-1C-07
Next Hop CA MAC Address: 00-1D-D8-B7-1C-07

PA 路由資訊:PA Routing Information:

Local PA IP: 10.10.182.66
Remote PA IP: 10.10.182.65

......

  1. [承租人]檢查有 virtual 子網路或封鎖流量 VM 網路介面指定不分散式的防火牆原則。[Tenant] Check that there is no distributed firewall policies specified on the virtual subnet or VM network interfaces which would block traffic.

查詢 sa18.nttest.microsoft.com 網域中找到 sa18n30nc 示範環境中網路控制器 REST API。Query the Network Controller REST API found in demo environment at sa18n30nc in the sa18.nttest.microsoft.com domain.

$uri = "https://sa18n30nc.sa18.nttest.microsoft.com"
Get-NetworkControllerAccessControlList -ConnectionUri $uri 

尋找 IP 設定,這參考此 ACL Virtual 子網路。Look at IP Configuration and Virtual Subnets which are referencing this ACL

  1. [主]執行Get-ProviderAddress在兩個 HYPER-V 主機裝載兩個承租人虛擬有問題的電腦,然後執行Test-LogicalNetworkConnectionping -c <compartment>HYPER-V 主機驗證連接 HNV 提供者的邏輯網路上的[Hoster] Run Get-ProviderAddress on both Hyper-V hosts hosting the two tenant virtual machines in question and then run Test-LogicalNetworkConnection or ping -c <compartment> from the Hyper-V host to validate connectivity on the HNV Provider logical network
  2. [主]確保 MTU 設定正確 HYPER-V 主機上,以及任何層級 2 切換空行 HYPER-V 主機裝置。[Hoster] Ensure that the MTU settings are correct on the Hyper-V hosts and any Layer-2 switching devices in between the Hyper-V Hosts. 執行Test-EncapOverheadValue上所有 HYPER-V 主機有問題。Run Test-EncapOverheadValue on all Hyper-V hosts in question. 也請查看所有層級 2 切換中的有 MTU 為最低 1674 位元組的最大的 160 位元組的費用。Also check that all Layer-2 switches in between have MTU set to least 1674 bytes to account for maximum overhead of 160 bytes.
  3. [主]如果未出現 PA IP 位址和/或 CA 連接損壞,檢查以確定已收到的網路原則。[Hoster] If PA IP Addresses are not present and/or CA Connectivity is broken, check to ensure network policy has been received. 執行Get-PACAMapping以查看是否封裝規則 CA-PA 對應建立覆疊 virtual 網路所需的正確建立及。Run Get-PACAMapping to see if the encapsulation rules and CA-PA mappings required for creating overlay virtual networks are correctly established.
  4. [主]檢查網路控制器主機代理程式連接至網路控制器。[Hoster] Check that the Network Controller Host Agent is connected to the Network Controller. 執行netstat -anp tcp |findstr 6640以查看是否Run netstat -anp tcp |findstr 6640 to see if the
  5. [主]檢查主機 ID 中 HKLM / 符合伺服器資源裝載承租人虛擬電腦的執行個體來電顯示。[Hoster] Check that the Host ID in HKLM/ matches the Instance ID of the server resources hosting the tenant virtual machines.
  6. [主]連接埠設定檔 ID 符合 VM 網路介面承租人虛擬電腦的執行個體 ID 檢查。[Hoster] Check that the Port Profile ID matches the Instance ID of the VM Network Interfaces of the tenant virtual machines.

登入,追蹤進階診斷]Logging, Tracing and advanced diagnostics

下列章節登入,和追蹤進階診斷提供的資訊。The following sections provide information on advanced diagnostics, logging, and tracing.

網路控制器集中登入Network controller centralized logging

Network Controller 可自動登偵錯工具會收集並將它們儲存在中央位置。The Network Controller can automatically collect debugger logs and store them in a centralized location. 當您第一次,或任何時候稍後部署 Network Controller 時,可登入的收藏。Log collection can be enabled when when you deploy the Network Controller for the first time or any time later. 登的網路控制器,請從所收集和網路由 Network Controller 的項目︰ 裝載電腦、軟體負載平衡器 (SLB) 和閘道電腦。The logs are collected from the Network Controller, and network elements managed by Network Controller: host machines, software load balancers (SLB) and gateway machines.

這些登包含 Network Controller 叢集、網路控制器應用程式、閘道登、SLB、virtual 網路和分散式的防火牆偵錯登。These logs include debug logs for the Network Controller cluster, the Network Controller application, gateway logs, SLB, virtual networking and the distributed firewall. 只要新增了一個新主機日 SLB 日閘道 Network Controller,登入被開始在這些電腦上。Whenever a new host/SLB/gateway is added to the Network Controller, logging is started on those machines. 同樣地,從網路控制器移除主機日 SLB 日閘道之後,登入停止在這些電腦上。Similarly, when a host/SLB/gateway is removed from the Network Controller, logging is stopped on those machines.

讓登入Enable logging

安裝 Network Controller 叢集使用時,會自動支援登入安裝-NetworkControllerCluster cmdlet。Logging is automatically enabled when you install the Network Controller cluster using the Install-NetworkControllerCluster cmdlet. 根據預設,登會在本機上收集 Network Controller 節點在%systemdrive%\SDNDiagnosticsBy default, the logs are collected locally on the Network Controller nodes at %systemdrive%\SDNDiagnostics. 這是建議在您變更是(不是本機)遠端檔案共用此位置。It is STRONGLY RECOMMENDED that you change this location to be a remote file share (not local).

Network Controller 叢集登會儲存在%programData%\Windows Fabric\log\TracesThe Network Controller cluster logs are stored at %programData%\Windows Fabric\log\Traces. 您可以指定中央的位置登入收集使用DiagnosticLogLocation與建議,這也是參數會遠端檔案共用。You can specify a centralized location for log collection with the DiagnosticLogLocation parameter with the recommendation that this is also be a remote file share.

如果您想要限制這位置的存取,您可以存取憑證的LogLocationCredential的參數。If you want to restrict access to this location, you can provide the access credentials with the LogLocationCredential parameter. 如果您提供的認證存取位置登入,您也應該提供CredentialEncryptionCertificate用來儲存在本機上 Network Controller 節點認證加密的參數。If you provide the credentials to access the log location, you should also provide the CredentialEncryptionCertificate parameter, which is used to encrypt the credentials stored locally on the Network Controller nodes.

使用預設設定,建議您有 [本機] 節點中中央位置 25 GB 可用空間至少 75 GB(如果未使用的中央位置)叢集 3 個節點 Network Controller 的。With the default settings, it is recommended that you have at least 75 GB of free space in the central location, and 25 GB on the local nodes (if not using a central location) for a 3-node Network Controller cluster.

變更登入設定Change logging settings

您可以變更在任何時間使用的登入設定Set-NetworkControllerDiagnosticcmdlet。You can change logging settings at any time using the Set-NetworkControllerDiagnostic cmdlet. 變更下列設定:The following settings can be changed:

  • 中央位置登入Centralized log location. 您可以變更所有登,儲存的位置DiagnosticLogLocation的參數。You can change the location to store all the logs, with the DiagnosticLogLocation parameter.
  • 若要存取的位置登入認證Credentials to access log location. 您可以變更的憑證存取登入的位置,以LogLocationCredential的參數。You can change the credentials to access the log location, with the LogLocationCredential parameter.
  • 移到本機登入Move to local logging. 如果您有提供的集中的登的存放位置,您可以移至本機 Network Controller 的節點上登入UseLocalLogLocation(不建議因為大型磁碟空間需求)的參數。If you have provided centralized location to store logs, you can move back to logging locally on the Network Controller nodes with the UseLocalLogLocation parameter (not recommended due to large disk space requirements).
  • 登入範圍Logging scope. 根據預設,所有登會都收集。By default, all logs are collected. 您可以變更收集只 Network Controller 叢集登範圍。You can change the scope to collect only Network Controller cluster logs.
  • 登入層級Logging level. 預設值登入為資訊。The default logging level is Informational. 您可以變更錯誤、警告,或詳細資訊。You can change it to Error, Warning, or Verbose.
  • 過時的時間登入Log Aging time. 登會儲存在循環的方式。The logs are stored in a circular fashion. 無論您使用登入本機集中登入,您將必須預設 3 天的登入的資料。You will have 3 days of logging data by default, whether you use local logging or centralized logging. 您可以變更此時間限制,使用LogTimeLimitInDays的參數。You can change this time limit with LogTimeLimitInDays parameter.
  • 過時大小登入Log Aging size. 根據預設,您將可以登入資料的最大 75 GB 如果使用的登入本機使用打造登入和 25 GB。By default, you will have a maximum 75 GB of logging data if using centralized logging and 25 GB if using local logging. 您可以變更此限制的LogSizeLimitInMBs的參數。You can change this limit with the LogSizeLimitInMBs parameter.

收集登和追蹤Collecting Logs and Traces

VMM 部署使用預設 Network Controller 集中登入。VMM deployments use centralized logging for the Network Controller by default. 部署 Network Controller 服務範本時指定檔案共用這些登的位置。The file share location for these logs is specified when deploying the Network Controller service template.

如果檔案的位置已指定的本機登入將每個 Network Controller 節點上使用儲存在 C:\Windows\tracing\SDNDiagnostics 登的。If a file location has not been specified, local logging will be used on each Network Controller node with logs saved under C:\Windows\tracing\SDNDiagnostics. 使用下面階層儲存這些登:These logs are saved using the following hierarchy:

  • CrashDumpsCrashDumps
  • NCApplicationCrashDumpsNCApplicationCrashDumps
  • NCApplicationLogsNCApplicationLogs
  • PerfCountersPerfCounters
  • SDNDiagnosticsSDNDiagnostics
  • 追蹤Traces

Network Controller 使用 (Azure) 服務 Fabric。The Network Controller uses (Azure) Service Fabric. 特定的問題進行疑難排解時,可能需要服務 Fabric 登。Service Fabric logs may be required when troubleshooting certain issues. 這些登上可找到 C:\ProgramData\Microsoft\Service Fabric 在每個 Network Controller 節點。These logs can be found on each Network Controller node at C:\ProgramData\Microsoft\Service Fabric.

如果使用者執行_偵錯-NetworkController_ cmdlet,其他登將會提供指定的伺服器中的資源 Network Controller 的每個 HYPER-V 主機上。If a user has run the Debug-NetworkController cmdlet, additional logs will be available on each Hyper-V host which has been specified with a server resource in the Network Controller. 這些登(和追蹤如果支援)會保持在 C:\NCDiagnosticsThese logs (and traces if enabled) are kept under C:\NCDiagnostics

SLB 診斷SLB Diagnostics

SLBM Fabric 錯誤(裝載服務提供者動作)SLBM Fabric errors (Hosting service provider actions)

  1. 檢查運作的軟體負載平衡器管理員 (SLBM) 和,協調流程層級可以才能互相溝通:SLBM SLB Mux]-> [與 SLBM]-> [SLB 主機代理程式。Check that Software Load Balancer Manager (SLBM) is functioning and that the orchestration layers can talk to each other: SLBM -> SLB Mux and SLBM -> SLB Host Agents. 執行DumpSlbRestState的存取權的網路控制器其餘 Endpoint 任何節點。Run DumpSlbRestState from any node with access to Network Controller REST Endpoint.
  2. 確認SDNSLBMPerfCounters中效能在其中的網路控制器節點 Vm(最好的主要 Network Controller 節點-Get-NetworkControllerReplica):Validate the SDNSLBMPerfCounters in PerfMon on one of the Network Controller node VMs (preferably the primary Network Controller node - Get-NetworkControllerReplica):
    1. 已連接到 SLBM 負載平衡器 (LB)」引擎嗎?Is Load Balancer (LB) engine connected to SLBM? (SLBM LBEngine 設定總計> 0)(SLBM LBEngine Configurations Total > 0)
    2. SLBM 至少知道自己的端點嗎?Does SLBM at least know about its own endpoints? (VIP 端點總計> = 2)(VIP Endpoints Total >= 2 )
    3. 連接到 SLBM HYPER-V (DIP) 主機嗎?Are Hyper-V (DIP) hosts connected to SLBM? (HP 戶端連接= num 伺服器)(HP clients connected == num servers)
    4. 已連接到 Muxes SLBM 嗎?Is SLBM connected to Muxes? (Muxes 連接 == 上 SLBM 健康 Muxes == Muxes 報告健康= # SLB Muxes Vm)。(Muxes Connected == Muxes Healthy on SLBM == Muxes reporting healthy = # SLB Muxes VMs).
  3. 請確定 SLB MUX 使用已成功對等 BGP 路由器設定Ensure the BGP router configured is successfully peering with the SLB MUX
    1. 如果您使用遠端存取權(也就是 BGP 一樣)RRAS:If using RRAS with Remote Access (i.e. BGP virtual machine):
      1. Get-BgpPeer 應該會顯示在連接Get-BgpPeer should show connected
      2. Get-BgpRouteInformation 應該會顯示 SLBM 至少路由自我 VIPGet-BgpRouteInformation should show at least a route for the SLBM self VIP
    2. 如果使用實體頂端的位架 (ToR) 切換為 BGP 等,請洽詢您的文件If using physical Top-of-Rack (ToR) switch as BGP Peer, consult your documentation
      1. 例如: # 顯示 bgp 執行個體For example: # show bgp instance
  4. 確認SlbMuxPerfCountersSLBMUX中效能 SLB Mux VM 上計數器Validate the SlbMuxPerfCounters and SLBMUX counters in PerfMon on the SLB Mux VM
  5. 檢查設定狀態和 VIP 範圍軟體負載平衡器管理員資源Check configuration state and VIP ranges in Software Load Balancer Manager Resource
    1. Get-NetworkControllerLoadBalancerConfiguration-ConnectionUri < https://|convertto-json-深度 8 (檢查 VIP 範圍 IP 集區中的,確定 SLBM 自我-VIP (LoadBalanacerManagerIPAddress),在這些範圍任何承租人面向 Vip)Get-NetworkControllerLoadBalancerConfiguration -ConnectionUri <https://| convertto-json -depth 8 (check VIP ranges in IP Pools and ensure SLBM self-VIP (LoadBalanacerManagerIPAddress) and any tenant-facing VIPs are within these ranges)
      1. Get-NetworkControllerIpPool NetworkId」< 公開私密金鑰日 VIP 邏輯網路資源 ID > [-SubnetId」< 公開私密金鑰日 VIP 邏輯子網路資源 ID > [預設-]「-ConnectionUri $uri | convertto-json-深度 8Get-NetworkControllerIpPool -NetworkId "<Public/Private VIP Logical Network Resource ID>" -SubnetId "<Public/Private VIP Logical Subnet Resource ID>" -ResourceId "" -ConnectionUri $uri |convertto-json -depth 8
    2. Debug-NetworkControllerConfigurationState-Debug-NetworkControllerConfigurationState -

如果有任何上述無法檢查、承租人 SLB 狀態也會失敗模式。If any of the checks above fail, the tenant SLB state will also be in a failure mode.

修復Remediation
依據所顯示的下列診斷資訊,修正下列動作:Based on the following diagnostic information presented, fix the following:

  • 確定已連接 SLB MultiplexersEnsure SLB Multiplexers are connected
    • 修正憑證問題Fix certificate issues
    • 修正網路連接的問題Fix network connectivity issues
  • 請確定已成功設定 BGP 等資訊Ensure BGP peering information is successfully configured
  • 確保主機 ID 登錄符合伺服器資源.執行個體 (參考附錄適用於HostNotConnected錯誤碼)Ensure Host ID in the registry matches Server Instance ID in Server Resource (reference Appendix for HostNotConnected error code)
  • 會收集登Collect logs

SLBM 承租人錯誤(裝載服務提供者和承租人動作)SLBM Tenant errors (Hosting service provider and tenant actions)

  1. [主]查看偵錯-NetworkControllerConfigurationState以查看是否任何 LoadBalancer 資源發生錯誤。[Hoster] Check Debug-NetworkControllerConfigurationState to see if any LoadBalancer resources are in an error state. 請嘗試下列待辦事項表格附錄來降低。Try to mitigate by following the Action items Table in the Appendix.
    1. 檢查有與廣告路徑 VIP 端點Check that a VIP endpoint is present and advertising routes
    2. 檢查端點 VIP 發現多少 DIP 端點Check how many DIP endpoints have been discovered for the VIP endpoint
  2. [承租人]驗證正確指定負載平衡器資源[Tenant] Validate Load Balancer Resources are correctly specified
    1. 驗證 DIP 裝載承租人虛擬機器對應至 LoadBalancer 後端地址集區 IP 設定,這登記 SLBM 中的端點Validate DIP endpoints which are registered in SLBM are hosting tenant virtual machines which correspond to the LoadBalancer Back-end Address pool IP configurations
  3. [主]如果發現不會或連接 DIP 端點:[Hoster] If DIP endpoints are not discovered or connected:
    1. 查看偵錯-NetworkControllerConfigurationStateCheck Debug-NetworkControllerConfigurationState
      1. 驗證該 NC 和 SLB 主機代理程式成功是連接到使用網路控制器事件協調器Validate that NC and SLB Host Agent is successfully connected to the Network Controller Event Coordinator using netstat -anp tcp |findstr 6640)
    2. 查看HostIdnchostagent服務 regkey (參考HostNotConnected中附錄錯誤碼) 符合對應伺服器資源的執行個體 Id (Get-NCServer |convertto-json -depth 8)Check HostId in nchostagent service regkey (reference HostNotConnected error code in the Appendix) matches the corresponding server resource's instance Id (Get-NCServer |convertto-json -depth 8)
    3. 檢查一樣連接埠連接埠設定檔 id 符合對應一樣 NIC 資源的執行個體 IdCheck port profile id for virtual machine port matches corresponding virtual machine NIC resource's Instance Id
  4. [裝載提供者]會收集登[Hosting provider] Collect logs

SLB Mux 追蹤SLB Mux Tracing

也可以透過事件檢視器判斷軟體負載平衡器 Muxes 資訊。Information from the Software Load Balancer Muxes can also be determined through Event Viewer.

  1. 按一下 [顯示分析及偵錯登」在事件檢視器檢視功能表Click on "Show Analytic and Debug Logs" under the Event Viewer View menu
  2. 瀏覽到 [應用程式與服務登「> Microsoft > Windows > SlbMuxDriver > 沿著湖邊繪製事件檢視器Navigate to "Applications and Services Logs" > Microsoft > Windows > SlbMuxDriver > Trace in Event Viewer
  3. 在其上按一下滑鼠右鍵,然後選取 [讓登入]Right click on it and select "Enable Log"

注意

我們建議您只需要一段時間當您嘗試重現問題支援此登入It is recommended that you only have this logging enabled for a short time while you are trying to reproduce a problem

VFP 和 vSwitch 追蹤VFP and vSwitch Tracing

從任何 HYPER-V 主機的裝載來賓 VM 附加到承租人 virtual 網路,您也可以收集 VFP 追蹤以判斷位置座落問題可能。From any Hyper-V host which is hosting a guest VM attached to a tenant virtual network, you can collected a VFP trace to determine where problems might lie.

netsh trace start provider=Microsoft-Windows-Hyper-V-VfpExt overwrite=yes tracefile=vfp.etl report=disable provider=Microsoft-Windows-Hyper-V-VmSwitch 
netsh trace stop
netsh trace convert .\vfp.etl ov=yes