30 Apr 2020 by Simon Greaves
NSX-T Troubleshooting
Check L2 before L3.
CorfuDB3 nodesQuorum must be up, at least 2 corfu servers required for quorumGroup Member Leader Election Server (GMLE) helps in detecting the fault with an NSX Manager node failure. It also helps elect a new leader per group.Day 2 OperationsUse st en to enter engineering mode (root privileged mode)
Component | Log Files and Locations |
NSX Policy Manager | /var/log/policy/policy.log |
NSX Manager NSXAPI Logs CorfuDB logsCluster BootstrapManager (CBM) |
/var/log/syslog/var/log/manger.log/var/log/proton/nsxapi.log/var/log/nsx-audit.log/var/log/corfu/var/log/cbm |
NSX Controller | /var/log/cloudnet/nsx-ccp.log |
ESXi host DFW |
/var/log/cfgAgent.logesxupdate.lognsxa-opsagent.lognsx-syslog/var/log/dfwpktlogs.log (only fills if logging enabled on rule) |
KVM host DFW |
/var/log/vmware/nsx-syslog/var/log/syslog/var/log/openvswitch/ovswitchd.log/var/log/dpkg.log/var/log/dfwpktlogs.log (only fills if logging enabled on rule) |
Edge NodesLoad Balancer errors | Syslog (get log-file syslog)Access-log [follow]Error-log [follow] |
##
Set service manager logging-level debug
View with get log-file policy.logget log-file syslogController LogCFG Agent Log (ESXi)KVM
Configure Syslog ExporterYou get vRLI with NSX.
set logging-server <hostname-or-ip-address[:port]> proto
/etc/rsyslog.d/40-vmware-remote-logging.conf
‘.@
Systemctl restart rsyslog
If you need detailed traffic info, use port mirroring.Can use CLI to setup packet capture on:
start capture interface
set capture session
pktcap-uw
tcpdump -uw
Tcpdump
If file corrupt check OVA or QCOW2 install files12 characters minimum on passwordCheck logs
NSX CLI
get servicesget service
Can see that ESXi is connected to 46, and KVM is on 47, showing the Shards are working correctly.
N-VDS is incorrectly configured on a hostOverlay tunnel (GENEVE) is misconfiguredTEPs unable to reach each other
esxcfg-vswitch -l
esxcli network ip interface ipv4 getVmk10 is the TEP for NSX.Vmk50 is for intra-tier networking/routing and containers.
vmkping ++netstack=vxlan
If a VM is not able to communicate on a specific host, check that the segment is present, if it isn’t showing on the host, go into the GUI, and check the N-VDS segment is present. If it is, check the advanced settings virtual switches and look for any errors like Partial Success Shown below.
If this happens, check that the agents are running on the host.
/etc/init.d/nsx-mpa statusesxcli network ip connection list | grep 5671/etc/init.d/nsx-proxy statusesxcli network ip connection list | grep 1235/etc/init.d/nsx-opsagent status |
Especially those check boxes! Check Routing Table get logical-router Check the SR for routingValidate the routing table for the Tier-0 SR VRFvrfget route b = BGP
For DR check the forwarder for similar informationget forwarding
get bgp neighbor summaryCheck the status is established. Active means still setting up!
T0 SR can show BGP route infoget bgp ipv4
Most common firewall issues are
get firewall statussummary KVMOvs-appctl used for configuration of Firewall.Validate with
ovs-appctl -t /var/run/openvswitch/nsxa-ctl dfw/vifGet the VIF then typeovs-appctl -t /var/run/openvswitch/nsxa-ctl dfw/rules
ESXiUse vsipioctl and summarize-dvfilter
Summarize-dvfilter | grep
The example adds the -A16 variable which tells grep to add 16 lines to the output.This is without the -A16 and with
Can also use the addrsets in the filter instead of name. Same commands again but with -f addrset number.The edges give definition of what’s in the rule sets using
get firewall
Again, this command only shows 1 line as the -A1 is used in the egrep.
get configurationget node-uuidget interfacesget managersget host-switchesget tunnel-portsget vteps
Tagged with: NSX-T Command Line
Comments are closed for this post.