As Nutanix describes in their Nutanix Networking guide, Nutanix is recommending “Active-backup” as bond mode for ease of use. For available bandwidth and increased availability, I always recommend the use of the bond mode “balance-tcp” in combination with LACP.
For a the use of LACP, you need to configure that as well on the network switches as well on the AHV hosts.
Don’t forget: if changing network settings on a host in a running Nutanix cluster, set the host in Maintenance. This will make sure, there is no impact on running virtual machines.
Prereq: Check the network connections of the host
Probably, you know which network port are being used for your host, but my good advice is to check this before you are changing any settings on the network switch.
You can follow the network cables in the datacenter from NIC to switch port. You also can use lldpctl to check on your host.
[root@AHV02 ~]# lldpctl
-------------------------------------------------------------------------------
LLDP neighbors:
-------------------------------------------------------------------------------
Interface: eth1, via: LLDP, RID: 8, Time: 3 days, 05:11:10
Chassis:
ChassisID: mac b8:59:9f:7d:7b:40
SysName: MLNX01
SysDescr: MSN2010,Onyx,SWv3.7.1134
TTL: 120
MgmtIP: 10.10.0.254
Capability: Bridge, on
Capability: Router, off
Port:
PortID: ifname Eth1/2
PortDescr: link to Nutanix node AHV02
-------------------------------------------------------------------------------
Interface: eth2, via: LLDP, RID: 7, Time: 3 days, 05:11:30
Chassis:
ChassisID: mac b8:59:9f:75:31:40
SysName: MLNX02
SysDescr: MSN2010,Onyx,SWv3.7.1134
TTL: 120
MgmtIP: 10.10.0.252
Capability: Bridge, on
Capability: Router, off
Port:
PortID: ifname Eth1/2
PortDescr: link to Nutanix node AHV02
-------------------------------------------------------------------------------
Last comment about the LACP configuration is about the use of LACP fast. Nutanix recommends setting lacp-time to fast to decrease link failure detection time from 90 seconds to 3 seconds.
Configuring LACP on switch side
When the host is unconfigured or in maintenance mode ( 😉 ), you can reconfigure the switch. Each vendor, or networking OS has its own syntax to achieve the configuration.
When implementing a network for the hyperconverged infrastructure of a customer, I prefer to use Mellanox switches. This is because of the quality of the switches, the pricing and the integration of Mellanox Onyx with Prism, using the Mellanox tool Neo.
interface ethernet 1/1-1/8 speed 10G no-autoneg force
interface ethernet 1/1 mlag-channel-group 1 mode active
interface mlag-port-channel 1 switchport mode hybrid
lacp
interface ethernet 1/1 lacp rate fast
interface mlag-port-channel 1 switchport access vlan 9
interface mlag-port-channel 1 switchport hybrid allowed-vlan add 10-12
Configuring the AHV host
The AHV host is easy to configure using either using “manage_ovs” in the Controller VM, or “ovs-vsctl” on the AHV host itself. I prefer the configuration on the host itself. The following 4 commands are issued on the host. Syntax is based on the default configuration of AHV.
AHV host>ovs-vsctl set port br0-up other_config:lacp-fallback-ab=true
AHV host>ovs-vsctl set port br0-up other_config:lacp-time=fast
AHV host>ovs-vsctl set port br0-up lacp=active
AHV host>ovs-vsctl set port br0-up bond_mode=balance-tcp
After this configuration, everything should be ok. However, we have to check that.
Check your configuration!
You can check on both the switch, as the host if your LACP configuration is correctly.
On the switch side, issue the command “show mlag” in configuration mode.
# show mlag
Admin status: Enabled
Operational status: Up
Reload-delay: 30 sec
Keepalive-interval: 1 sec
Upgrade-timeout: 60 min
System-mac: 00:00:5E:00:01:21
MLAG Ports Configuration Summary:
Configured: 9
Disabled: 0
Enabled: 9
MLAG Ports Status Summary:
Inactive: 0
Active-partial: 0
Active-full: 9
On the AHV host, the configuration can be checked with the commands “ovs-appctl bond/show” and “ovs-appctl lacp/show”.
[root@AHV02 ~]# ovs-appctl bond/show
---- br0-up ----
bond_mode: balance-tcp
bond may use recirculation: yes, Recirc-ID : 1
bond-hash-basis: 0
updelay: 0 ms
downdelay: 0 ms
next rebalance: -229 ms
lacp_status: negotiated
lacp_fallback_ab: true
active slave mac: 38:68:dd:19:fe:60(eth1)
[root@AHV02 ~]# ovs-appctl lacp/show
---- br0-up ----
status: active negotiated
sys_id: 38:68:dd:19:fe:60
sys_priority: 65534
aggregation key: 1
lacp_time: slow
slave: eth1: current attached
port_id: 1
port_priority: 65535
may_enable: true
actor sys_id: 38:68:dd:19:fe:60
actor sys_priority: 65534
actor port_id: 1
actor port_priority: 65535
actor key: 1
actor state: activity aggregation synchronized collecting distributing
partner sys_id: 00:00:5e:00:01:21
partner sys_priority: 32768
partner port_id: 262
partner port_priority: 32768
partner key: 29002
partner state: activity timeout aggregation synchronized collecting distributing
slave: eth2: current attached
port_id: 2
port_priority: 65535
may_enable: true
actor sys_id: 38:68:dd:19:fe:60
actor sys_priority: 65534
actor port_id: 2
actor port_priority: 65535
actor key: 1
actor state: activity aggregation synchronized collecting distributing
partner sys_id: 00:00:5e:00:01:21
partner sys_priority: 32768
partner port_id: 5
partner port_priority: 32768
partner key: 29002
partner state: activity timeout aggregation synchronized collecting distributing
Leave a Reply