UDLD and EtherChannel Configuration
ByA data centre network for a client I work with had an interesting issue this week. For no apparent reason some users within the data centre environment reported connection issues to hosts in the network. They were able to connect to some hosts but not others. Then all of a sudden connection would be restored but quickly lost again.
One of my colleagues was able to access two of the core data centre switches but I could only get to one. A very quick trip to the data centre floor and a console cable connection into the core data centre switches revealed the issue.
The two switches in question (Catalyst 4507s) have a 2 x 1Gb EtherChannel configured on either end, connected by fibre connections. One side of the connection reported that both links were active in the Etherchannel. The other side had one link as down and the logs showing that the connection had left the EtherChannel.
The full reason for this is still unknown but this type of issue, where one side sees the link as up but the other sees it as down, is called a unidrectional failure. To solve the matter at hand, we first of all shut down the faulty link at the end that still had the link up. As soon as this was done everything sprung into life and everyone was able to connect to the data centre hosts. While the link was down we quickly swapped out the GBic cards and brought the connection backup. The link joined back into the EtherChannel and everything was back as it should be.
This highlighted an issue with the Etherchannel configuration on these particular switches however. Here is a look at the configuration of one of the Etherchannel interfaces as it stood at that time.
interface GigabitEthernet1/1
description ** link to xxxxxx **
switchport trunk encapsulation dot1q
switchport trunk allowed vlan x,xx,xx,xx-xx,xx-xx,xx
switchport mode trunk
qos trust dscp
channel-group 1 mode on
I have always advised network engineers to use a mode of desirable on either side of an Etherchannel connection, rather than forcing the Etherchannel up. The on mode forces a port to join an Etherchannel without any sort of Etherchannel protocol negotiation taking place. Using the desirable keyword instead of the on keyword means that the switch uses the Port Aggregation Protocol (PAgP). When using PAgP the switch learns of partner interfaces on other switches that support PAgP and dynamically groups its interfaces into an Etherchannel. Lets look at an example. I’ve set up two Cisco Catalyst 3550s back to back connecting ports 13 and 14 off each switch together.
Here is the configuration of the ports on either end of the connection.
interface FastEthernet0/13
switchport trunk encapsulation dot1q
switchport mode trunk
channel-group 1 mode desirable
!
interface FastEthernet0/14
switchport trunk encapsulation dot1q
switchport mode trunk
channel-group 1 mode desirable
Once the port channel group on the first configured interface the IOS automatically creates the port channel interafce.
interface Port-channel1
switchport trunk encapsulation dot1q
switchport mode trunk
If the same configuration is applied at both ends then the PAgP protocol will dynamically place each relevant interface into the Etherchannel.
Here is the output of the show etherchannel summary command from SW1
SW1#show etherchannel summary
Flags: D – down P – in port-channel
I – stand-alone s – suspended
H – Hot-standby (LACP only)
R – Layer3 S – Layer2
U – in use f – failed to allocate aggregator
u – unsuitable for bundling
w – waiting to be aggregated
d – default port
Number of channel-groups in use: 1
Number of aggregators: 1
Group Port-channel Protocol Ports
——+————-+———–+———————————————–
1 Po1(SU) PAgP Fa0/13(P) Fa0/14(P)
Lets see what happens if I change port fas0/14 on SW2, removing it from the channel and thus stopping PAgP.
SW2(config-if)#no channel-group 1
SW2(config-if)#do show etherchannel summ
Flags: D – down P – in port-channel
I – stand-alone s – suspended
H – Hot-standby (LACP only)
R – Layer3 S – Layer2
U – in use f – failed to allocate aggregator
u – unsuitable for bundling
w – waiting to be aggregated
d – default port
Number of channel-groups in use: 1
Number of aggregators: 1
Group Port-channel Protocol Ports
——+————-+———–+———————————————–
1 Po1(SU) PAgP Fa0/13(P)
On SW1 both fas0/13 and 14 are still configured as part of port channel 1. But as PAgP is used, SW1 drops port14 from the group when it stops seeing PAgP.
SW1#sh etherchannel summ
Flags: D – down P – in port-channel
I – stand-alone s – suspended
H – Hot-standby (LACP only)
R – Layer3 S – Layer2
U – in use f – failed to allocate aggregator
u – unsuitable for bundling
w – waiting to be aggregated
d – default port
Number of channel-groups in use: 1
Number of aggregators: 1
Group Port-channel Protocol Ports
——+————-+———–+———————————————–
1 Po1(SU) PAgP Fa0/13(P) Fa0/14(I)
Port fas0/14 is now operating as a stand-alone port and is now a seperate trunk between the switches.
SW1#sh int trunk
Port Mode Encapsulation Status Native vlan
Fa0/14 on 802.1q trunking 1
Po1 on 802.1q trunking 1
SW2#sh int trunk
Port Mode Encapsulation Status Native vlan
Fa0/14 on 802.1q trunking 1
Po1 on 802.1q trunking 1
In the data centre situation I described above this would have dropped the offending interfaces from the etherchannel as one side would have stopped seeing PAgP. However, it may have been possible for one switch to move the interface into stand-alone mode and pass traffic across a broken link, as it was still seeing this link as up. In order to help in situations like these Cisco developed the Unidirectional Link Dection protocol.
UDLD can now be configured in aggressive mode from IOS Release 12.1(3a)E. Cisco describe aggressive mode as follows:
” Configure UDLD aggressive mode only on point-to-point links between network devices that support UDLD aggressive mode. With UDLD aggressive mode enabled, when a port on a bidirectional link that has a UDLD neighbor relationship established stops receiving UDLD packets, UDLD tries to reestablish the connection with the neighbor. After eight failed retries, the port is disabled.
To prevent spanning tree loops, nonaggressive UDLD with the default interval of 15 seconds is fast enough to shut down a unidirectional link before a blocking port transitions to the forwarding state (with default spanning tree parameters).
When you enable UDLD aggressive mode, you receive additional benefits in the following situations:
•One side of a link has a port stuck (both Tx and Rx)
•One side of a link remains up while the other side of the link has gone down
In these cases, UDLD aggressive mode disables one of the ports on the link, which prevents traffic from being discarding. “
I use aggressive mode when available. Configuration is simple. To configure non-aggressive udld you would enter the following command.
SW2(config)#int fas0/13
SW2(config-if)#udld port
To configure aggressive mode only one more keyword is required
SW2(config-if)#int fas0/14
SW2(config-if)#udld port aggressive
Using the show udld command you can check to make sure that udld is running as desired.
SW2#sh udld
Interface Fa0/13
—
Port enable administrative configuration setting: Enabled / in aggressive mode
Port enable operational state: Enabled / in aggressive mode
Current bidirectional state: Bidirectional
Current operational state: Advertisement – Single neighbor detected
Message interval: 7
Time out interval: 5
Entry 1
—
Expiration time: 44
Cache Device index: 1
Current neighbor state: Bidirectional
Device ID: CAT0640X09Y
Port ID: Fa0/13
Neighbor echo 1 device: CAT0825X28N
Neighbor echo 1 port: Fa0/13
Message interval: 15
Time out interval: 5
CDP Device name: SW1
Interface Fa0/14
—
Port enable administrative configuration setting: Enabled / in aggressive mode
Port enable operational state: Enabled / in aggressive mode
Current bidirectional state: Bidirectional
Current operational state: Advertisement – Single neighbor detected
Message interval: 15
Time out interval: 5
Entry 1
—
Expiration time: 33
Cache Device index: 1
Current neighbor state: Bidirectional
Device ID: CAT0640X09Y
Port ID: Fa0/14
Neighbor echo 1 device: CAT0825X28N
Neighbor echo 1 port: Fa0/14
Message interval: 15
No timeout interval
No CDP device name
The final configuration of the ports on either end looks like this:
interface FastEthernet0/13
switchport trunk encapsulation dot1q
switchport mode trunk
udld port aggressive
channel-group 1 mode desirable
!
interface FastEthernet0/14
switchport trunk encapsulation dot1q
switchport mode trunk
udld port aggressive
channel-group 1 mode desirable
Etherchannels are wonderful things and in the most part run without any hitches. However, I think that running some sort of protocol to help dynamically manage the participating interfaces and using UDLD to monitor for unidirectional failures is a good safeguard from situations such as the one described above.
Popularity: 91% [?]