Sunday, November 4, 2012

CCIE DC: Understanding Failover and network config in UCS

Hi Guys

In this blog post I am going to try and explain the network configuration in UCS. Many people have difficulty understanding the failover and how it should be configured. I recently explained the concepts around it to a customer and I think that it made a lot of sense to them so i wanted to share it here using the same methodology.

I want to cover off both North-Bound (I.e. the uplinks back towards the network) and South-bound (I.e. towards the servers) in this article.


Since I am from Australia let's start with South Bound. 

First, picture in your mind a switch chassis, with a single line card, and a server attached to this linecard with a single nic

Pretty simple to picture so far hey? OK, it's obvious that this setup has quite a few single points of failure, so let's expand the picture, introduce another entire switch, and another NIC, each acting independently of the other (so no VSS or vPC or anything complicated like that :))

Got it? Great!

Now we have redundancy, and no single point of failure, if either of the switches or NIC fails you have full redundancy.

Congratulatons! You have now accurately pictured the way Cisco UCS does south-bound connectivity.

In UCS, Each Fabric Interconnect is an entirely independent switch, and the FEX is just a remote line card (this is litreally what happens in UCS, if you look at the config on the FI it is just a Nexus 5k with a 2k remote line card that happens to sit in the chassis)

With me so far? Easy right!

OK so let's take this a step further, if your server only has one NIC, what could you do? well in  real life not much, but with Cisco UCS each vNIC is actually just a virtual instance, the server itself has a mezzaine card, but how many NIC's the server has is actually entirely up to you, it can have up to 64 if you really wanted! but they are virtual.

So when would you ever define just one NIC? maybe your operating system is not intelligent and can't provide redundancy across two NIC's, in this case you would use UCS's built in failover! the built in failover means that if something happens to FI A, FEX A, or FI B or FEX B, Cisco UCS can handle that failover!

If your OS supports it, you could rely on the OS to provide the failover, in ESXi you would configure the vSWITCH active and standby adapters and be all good.

Now let's say however that you had a vSWITCH and configured up two vNIC's, in the back end, on Cisco UCS, one of these vNIC's is going to Fabric A, the other to Fabric B, you go to your vswitch and you configure etherchannel across these NIC's

Your going to have a bad time :(, If you think about it, your trying to etherchannel across two totally separate switches, you will have very strange problems such as Ping's dropping in and out and other issues. It will be quite hard to track down! So remember, you can't etherchannel across two NIC's if those NIC's actually go back towards Cisco UCS on two separate Fabrics! Because each fabric is a separate switch!

My question has always been: Why even have multiple NIC's at all? Just have one going to each of your servers and let UCS provide the failover, after all UCS is going to notice a failure sooner!

But the problem with this approach is that VMWARE will complain if it does not have two NIC's attached, you will receive a message in the console complaining of single NIC connectivity. Therefore many people like to ensure there are two NIC's.

For your information, when a fault occurs with either a FEX or the FI on a fabric, the NIC's will physically disconnect in ESXi so you can always use physical status as your failover detection in ESX :).

Now that you understand what is actually happening in the background, the possibilities for your ESXi Switch configuration are probably starting to make sense, you could have one giant vSWITCH for everything with two Adapters, one active one standby, you could have one giant switch and have both adapters active but use "load balance based on originating virtual port ID" which will send some traffic from one host down one link and some traffic from the other host down another. (Incidentally, i wouldn't get too caught up on "using all the available bandwidth" because the fact is that in the backend, the traffic all goes to one mezzaine card anyway! which has 10 gig or 20 gig or 40 gig of throughput, if you want to control bandwidth you should be using the QoS settings on the NIC in UCS but that is a conversation for another blog)

You could even split traffic up into two seperate vSWITCHES, one for management one for data and have active/standby adapters in each of these. But the important thing to think about especially when you start running into weird problems with pinging etc is to remind yourself of what the topology actually looks like. All this virtualization etc can make the topology more difficult to picture, but if you understand the topology you will quickly work out what is going wrong.


Now that you understand south bound, take yourself back to our picture in our mind of two separate switches with line cards and a server dual-attached to them. Ask yourself this, for the upstream switches that you will be attaching your imaginary switches to, would you dual attach them? Of course you would, if you could vPC or VSS etherchannel them to two seperate switches, would you? Of Course you would!

It is no different in the UCS World, your Fabric Interconnects A and B should be dual attached to your upstream switches wherever possible, be that by using vPC or etherchannel.

A big no-no is having each Fabric Interconnect uplinked to a single Nexus 7k each, but then those Nexus 7k's being in a vPC, because this means that whenever a host on fabric A needs to talk to a host on fabric B it goes over your internal link that your Nexus 7k's use (in this case it would be the vPC Peer link, in another example like VSS it would be your VSS Link)

The Take Aways

If your the type who looks for too long didn't read, this is your section :)

  • The best possible way to think of each Fabric Interconnect is to think of them as a seperate switch, with a remote line card (the FEX), because in reality, that is actually what they are, the Fabric Interconnect is just a Nexus 5k and the FEX is just a Nexus 2k.
  • The fabric Interconnects are NOT vPC'd together, they are two totally independent switches
  • Thus, when you create vNIC's in Cisco UCS, and assign them to either fabric A or Fabric B, you need to picture in your mind "I am setting uplinks up to TWO diffirent switches" and all of a sudden your failover and etherchannel etc configuration will make sense
  • Speaking of failover, your choices are to rely on the operating system or Cisco UCS to do the failover for you, but never do both
  • Always uplink your fabric interconnect's to BOTH upstream switches and use etherchannel whenever possible. (So FI A should attach to Core Switch A and B and its link's should be in an etherchannel!)

I hope this helps someone out there

1 comment:

  1. Thanks.

    I have a question regarding using Vmware Virtual distributed switch, which load balancing policy you would use if I was using Network I/O control - want to use all 4 nics created on the esxi host (uplink nexus 7k vpc and lacp enable). On the FI, 2 nics on FI A and FI B with failover for Vlans