Let's talk about a topic that at least for me had me running scared, but now that I know how it all works is much less intimidating :).
Buffer to Buffer Credits
So in FC, Drops are obviously a huge issue, one of the causes of Drops is buffers running out on receive ports, so FC created a concept called Buffer to Buffer Credits, What this does is let the two switches communicate to each other how much buffer space they have for receiving traffic, when all the available buffers have been depleted, the FC will tell the other end to pause sending until it has emptied it's buffers, it will then send a R_RDY to say hey let's keep transmitting :).
From Wikipedia:
"Each time a port transmits a frame that port's BB Credit is decremented by one; for each R RDY received, that port's BB Credit is incremented by one "
Buffer to buffer credits are affected by distance: the problem is that when traffic is on the wire, it takes a while for the other end to receive them, while that traffic is being transmitted down to the other end, now if we had only 1 buffer credit, we would only be able to send one frame at a time and we would have to wait to receive an R_RDY back before we could transmit the next one! This would obviously negatively impact our SAN link performance.
So, Buffer to Buffer credits allow us to send a certain amount down the wire that we know the other end will be able to cope with, because it has communicated to us how many buffers it has available, we keep track of how much we are sending it, so we know how much buffer we are using up as we send the traffic.
So Buffer to Buffer credits are affected by distance: The longer the fibre link, the more buffer to buffer credits we need.
The rough formula is:
BB_Credit = [port speed] x [round trip time] / [frame size]
Another more practical formula at least as I am concerned is 2 buffer to buffer credits per KM.
OK now let's look at how they fit in with our MDS
So you may know the internal structure of the MDS is that basically each set of ports is allocated to a port-group, this is very similiar to most modern day switches (but modern day switches hide a lot of the buffer allocation magic)
Let's look at the port resource command and we can get a bit of information and explain along the way:
MDS1# show port-resources module 2
Module 2
Available dedicated buffers are 4656
Port-Group 1
Total bandwidth is 12.8 Gbps
Total shared bandwidth is 12.8 Gbps
Allocated dedicated bandwidth is 0.0 Gbps
--------------------------------------------------------------------
Interfaces in the Port-Group B2B Credit Bandwidth Rate Mode
Buffers (Gbps)
--------------------------------------------------------------------
fc2/1 16 4.0 shared
fc2/2 16 4.0 shared
fc2/3 16 4.0 shared
fc2/4 16 4.0 shared
fc2/5 16 4.0 shared
fc2/6 16 4.0 shared
fc2/7 16 4.0 shared
fc2/8 16 4.0 shared
Ok let's look at the basic output first, if you notice each of the interfaces, you will see that each interface has the default allocation of 16 buffers, the exact number of buffers allocated varies by interface and by what mode you set the interface to, E ports tend to get a lot more buffers because obviously they normally have a lot more traffic going over them.
So, when we have a port with let's say 16 buffers, that means this interface is GUARANTEED that we will give him 16 buffers at least, no matter what situation the rest of the switch is in, even if its smashing away doing a thousand gigabits per second, we guarantee those buffers to our interface
This value is adjustable for each port as we will see later, and as mentioned it's actually changed by the switch automatically depending on what kind of port you configure, and what speed that port is, it will also change depending on the hardware of the module
From the Cisco Documentation:
http://www.cisco.com/en/US/docs/switches/datacenter/mds9000/sw/5_0/configuration/guides/int/nxos/buffers.html
"
The receive BB_credit values depend on the module type and the port mode, as follows:
•For
16-port switching modules and full rate ports, the default value is 16
for Fx mode and 255 for E or TE modes. The maximum value is 255 in all
modes. This value can be changed as required.
•For
32-port switching modules and host-optimized ports, the default value
is 12 for Fx, E, and TE modes. These values cannot be changed.
•For Generation 2 and Generation 3 switching modules, see the "Buffer Pools" section."
If we take a look at interface fc2/2 for example, which is configured as an F Port, we see the following:
MDS1# show int fc2/2 bbcredit
fc2/2 is up
Transmit B2B Credit is 3
Receive B2B Credit is 16
16 receive B2B credit remaining
3 transmit B2B credit remaining
3 low priority transmit B2B credit remaining
The credits allocated to this interface are shown, as you can see, we see a term here, transmit B2B credit, why is that so low? The reason is that this port is connected to a HBA, and the HBA on this particular server does not have all that much buffer space, so it transmits a message saying hey i only have 3 B2B credits spare.
So the FC ports communicate this with each other as you can see above :)
Let's look at how we can manually change the buffer to buffer credits on an interface:
MDS1(config-if)# switchport fcrxbbcredit 24 ? mode Configure receive BB_credit for specific mode
Here you can see that we could just specify the amount of BB credits, but we have more interesting options avialable to us too!
MDS1(config-if)# switchport fcrxbbcredit 500 mode ?
E Configure receive BB_credit for E or TE mode
Fx Configure receive BB_credit for F or FL mode
We can pre-configure a port to say OK if it's in E mode, use these buffer credits, if it's in F mode, use these buffer credits
MDS1(config-if)# switchport fcrxbbcredit ?
1-500> Enter receive BB_credit
default Default receive BB_credit
extended Configure extended BB_credit for the port
performance-buffers Configure performance buffers for receive BB_credit
We can specify both performance and extended buffer credits too, performance buffer credits are additional buffer credits on top of the already allocated buffer credits, but are only available in some line cards and modules:
MDS1(config-if)# switchport fcrxbbcredit performance-buffers ?
1-145> Enter performance buffers for receive BB_credit
default Default performance buffers for receive BB_credit
MDS1(config-if)# switchport fcrxbbcredit performance-buffers 145 ?
MDS1(config-if)# switchport fcrxbbcredit performance-buffers 145
fc2/1: (error) requested config change not allowed
MDS1(config-if)# switchport fcrxbbcredit performance-buffers 1
fc2/1: (error) requested config change not allowed
As you can see from the above I don't have a single buffer available :( No performance buffers for me!
However, by enabling the feature:
feature fcrxbbcredit extended
I now have additional buffer to buffer credits I can allocate:
MDS1(config-if)# switchport fcrxbbcredit extended ?
256-4095> Enter extended credit receive BB_credit
MDS1(config-if)# switchport fcrxbbcredit extended 256
So! This shows the basics of buffer to buffer credits, but a few questions remain.
First of All, what is this line of output at the top of our show port-resource:
MDS1# show port-resources module 2
Module 2
Available dedicated buffers are 4656
So we have a pool of buffers that are available to all the ports, this is the "common unallocated buffer pool for BB_Credits" as per the very useful Diagram from Cisco below:
So as you can see, the buffers go up, with allocated buffers per port, the common unallocated buffer pool for BB_credits, and finally the performance buffers which we have IF we have a hardware module that supports them, finally there is the reserved internal buffers which we as the user cannot modify.
Now, if we look at a port that we configure the buffer manually on:
MDS1(config-if)# show run int fc2/2
interface fc2/2
switchport speed 4000
switchport rate-mode dedicated
switchport fcrxbbcredit 128
Notice our "Shared buffer pool" has shrunk:
MDS1(config-if)# show port-resources module 2
Module 2
Available dedicated buffers are 4556
On the other hand, if we take a port out of service (or in this case, a whole bunch of ports)
MDS1(config)# int fc2/10 - 20
MDS1(config-if)# out-of-service
Putting an interface out-of-service will cause its shared resource configuration to revert to default
Do you wish to continue(y/n)? [n] y
MDS1(config-if)#
Suddenly our availble buffers increases:
MDS1(config-if)# show port-resources module 2
Module 2
Available dedicated buffers are 4875
The last thing to worry about in my opinion for buffer to buffer credits is described succulantly in the Cisco documentation:
Enabling Buffer-to-Buffer Credit Recovery
Although the Fibre Channel standards require low bit error rates, bit
errors do occur. Over time, the corruption of receiver-ready messages,
known as R_RDY primitives, can lead to a loss of credits, which can
eventually cause a link to stop transmitting in one direction. The Fibre
Channel standards provide a feature for two attached ports to detect
and correct this situation. This feature is called buffer-to-buffer
credit recovery.
Buffer-to-buffer credit recovery functions as follows: the sender and
the receiver agree to send checkpoint primitives to each other, starting
from the time that the link comes up. The sender sends a checkpoint
every time it has sent the specified number of frames, and the receiver
sends a checkpoint every time it has sent the specified number of R_RDY
primitives. If the receiver detects lost credits, it can retransmit them
and restore the credit count on the sender.
The buffer-to-buffer credit recovery feature can be used on any
nonarbitrated loop link. This feature is most useful on unreliable
links, such as MANs or WANs, but can also help on shorter, high-loss
links, such as a link with a faulty fiber connection.
Configuring:
MDS1(config-if)# int fc2/2
MDS1(config-if)# switchport fcbbscn
That should just above cover hopefully buffer to buffer credit's, next let's look at interfaces and shared bandwidth
Back to our favorite command show port-resource
MDS1(config-role)# show port-resource module 2
Module 2
Available dedicated buffers are 4875
Port-Group 1
Total bandwidth is 12.8 Gbps
Total shared bandwidth is 8.8 Gbps
Allocated dedicated bandwidth is 4.0 Gbps
--------------------------------------------------------------------
Interfaces in the Port-Group B2B Credit Bandwidth Rate Mode
Buffers (Gbps)
--------------------------------------------------------------------
fc2/1 16 4.0 shared
fc2/2 128 4.0 dedicated fc2/3 16 4.0 shared
fc2/4 16 4.0 shared
fc2/5 16 4.0 shared
fc2/6 16 4.0 shared
fc2/7 16 4.0 shared
fc2/8 16 4.0 shared
fc2/9 16 4.0 shared
Module 2
Available dedicated buffers are 4875
Port-Group 1
Total bandwidth is 12.8 Gbps
Total shared bandwidth is 8.8 Gbps
Allocated dedicated bandwidth is 4.0 Gbps
--------------------------------------------------------------------
Interfaces in the Port-Group B2B Credit Bandwidth Rate Mode
Buffers (Gbps)
--------------------------------------------------------------------
fc2/1 16 4.0 shared
fc2/2 128 4.0 dedicated fc2/3 16 4.0 shared
fc2/4 16 4.0 shared
fc2/5 16 4.0 shared
fc2/6 16 4.0 shared
fc2/7 16 4.0 shared
fc2/8 16 4.0 shared
fc2/9 16 4.0 shared
fc2/10 (out-of-service)
fc2/11 (out-of-service)
fc2/12 (out-of-service)
fc2/11 (out-of-service)
fc2/12 (out-of-service)
I have highlighted the sections we care about.
In this module i have 12.8 gig available to share amongst 12 ports, the total bandwidth that is shareable across all the ports is 8.8 gig, BECAUSE i have specifically allocated 4 gig of dedicated bandwidth to one port. But in order to get more dedicated ports, i would need to take some ports out of service, let's explore that a bit more...
If I was to leave all ports at the default, the 12.8 gig would simply be divided up amongst the ports as a shared pool of bandwidth,
MDS1(config-if)# show port-resource module 2
Module 2
Available dedicated buffers are 4888
Port-Group 1
Total bandwidth is 12.8 Gbps
Total shared bandwidth is 12.8 Gbps
Allocated dedicated bandwidth is 0.0 Gbps
--------------------------------------------------------------------
Interfaces in the Port-Group B2B Credit Bandwidth Rate Mode
Buffers (Gbps)
--------------------------------------------------------------------
fc2/1 16 4.0 shared
fc2/2 16 4.0 shared
fc2/3 16 4.0 shared
fc2/4 16 4.0 shared
fc2/5 16 4.0 shared
fc2/6 16 4.0 shared
fc2/7 16 4.0 shared
fc2/8 16 4.0 shared
fc2/9 16 4.0 shared
fc2/10 16 4.0 shared
fc2/11 16 4.0 shared
fc2/12 16 4.0 shared
So here you can see, we have 12.8 Gig, and we have 12 ports, now it doesn't take a maths Genius (Maths was my worst subject in school ha ha) to work out that 12.8 gig divided by 12 ports does NOT give us 4.0 gig per port, so we have 12 gig to share amongst all these ports. We could take one of those interfaces and make them dedicated like so:
MDS1(config)# int fc2/1
MDS1(config-if)# switchport rate-mode dedicated
MDS1(config-if)# show port-resources module 2
Module 2
Available dedicated buffers are 4885
Port-Group 1
Total bandwidth is 12.8 Gbps
Total shared bandwidth is 8.8 Gbps
Allocated dedicated bandwidth is 4.0 Gbps
--------------------------------------------------------------------
Interfaces in the Port-Group B2B Credit Bandwidth Rate Mode
Buffers (Gbps)
--------------------------------------------------------------------
fc2/1 16 4.0 dedicated
fc2/2 16 4.0 shared
fc2/3 16 4.0 shared
fc2/4 16 4.0 shared
- Output Ommited -
So now we have 8.8 gig of shared bandwidth free, and 4 gig of dedicated bandwidth, but let's say i need another dedicated interface in this port group, well i have 8.8 gig of shared bandwidth free right, so can't I just carve out another 4 gig of that for my dedicated interface?
MDS1(config-if)# int fc2/2
MDS1(config-if)# switchport rate-mode dedicated
fc2/2: (error) Bandwidth not available
Huh? What's going on
Check out this table below from Cisco:
Every port has a particular amount of resreved bandwidth, in our case it's 0.8 gig per port.. so let's do some math.. 0.8 x 11 (because remember, we have 12 ports, one of which is dedicated, so 8.8 gig of bandwidth to share between 11 ports = 8.8!
So, if we want another dedicated interface, we need to increase that shared pool, how do we do it?
By taking ports out of service:
MDS1(config-if)# int fc2/12
MDS1(config-if)# out-of-service
MDS1(config-if)# show port-resources module 2
Module 2
Available dedicated buffers are 4943
Port-Group 1
Total bandwidth is 12.8 Gbps
Total shared bandwidth is 8.8 Gbps
Allocated dedicated bandwidth is 4.0 Gbps
--------------------------------------------------------------------
Interfaces in the Port-Group B2B Credit Bandwidth Rate Mode
Buffers (Gbps)
--------------------------------------------------------------------
fc2/1 16 4.0 dedicated
fc2/2 16 4.0 shared
fc2/3 16 4.0 shared
fc2/4 16 4.0 shared
fc2/5 16 4.0 shared
fc2/6 16 4.0 shared
fc2/7 16 4.0 shared
fc2/8 16 4.0 shared
fc2/9 16 4.0 shared
fc2/10 16 4.0 shared
fc2/11 (out-of-service)
fc2/12 (out-of-service)
Now that we have taken the port out of service, we have 8.8 gig to divide by not 11 ports, but 9 ports, giving us 0.97, but if we where to put fc2/2 into dedicated, that would be 8.8 - 4, leaving us with 4.8, which divided by 8 ports (since we would not be counting fc2/2 anymore) which gives us 0.6 which is still not meeting our minimum reserved bandwidth, so we need to put more ports out of service
4.8 / 6 ports gives us the total we are looking for, 0.8
So we take a few more ports out of service:
MDS1(config-if)# int fc2/2
MDS1(config-if)# switchport rate-mode dedicated
fc2/2: (error) Bandwidth not available
MDS1(config-if)# int fc2/9 - 10
MDS1(config-if)# out-of-service
Putting an interface out-of-service will cause its shared resource configuration to revert to default
Do you wish to continue(y/n)? [n] y
MDS1(config-if)# int fc2/2
MDS1(config-if)# switchport rate-mode dedicated
Now we have the bandwidth to do it
Obviously, all of these calculations change if when you setup the dedicated rate-mode, you change the amount of bandwidth that the dedicated rate-mode interface is allowed, with something like:
MDS1(config-if)# switchport speed auto max 2000
Now, you CAN change this behaviour
MDS1(config)# no rate-mode oversubscription-limit module 2
This reduces significantly the amount of reserved bandwidth per port
MDS1(config-if)# show port-resource module 2
Module 2
Available dedicated buffers are 4879
Port-Group 1
Total bandwidth is 12.8 Gbps
Total shared bandwidth is 0.8 Gbps
Allocated dedicated bandwidth is 12.0 Gbps
--------------------------------------------------------------------
Interfaces in the Port-Group B2B Credit Bandwidth Rate Mode
Buffers (Gbps)
--------------------------------------------------------------------
fc2/1 16 4.0 dedicated
fc2/2 16 4.0 dedicated
fc2/3 16 4.0 dedicated
fc2/4 16 4.0 shared
fc2/5 16 4.0 shared
fc2/6 16 4.0 shared
fc2/7 16 4.0 shared
fc2/8 16 4.0 shared
fc2/9 16 4.0 shared
fc2/10 16 4.0 shared
fc2/11 16 4.0 shared
fc2/12 16 4.0 shared
As you can see here we have now allocated all three interfaces as dedicated, which is the maximum we can do because 4 x 3 = 12, leaving us with 0.8 of a gig shared between the other ports..
Alright! Last thing, is to do with FCoE
So a quick review, when FCoE came out they said oh dear, we don't have a lossless mechanism in FCoE, we don't have anything like R_RDY etc, and we certainly don't have anything like buffer to buffer credit's, they had something called pause frames, which have been used with priority flow control for quite a while, but they said to themselves, let's work on the pause frames and create a per-priority flow control so that you can do pause frames but only for particular CoS Values! Champagne for Everyone!
But they still had a problem, the same issue faced by fibre channel and that is that as distance of a link increased, what could happen is that you could start sending traffic and actually send so much that the other end has to drop some, and before the other end can inform you to say "Hey my buffers are full don't send me anymore please!" (i.e a pause frame), you have already sent too much traffic, meaning it will have to drop it. The way FC combatted this we have mentioned: buffer to buffer credits to keep an eye out for how much traffic the other end can handle while still making sure we wait for the R_RDY, unfortunately no such mechanism exists in FCoE, so they had to have a bit of a diffirent solution, check out the below on the cisco documentation:
http://www.cisco.com/en/US/docs/switches/datacenter/nexus5000/sw/qos/513_n1_1/b_cisco_nexus_5000_qos_config_gd_513_n1_1_chapter_011.html
Configuring No-Drop Buffer Thresholds
Beginning with Cisco NX-OS Release 5.0(2)N1(1), you can configure the no-drop buffer threshold settings for 3000m lossless Ethernet.Note | To achieve lossless Ethernet for both directions, the devices connected to the Cisco Nexus 5548 switch must have the similar capability. The default buffer and threshold value for the no-drop can ensure lossless Ethernet for up to 300 meters. |
They very helpfully also show you the values to support the maximum distance, which is 3000m, (3km)
switch(config-pmap-nq)# policy-map type network-qos nqos_policy switch(config-pmap-nq)# class type network-qos nqos_class switch(config-pmap-nq-c)# pause no-drop buffer-size 152000 pause-threshold 103360 resume-threshold 83520 switch(config-pmap-nq-c)# exit switch(config-pmap-nq)# exit switch(config)# exit switch#
So if we take the values they show there, 152000, 103360 and 83520 we can divide them by 3 for example to get 1km values.
This is all found in the NXoS QOS Configuration guide
I hope this was interesting or helped someone out there.
Nice post, very helpful. Thanks for putting the time into it.
ReplyDeleteGood one.. Thank you..!
ReplyDeleteExcellent post Peter. Thanks for narrowing everything up that way. I was having trouble understanding how that oversubscription thing worked on MDSs. I went through the configuration guides, but your post was definitely more helpful than that.
ReplyDeleteGreat Peter. Now I understand a little bit better the flow control in MDS
ReplyDeleteGreat post peter, i was trying to use 4/44 feature with 2 8gb SFPs on 2 separate MDS 9222i. From your post i realized that there are 4 groups on my 48 port line card and i can have one 8 gb port in each group versus 4 continuous 8gb ports.
ReplyDeleteadidas outlet store
ReplyDeletecheap jordans
ferragamo outlet
cheap nba jerseys
ed hardy shirts
the north face jackets
longchamp outlet
tory burch outlet
polo outlet
cheap nfl jerseys wholesale
hzx20170512
Pipeline leak Detection is useD to Determine if anD in some cases where a leak has occurreD in systems which contain liquiDs anD gases. MethoDs of Detection incluDe hyDrostatic testing, infrareD, anD laser technology AFter pipeline erection anD leak Detection During service.
ReplyDeleteجلى بلاط بالرياض
شركة تركيب باركيه بالرياض بجدة
افضل شركة كشف تسربات المياه بالاحساء
افضل شركة كشف تسربات المياه بالدمام
افضل شركة كشف تسربات المياه بالجبيل
شركة الصفرات للتنظيف
ReplyDeleteشركة الصفرات لمكافحة الحشرات بالرياض
شركة الصفرات لتسليك المجاري
شركة الصفرات لتنظيف الخزانات
adidas stan smith
ReplyDeletecheap air jordans
air jordan
golden goose superstar
golden goose mid star
lebron 16
supreme outlet
kd shoes
cheap jordans
curry 6