FCoE: Unovering the CNA – Deep Dive

By | October 2, 2009

Continuing with my current theme of Fibre Channel over Ethernet, and at the request of several people, this post will take a close look at Converged Network Adapters (CNA).

Now then, there is no easy way to put this, but to do justice to a topic like this will require a lot of words.  I’ll do my best to keep it succinct, but if you are looking for a high level overview in under 1,000 words then this is probably for you, but thanks for stopping by………  However, if you want a deep dive and opportunity for technical discussion, then this might be what you’re looking for.

Still here? Magic, let’s go…..

First things first, its best to have a quick look at what things look like without CNAs, in order to more fully appreciate some of the problems CNAs are resolving.

 
Before CNAs

In a traditional server with no CNA cards and not connected to a unified fabric it is not uncommon to see the following “network” related sprawl (sorry about the terrible diagram but I wrote most of this post while on an aeroplane) –
 

Diagram 1

 
The FC HBAs tend to run at either 2, 4 or 8Gbps with the NICs usually running at either 100Mbps or 1Gbps.

The multiplicity of NICs is common to meet the demands of multiple traffic types such as; production, backup, management etc.

It is also worth noting that most servers are built with two physical HBA cards installed to provide both redundancy and increased aggregate bandwidth/performance – commonly referred to as I/O multi-pathing and is a must in 99.99% of FC SANs.

But that was then and this is now….. or nearly now

Say hello to the Converged Network Adapter, or CNA for short.

The Converged Network Adapter is exactly what it says it is – an HBA and a NIC converged on to a single PCIe adapter –
 

Diagram 2 – Picture courtesy of Emulex

Digging into the technical detail, some CNAs have a single ASIC that performs both the HBA and the NIC functionality, whereas other have a separate ASIC for each of the distinct functions.  This really depends on the vendor and model.  The important point being – CNAs provide HBA and NIC functionality in hardware, making them fast and reducing CPU overhead.  Implementing functionality without thieving CPU cycles is a big plus-point in server virtualisation environments.


Offload EVERYTHING!!

Considering the fact that stealing CPU cycles is undesirable at the best of times and a federal offence punishable by death in virtual server environments, providing hardware offloads is vital.  To help out in this area, the latest raft of Generation 2 Emulex OneConnect Universal Converged Network Adapters, such as the LP2100x, provide full CPU offload for all protocols on a single chip design and are FIP compliant!  That’s not just FCoE offload, we’re talking about TOE and iSCSI as well – the iSCSI one is interesting and may be a chat for another day!
 

NOTE: I will point out that I had a chat recently with Emulex and was very impressed.  They are laser focussed (pun intended) on FCoE and really understand the market and value position of FCoE.  A real pleasure to chat with passionate like minded people, oh and they didn't hang up on me when I mentioned "Broadcom" 😉  VERY unprofessional of me but I was unbelievably jetlagged when we spoke, so I thank them for overlooking my lack of tact and poor humour.

So with the fact that CNAs provide both HBA and NIC functionality, as well as connecting to 10Gbps Enhanced Ethernet fabrics, it shouldn’t take a rocket scientist to figure out that we could swap out our 6 PCI adapters from Diagram 1 and replace them with just 2 CNAs (2 for redundancy).  If you think about it, this has the potential to reduce PCI adapter count and cable sprawl by crazy%

Saving Space

Early generation CNAs are installed as PCIe adapters in servers and as PCIe mezzanine cards in blade servers.  However, the way forward is to eventually have them embedded on the motherboard – a la’ LOM – vendors currently have this on their roadmap.  But this is no biggy right?  Well actually this has the potential to hugely reduce the amount of space consumed by PCI adapters inside of servers paving the way for higher density in blade servers where real-estate is at a premium!

Now this brings up another interesting benefit.  Not only is FCoE and its associated technologies resolving todays problems, it is actually enabling a better future.  How good would it be to have 6 or more CNAs per blade server!  That’s 60Gbps+ of flexible bandwidth based on today’s current 10Gbps Enhanced Ethernet!

Now this becomes a real winner in light of the current raft virtualisation optimised CPUs on the market – think Intel Nehalem and associated  virtualisation technologies like Intel VT-x VT-c, VT-d VMDq – xpect the ratio of VMs per core to rocket skyward!  In order to keep pace with the advancements in chip technology the I/O subsystem needs to move in step.  CNAs and Lossless Enhanced Ethernet are vital to this.

This point aside, I recently had a hand in a design involving HP BladeSystem c7000 technology.  I remember during the design wishing that the system had CNAs instead of the Flex10 and Virtual Connect technologies.  Could have saved space and cabling as well as potentially given more flexibility.  However, this was prior to even FCoE being ratified by INCITS.

Obvious benefits summarised

So some obvious benefits that CNAs bring include – reduced number of PCI adapters, cables, space, power and cooling.  Nice!

Secondly, by supporting 10Gbps Enhanced Ethernet they provide greater bandwidth than existing adapters allowing more traffic types and volume to travel down the same stretch of cable.  Suddenly the wire once nirvana is now a reality (actually wire twice if you want true I/O multi-pathing).
 

NOTE: A quick heads up on throughput.  While CNAs provide access to 10Gbps CEE networks, not all services, including FCoE, are supported at full bandwidth.  For example, CNAs normally support Ethernet at 10Gbps but FCoE only runs at a max of either 4Gbps or 8Gbps.  This is in line with 4Gbps and 8Gbps FC – there is currently no standard for 10Gbps FC N-Port to F_Port connections.

<OK so this is about the half way point, so you might want to come up for some air here before discussing the interesting stuff>

Clever things with CNAs

Now to the deeper technical stuff………..

With 10Gbps Enhanced Ethernet, FCoE, CNAs and other associated technologies being relatively new, the standards bodies (such as INCITS T11 FC-BB-5 for FCoE) as well as the vendors are able to design with virtualisation in mind.  This is great!

Let’s mention a couple of technologies that bring a lot to the table in Hypervisor environments.

NPIV

The first technology worth mentioning is N_Port ID Virtualisation, otherwise known as NPIV.  NPIV is a T11 INCITS and ANSI standard initially developed by IBM and Emulex, and yes I know it’s not exactly new.

Because NPIV is not a new technology I wont spend much time on it here other than to say NPIV makes it possible for a single HBA to have several N_Port IDs and WWPNs, allowing virtual machines to be uniquely addressable on the SAN and therefore able to be managed on the SAN just like normal physical servers.
 

If it would be useful I can put something up on NPIV.  Just leave a comment at the bottom asking and if enough people ask I’ll post on it.

I/O Virtualisation (SRIOV)

So while NPIV allows multiple VMs to be uniquely addressable on the SAN side of an HBA, on the other side, the side of the servers PCI tree, there is still a 1:1 mapping between physical ports and addressable PCI devices.


The Middle-man

Early virtual server implementations see the hypervisor act as middleman, between the I/O adapter and the Virtual Machine (VM).  The VM never actually talks directly with the I/O adapter, always through a middle-man – the hypervisor.  But like any middleman scenario, it has its advantages and disadvantages.  The middleman would argue that he adds “value”, but is rarely so keen to point out that he also always takes a generous percentage of the spoils (in our case, I/O performance and CPU cycles).

Some of the advantages and disadvantages of having the hypervisor as the middleman might include –

Middle-man Advantages

  • Hypervisors often provide snapshot capabilities
  • Hypervisors often provide thin provisioning capabilities
  • Many existing Hypervisor technologies are currently designed around this model

Middle-man Disadvantages

  • Lower I/O performance.  Because the hypervisor handles all I/O to and from a VM, as well as the associated interrupt handling etc, this injects server-side latency in to the I/O path as well as stealing CPU cycles form the main role of the Hypervisor.
  • Security concerns.  The fact that all I/O, to and from VMs, is seen and touched by the hypervisor may be a concern to some people :-S
  • Limited feature set.  You are restricted to the features and functions provided by the Hypervisor and not exposed to the full feature set available from the manufacturers native driver….

A common example where having the Hypervisor act as the middle man is not seen as desirable is a high throughput OLTP type system.  These are rarely virtualised due to, among other things, the I/O performance impact associated with having the Hypervisor as the middle-man.

So what does SRIOV do to help?

NOTE: Single Root I/O Virtualisation (SRIOV) and Multi Root I/O Virtualisation (MRIOV) are extensions to PCIe brought to us courtesy of the electronics industry consortium known as the PCI Special Interests Group (PCI-SIG).

SRIOV implementations allow an I/O adapter, in co-operation with the Hypervisor, to be sliced up in to multiple virtual adapters.  Each of these uniquely addressable on the servers’ PCI tree.  Each virtual adapter can then be mapped directly to a virtual machine and in turn is directly addressable by that virtual machine – no more middle-man.  SRIOV based adapters may have dedicated I/O paths in silicon and work alongside other related technologies such as Intel VT-c, VT-d, VMDq…. to provide a huge variety of offloads and assists aimed at reducing the load on main CPU and memory systems and thus increasing system performance.  This mode of operation is sometimes referred to as “hypervisor bypass” mode (we should be thinking VMDirectPath by now) and offers close to full line rate, making virtualisation a more realistic option for transaction based and other I/O intensive systems like our example OLTP system.

This also opens the door to things like VMs running drivers provided by the manufacturers making new functionality immediately available without waiting for the Hypervisor middle-man to support it.  Obviously it all needs buy in from your Hypervisor …..

Of course SRIOV is a semi-open standard (I say semi-open because several months ago when I researched it you had to pay to get access to the standards docs) and like most standards each vendor is free to implement the specifics in their own unique and value-add way as well as to add more features and requirements around it.

So in a nutshell, NPIV enables a single port to have multiple discrete addresses on the SAN side, whereas SRIOV allows a single I/O adapter to have multiple discrete addresses internally on the servers PCI bus/tree.  The diagram below from a long time ago on the Cisco website shows an I/O card that can be partitioned in to 128 (0-127) virtual interfaces.
 

Diagram from the Cisco website

Say goodbye to hardware rip and replace

By implementing the above mentioned technologies it becomes possible to run a single cable to a server and then use management software to configure the virtual interfaces according to current requirements.  For example, removing a NIC function and adding an HBA function becomes just a software change – no requirement to physically swap out a PCI card or lift the floor and run new cables.  Simply make the change in software and the new device will be presented to the servers PCI tree, job done!  Sound good to anyone else?

There is more, but I doubt anyone would read more than this in one post.  Other topics can always be discussed in the comments section below….

So if you made it all the way to hear, thanks and I hope it was useful.  Please feel free to join the discussion either here on the blog site via the comments section below, or by following me on Twitter @nigelpoulton.  I only talk about storage and related technologies.

Nigel

PS.  I am an independent consultant and available for hire via the Contact Me page

18 thoughts on “FCoE: Unovering the CNA – Deep Dive

  1. Jay Livens

    Nigel,

    Great post on. One of things that I find curious is that you mention the concept of using a CNA to collapse the concept of multiple HBAs into one. Your diagram (and the one from Emulexd) showed FC HBAs. Does the CNA support traditional FC? I did not think so. If this is the case then this brings up an interesting point which is that cannot really collapse all your HBAs into one until you get rid of traditional FC. It also brings the point of FCoE adoption which I posted about in my blog here: http://www.aboutrestore.com/2009/09/25/pondering-fibre-channel-over-ethernet/

  2. Nigel Poulton

    Hi Jay,

    Thats a good question and I think I’ll write a post up about it in more detail some time.  However, for now hopefully this will do –

    FCoE is basically the encapsulation of FC frames inside of Ethernet frames.  So your normal 2,148 byte FC frame (SOF, header, payload, CRC, EOF) gets wrapped up inside an Ethernet frame, only not a standard ~1,500 byte Ethernet frame – it requires baby jumbo frames so that the FC frame is never fragmented.  So your normal FC frame still exists just wrapped in an Ethernet frame with a specific EtherType. 

    The cable out of the CNA plugs into a switch that supports CEE.

    Hope that I understood your question and that the above helps.

    Nigel

  3. Julie Herd Goodman

    Nigel, again, this is awesome stuff.  Even if it does make the marketing side of my brain hurt a little.  ;-)Forgive me if I’m being dense, but relative to CNA’s, how is the IP traffic and the FCoE traffic kept from consuming one another’s bandwidth?  Or is it still the case that in order to improve overall bandwidth, multiple CNA’s can be added for performance?A little more on NPIV, with an emphasis on SAN management / zoning would be hugely helpful.  Thanks again for finding the perfect balance on the level of detail in your overviews.

  4. Jay Livens

    Nigel,

    Thank you for the response. I think that the the model of collapsing HBAs/NICs does not work well for existing FC users. They have HBAs in their servers and dedicated FC switch ports. If they really want to collapse their HBAs into one, they will have to rip and replace FC with FCoE. This can be costly and disruptive and might not make sense given their existing investment in FC. I think that the opportunity for collapsing HBAs/NICs really only applies to new equipment or a situation where a complete re-architecture is occurring. Do you agree?

    JL

  5. Nigel Poulton

    Jay,

    I agree.  The approach is not one of rip and replace.  A sweet-spot is definitely new deployments.  But not necessarily justnew Data Centre builds.  My experience deploying the HP c7000 BladeSystems was a small deployment of 2 x HP BladeSystem c7000 into an existing FC environment and would have benefited. 

    Of course if you are building a new Data Centre then it makes sense at a lot of levels IMHO

    Nigel

  6. John Dias

    Thanks for putting this together – SRIOV looks like the name of a Russian space project or a DARPA supercomputer thingy. 🙂
    Anyway, question for you.  So is multipathing improved somehow?  You talk about how so much is offloaded from the host, but typically MPIO is handled by a driver or shim or whatever loaded on the host OS.  Seems to me this would be an even more magical world if CNAs "knew" about each other and automatically added new CNAs to a clustered IO group without the host being any wiser – other than having more bandwidth available…  Am I making sense?

  7. Steven Ruby

    Nigel,

    With NPIV and VM’s (which i havent had the chance to implement or even dig into yet), are you saying that each VM within an ESX cluster is assigned its own WWPN and the ESX server, fabric and actual VM know all about this WWPN end-to-end?

    good write up!

  8. Louis Gray (Emulex)

    Nigel,

    thank you for the continued detailed discussion around FCoE and CNAs specifically. For more detailed reading on Emulex’s plans for convergence, I thought I would let your readers know they can grab our second edition of the Convergenomics Guide from our Web site:

    http://www.emulex.com/solutions/convergence/convergence-solution-guide.html.

    It contains a great deal of input not just from Emulex but from many of our partners.

    Great talking with you earlier this week.

  9. Nigel Poulton

    @SEPATONJay –

    If you have an existing SAN, you keep it.  New hosts could come in and have CNAs installed and be connected up to an Edge/Access layer FCoE capable switch like the Brocade 8000 (top of the rack switch).  These switches accept the connections from the CNAs on the "front" and on the "back" have native FC prots that connect into your existing FC SAN.  They also have GigE ports on the "back" to connect in to your existing LAN.  So your new hosts connect into the Edge switch via CEE/FCoE and the switch then performs the decapsulation of the FCoE frame leaving it as a bare FC frame and switches is on to the native FC SAN via the FC ports on the "back".

    With this approach you keep your existing FC SAN and LAN backbone and dont really touch your existing production systems.

  10. Nigel Poulton

    @Louis Gray – The pleasure was mine. I tweeted after the call that I felt the folks on the call were my kind of people – passionate and know their stuff.  No probs leaving your link up – good article.

    @Steven Ruby – Hi Steven long time no talk.  Yes that is what I am saying.

    @John Dias – Hmmm I’ll have to do a little reading around the multi-pathing. Towards the end though I think your dreaming 😉

  11. Charles Hood

    Great information Nigel, I’ve been following your FCoE series and I love the discussion it is generating, nice job!

    I’d encourage you to take a look at the Brocade 1010 and 1020 FCoE CNAs at:
    http://www.brocade.com/products-solutions/products/server-connectivity/product-details/1010-1020-cna/index.page

    Brocade is the only company, to my knowledge, that provides both FCoE CNAs as well as FCoE switches and Directors. One advantage of this approach is that it allows their DCFM software tool to provide end-to-end management. Take a look and let me know what you think!

  12. Jay Livens

    Nigel,

    A converged FCoE/FC switch makes sense and is a good solution to enable new systems to access existing FC networks using FCoE. That said, I believe that we are also in agreement that we are unlikely to see users rip and replace existing FC installations for FCoE. FCoE will augment FC and over time may replace it as companies replace/upgrade hardware.

    JL (@SEPATONJay)

  13. John Dias

    @John Dias – Hmmm I’ll have to do a little reading around the multi-pathing. Towards the end though I think your dreaming
    Well, I did say it would be magical!

  14. Randy Bias

    Maybe it’s just me, but the need to rip and replace to add CNAs seems like a major downside. Your article seems to imply that new silicon is highly desirable, but that seems counter intuitive to me give that so much of storage is being rapidly commoditized. Why save CPU cycles? In most environments there are more than enough to spare. Why not do FCoE in software? Is there
    a technological hurdle?

    This is something I am trying to understand better and I haven’t seen a satisfactory answer one way or another.

    Be great to get your thoughts.

    Thanks,

    –Randy

  15. stephen2615

    I assume that most large IT organisations are into blades and are ditching chassis systems where possible. It just seems to be the way things are going. I was trying to get HP to say something about CNA’s which are still only available as HBA’s. HP seem rather unfussed by FCoE and (I get the feeling that they) believe FC is what really serious companies are using and will be for sometime. They did say that they might be offering some sort of FCoE mezz card in the future. I also think that HP have invested so much effort in Virtual Connect (VC) that they hope that people will stay with that for sometime.
    VC (not to be confused with Brocade’s lame excuse to separate traffic in ISL’s to prevent head of line blocking) is still something that I don’t think offers enough to make me want to use it. The only really interesting point is that it can allow (in a C Class enclosure) connection to two fabrics from a VC card. The four ports can be split generally into two fabrics, eg, one to Brocade and one to Cisco and there is load balancing available for NPIV connections.  I have two completely separate fabrics due to a merger and I have to eventually make them one.  Why did Brocade drop interop mode with Cisco?
    What really bothers me is in a HP C Class enclosure, you can have 16 of the new 490c blades and those little beasts have some fabulous grunt. Why would you restrict a super system like that to such limited FC bandwidth? The VC is still only 4 x 4 Gpbs ports.
    I heard that IBM can offer FCoE to their blades. Anyway, for us to use CNA’s, we can use an expansion slot in the C Class enclosure. One of our "people" really wants FCoE to happen so when I asked what the real value of it was, it made for good umms and errs and other time wasters. We don’t have any Nexus equipment yet.  As I work for a government organisation and we tend to be like banks where we take a time will tell view, FCoE probably won’t be honestly looked at for a couple of years. Thats about the time our aging 48000’s will be up for replacement.
    The one thing that always sits in the back of my mind is that we (as SAN people) who have made our systems as robust as possible will somehow have to hand over some control of the FC part of the network to cowboy network admins.
    I look forward to HP bringing out a mezz card with FCoE abilities. I just wonder how it will be connecting to the network. Sure we will be able to connect the blade to a Nexus 5000 that connects to the Director (Cisco or Brocade) but the ports on the 5000 are very limited. I believe that the FC director can be any brand. I am sure that by the time we get FCoE into our data centres, all these interesting things will be sorted out instead of appearing to be there for technologies sake.

  16. Nigel Poulton

    @ Julie Herd Goodman

    It seems that I forgot about your question amid the influx of chat both here on the blogsite and also via twitter….

    I will try and find time to dig into this deeper in a post in the future, but for now I hope its enough to say that it is accomplished via Ethernet Priorities.  I talked about this a little in my post FCoE: Lesson #1 and in a little more detail in the comments to that post

    @ Randy Bias

    Re why save CPU cycles.  I will probably try and write a full blown post on this in the future, but for now…….  I suppose its not necessary in on every server and not even necessary on every server running a hypervisor.  However, this section was aimed primarily at hypervisor installations, and with the ratio of VMs per core on the rise then anything you can do release the main CPU for core hypervisor tasks must surely be a benefit. 

    It also adds a string to your virtualisation bow, making virtualisation more practical for high performance apps that might currently be limited by the performance of the I/O subsystem.  With moderm CPUs offering so many VM related helps, if we dont see similar features on I/O adapters they will fall behind and become the bottleneck/weak point. 

    As for software FCoE, there are software FCoE implementations available, but obviously they dont bring any of the above mentioned performance benefits….

  17. Al

    More NPIV Please 🙂
     
    I have seen so many manuals an tutorials and they were all like hyrogliphics  to me…
     
    but just reading what youve written so far has improved my undestanding allot , all in a few words !!

Leave a Reply

Your email address will not be published. Required fields are marked *


*

You can add images to your comment by clicking here.