Programming The Network With Intel NEX Chief Nick McKeown

It would be very difficult indeed to find a better general manager for Intel’s newly constituted Network and Edge Group networking business than Nick McKeown, and Pat Gelsinger, the chief executive officer charged with turning around Intel’s foundries and its chip design business, is lucky that Intel was on an acquisitive bend in the wake of its rumored failed attempt to buy Mellanox and Nvidia’s successful purchase of Mellanox a few months later.

In the longest of runs, the kind of approach to networking that McKeown has championed all of his career is better suited to the compute engines that Intel is accustomed to and the network engines that it wants to build. So all is well that ends better.

McKeown is a professor at Stanford University who helped create the P4 network programming language as well as being the co-founder of virtual networking company Nicira (which was acquired by VMware a decade ago for $1.26 billion and is the basis of its NSX product line) and the co-founder of programmable switch maker Barefoot Networks, which dropped out of stealth six years ago and which was acquired by Intel three years ago for an undisclosed sum. (Almost certainly an order of magnitude or smaller than what Nvidia paid for Mellanox, which was raking in money at the time.)

The Network and Edge Group at Intel is one of the few bright spots at the company these days, and McKeown sat down with The Next Platform to talk about what Intel is trying to accomplish with the NEX business and why the time is ripe for programmability in all aspects of the network.

Timothy Prickett Morgan: I don’t know whether to call it “N-E-X” or “Nex,” but what I do know is that it is the Network and Edge Group and that you are in charge of it.

Nick McKeown: I actually I say both interchangeably, which doesn’t help. It’s all about network and edge, which is the important thing.

TPM: I am obviously interested in how the different parts of the NEX business are doing, and I am also keen on getting an update on datacenter networking in the wake of the Barefoot Networks acquisition. There is a lot going on with NIC and SmartNIC shortages at Nvidia right now, with 52 week supply chain delays, which has helped Intel with its own Ethernet NIC sales, particularly the “Columbiaville” Ethernet Controller 810 series.

But to start out, let’s just level set about what NEX is and what you are trying to do at Intel in the datacenter and at the edge with regard to networking and a healthy dose of Xeon compute.

Nick McKeown: The idea was originally to bring together three businesses. First, our cloud networking, which is our Tofino switches and our IPUs, our foundational NICs, and our silicon photonics, which were really targeted to big datacenters and a relatively small number of big customers. Then we brought in the stuff we sell to telco equipment manufacturers, which was originally network function virtualization and is now vRAN and OpenRAN, moving all of the base station business to software and away from fixed function devices.

TPM: The unifying theme for these is moving from fixed function appliances to a collection of software running on industry standard components.

Nick McKeown: We want to do this for a couple of reasons. It gives IT organizations more agility, and then they can combine all of that compute resource in order to be able to run the base stations with cloud native controls.

TPM: There’s lots of Xeon D in there. . . .

Nick McKeown: Lots of Xeon D indeed, and that’s mostly about moving away from fixed function onto compute. We are now at a point where it’s very cost competitive from a TCO and a performance per watt to take on DSP workloads.

And then the edge part is, you know, what we refer to as IoT – not quite the right name, but the kinds of things that are on customer premises – factory automation, retail outlets, digital signage, smart cities, that type of thing. And that has a broad, very fragmented 1,500 customers that are all over the world in all sorts of different applications.

So these are three very different businesses across the cloud, the telco, and the edge, but as you say, it’s all about moving folks off fixed function, programmable logic controllers up into software so that they have more control over their destiny and more agility. In those industries that we’re looking at, it is just happening at different timescales affecting different people at different times. But it’s the same motion time and again.

Now, if we go to our cloud networking business, we are talking about our Tofino switches, our IPUs, and our NICs.

Intel had a very strong position in the NIC business, and we still do have a very strong position in the 1 Gb/sec and 10 Gb/sec NICs. Over the years, the assumption was that the NIC would be more and more absorbed into the server. And that turned out to not be the case. Because the pressure for speeds and feeds was great, and companies needed to be new generations of NICs running at 25 Gb/sec, 100 Gb/sec, 200 Gb/sec. and 400 Gb/sec. That was before this notion of SmartNICs and IPUs, whatever we want to call them. And, you know, Intel underinvested, it’s easy for me to say because I wasn’t here at the time, but Intel fell behind and has been in rapid catch up. Our market share is increasing in the foundational and fixed function NICs. But as far as the as the customer is concerned, it’s something you plug into a server, it has pretty fixed drivers. If you’re programming, you’re probably programming using DPDK more at an application level, widely used in datacenters, cloud, and telco environments, where they want that DPDK performance cutting right through to the to the wire.

You knew well the story of Barefoot Networks, which set out to transforming the high performance networking industry. The general assumption was at the time was that you could only do these high performance switches in fixed function. So it started out as a bit of a hero experiment back in about 2010, but we showed the world you can actually make it programmable for the same power and performance. And Tofino 1 chips came out almost exactly the same month as the Tomahawk 2 from Broadcom, both 6.5 Tb/sec, both had the same area, both had the same power. And because they were the same area, they had the same cost basis. So we demonstrated in one fell swoop that you could actually have programmability in the forwarding plane of a switch without compromise.

Over time, that programmability through P4 compilers and applications has proved very, very useful for those who want to combine and merge things like top of rack switches, spine switches, with new novel routing algorithms, or even appliances like gateways, firewalls, and load balancers that they build into the switch just by programming it.

It takes a while for these changes to transform an industry because everyone, of course, remains skeptical going in. Can you really make something programmable that has the same power, performance, and area as a fixed function device? After you demonstrate that, it becomes a horse race because we know, ultimately, this is measured on speeds and feeds.

TPM: Are you still on track with Broadcom’s “Tomahawk” switch ASICs? You are working on Tofino 3 and they are working on Tomahawk 5, I think. . . .

Nick McKeown: We are a bit behind. With the acquisition of Barefoot by Intel, this has taken a hiccup. We’re very committed to catching up and being at the speeds and feeds that will match the CPU roadmaps that the cloud customers have because, as you know, obviously these things go together in the systems when they do their fleet upgrades.

So we’re very committed to that trajectory and we’ll just keep investing. Switches are about speeds and feeds for a reasonable power, and then giving sufficient flexibility and programmability in order for the customers, who are mostly the clouds and hyperscalers and who are the most aggressive in this regard. They want to change it, they want to have their own special features like congestion control, different types of load balancing, and things like this. That’s their magic and their differentiating capability within their networks. To the extent that I know – and of course they’re not totally public about this – the different cloud service providers today have networks that all operate in slightly different way. And they’re all slightly non-standard. And that’s great. I think it’s actually a good thing. They’ve actually differentiated and used the network as a competitive advantage in their environments.

And in some ways, the Intelligent Processing Unit, or IPU, is really just a continuation of that story, which is line rate packet processing, programmatic control using P4 in the forwarding plane. So you can decide what extra congestion control algorithms or extra headers that you want to put in there, and be able to do that in a way that doesn’t burden the CPU, as well as doing line rate encryption and line rate compression in a way that’s programmable and configurable. So you can do that as packets going in and out of the CPU. And then having this wonderful complex of CPUs, in order to police the infrastructure code for the for the for the cloud. And as you know, this is what we developed with Google in our “Mount Evans” ASIC, our first IPU.

TPM: Were you working on the IPU idea at Barefoot Networks? It certainly looks like Barefoot could have designed it.

Nick McKeown: No, it was all Intel, with co-development between Intel and Google.

We have talked quite a lot about that 200 Gb/sec Mount Evans ASIC, and more recently we have talked about our 400 Gb/sec follow-on, which we call “Mount Morgan,” which is on the roadmap and not out yet. We’re working on our 800 Gb/sec IP, which will follow on from that in a couple of years. We’re in a roughly two-year cadence, which seems to be the way that the industry is heading now with IPUs.

One more thing: I’ve seen as other things you’ve written before of referring to DPUs instead of IPUs. And I realize this is kind of confusing. . . .

TPM: I’m not confused – I just don’t think we have the right word for it. Although someone did say to me recently that it was me who coined the term “data processing unit,” and I have no recollection of that and I suspect someone was just trying to pin the blame.

Nick McKeown: Actually, I want to put it to you this way, and it is not just marketing speak. The IPU is actually designed with a different goal in mind than what is called a DPU.

There is this progression that started from NICs that went up to NICs with more functions like I don’t know, TCP offload and things like this, that were originally called SmartNICs. Then they had more inline processing that you could add on through cores that were sitting there, the packets would flow through, as they were, as they were passing through.

We actually approached this problem with in our co- development with Google, in a similar way, I would imagine, to how Amazon Web Services had approached it with their own “Nitro” equivalent, which is that the point is not about putting more compute in the path. It’s about having a safe, secure place to run the infrastructure code so that the tenant code doesn’t bring down the infrastructure.

One aspect of this IPU approach is using the PCI-Express bus as a DMZ to protect the infrastructure against the tenant. No one knows today how to do that within a bunch of cores on a CPU, to have a really secure separation between the tenants and the infrastructure. And the second thing is the IPU places it closer to the wire so that infrastructure can now have its own communication, superfast. And that means that the infrastructure itself can exploit microservices, and then be super lightweight, super fast, without having to cross over into tenant land. That separation has proved to be a very significant part of it. And so a lot of this work, you know, some people refer to the IPU as microservices engine, which is not far off from the truth.

TPM: Back in 1978 with the System/38 and with the AS/400 in 1988, and with the System/360 before that, IBM was calling them intelligent I/O processors. . . . Something you do to offload I/O processing from a very expensive CPU with one core and a limited clock speed.

Nick McKeown: [Laughter] But seriously, the reason for picking the term wasn’t just that it started with “I” like Intel, but to demonstrate that this is actually about that infrastructure in the cloud service providers, who were asking us a different set of requirements from that bump in the wire – bigger scale, bigger state, with programmable line rate processing. And I believe that our IPU is the only one that will do programmable line rate processing at 200 Gb/sec because that’s not negotiable in a cloud service provider. You can’t put all of that infrastructure in place and then run it at half speed.

TPM: There is no question that Mount Evans looks like a good design, and it has the right features and the right first customer in Google. And it wasn’t obvious when it was revealed that Intel would be able to sell it to others, but we cleared that up.

Nick McKeown: It is sampling with a number of big datacenter customers now because that’s what it’s really designed for.

TPM: What’s the attach rate going to be for DPUs and IPUs, something with more compute than a SmartNIC? It seems to me that any multitenant set of infrastructure needs one for this separation of client application and infrastructure management, but there are obvious offload and virtualization benefits even if the infrastructure is only being used by one organization. There will be different scales of offload capabilities and compute capacity, of course, but I would say ever server should have one.

Nick McKeown: We are going in a direction where the forwarding behavior will for sure be programmable. Because you need the agility for that to evolve. And it comes with no performance or cost penalty for their programmability, just like in the switches. So that that to me means that it is just inevitable.

The question is: How much compute do you need for running stuff, whether it’s the infrastructure in the cloud, whether it’s for microservices acceleration in an enterprise datacenter, storage offload, or whether you need it for acceleration of other things, say, in a telco environment, or out at the edge, where you’re doing things like building a network mesh across a number of edge servers in different locations. All of those examples require compute and benefit from compute tightly coupled with a programmable forwarding plane. The difference will be in how much forwarding data rates and how much compute. Some of them may have 32 cores, some of them may have four cores, some of them might have zero cores, depending on the environment, alongside their programmable forwarding plane. But it will all be on this continuum of the same class of compute, the same architecture.

And if you squint and look at the switch, the switch is just lots of the same programming forwarding planes. It’s the same thing. And so you can imagine the switch on its own with lots of the programmable forwarding. Now, you may have heard me say this before, but it’s something that really excites me, so I am going to geek out a little bit.

You’ve got the Xeon servers, you’ve got the IPU connected to it, and you’ve got a series of switches. And then you’ve got another IPU at the other end and a Xeon server. Just think about that whole pipeline, which now becomes programmable in its behavior. So if I want to turn that pipeline into a congestion control algorithm that I came up with, which is suits me and my customers better than anything that has been deployed in fixed function hardware, I can now program it to do it. If I want to do something – installing firewalls, gateways, encapsulations, network virtualization, whatever – in that pipeline, I can program it, whether I’m doing it in the kernel with eBPF or in userspace with DPDK. I can write a description at a high level of what that entire pipeline is, and I don’t care really which parts go into the kernel or into the user space, or into IPU or into the switch, because it can actually be partitioned and compiled down into that entire pipeline now that we’ve got it all to be programmable in the same way.

TPM: Wait a second, do you actually have a compiler that can do this, or has it yet to be invented?

Nick McKeown: So I’m dreaming here. I think that this is an inevitability, where it will go with or without my help. In fact, I think at this point, we’re down that path. But this is a path that we’re very committed to in order to enable that through an open ecosystem. Of course, I want those elements to all to be Intel. But still we want an ecosystem that you allows you to do this in a vendor-agnostic way.

This is the way that I think IPDK will evolve over time, first of all that programmable path. And then in terms of the ability to have cores and CPUs right there in order to be able to do microservices, infrastructure, other types of things like that. IPDK will enable you to put those together as well. I think building an open ecosystem that is vendor agnostic is essential for our industry. Otherwise, we’re just shooting ourselves in the foot. My passion is to try and make that open ecosystem. Intel is obviously very committed to the open source part of it, that’s largely why I’m here – to help drive that in that direction, to open it up. All of these open ecosystems do create competition, and that’s fine. Bring it on. If we lose out in that race, it’s our own fault. I will make it clear what our role is in that ecosystem: Keep it open, keep it big, and then run like crazy to make sure that we’re providing the best compiler targets, essentially, for those different devices. And I think that’s the computer industry is and that’s how the networking industry needs to be, instead of hiding behind protocols and closed doors. You just unleash the power of 25 million developers around the world to make magic out of it that we could never think of.

TPM: This may seem like an obvious question, but I am going to ask it anyway. The first IP routers were programmed on DEC PDP-11s at Bolt Beranek and Neuman in the mid-1970s. How did networking get closed off in the first place? Was it just too arcane and difficult, or were people just too busy making operating systems, systems software, databases, and applications?

Nick McKeown: I think there’s actually a more simple reason for that. If you’re building a CPU all the way back to the 8088 or earlier, you have to tell people how to program it. So you have to open it up. And you have to have visibility into what the instruction set is, and which instructions you can do and which ones you can’t, and things like that. You have to do that right for people to produce a compiler or to be able to program it. So the CPU industry has always taken great pride in that race between CPUs with different architectures and compilers.

Another part of the explanation is that because we needed interoperability for networking, we allowed standards to get in front of the design process. And then everybody could say, here are the standards that I actually support – check off, IPv4, check off IPv6, blah, blah, blah – all the way down the list. You don’t have to tell them how it is programmed and everybody thought that they had something magical. But actually they were all kind of doing variants of the same thing, but they kept it secret. And the standardization through the API on the on the external interface, ironically, created secrecy and privacy inside now that can that can breed innovation through differentiation, because you say – Aha! I can do it a better way than someone else. And it did work for a while. And then it as we had strong, dominant players that didn’t need to innovate anymore because they just checked off items and switching became fixed function and vendors just carried the features forward.

We really wanted to disrupt that with Barefoot, allowing the user to decide what features a switch has and allowing them to do things that the chip designer never thought of. We are moving away from design protocols by committee and spending $100 million to put it into silicon five years later. By contrast, we create a compiler target across the programmable network devices and you decide what features they run.

Sign up to our Newsletter

Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.
Subscribe now

Be the first to comment

Leave a Reply

Your email address will not be published.


*


This site uses Akismet to reduce spam. Learn how your comment data is processed.