In the networking world, NFV is one of the biggest disruptive technologies on the horizon. Vendors are racing their products to market and sharpening their ecosystems, and carriers are testing it out and sculpting new services or re-sculpting old ones. But there are issues left to solve before NFV can fulfill its promises. With us today to talk about a few of them is Andy Huckridge, Director of Service Provider Solutions & SME at Gigamon.
TR: First, what role does Gigamon play in the industry? What problem do you solve?
AH: Gigamon is a security & monitoring company, we provide visibility to traffic. We allow tool vendors to connect many and various types of analytic tools, for example sniffers, APN, NPM tools, to the network to get an idea what is going on, to understand how the network is working. The problem that we solve is the one-to-one problem of connecting various tools to a network. Every time a tool is connected to a network, more and more hands touch the network there can be problems. So instead we have a visibility fabric that connects to the network, and the various tools connect to us. We replicate traffic, filter traffic, and solve a lot of the big data headache.
TR: Have you been watching the industry as it prepares NFV for prime time? What have you been seeing?
AH: We've been involved in monitoring NFV, including some lab trials of NFV. The premise of what we are seeing is that, very much like the early days of SIP, MGCP, and many technologies, a lot of this stuff is still shaking out. Gigamon believes in NFV, we're fully behind it and helping carriers to roll it out. But some unforeseen issues, which the industry is not talking about, are how to you get interoperability, how do you work out what is going wrong if something does go wrong, how you get visibility of traffic, of processes, of the VNFs themselves, in environments which don't have monitoring built in.
TR: Isn’t that being built in already?
AH: It's well known in our industry that network equipment vendors will obfuscate or hide certain interfaces in order that the carrier has to buy all pieces of the same equipment from the same vendor. How do you get access to these interfaces to work out where the interoperability issues are? And if you can’t, how do you even know the VNFs are working if services are chained together, especially if the VNFs are from different vendors? And how do you study the impacts of running them in the different virtualization environments?
TR: How important is it to answer these questions now? Won’t it all work out in the end?
AH: As I said, we're strongly behind NFV and SDN, but you can't roll out this type of technology unless it can be de-risked. And this is a brand new technology in the same way that SIP was once upon a time. The best part of twenty years later there is still work being done on SIP from the interoperability perspective, which is why the session border controller industry exists till this day. Do we want to go down the road of fostering another new industry along the same lines to guarantee interoperability between VNFs? That's of course a ridiculous statement, right? But we believe that unless you have some basis for monitoring, some ability to look into these obfuscated interfaces, some ability to look at the performance of different VNFs across different virtualization environments, then the operator is going to be at a bit of a loss.
TR: What might go wrong?
AH: When you actually go and deploy this technology at 6pm on a Friday evening and there's a big game on the weekend, and there’s been a new feature or service upgrade rolled out... Kick-off takes place - all of a sudden everyone interacts with this app, there's bunch of traffic, the delay or jitter goes out of tolerance, and the app falls over and there's something wrong with your service. Customer service centers light-up How do you debug that? It's not the end of the world, there's ways and means by which we can do this, but you've got to have that visibility. Unless you have the tools and the network visibility, this technology is not deployable for prime time.
Invariably, everyone rushes to roll out these new technologies reducing capex, opex, etc, but what ends up happening is unless all of the visibility is built in from scratch, we are all the same as lemmings rushing towards the edge of the cliff. We can again look back to 2008 and BT's 21st Century network deployment in the UK to see what happened there with the first softswitch rollout that fell over on a live deployment – no dial tone for hundreds of thousands of subscribers. From a selfishly-driven agenda, what we want the carriers to do is be successful. If they are, then all the technology has been tested properly and been rolled out correctly. But how do they do that? They need to have the tools, the capabilities to find out what went wrong and why well before the big game. We need to be able to characterize how these VNFs work if they are service chained together in the same virtual environment.
TR: Why do we need to characterize VNFs?
AH: We need to know whether one VNF can be mixed with another. For instance, can you have an MME, a diameter routing agent, a virtual IMS, all on the same server, or not? The answer that is coming back from the industry so far is “no you can't”, they have to be held in separate clusters because they interfere with one another. One tends to be bursty at this particular moment, another tends to be bursty at another moment in time depending on what's going on in the network. You can have perfect interoperability, but unless the VNFs are characterized then you can't deploy because you don't know how one interacts or affects another.
TR: Why is it that such environments are vulnerable in ways the traditional solutions were not?
AH: In the old days we had dedicated hardware that in general could do 100% of what the firmware would allow it to do. But now we've got common purpose off-the-shelf servers for everything, and the software can often do more than what the hardware can handle. If you take three devices, a p-gateway, an s-gateway, and an MME, what you have is 3 devices that do different things at different times. So one requires lots of memory, one requires lots of processing, one requires a lot of NIC-card processing. If you try to put all of that onto one processor, will it work? Yes and no. In the case where a brand new services comes online and you have an out-of character swarm that places a greater burden then can be foreseen on any of the three, the software is unlimited against the hardware and may not actually know what the capabilities of the hardware are.
TR: How can different virtualization environments affect VNFs?
AH: Suppose you have a network that is expanding, e.g. it's the busy hour for compute on demand and your cloud has taken over another set of virtual machines to spawn another p-gateway to do the processing. What's the impact of that server or system or memory increase actually spooling up? Is there an extra delay or latency with the load balancing across a new server being added in. On the other side, if we're coming out of the busy hour, and we need less resources, then what issues surround spinning down servers and putting that traffic back on the main servers. None of this is being looked at yet. When you put it all together from a telco perspective there are multiple questions here, and nobody has the answers.
TR: So what actual action are you calling for?
AH: We want to see the carriers to push the vendors to put the necessary hooks in. All monitoring and test equipment vendors struggle to push the network equipment vendors to do things like this. We try to be advocates for the industry in general, so what we are trying to do is make everyone aware that there are issues and how to solve those issues. As soon as the carriers understand where the problems are, then it will happen naturally. So what we're trying to do is get that ball rolling now such that the deployments will go more smoothly. No one wants to see a failed technology rollout!
TR: What makes this time different?
This is the first time that technology has been deployed where there is a lack of visibility defacto built into the actual technology itself. When the operator deploys these VNFs and virtual environments, they are actually giving up a lot of the capabilities to work out why things went wrong and where on the network issues happened. they previously had.
TR: Do you think it will actually happen?
AH: It's a multi-year process, but one way or another it will have to. All it takes is one good outage, right?
TR: Thank you for talking with Telecom Ramblings!