As we edge our way into the 5G era, the entire industry is trying to find ways to better scale and automate virtually every component. One company that has taken a big swing at its own piece of the puzzle is Montreal-based Kaloom. With us today to talk about its technology and its approach to the problems ahead is Kaloom CEO Laurent Marchand.
TR: How did Kaloom get started? What problem were you trying to solve?
LM: I’ve been in the networking industry my entire career: through the second, third, fourth, and the incoming fifth-generation of mobile networks. It became clear to me that most incumbent vendors are still aiming to use the same technology platforms for 5G that they are currently using for 4G. From my personal perspective, I thought that this was the wrong thing to do. A 5G-enabled HD camera for a smart city application will generate an order of magnitude more data than a traditional smartphone but will only generate 1 or 2 dollars of revenue per month. Without a new approach, it would be simply impossible to run this emerging application in a business viable fashion. I was convinced that a new approach was needed which introduced radical OpEx and CapEx reductions.
TR: How did you translate that insight into a specific direction for Kaloom?
LM: The first thing which was obvious to us is that many of the existing networking solutions lacked the proper level of automation. They were far too labor-intensive and error-prone. Additionally, people had to make a choice between flexibility or performance. For flexibility, they could pick an environment based on x86 where they can program whatever network function they were looking for. For performance, they had to utilize fixed-function ethernet ASICs like Broadcom Trident or Tomahawk chipsets. The inability to have a solution which could simultaneously provide the full flexibility of a programmable environment while carrying terabits of traffic was a serious deficiency. Both are absolutely necessary. We were fortunate that at the same time we started Kaloom a new generation of chipsets emerged which could provide throughput from 3.2 to 12.8 terabytes but that are fully programmable in this industry language called P4, and we were able to embrace that technology from the start.
The next problem that we knew we had to address was the need to support network slicing for 5G. From a data center standpoint, with network slicing you take your physical infrastructure and slice it into multiple independent virtual data centers. Each one can then have its own virtual fabric, a completely isolated domain down to layer 2. That allows different tenants to share a common pool of hardware resources while ensuring that there will be no security breaches between those users. More physical resources can be added or removed in a dynamic fashion.
Beyond that, most people believe that low latency is a key objective for 5G. Many emerging applications cannot run in a standard public cloud due to speed of light and other restrictions, which has contributed to the enthusiasm for edge computing. We have to bring the compute and storage closer to the actual device and user. But we noticed that at the time everybody was focusing on port-to-port latency, shaving off nanoseconds between these components. But when you reach the server, to go from the NIC to the kernel memory to the hypervisor to the user space memory and so on, it can take a hundred microseconds to reach the application. We needed a solution that took an end-to-end perspective on latency because that’s what matters at the end.
And finally, without radically more energy-efficient solutions, an explosion in 5G energy usage will have a major negative impact on the environment. We needed to do way more with each watt of electricity. In a 5G network, all devices are connected through an entity called UPF. All the existing vendors are delivering the 5G UPF on an x86 server. Their top of the line, very expensive, dual-socket servers can probably reach 150Gbps of throughput while consuming 850W each. For us, on a P4 switch using the Intel Tofino, which costs half the price of a server and consumes 750W, we can do about 1.5Tbps with 25 times lower latency.
TR: How did you put those concepts into practice at the software level?
LM: We were fortunate to be able to start with a clean slate. So, we thought about what the nature of the bulk of the applications that will be deployed over 5G infrastructure might be. To us it was obvious that cloud-native container applications would be the way to move forward. It’s way more cost-effective to use containers than a virtual machine because you can have more containers supporting more devices simultaneously. But the existing container frameworks were not designed for mission-critical or high-availability applications, so we had to basically evolve the environment to provide those characteristics. We selected the next generation of white boxes based on Xeon processors, Intel Tofino chips, FPGA, and HBM2 memory. We have built a fully disparate, scalable control plane in a new language from Google called Go. It is deployed as containers over a community clustering framework for which we partner with Red Hat to use their OpenShift container orchestration platform. The data plane is written in the P4 programming language.
TR: What makes P4 ideal for this application?
LM: P4 is an industry-standard domain-specific language that targets packet processing. It is an abstraction of the underlying chipset. A P4 environment has a frontend compiler and a backend compiler, with each different chipset having its own backend compiler. So, you can build one application which compiles into an executable that can run on all these different chipsets, although that underlying chipset will provide the characteristics of the application, such as the throughput. So, if a new chipset appears, it is straightforward to just bring your P4 application over to it. For us, P4 is the way we want to build our virtual network functions for our control and data planes.
TR: At what stage of development are you on all this, and what is next in the pipeline?
LM: Our Kaloom Software Defined Fabric and Cloud Edge Fabric are generally available for commercial production and ready to use. Our Unified Edge solution uses the Cloud Edge Fabric to provide an additional level of optimization for edge data centers. Suppose a telco wants to introduce an edge data center in a legacy central office which might be packed with legacy equipment like Frame Relay, ATM, and so on. There’s very little space and very little power available. Suppose you have room for one rack with 18 edge servers. To interconnect two edge servers, you will need at least two switches. If you are using Cisco Nexus switches, for example, you will have to deploy a Cisco fabric controller to configure and manage the fabric, which means more dedicated servers. Then you may need 500Gbps of throughput. If one x86 top-of-the-line server can manage only 150Gbps, you will need three or more just for the UPF functionality. Then if you want to deploy OpenShift applications, you will need three more for the Kubernetes master etc. So, nine of your eighteen servers would be used for such infrastructure rather than customer revenue-generating applications. When you have a limited number of servers, you cannot start using them in this fashion. The Unified Edge solution takes the fabric controller, the UPF control and data plane functions, and the Kubernetes master part of OpenShift, and consolidates all of that functionality onto the switches. This allows customers to fully use the eighteen servers for revenue-generating applications. This is really something that has raised the level of interest from many of our customers and partners.
TR: What types of deployments are you seeing in the market right now for your technology, and how are you taking them to market?
LM: The use cases and interest we are seeing the most right now are related to private 5G networks and access edge clouds. These are really the two that are getting the largest amount of traction. Typically, we go to market through our partners. That is not to say that we will not in certain cases have customers that decide to buy directly from Kaloom, but the bulk of our sales will go through partners such as IBM, Lenovo, Altran, Wipro, Red Fig Networks, NS Solutions, etc.
TR: You recently announced a partnership with IBM, can you tell us a bit about that relationship?
LM: Kaloom is quite thrilled to be part of the IBM Telco Cloud ecosystem. We have tried to send a strong message that it takes a village to build the edge because there are many, many components. It’s not possible for a single company to bring all the pieces to market. As part of the overall IBM solution, Kaloom brings what we believe to be a critical aspect of edge networking. We believe our Unified Edge solution offers best-in-class technology for that kind of environment, and we have been working closely with Red Hat for many years, and more recently with IBM since the acquisition. Of course, IBM acquired Red Hat as everybody knows.
TR: What hurdles still await this industry as it moves toward widespread 5G adoption?
LM: As with any newer technologies, you can’t say whether you are at the leading edge of the industry, or the bleeding edge. It requires significantly more testing and debugging than an older, more mature environment, and you need to ensure that your systems are designed such that those tests will be fully automated. It requires a lot of effort to create the proper environment to do that for 5G due to the fact that the technology is not 100% mature and is evolving all the time. When you look at the 3GPP release 15, people have barely started to deploy this version of the specification and we’re about to move to release 16 and then release 17. When everything is changing all the time, it is a more challenging environment. Even as I say that, in certain areas we see some simplification happening. Today, telco’s have to deal with different environments for fixed and mobile access. We can expect to see a single infrastructure for fixed access, Wi-Fi 6, and 5G, whether over licensed or unlicensed, spectrum. This trend will be an important one because it will simplify the life of network operators. Of course, in the near term the pandemic has introduced many additional challenges to all of us as well.
TR: How have you weathered the pandemic?
LM: We anticipated it would come very rapidly, so we were well prepared as a company to work remotely. Very early on we added multiple redundant internet connections to our infrastructure. Our entire business runs on our technology, and everyone is connected back to our own data centers. Everything has been working seamlessly, and in many cases we have improved our productivity. Of course, on the customer side when you’re in the middle of a trial and the government shuts down the place, everything gets delayed. I hope that in the near future this will be behind us.
TR: Thank you for talking with Telecom Ramblings!
If you haven't already, please take our Reader Survey! Just 3 questions to help us better understand who is reading Telecom Ramblings so we can serve you better!
Categories: Industry Spotlight · Software · Telecom Equipment
Discuss this Post