How SDN is Redefining the Concept of the Self-Healing Network

March 24th, 2016 by · 6 Comments

This Industry Viewpoint was authored by Paul Savill, SVP Global Core Product Management at Level 3 Communications

Here’s a story of our times: online ticket sales for a certain movie release crash, multiple ticketing sites are brought down for hours and angry customers have to revert to going to the box office to reserve their seats for opening night.

What if it all could have been prevented? What if a network could anticipate and address issues like a massive influx of traffic as it happened, adapt and expand in real time to handle the traffic and then, when the surge is over, revert back to routine operations?

Among the benefits of SDN that are often touted are operational efficiency, improved network performance and greater control over the network. However, SDN offers a much greater power – it can enable the network to self-correct or self-heal.

In telecom, “self-healing” used to refer to protected network configurations for SONET and DWDM that allowed for traffic to re-route in the event of a fiber cut. But with SDN, the concept of a self-healing network takes on an entirely new meaning, one that includes adapting to real-time conditions and network demands.

SDN gives a granular level of visibility into network performance and utilization. When coupled with the ability to adjust capacity in real-time and without service disruption, this insight creates the right environment for the network to respond to unforeseen events that keep network admins up all night, events like a massive influx of traffic due to a movie premiere, a marketing promotion, or a post-holiday online clearance sale.

Historically, network performance reports from telcos were limited to end-to-end averages that, while marginally useful for future planning, weren’t very informative in the moment. We all know that when a network is experiencing degradation, in the time it takes for someone to make a decision about how to address it, it’s already too late. SDN-based reporting, on the other hand, reveals detail by network segment and by class of service, providing actionable insights into latency, packet delivery, jitter and utilization.

With this holistic view, enterprises can not only identify where a network performance issue is originating and address it in real-time – they can also set triggers to remedy issues in the future without additional intervention required. Back to our example, say you’re planning for that new release and you want to prevent network utilization from going above, say, 70 percent. No problem. Set the threshold and the network can be programmed to dial up two or even three times the standard capacity. When the movie premiere is over, the network detects the change in utilization and decreases the bandwidth back to the base level.

Forget the five nines of the past – in today’s on-demand economy, enterprises face the seemingly impossible task of meeting accelerated demands for IT infrastructure that is always up. Thanks to SDN and the self-healing network, it can be done.

If you haven't already, please take our Reader Survey! Just 3 questions to help us better understand who is reading Telecom Ramblings so we can serve you better!

Categories: Industry Viewpoint · SDN

Join the Discussion!

6 Comments So Far


  • Anonymous says:

    Agree that SDN holds promise for handling bandwidth on demand situations but a fundamental question that service providers must address is their willingness to spend capex dollars to overbuild their networks with unused 10G/100G channels. To date, most service providers have adopted a success-based capex model for adding capacity when orders are submitted.

    • psavill says:

      I can’t speak for other carriers, but the way we architect our network involves diverse 10G backbone trunks from our Provider Edge (PE) switches, using traffic engineering models that consider committed bandwidth and actual usage levels. Capacity is augmented in 10G increments. So a scenario like the one I described with a temporary need for a burst in capacity from a single customer port is almost always supportable at the scale of our backbone and metro networks.

  • pete says:

    Yes, SDN does provide deeper insight but it is not a measurement and management methodology. For a truly holistic view of network performance you need a measure that tell you how application performance is impacted by network quality attenuation and if this is due to a lack of bandwidth or congestion, which are not the same thing. Quality Attenuation is the single metric that SDN needs to see inside a network and how it impacts application performance, see here for more about it: https://goo.gl/jxIPVi .

    • psavill says:

      I agree. Ultimately the application performance is what matters. In the situation I described, we use our own performance management tools to collect the performance data, and then use an in-house built system to trigger the SDN to increase the bandwidth so that the network is not the bottle neck for application performance. We are currently able to initiate bandwidth allocations on customer virtual circuits via three methodologies: 1) Real-time change initiated directly by the customer through our portal, 2) A “pre-programmed” change based on time of day, week, month, year, and 3) A “pre-programmed” change triggered by real-time performance measures on the network such as peak utilization or packet loss. This last methodology is the one characterized in this article. The ability to initiate network reconfigurations via SDN on pre-programmed event triggers like these really gets us close to custom-configured networks that can be programmed to adapt, grow and change automatically without initiating an order with the carrier and waiting on a provisioning interval.

  • Ruprecht says:

    I doubt the network is going to be the bottleneck in his hypothetical example, but if it were, I guess the assumption is that there is already enough capacity on both the network side and to the end user site to expand to double or triple what is normal? If that’s the case, it seems the end user (and Level3) already have that capacity available any way, so what’s the point? Is SDN esssentially moving a rate-limit with the end user or on the Level3 network? If the SDN can provision and connect fibers between the end user and level3 at Equinix in real-time to gain additional capacity, that would be amazing… but what is being described here simply isn’t. SDN does not solve everything. Everyone still has to do some amount of old fashioned capacity planning to be prepared for events like the one being described, if for nothing other than having the ports provisioned and fibers connected.

  • psavill says:

    The hypothetical example used was really just to communicate the idea of customer programmable networking using SDN. For this to work, you are right that the physical layer cross-connects from the customer switch to the carrier PE must already be in place. But SDN still adds tremendous value because most customers provision a physical port which is much bigger than the actual bandwidth usage that they commit and buy from the carrier. For instance, they will connect to the carrier PE with a 1GE port, but only provision 100Mb of bandwidth on their EVC. They do this to accommodate future ease of scalability while only paying for what they need at the time. But when unexpected events causing surges in bandwidth happen, SDN can be used to accommodate these surges real-time with these pre-programmed, pre-authorized network changes. This ultimately creates a much more reliable network without paying for a lot of unused overhead bandwidth.

Leave a Comment

You may Log In to post a comment, or fill in the form to post anonymously.





  • Ramblings’ Jobs

    Post a Job - Just $99/30days
  • Event Calendar