The move to network automation is arguably the hardest problem for communications service providers (CSPs) to solve: Is the network okay? How will you know? Indeed, network automation was a key topic in multiple sessions at the recent North American Network Operators Group (NANOG) conference in Seattle, June 12–14, 2023.
The State of Network Automation in 2023
At NANOG 88, Chris Grundemann (of Grundemann Technology Solutions) gave a presentation focused on his organization’s “2023 State of Network Automation Survey,” with results based on 78 responses from NANOG members from 20 countries. Respondents included network operators from internet service provider (ISP), network service provider (NSP), and managed service provider (MSP) organizations and from CloudOps, public sector IT networks, and enterprise IT network teams. Grundemann noted how networks are continually getting larger and more complex with multicloud, software-defined networking (SDN), 5G, and ever-expanding connections between networks, but companies and organizations are not increasing the number of people tasked with operating the networks!
As his remarks recognized, we are still in the early days of network automation. CSPs and enterprises are prioritizing what gets automated and are automating with a range of solutions that run from “home brew” applications to commercial applications and operational support system (OSS) paid support. To make matters more challenging, the survey highlights a skills gap: only 27 percent of the responding companies say they have staff with automation expertise.
What Is Getting Automated?
That service provisioning and network/service troubleshooting was at the top of the list of things to automate is no surprise. Device deployment, service provision firmware upgrades, capacity planning, traffic engineering, and network design were also listed as the top automation targets. The fact that network traffic continues to grow dramatically and change (as it did with the pandemic and now post-pandemic traffic patterns), along with the move to the cloud, makes manual control of the network increasingly difficult. Simply said, there are not enough hands to turn all the dials.
Why Is Network Automation So Hard?
The challenge of network automation may be best exemplified by 5G Standalone networks. The movement to the cloud with 5G promises to usher in a new wave of specialized enterprise services that require ultra-low latency and ultra-high bandwidth and reliability for applications such as robotic manufacturing, augmented reality/virtual reality (AR/VR), mining, remote medicine, gaming, smart cities, and more. But moving to virtualized, disaggregated, and containerized infrastructure brings never-before-seen challenges for operations and network engineering teams responsible for running these networks. The control plane traffic can be encrypted with Transport Layer Security (TLS) 1.2/1.3, packets no longer identify the network functions and network functions can be disaggregated and variably instantiated on different compute resources. How do you maintain visibility in this cloudy 5G network?
Making Sure the Network Is Okay
Although it sounds like a straightforward task to make sure the network is operating as it should, the increasing complexity of the network makes that increasingly challenging. As Major League Baseball Director of Solution Engineering Jeremy Schulman noted in his NANOG presentation on network assurance, the challenge for network operations and engineering teams is “ensuring that the network is operating as expected and reporting any anomalies with as much context as possible to maximize situational awareness.”
What Do CSPs Use as the Source of Truth to View and Report on the Network?
Many consider IP packets to be the greatest source of truth in the network versus NetFlow, error logs, or other methods of network visibility because they best represent the actual signaling and user plane traffic on the network. But gaining access to IP packets must be cost-effective and efficient so that network operators have end-through-end visibility in at least near real-time. Constantly updating the network construct or service dependency mapping is also critical to providing visibility and actionable intelligence. The fact that networks are not static means the network design of yesterday does not always reflect the actual network configuration today. And equally important, performance monitoring and service assurance applications must provide not only reactive and proactive service triage but also must be moving to become predictive and prescriptive with the aid of artificial intelligence (AI) and machine learning (ML).
Learn more about intelligent automation for 5G here.
Read about automated analytics for service providers here.