High-Performance Distributed SNMP Monitoring
The Simple Network Management Protocol (SNMP) has been a corner-stone of network performance management for data networks. SNMP is supported in most network devices and it allows the network operators to collect both statistics for the links between the network devices as well as health counters for the device itself such as temperature, memory consumption and CPU utilization.
The main use case for SNMP-based monitoring has been to monitor the utilization on the links in the network. The utilization of the link provides a coarse estimate on how good quality of service the customer connections traversing the links are getting. If the utilization is high, packets are most probably getting delayed waiting in the queue of the router to be sent out on the link. Thus, if your average link utilization is approaching 60-70% of the total capacity, you are probably starting to see performance degradations of the customer connections. Furthermore, if you start to see packet loss on your links, then you know that customers are already getting affected and traffic is getting lost.
As networks grow and become more complex, there are some challenges with the SNMP monitoring approach. The data is typically collected centrally from a server in the operators networks operations center (NOC). As the network becomes congested, it is possible that the SNMP protocol packets are lost travelling through the congested pipes and there is a significant delay between the server sending out the request to the devices to report on the performance counters and the response coming back with the results. As there are more and more monitoring points in the network, the server eventually start to struggle with the load.
Also, with the more complex networks of today, you want to poll the devices more frequently, perhaps targeting one-minute reporting intervals instead of the typical five-minute intervals used today. The frequent polling further increases the load on the server.
The solution is to deploy distributed probes in the network that can perform the SNMP collection closer to the devices being monitored. Distributed probes distribute the load to multiple probes in the network and at the same time the probes are located closer to the devices being monitored and delays between requests and responses are reduced. To deploy distributed probes can be costly if done only in order to increase the scalability of the SNMP solution. However, as we will discuss in upcoming blogs, modern data networks need active monitoring combined with the passive data for a comprehensive understanding of your networks state and health. The same deployed probes can perform the active monitoring for you at the same time as they are performing the high-performance SNMP data collection.
The solution offers you a combined active and passive view of network quality and you can correlate service QoS (jitter, delay and packet loss) with link utilization and device health counters for a deeper understanding of your network quality and trends. You are able to avoid costly outages by being able to quickly isolate issues and find the root-cause and by understanding the underlying trends in the network you can proactively avoid issues by planning capacity upgrades and configuration changes ahead of time.
Summary
As networks evolve, more scale and granularity are needed for efficient bandwidth management and for the critical traffic on the networks you need to augment the data with an end-to-end view of the bandwidth as well. Creanord can offer an advanced toolset for efficient bandwidth management, whether you want to optimize your existing link-based solution or move to a more granular and efficient centralized solution complemented by tools for end-to-end bandwidth management. The picture below provides an overview of the Creanord solution.
Creanord is a specialist in service assurance with more than 20 years of experience in developing solutions for network service providers and cloud providers. Creanord’s service assurance solutions enable accurate tracking of network and application quality and performance and the technology has been implemented in over 30 countries and more than 60 networks globally.
Contact us to find out more on how we can help you build outperforming networks here