Site icon Karneliuk

Tools 12. Using Prometheus with SNMP Exporter to Monitor Cisco IOS XR, Nokia SR OS and Arista EOS Network Devices

Dear friend,

Awareness of what is happening in your IT infrastructure (in our case, in network) is a key success or failure factor of any modern business, as huge majority of businesses are now running online. The awareness is built on top of visibility of network events and activities happening in the network, which in their turn reflects in data points, which can be collected. In this blogpost we’ll cover how these data points can be collected in multi vendor network running Cisco IOS XR, Nokia SR OS and Arista EOS switches using Prometheus, which is one of the most popular monitoring platforms these days.


1
2
3
4
5
No part of this blogpost could be reproduced, stored in a
retrieval system, or transmitted in any form or by any
means, electronic, mechanical or photocopying, recording,
or otherwise, for commercial purposes without the
prior permission of the author.

Is Monitoring Needed for Network Automation?

The ultimate state of any system, including the IT/network is self-managed (self-healed, self-controlled, etc). It is simply impossible to build any self-controlled system without monitoring and collection of the data, as this data collection in the self-controlled system is the only (as we remove people from the decision making) way how the system may know what is it is health and where it shall be moving towards. As such proper monitoring is a must of automation systems.

By the way, we have training programs, which teaches you how to build the next-generation monitoring for network automation and the network automation, of course, the network automation systems and solutions:

We offer the following training programs for you:

During these trainings you will learn the following topics:

We constantly update materials in our trainings for them to stay relevant not only for networks of today, but also in future. Therefore, we emphasize a lot model-driven automation framework, all its building blocks and components, so that you can get away from legacy CLI-based based scrapping and automation. Join us and unleash your potential.

Start your automation training today.

Brief Description

We already discussed Prometheus in a few previous blogposts:

Therefore, we will skip the part explaining why you would like to use Prometheus for Network Monitoring. Read the aforementioned blogposts to get that info.

SNMP has been the primary method to monitor network devices, together with syslog, for decades. Despite it has got a massive rival in the form of the streaming telemetry recently, it is still the leader, and will continue to be so for the time being until the ecosystem for the streaming telemetry, as well as the penetration level of its support across network devices will reach a minimum critical level to start the sky rocketing growth. Until that happens, and, probably, even when that happens, SNMP still will be used to collect the numerical data.

Let’s quickly brush up SNMP. It stands for Simple Network Management Protocol and is used to communicate basic network devices’ metrics, such as state of the interface, amount of sent/received packets/bytes, etc. It can work in two modes:

ModeDescription
POLLThis is a mechanism to collect the data from network devices by network management system (NMS) at regular intervals (e.g., each 30 seconds). This method is primarily used to collect data for various graphs as well as to build up alerting
TRAPThis approach is used, when the data point shall be sent immediately by the network device to the NMS in case of certain network events (e.g., network interface went down) and, therefore, is primary used for alerting.

Prometheus can implement for us the first mode, as it implements pull-mode for operation. Take a look at the following picture:

SNMP polling with and without Prometheus

In case of SNMP polling, the NMS performs an operation to collect data using GET, GET-NEXT, or GET-BULK request type and receives response to its request. Prometheus deploys the same model with its exporters, which is called pull-mode: It performs the scrape request and receives some data back. As such, it becomes obvious that these two approaches are identical and, therefore, can nicely complement each other in the following way:

The key role here plays this specific exporter. As you may already know, exporter is an agent, which is being polled by the Prometheus central backend. Previously we have shown how to install the Node Exporter to collect the data from the disaggregated data center switches. What if, however, you are not that lucky and you run some traditional network operating system, such as Cisco IOS XR or Nokia SR OS, which are very popular in the Service Provider world. In such a scenario, you cannot install the node exporter on the network devices. That’s where the Prometheus SNMP exporter comes onto the stage.

The Prometheus SNMP exporter is an agent, which receives requests from Prometheus and responds to it using the messaging format supported by Prometheus. Locally on the exporter there is a configuration file, which contains information about MIBs, which needs to be polled, structured in the so-called “modules” as well as the credentials (i.e., SNMP community strings for SNMP v1 and v2 and username/passwords for SNMPv3). Upon request from the Prometheus, which includes the FQDN or IP address of the target device to be polled as well as the name of the module to use, SNMP exporter looks in its config to find the necessary OIDs and credentials and attempts to poll the device. If that operation is successful, SNMP exporter returns collected data.

Lab Setup

As we claimed in the title of this blogpost, we are going to deploy Prometheus and SNMP exporter to poll data from multi-vendor network running Cisco IOS XR, Nokia SR OS and Arista EOS network devices. Here is our topology:

For the Prometheus part, we are running:

We could run Prometheus on top of Kubernetes as well, and in general this would be a preferred way; however, that is out of scope for this lab

As network devices, we run:

Enroll to our Zero-to-Hero Network Automation Trainings to become an expert in automation of Cisco, Nokia and Arista as well as to master Linux and Docker skills.

Solution Implementation

Step #1. Configure SNMPv3 in Cisco IOS XR, Nokia SR OS and Arista EOS

The very first step, before we jump into the deployment of Prometheus with SNMP exporter is to get SNMP configured and validated at network devices.

We are going to configure SNMP version 3, which is considered to be the most secure version. Actually, we did that before in our blog, so we are going to re-use the same configuration examples.

It is important though to validate our SNMP operation before moving to Prometheus. That will allow to ensure, if we have to troubleshoot the setup, we focus only on one known (e.g., only Prometheus part) rather than trying to solve multiple problems.

In Ubuntu Linux, you need to install the following packages:


1
2
$ sudo apt-get update -y
$ sudo apt-get install snmp snmp-mibs-downloader -y

Once this is done, you can check SNMP operation:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
$ snmpwalk -v 3 -l authPriv -u Collector -a SHA -A SUPER_AUTH -x AES -X SUPER_PASS 192.168.101.11 IF-MIB::ifXTable
IF-MIB::ifName.1 = STRING: system
IF-MIB::ifName.2 = STRING: oc_1/1/c1/1_0
IF-MIB::ifName.3 = STRING: oc_1/1/c2/1_0
IF-MIB::ifName.1610899520 = STRING: 1/1/c1
!
! FURTHER OUTPUT IS TRUNCATED FOR BREVITY


$ snmpwalk -v 3 -l authPriv -u Collector -a SHA -A SUPER_AUTH -x AES -X SUPER_PASS 192.168.101.12 IF-MIB::ifXTable
IF-MIB::ifName.2 = STRING: Null0
IF-MIB::ifName.3 = STRING: GigabitEthernet0/0/0/0
IF-MIB::ifName.4 = STRING: GigabitEthernet0/0/0/1
IF-MIB::ifName.5 = STRING: GigabitEthernet0/0/0/2
!
! FURTHER OUTPUT IS TRUNCATED FOR BREVITY


$ snmpwalk -v 3 -l authPriv -u Collector -a SHA -A SUPER_AUTH -x AES -X SUPER_PASS 192.168.101.13
 IF-MIB::ifXTable
IF-MIB::ifName.1 = STRING: Ethernet1
IF-MIB::ifName.2 = STRING: Ethernet2
IF-MIB::ifName.999001 = STRING: Management1
!
! FURTHER OUTPUT IS TRUNCATED FOR BREVITY

Once SNMP v3 operation is validated, we can confidently say that this step is successfully completed. The next step is to setup the SNMP exporter.

Step #2. Setup SNMP Exporter

It is expected that you know how to install Docker. If not, read to this blogpost and enroll to Zero-to-Hero Network Automation Training.

Before we will though launch SNMP exporter, it is worth to spend some minutes talking about its configuration file. In the official documentation, it is suggested that you should create your configuration file. You definitely should do that if, you have some drivers, which are not part of the standard collection. If they are the part, though, you don’t have to do that and you can simply download that file and use it as a configuration one for your SNMP exporter. Let’s do it:


1
2
3
$ mkdir -p snmp_exporter/config
$ cd snmp_exporter
$ wget https://raw.githubusercontent.com/prometheus/snmp_exporter/main/snmp.yml -O config/snmp.yaml

Spend some time looking through it to identify what is in, what is not. For the purpose of this blogpost we will use the ifTable and ifXTable MIBs, which are part of this file. As such, we don’t need to generate the new SNMP file. However, we need to add credentials. To do so, we add to the corresponding module if_mib the information per our previous configuration:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ vim config/snmp.yaml
! SOME OUTPUT IS TRUNCATED FOR BREVITY
!
if_mib:
  version: 3
  auth:
    username: Collector
    security_level: authPriv
    password: SUPER_AUTH
    auth_protocol: SHA
    priv_protocol: AES
    priv_password: SUPER_PASS
  walk:
  - 1.3.6.1.2.1.2
  - 1.3.6.1.2.1.31.1.1
!
! FURTHER OUTPUT IS TRUNCATED FOR BREVITY

Now your configuration of SNMP exporter is ready and we can bring that up. The good thing is that you can, and generally should, do it before you bringing up the Prometheus itself, as SNMP exporter has built-in mechanism to test the operation of SNMP polling, which helps to:

We stated that in this lab we are going to use Docker compose to orchestrate the deployment; therefore, we need to create the Docker Compose file for that:


1
2
3
4
5
6
7
8
9
10
11
12
$ tee docker-compose.yaml << __EOF__
---
version: "3.9"
services:
  snmp:
    restart: always
    image: "prom/snmp-exporter:latest"
    ports:
    - "9116:9116"
    volumes:
    - "./config/snmp.yaml:/etc/snmp_exporter/snmp.yml"
__EOF__

Enroll to Zero-to-Hero Network Automation Training to master Docker and Docker-Compose skills

Bring the application up:


1
2
3
4
5
6
7
8
9
10
$ sudo docker compose up -d
[+] Running 5/5
 ⠿ snmp                                                 Pulled         3.2s
   ⠿ 22b70bddd3ac                                       Pull complete  0.6s
   ⠿ 5c12815fee55                                       Pull complete  1.1s
   ⠿ a80d1d2a0e12                                       Pull complete  1.5s
   ⠿ b6c49ac14299                                       Pull complete  1.7s
[+] Running 2/2
 ⠿ Network snmp_exporter_default                        Created        0.1s
 ⠿ Container snmp_exporter-snmp-1                       Started        0.5s

Validate that container is up and running indeed:


1
2
$ sudo docker container ls | grep snmp
10f48a33ba86   prom/snmp-exporter:latest       "/bin/snmp_exporter …"   10 hours ago   Up 10 hours   0.0.0.0:9116->9116/tcp, :::9116->9116/tcp   snmp_exporter-snmp-1

As it looks to be properly up and running, you shall be able to connect it using the IP address of the host (per our lab topology it is 192.168.51.72) and the port 9116/TCP:

Starting page of SNMP Exporter

If you can see the same picture, it means that you are properly connected to network device. The next step is validate the operation of SNMP polling. Provide IP address or FQDN of the device you will be polling in the target filed and press Submit:

Sample output of the successful SNMP polling

If credentials were accurate and the network device is reachable on the SNMP port, the polling shall be successful and you shall be able to see some metrics with or without labels.

Step #3. Setup Prometheus

Finally you should setup the Prometheus itself. As we have discussed previously the setup in general, we will focus now only on the SNMP job. Among others, there are some important considerations you should do, when you setup the scrapping with Prometheus:

In our case we want to be able to detect even relatively short spikes; therefore, we will setup scrape_interval for Prometheus to be 10 seconds. Be mindful, if you set such an aggressive time in the production network as may generate additional CPU/memory load on network devices as well as a significant amount of storage needed.

Let’s create config file for Prometheus:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
$ mkdir ../prometheus
$ cd ../prometheus
$ tee prometheus.yaml << __EOF__
---
global:
  scrape_interval: 10s
  scrape_timeout: 5s
  evaluation_interval: 5s

scrape_configs:
  - job_name: 'office1-snmp'
    metrics_path: /snmp
    params:
      module: [if_mib]
    static_configs:
      - targets:
        - 192.168.101.11
        - 192.168.101.12
        - 192.168.101.13
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: 172.18.0.2:9116
__EOF__

172.18.0.2 is an IP address of the SNMP Exporter, which was allocated by Docker automatically. You can get it using “docker container inspect snmp_exporter-snmp-1” command.

Now you can bring the Prometheus up, also using Docker compose:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
$ tee docker-compose.yaml << __EOF__
---
version: "3.9"
services:

  prometheus:
    restart: always
    image: "prom/prometheus"
    ports:
      - "9090:9090/tcp"
    command:
      - "--config.file=/etc/prometheus/prometheus.yaml"
      - "--storage.tsdb.path=/prometheus"
      - "--web.console.libraries=/usr/share/prometheus/console_libraries"
      - "--web.console.templates=/usr/share/prometheus/consoles"
    volumes:
      - "./prometheus.yaml:/etc/prometheus/prometheus.yaml:ro"
      - "prometheus_db:/prometheus"

volumes:
  prometheus_db:

__EOF__

Start the Prometheus:


1
2
3
4
5
$ sudo docker compose up -d
[+] Running 3/3
 ⠿ Network prometheus_default                    Created        0.1s
 ⠿ Volume "prometheus_clangm_prometheus_db"      Created        0.0s
 ⠿ Container prometheus-prometheus-1             Started        0.8s

Check that it is up and running:


1
2
$ sudo docker container ls | grep prome
fbe8cce6bfc0   prom/prometheus                 "/bin/prometheus --c…"   25 hours ago   Up 9 hours    0.0.0.0:9090->9090/tcp, :::9090->9090/tcp   prometheus-prometheus-1

As our monitoring with Prometheus and SNMP exporter appears to be working, it is a time verify that.

Validation

Connect to Prometheus UI using the IP address of the host (192.168.51.72) and the 9090/TCP port:

Prometheus start up page

Explore the collected data. For example, type just “if” in the “Expression” field. You shall see a number of suggestions:

Suggestions of collected metrics

One of the useful graphs, which you can build is interfaces’ utilization in bps. In the field “Expression” type “rate(ifHCInOctets[1m])*8” and press “Execute“. You shall see, depending on the traffic levels in your network somewhat similar picture:

Woohoo, your monitoring of network with Prometheus leveraging SNMP is working.

By the way…

As you can see, the amount of information, various tools and patterns you may need to use in the modern networking is incredible. And it is getting just bigger. In order you can be successful in your job, whether you are an IT/network engineer caring about your career or manager/director/CTO caring about your network/IT infrastructure, we offer you the best training programs which are based on years of real-world experience of designing, building and operating network automation systems and solutions. Leverage our expertise to let you focus on your core tasks. Enroll yourself or your teams to our training programs and start training now.

Lessons Learned

Apparently, the biggest lessons learned for us was that it is important and useful to look in your past notes (i.e., blogposts). The main reason why we are writing blogposts is because we see that despite there is a huge amount of information worldwide, there is 99,99% amount of noise there. When we were configuring SNMP v3 on Cisco IOS XR and we went through the official documentation, there were some pieces missing. Same story for Nokia and Cisco IOS XR. however, when we started looking for further examples, we encountered our own blogpost back from 2019, which has simple and clear explanation how to configure SNMP v3 on the vendor of our choice. As a result, we saved a tremendous amount of time and focused on the core topic of the blog, which is SNMP Exporter for Prometheus.

Summary

If you take a look from 1000 feet overview, it may look that Prometheus is yet another monitoring system: we already have Nagios, Zabbix, LibreNMS, InfluxData/IndluxDB, and now Prometheus. The statement is absolutely fair. At the same time, the variety of tools, good Open Source tools, on the market gives you possibility to choose what fits your use case on the one hand, and what can be benefited by the wider enterprise context. ultimately, if you are able to reach that point that you have a single, though for sure highly redundant), monitoring system for your entire infrastructure, that would significantly simplify data correlation between different events and to make the holistic analysis. Take care and good bye!

Need Help? Contract Us

If you need a trusted and experienced partner to automate your network and IT infrastructure, get in touch with us.

P.S.

If you have further questions or you need help with your networks, we are happy to assist you, just send us a message. Also don’t forget to share the article on your social media, if you like it.

BR,

Anton Karneliuk

Exit mobile version