Site icon Karneliuk

Tools 8. Monitoring Network Performance with Dockerised Prometheus, Iperf3 and Speedtest

Hello my friend,

in the time when the business is conducted online, it is vital to have a clear visibility into the health of your services and their performance, especially if they rely on the media or other components outside of your immediate control. Earlier in our blogpost we have covered how and why to use iperf3 for measurements of a performance between your hosts and speediest to measure a performance of an Internet connectivity. Today we’ll show how to automate this process with the help of Prometheus.


1
2
3
4
5
No part of this blogpost could be reproduced, stored in a
retrieval system, or transmitted in any form or by any
means, electronic, mechanical or photocopying, recording,
or otherwise, for commercial purposes without the
prior permission of the author.

How Can We Automate Monitoring?

Automation is not only about Ansible and Python. Knowing how you can properly use various applications, especially those great open source tools available on the market is a key to your success. At the same time, Ansible plays a key role in rolling out application these days, as it helps to ensure that deployment is done in a consistent way. Ansible is like an extra pair of hands (or multiple extra pairs of hands) for you.

And we are delightful to teach you how to use it in an efficient way at our trainings!

We offer the following training programs:

During these trainings you will learn the following topics:

Moreover, we put all mentions technologies in the context of real use cases, which our team has solved and are solving in various projects in the service providers, enterprise and data centre networks and systems across the Europe and USA. That gives you opportunity to ask questions to understand the solutions in-depts and have discussions about your own projects. And on top of that, each technology is provided with online demos and labs to master your skills thoroughly. Such a mixture creates a unique learning environment, which all students value so much. Join us and unleash your potential.

Start your automation training today.

Brief Description

If we would simplify monitoring of any application or IT/Network infrastructure and describe it at a high level, we could summarise it with the following steps:

Earlier we have shared how to implement monitoring using Telegraf, InfluxDB and Grafana, which is one of two the most popular open source frameworks for data collection and monitoring:

We already talked about InfluxDB and Telegraf: part 1 about SNMP and part 2 about Syslog.

Today we’ll take a look in another key open source framework for metrics collection (monitoring), which is called Prometheus. In fact, InfluxDB and Prometheus are the most widely used time series databases these days.

They have a few major differences (the comparison is by no mean complete and serving purpose to outline differences relevant from our use case perspective):

CategoryPrometheusInfluxDB
Metic collectionPull-mode – Prometheus connects to probe and actively pulls the data from it on a regular basis. Push-mode – InfluxDB is listening for requests and writing data pushed by probes (e.g., Telegraf clients)
Metric typesNumbers – Prometheus stores only numerical data (e.g, counters, etc)Numbers and strings – InfluxDB can store numbers (e.g., counters) as well as text strings (e.g., syslog messages)

Therefore, ultimately the choice of tool is based on your network topology (can you expose your InfluxDB or can you expose Prometheus probes to be accessible) as well as type of the data you want to collect.

As our main focus for this blogpost is Prometheus, we’ll focus on it solely. It is easier to explain how it works not the example of particular use.

Use Case

Think about such a scenario.

You deploy a geo redundant application in a two 3rd party data centres. Your application require a stable communication channel between them, and you want to know for sure that the performance of such a channel is persistent. Also, your application serves customers in Internet and you are interested to check the uplink capacity in your data centre. Ultimately, you would like to do these tests on a regular basis and to see the results over the time and, as an addition, you may be interested in establishing automated alerting in case of performance doesn’t match the defined limit.

Solution Description

In the described scenario, we assume that there is a private connectivity between two data centres existing in one of the overlay VPN forms:

If we have a private connectivity between our hosts, it would in general mean that you can deploy a pull model, where a host running Prometheus can connect to a probe, which collects metrics. The probe in Prometheus world is called an exporter and typical consist of two components:

To address the aforementioned scenario, we will deploy the following setup:

There are 3 hosts in this solution, which in our case VMs running Ubuntu Linux 20.04 LTS, but it can be any other operating system as long as you can have Docker installed and running.

Join our Zero-to-Hero Network Automation Training to master Linux and Docker administration skills.

The further setup is the following:

Looking on the image above, you could see also communication flows:

We hope that both the setup and use cases are clear to you and now you are interested to see how it is deployed.

Implementation Details

It is assumed that you have freshly installed Ubuntu 20.04 LTS on your hosts.

Step #0. Install Docker

These days containers is a standard for packaging applications in Linux, as it gives flexibility and simplicity to manage necessary dependencies. As such, the first step is to have Docker installed:


1
2
3
4
5
$ sudo apt-get update -y && sudo apt-get upgrade -y
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
$ echo   "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
$ sudo apt-get update -y
$ sudo apt-get install docker-ce docker-ce-cli containerd.io -y

Once installed, check that Docker is working by launching a test container:


1
$ sudo docker run hello-world

Consult for further details (e.g., other Linux distributives) with the official Docker documentation.

The final preparatory step is to have docker-compose tool installed, as it is easy and convenient way of managing multiple docker containers with a single command:


1
2
$ sudo curl -L "https://github.com/docker/compose/releases/download/1.29.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
$ sudo chmod +x /usr/local/bin/docker-compose

Enrol to Zero-to-Hero Network Automation Training to master containers and Docker skills

Step #1. Hosts with Exporters

Good news is that there is already a number of open source tools available for us. We mean, that all three pieces of our probing setup (iperf3_exporter, iperf3_server, and speedtest_exporter) are already created. So, you need to create a directory at each host and create the following docker-compose.yaml file:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
$ mkdir performance-monitoring
$ cd performance-monitoring
$ cat docker-compose.yaml
---
version: "3.9"
services:
  iperf3_querier_exporter:
    image: "edgard/iperf3-exporter:latest"
    ports:
      - "9579:9579"
  iperf3_responder:
    image: "networkstatic/iperf3"
    ports:
      - "5201:5201"
    command:
      - "-s"
  speedtest:
    image: "jraviles/prometheus_speedtest:latest"
    ports:
      - "9516:9516"
...

If for various reasons TCP ports 9579, 5201, 9516, are already used, you can modify them in this file and then to amend them in the Prometheus config.

Bring all the container up using docker-compose tool:


1
2
3
4
5
$ sudo docker-compose up -d
[sudo] password for aaa:
Starting performance-monitoring_speedtest_1              ... done
Starting performance-monitoring_iperf3_responder_1       ... done
Starting performance-monitoring_iperf3_querier_exporter_1 ... done

Check that containers with Prometheus exporters for iperf3 and speedtest as well as iperf3 server are up and running:


1
2
3
4
5
$ sudo docker container ls
CONTAINER ID   IMAGE                                  COMMAND                  CREATED       STATUS       PORTS                                       NAMES
7fa699d69c54   edgard/iperf3-exporter:latest          "/bin/iperf3_exporter"   6 hours ago   Up 3 hours   0.0.0.0:9579->9579/tcp, :::9579->9579/tcp   performance-monitoring_iperf3_querier_exporter_1
8b68851bba03   jraviles/prometheus_speedtest:latest   "python -m prometheu…"   6 hours ago   Up 3 hours   0.0.0.0:9516->9516/tcp, :::9516->9516/tcp   performance-monitoring_speedtest_1
96869a6c87a8   networkstatic/iperf3                   "iperf3 -s"              6 hours ago   Up 3 hours   0.0.0.0:5201->5201/tcp, :::5201->5201/tcp   performance-monitoring_iperf3_responder_1

If all of them are up and running, then the probes’ side setup is ready and you can move to a step of prometheus setup itself.

Though it is not shown explicitly, both same config is applied on both monitoring hosts.

Step #2. Host with Prometheus

The setup of the host with Prometheus consists of two pieces:

Both of there are to be located in same directory, so create it:


1
2
$ mkdir prometheus
$ cd prometheus

Let’s start with the first one.

Step #2.1. Prometheus Configuration File

First of all, create a directory for the config file and the config file itself:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
$ mkdir config
$ cat config/prometheus.yaml
---
global:
  scrape_interval: 5m
  scrape_timeout: 2m

scrape_configs:
# iperf3 tests
  - job_name: 'iperf3-probe-1'
    metrics_path: /probe
    static_configs:
      - targets:
        - 192.168.52.72
    params:
      port: ['5201']
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: 192.168.51.72:9579
  - job_name: 'iperf3-probe-2'
    metrics_path: /probe
    static_configs:
      - targets:
        - 192.168.51.72
    params:
      port: ['5201']
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: 192.168.52.72:9579
# speedtest tests
  - job_name: 'speedtest-probe-1'
    metrics_path: /probe
    static_configs:
    - targets:
      - 192.168.51.72:9516
  - job_name: 'speedtest-probe-2'
    metrics_path: /probe
    static_configs:
    - targets:
      - 192.168.52.72:9516
...

Here we have a few parts to digest:

Pay attention that static_configs/targets for iperf3_exporter and speedtest_exporter performs different tasks!

Step #2.2. Docker-compose File for Prometheus

Once the configuration file is ready, you can create the Docker container with Prometheus. However, before you do that, think about the following point.

By default, Docker containers don’t store any data persistently and, therefore, all data existing in them (e.g., collected metrics) are gone upon you tear down or restart the containers. To avoid this data loss, you need to create persistent storage with volumes, which would allow to store the important data. You may think, why should you want to reload containers? The answer is straightforward: the config file created in the previous step is read and processed upon the container launch. Therefore, in order to add new probes or measurements, you would need to restart the container with Prometheus. That’s why you need to take care of persistent storage for your data:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ cat docker-compose.yaml
---
version: "3.9"
services:
  prom:
    image: "prom/prometheus"
    ports:
      - "9090:9090"
    volumes:
      - "./config:/etc/prometheus"
      - "tsdb_persistent:/prometheus"
volumes:
  tsdb_persistent:
...

Like in the previous case, the docker-compose file is used to create the service structure relying on an official Prometheus Docker image, the persistent volume you create and configuration file. You also need a port to connect to the Prometheus instance; therefore, we expose its default port 9090/TCP.

Once done, boot that up:


1
2
3
$ sudo docker-compose up -d
Creating network "prometheus_default" with the default driver
Creating volume "prometheus_tsdb_persistent" with default driver

And check it is working:


1
2
3
$ sudo docker container ls
CONTAINER ID   IMAGE             COMMAND                  CREATED         STATUS         PORTS                                       NAMES
cca15ce2189f   prom/prometheus   "/bin/prometheus --c…"   5 minutes ago   Up 5 minutes   0.0.0.0:9090->9090/tcp, :::9090->9090/tcp   prometheus_prom_1

Earlier in the configuration file we put the metrics collection interval as 5 minutes, so you can go for a brew until some metrics are collected.

Validation

Connect to your IP address and Prometheus port (in our case the IP address of host running Docker container with Prometheus is 192.168.51.71). Navigate to “Status -> Targets” tab:

Once connected, if you have provided at least 5-10 minutes since the start of the container with Prometheus provided that all other steps were done correctly and there are no issues with connectivity towards exporters, you should see the following picture:

If status of all probes up, you can navigate to Graphs tab and choose Graphs data representation. In the query field type the name of the metric (for example, download_speed_bps or upload_speed_bps speedtest) and press Execute button:

In the opened window you can see your collected metrics.

All metrics are shown, but you can filter some out if you need to. For example, you can specify to show metrics collected only from one of the instances.

Besides pure visualisation of metrics, it is possible to apply some math operations directly in Prometheus as well. For example, in order to show the bandwidth measured by iperf3, you need to take two metrics and apply some math operations:

Happy monitoring!

Integration of Grafana and Prometheus would be a topic for a separate blogpost.

Examples in GitHub

You can find this and other examples in our GitHub repository.

Lessons Learned

We have built a lot of monitoring using InfluxDB and Telegraf before, but we haven’t used Prometheus beforehand. At a glance, one of the interesting benefits of Prometheus we have outlined for ourselves is the centralisation of a control plane on the Prometheus side. It makes the operational expenses to manage exporters really low, which is a very important factor in a high-scale environment.

Conclusion

Being one of the two most popular open source frameworks for monitoring these days, network engineers should know how to use Prometheus. We hope that provided examples help you to figure out how you can use it in your environment to improve observability of your IT and network infrastructure and, therefore, improve customer experience as well. Take care and good bye.

Need Help? Contract Us

If you need a trusted and experienced partner to automate your network and IT infrastructure, get in touch with us.

P.S.

If you have further questions or you need help with your networks, we are happy to assist you, just send us a message. Also don’t forget to share the article on your social media, if you like it.

BR,

Anton Karneliuk

Exit mobile version