Building High-Available Web Services: Open Source Load Balancing Based on HAProxy + FRR and Origin Web Server Based on NGINX Connected to Arista EVPN/VXLAN. Part 2 – Configuration and Validation.

Hello my friend,

In the previous blogpost we have started discussion about the Open Source Load Balancing solution, which leverages HAProxy and FRR, which is connected to the data centre fabric running EVPN/VXLAN on Arista EOS switches and serving content from NGINX-based origin servers. In that blogpost we covered the architectural guidelines and design principles. Today we will cover the configuration and the validation of the solution.


1
2
3
4
5
No part of this blogpost could be reproduced, stored in a 

retrieval system, or transmitted in any form or by any 

means, electronic, mechanical or photocopying, recording, 

or otherwise, for commercial purposes without the 

prior permission of the author.

What Is More Important: Network Technologies or Network Automation?

The truth is that both disciplines are equally important. Knowledge and skills in network technologies will allow you to build great connectivity solutions to empower businesses across the globe and spread its capabilities as nowadays, perhaps, 99% of all business operations leverage IT and network technologies either entirely or at least partially. In its turn, network automation allows to make operational activities (e.g., configuration, troubleshooting, analysis, etc) much more precise and predictable, decreasing the probability of outages or any other malfunctions as much as possible and in the same time reducing time-to-market for new services and configuration allowing the self-service capabilities for basic or complicated networking requests.

So, it is not network technologies or network automation. It is network technologies AND network automation. And that’s where we can help you with.

We offer the following training programs for you:

During these trainings you will learn the following topics:

Success and failure strategies to build the automation tools.
Principles of software developments and the most useful and convenient tools.
Data encoding (free-text, XML, JSON, YAML, Protobuf).
Model-driven network automation with YANG, NETCONF, RESTCONF, GNMI.
Full configuration templating with Jinja2 based on the source of truth (NetBox).
Best programming languages (Python, Bash) for developing automation
The most rock-solid and functional tools for configuration management (Ansible) and Python-based automation frameworks (Nornir).
Network automation infrastructure (Linux, Linux networking, KVM, Docker).
Orchestration of automation workflows with AWX and its integration with NetBox, GitHub, as well as custom execution environments for better scalability.
Collection network data via SNMP and streaming telemetry with Prometheus
Building API gateways with Python leveraging Fast API
Integration of alerting with Slack and your own APIs
… and many more

Moreover, we put all mentions technologies in the context of real use cases, which our team has solved and are solving in various projects in the service providers, enterprise and data centre networks and systems across the Europe and USA. That gives you opportunity to ask questions to understand the solutions in-depth and have discussions about your own projects. And on top of that, each technology is provided with online demos and labs to master your skills thoroughly. Such a mixture creates a unique learning environment, which all students value so much. Join us and unleash your potential.

Start your automation training today.

Solution Implementation

The solution deployment consists of a several steps, which you would need to perform:

Build of the data center fabric with EVPN/VXLAN with Arista EOS.
Setup of origin servers with NGINX.
Implementation of load balancing with HAProxy and FRR.

Step #1. Data Center Fabric

The end-to-end design was covered in the previous blogpost; therefore, in this one we will come with its implementation.

Step #1.1. eBGP Underlay on Arista EOS

First thing first, you would need to create the underlay network, which will be used as a basis for future EVPN/VXLAN overlay services. We decided to go with eBGP-based design following the RFC 7938, which covers the design principles for large-scale data centres. For the purpose of this lab we have created very simple topology, which is may be 1/100 out of the real topology we deployed using the same design:

Following the aforementioned RFCs we have:

All border leafs share the same BGP AS number
All spines per pod (4 spines) share the same BGP AS number
Each leaf has its own AS number as each of them hosts individual servers, each having /31 interconnect prefixes.

The eBGP peering is built using the interfaces’ IPv4 addresses with BFD being used to improve the convergence. From the advertisement perspective, only the Loopback 0 IP address with /32 prefix is announced from the each data center fabric switch. Here is the configuration snippet from c-3-b1 (the configuration on c-3-l1 and c-3-l2 are pretty much the same):


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
c-3-b1#show run interfaces loopback 0

interface Loopback0

   ip address 10.0.255.100/32

!





c-3-b1#show run interfaces ethernet 2

interface Ethernet2

   no switchport

   ip address 10.0.0.1/31

!





c-3-b1#show run section router bgp

router bgp 65001

   router-id 10.0.255.100

   neighbor PG_SPINE peer group

   neighbor PG_SPINE bfd

   neighbor PG_SPINE password ***

   neighbor PG_SPINE send-community extended

   neighbor 10.0.0.0 peer group PG_SPINE

   neighbor 10.0.0.0 remote-as 65002

   !

   address-family evpn

      neighbor PG_SPINE activate

   !

   address-family ipv4

      neighbor PG_SPINE activate

      network 10.0.255.100/32

   !

!

The full configuration is provided in the end of the blogpost.

We don’t specify the parameters of BFD session as they are platform specific (this snippet is taken from Arista vEOS switches, which has certain limitations in terms of how low the timers can be).

Also you may see that we enable EVPN from the beginning, as we will use it later. However, for EVPN to work correctly, you need to send a number of extended communities. Therefore, we enable the exchange of the extended communities for the fabric BGP sessions.

On the spine, the configuration is very similar to leafs/borders with the small amendment: you need to disable the next-hop change for the EVPN address family to maintain the original VTEP IP as an endpoint for EVPN routes:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
c-3-s1#show running-config section router bgp

router bgp 65002

   router-id 10.0.255.101

   neighbor PG_LEAF peer group

   neighbor PG_LEAF bfd

   neighbor PG_LEAF password ***

   neighbor PG_LEAF send-community extended

   neighbor 10.0.0.1 peer group PG_LEAF

   neighbor 10.0.0.1 remote-as 65001

   neighbor 10.0.0.3 peer group PG_LEAF

   neighbor 10.0.0.3 remote-as 65003

   neighbor 10.0.0.5 peer group PG_LEAF

   neighbor 10.0.0.5 remote-as 65004

   !

   address-family evpn

      neighbor PG_LEAF activate

      neighbor PG_LEAF next-hop-unchanged

   !

   address-family ipv4

      neighbor PG_LEAF activate

      network 10.0.255.101/32

Overlay configuration is completed. You can check the content of the BGP neighbors and content of the routing table to ensure it works correctly:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
c-3-b1#show bgp ipv4 unicast summary 

BGP summary information for VRF default

Router identifier 10.0.255.100, local AS number 65001

Neighbor Status Codes: m - Under maintenance

  Neighbor         V  AS           MsgRcvd   MsgSent  InQ OutQ  Up/Down State   PfxRcd PfxAcc

  10.0.0.0         4 65002            428       423    0    0 05:53:10 Estab   3      3





c-3-b1#show ip route



VRF: default

Codes: C - connected, S - static, K - kernel, 

       O - OSPF, IA - OSPF inter area, E1 - OSPF external type 1,

       E2 - OSPF external type 2, N1 - OSPF NSSA external type 1,

       N2 - OSPF NSSA external type2, B - BGP, B I - iBGP, B E - eBGP,

       R - RIP, I L1 - IS-IS level 1, I L2 - IS-IS level 2,

       O3 - OSPFv3, A B - BGP Aggregate, A O - OSPF Summary,

       NG - Nexthop Group Static Route, V - VXLAN Control Service,

       DH - DHCP client installed default route, M - Martian,

       DP - Dynamic Policy Route, L - VRF Leaked,

       RC - Route Cache Route



Gateway of last resort is not set



 C        10.0.0.0/31 is directly connected, Ethernet2

 C        10.0.255.100/32 is directly connected, Loopback0

 B E      10.0.255.101/32 &#91;200/0] via 10.0.0.0, Ethernet2

 B E      10.0.255.102/32 &#91;200/0] via 10.0.0.0, Ethernet2

 B E      10.0.255.103/32 &#91;200/0] via 10.0.0.0, Ethernet2

Step #1.2. EVPN/VXLAN Overlay on Arista EOS

The next step is to get overlay up and running. Four our use case we have designed a dedicated VRF, which hosts the application. This VRF is to be configured on leafs and exits with the same Route Distinguisher (RD) and Route Targets (RT). It is also possible to use different RDs (one per leaf/border), but this is not needed in our scenario.

In our scenario we redistribute interconnect /31 interfaces in BGP so that data center fabric device can know in every point in time where the destination server is connected. The redistribution is controlled by the route-map, which relies on IP prefix-lists for prefix selection. There is a L3 VNI associated with this VRF, which is then mapped to a VTEP.

The following snippet is configured on the c-3-b1 and covers all the mentioned configuration elements (c-3-l1/2 have the very same configuration):


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
c-3-b1#show run interfaces ethernet 1

interface Ethernet1

   no switchport

   vrf CUST_1

   ip address 10.2.0.0/31

!



c-3-b1#show run interfaces vxlan 1

interface Vxlan1

   vxlan source-interface Loopback0

   vxlan udp-port 4789

   vxlan vrf CUST_1 vni 10001

!



c-3-b1#show run section ip prefix-list

ip prefix-list PL_IPV4_CUSTOMER_PREFIXES seq 10 deny 10.0.0.0/16 le 32

ip prefix-list PL_IPV4_CUSTOMER_PREFIXES seq 20 permit 10.0.0.0/8 le 32

!



c-3-b1#show run section route-map

route-map RP_IPV4_CUSTOMER_PREFIXES permit 10

   match ip address prefix-list PL_IPV4_CUSTOMER_PREFIXES

route-map RP_IPV4_CUSTOMER_PREFIXES deny 9999

!



c-3-b1#show run section router bgp

router bgp 65001

   vrf CUST_1

      rd 65000:1

      route-target import evpn 65000:1

      route-target export 65000:1

      route-target export evpn 65000:1

      redistribute connected route-map RP_IPV4_CUSTOMER_PREFIXES

!

Spine c-3-s1 doesn’t have VTEP as it doesn’t terminate neither originate the customer’s traffic. Hence, there is no extra BGP configuration needed either.

If everything was configured correctly, you shall see interconnecting subnets being correctly distributes across leafs and border and installed in the routing table inside the VRF:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
c-3-b1#show ip route vrf CUST_1



VRF: CUST_1

Codes: C - connected, S - static, K - kernel, 

       O - OSPF, IA - OSPF inter area, E1 - OSPF external type 1,

       E2 - OSPF external type 2, N1 - OSPF NSSA external type 1,

       N2 - OSPF NSSA external type2, B - BGP, B I - iBGP, B E - eBGP,

       R - RIP, I L1 - IS-IS level 1, I L2 - IS-IS level 2,

       O3 - OSPFv3, A B - BGP Aggregate, A O - OSPF Summary,

       NG - Nexthop Group Static Route, V - VXLAN Control Service,

       DH - DHCP client installed default route, M - Martian,

       DP - Dynamic Policy Route, L - VRF Leaked,

       RC - Route Cache Route



Gateway of last resort is not set



 B E      10.1.0.0/31 &#91;200/0] via VTEP 10.0.255.102 VNI 10001 router-mac 4a:be:91:df:4f:91

 B E      10.1.0.2/31 &#91;200/0] via VTEP 10.0.255.103 VNI 10001 router-mac 12:c5:a5:ed:70:ab

 B E      10.1.1.0/31 &#91;200/0] via VTEP 10.0.255.102 VNI 10001 router-mac 4a:be:91:df:4f:91

 B E      10.1.1.2/31 &#91;200/0] via VTEP 10.0.255.103 VNI 10001 router-mac 12:c5:a5:ed:70:ab

 C        10.2.0.0/31 is directly connected, Ethernet1

The only piece which is not yet covered in this configuration is the configuration of BGP peering between the load balancer and leafs. It will be covered in the load balancer part later in this blogpost. Besides that, the multi-tenant EVPN/VXLAN data center fabric based on Arista EOS is ready to serve the traffic.

Step #2. Origin Web Servers

The next step is to bring up the origin web servers, which actually will serve the clients’ requests.

Step #2.1. Containerization Engine with Docker

The webserver, as well as the load balancer, is in our opinion great services to be containerized. There is a number of operational benefits to do so, including the following among others:

Removing dependencies between the application’s runtime and infrastructure host.
Simplifying application packaging and versioning.
Improving user experience in the application management.

For the containerization engine we choose Docker, as this is one of the most stable and mature products for such a purpose on the market.

If you are not familiar with Docker but thrive to learn more, enroll to our zero-to-hero network automation training, where you will learn Docker details and even will create your own containerized applications.

All our servers, which we use both for load balancers and for origin web servers run Ubuntu Linux 22.04. As such, we follow the official guide how to install Docker Engine in Ubuntu Linux:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
$ sudo apt-get update -y



$ sudo apt-get install \

    ca-certificates \

    curl \

    gnupg \

    lsb-release -y



$ sudo mkdir -p /etc/apt/keyrings



$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg



$ echo \

  "deb &#91;arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \

  $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list &gt; /dev/null



$ sudo apt-get update -y



$ sudo apt-get install docker-ce docker-ce-cli containerd.io docker-compose-plugin -y

If all is done accurately, you shall be able to see the Docker service up and running:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
$ sudo docker info

Client:

 Context:    default

 Debug Mode: false

 Plugins:

  app: Docker App (Docker Inc., v0.9.1-beta3)

  buildx: Docker Buildx (Docker Inc., v0.8.2-docker)

  compose: Docker Compose (Docker Inc., v2.6.0)

  scan: Docker Scan (Docker Inc., v0.17.0)



Server:

 Containers: 0

  Running: 0

  Paused: 0

  Stopped: 0

 Images: 0

 Server Version: 20.10.17

 Storage Driver: overlay2

  Backing Filesystem: extfs

  Supports d_type: true

  Native Overlay Diff: true

  userxattr: false

 Logging Driver: json-file

 Cgroup Driver: systemd

 Cgroup Version: 2

 Plugins:

  Volume: local

  Network: bridge host ipvlan macvlan null overlay

  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog

 Swarm: inactive

 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc

 Default Runtime: runc

 Init Binary: docker-init

 containerd version: 10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1

 runc version: v1.1.2-0-ga916309

 init version: de40ad0

 Security Options:

  apparmor

  seccomp

   Profile: default

  cgroupns

 Kernel Version: 5.15.0-46-generic

 Operating System: Ubuntu 22.04.1 LTS

 OSType: linux

 Architecture: x86_64

 CPUs: 2

 Total Memory: 3.832GiB

 Name: c-3-o1

 ID: FTU2:ZPIC:ZATK:I5VE:CYIU:HZUW:X4VV:EDPU:3A7N:OF2T:SR65:3JLA

 Docker Root Dir: /var/lib/docker

 Debug Mode: false

 Registry: https://index.docker.io/v1/

 Labels:

 Experimental: false

 Insecure Registries:

  127.0.0.0/8

 Live Restore Enabled: false

In the latest docker version we use, docker compose is installed automatically as a plugin to docker CLI and is executed as “sudo docker compose” instead of old syntax “sudo docker-compose“.

Step #2.2. Origin Web Server with NGINX

For the application, which acts as an origin web server, we are using NGINX per the details shared in the previous blogpost. To get it up and running, need just three files:

HTML static file, which will be used a user content.
Configuration file for the NGINX itself.
Docker-compose file to describe the containerized service.

The directory’s structure looks as follows and it contains the aforementioned three files distributed across 2 sub-directories:


1
2
3
4
5
6
+--nginx

   +--config

   |  +--karneliuk_service.conf

   +--content

   |  +--index.html

   +--docker-compose.yaml

The service description file is really simply one:


1
2
3
4
5
6
7
8
9
10
11
$ cat config/karneliuk_service.conf 

server {

    listen       80;

    listen  &#91;::]:80;

    server_name  localhost;



    location / {

        root   /tmp/content;

        index  index.html index.htm;

    }

}

As you see, we instruct the NGINX service to listen to the port 80 and to serve the file index.html or index.htm (first attempt the former and then the latter) from the directory /tmp/content. Don’t worry that we serve here unencrypted content, our load balancer is terminating the SSL session from customers. This is a common deployment design for online applications, as it allows to spare compute the resources on origin web servers as the encryption/decryption is done outside of them.

You may think, “How does NGINX know that it shall pickup config from karneliuk_service.conf file as its name looks very specific?” We’ll get back to this question shortly. Let’s take a look at two other files we have used at our origin server:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ cat nginx/content/index.html 

&lt;html&gt;

&lt;head&gt;

&lt;title&gt;Best Automation Trainings at Karneliuk.com&lt;/title&gt;

&lt;/head&gt;

&lt;body&gt;

&lt;h1&gt;Test worker&lt;/h1&gt;

&lt;p&gt;Enroll to &lt;a href="http://training.karneliuk.com/"&gt;Network Automation Trainings&lt;/a&gt;.&lt;/p&gt;



&lt;p&gt;For online documentation and support please refer to

&lt;a href="http://nginx.org/"&gt;nginx.org&lt;/a&gt;.&lt;br/&gt;

Commercial support is available at

&lt;a href="http://nginx.com/"&gt;nginx.com&lt;/a&gt;.&lt;/p&gt;



&lt;p&gt;&lt;em&gt;Thank you for using nginx.&lt;/em&gt;&lt;/p&gt;

&lt;/body&gt;

&lt;/html&gt;

As you can see this is static HTML file, which is not altered during being served to a customer.

Finally, the docker-compose.yaml:


1
2
3
4
5
6
7
8
9
10
11
12
13
$ cat nginx/docker-compose.yaml 

---

version: "3.9"

services:

    web:

        image: nginx

        restart: always

        ports:

          - 8080:80

        volumes:

          - ./config:/etc/nginx/conf.d

          - ./content:/tmp/content

...

If you are familiar with Docker, you can see that the container is supposed be like this:

It runs the official nginx image
It maps the external port 8080 to internal 80. It means that traffic shall arrive to the origin server to the destination port 8080/tcp and it will translated to port 80/tcp on the origin server
the container shall be up all the time unless we manually bring it down.
There are two folders from the local server mapped to the corresponding directories inside the container.

Join our Zero-to-Hero Network Automation Training to learn more about Docker and master skills of building your own containerized applications.

Coming back to the question we put above, “How does NGINX know about the config file?” As you could digest from the docker-compose.yaml, we map the directory with our configuration file from the Ubuntu host to the directory /etc/nginx/conf.d inside the container. By default NGINX is looking for the config files there and as long as they follow the correct syntax, it will try use it. That’s the answer.

Once the aforementioned files are created and put in the corresponding folders’ structure, we can launch the container with the origin server:


1
2
$ cd nginx

(nginx)$ sudo docker-compose.yaml

Wait until the download of the image is completed and the container is started. Once that is done, you shall see it up and running:


1
2
3
$ sudo docker container ls

CONTAINER ID   IMAGE     COMMAND                  CREATED       STATUS      PORTS                                   NAMES

39dde6252478   nginx     "/docker-entrypoint.…"   13 days ago   Up 5 days   0.0.0.0:8080-&gt;80/tcp, :::8080-&gt;80/tcp   nginx-web-1

The final validation step is to check if we can actually query the website. For this purpose we use curl, which is a tool built in on almost all Linux distributions by default (or you can install it). We run this request directly from the server, which has NGINX up and running:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ curl -X GET http://127.0.0.1:8080

&lt;html&gt;

&lt;head&gt;

&lt;title&gt;Best Automation Trainings at Karneliuk.com&lt;/title&gt;

&lt;/head&gt;

&lt;body&gt;

&lt;h1&gt;Test worker&lt;/h1&gt;

&lt;p&gt;Enroll to &lt;a href="http://training.karneliuk.com/"&gt;Network Automation Trainings&lt;/a&gt;.&lt;/p&gt;



&lt;p&gt;For online documentation and support please refer to

&lt;a href="http://nginx.org/"&gt;nginx.org&lt;/a&gt;.&lt;br/&gt;

Commercial support is available at

&lt;a href="http://nginx.com/"&gt;nginx.com&lt;/a&gt;.&lt;/p&gt;



&lt;p&gt;&lt;em&gt;Thank you for using nginx.&lt;/em&gt;&lt;/p&gt;

&lt;/body&gt;

The origin web service is up and running correctly, so we can move the load balancer part.

Step #3. Open Source Load Balancer

The load balancer is key component in this solution and in general in application delivery to customers in a reliable fashion. It consists of two main parts:

Routing, which in this case is performed with FRR
Load balancing, which in this case is performed by HA Proxy

Step #3.1. Containerization Engine with Docker

Repeat the step 2.1 on the Linux servers, which will host HAProxy and FRR , to have Docker up and running.

Step #3.2. Routing Engine with FRR

There are two ways how you can implement FRR on the server, depending on how you want it to act:

Install it on the underlying Linux host as daemon. In this case FRR will be able to alter the routing table of the Linux host. In multiple cases it can be a desired outcome, especially when your Linux host is connected to multiple networks and perform routing indeed.
Install it as a container. In this case the FRR is not able to alter the routing table of the Linux host; however, it still is able to announce prefixes via routing protocols (e.g. BGP) so that they are available to the rest parts of the network.

If you analyze our topology carefully, you will see that the load balancer is in fact one-armed: it is connected to the network with just one interface, over single interconnect subnet, which is used to reach default gateway. In such a case, we don’t really need to modify the routing table of the underlying Linux host and, therefore, we decided to run FRR containerized.

To do that, we need to create the corresponding directory and place the docker-compose.yaml file there:


1
2
+--frr

   +--docker-compose.yaml

The docker-compose.yaml looks as follows:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ cat frr/docker-compose.yaml 

---

version: "3.9"

services:

    frr:

        image: frrouting/frr

        network_mode: host

        restart: always

        volumes:

          - frr_config:/etc/frr

        privileged: true



volumes:

  frr_config:

    driver: local

...

In order to store the configuration persistently, we create a local drive, which will store the configuration file of the FRR application and routing daemons.

Start the container:


1
2
$ cd frr

$ sudo docker compose up -d

Check that is successfully up and running:


1
2
3
4
$ sudo docker container ls

&#91;sudo] password for aaa: 

CONTAINER ID   IMAGE           COMMAND                  CREATED       STATUS             PORTS     NAMES

913377c8d65e   frrouting/frr   "/sbin/tini -- /usr/…"   13 days ago   Up 5 days                    frr-frr-1

The configuration of FRR is split in to major parts:

daemons, which needs to be enabled for some functionality to work.
configuration applied to the corresponding daemons.

In our case we need to have 2 daemons: BGP and BFD. BGP is needed to announce the VIP address to the network, and BFD is needed to quickly turn down the BGP session and, as a result, to remove the VIP from the data center fabric.

We enable the daemons by modifying the configuration files inside the persistent docker storage:


1
2
$ sudo sed -i 's/bgpd=no/bgpd=yes/' /var/lib/docker/volumes/frr_frr_config/_data/daemons

$ sudo sed -i 's/bfdd=no/bfdd=yes/' /var/lib/docker/volumes/frr_frr_config/_data/daemons

The one could say that this is an anti-pattern for the containerized applications: however, it does the job perfectly in our case.

Once the daemons are enabled, you would need to restart the Docker application:


1
$ sudo docker compose restart

Finally, you can configure the FRR application:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
$ sudo docker container exec -it  frr-frr-1 vtysh

!

! Config is done inside FRR application inside the container

!

configure terminal

    ip prefix-list PL_IPV4_VIP seq 10 permit 10.1.0.254/32

    !

    route-map RP_IPV4_ADVERTISE_VIP permit 10

        match ip address prefix-list PL_IPV4_VIP

    !

    route-map RP_IPV4_ADVERTISE_VIP deny 9999

    !

    ip prefix-list PL_IPV4_DEF_ROUTE seq 10 permit 0.0.0.0/0

    !

    route-map RP_IPV4_RECIEVE_DEF permit 10

        match ip address prefix-list PL_IPV4_DEF_ROUTE

    !

    route-map RP_IPV4_RECIEVE_DEF deny 9999

    !

    router bgp 65100

        bgp router-id 10.1.0.1

        bgp log-neighbor-changes

        !

        neighbor 10.1.0.0 remote-as 65003

        neighbor 10.1.0.0 bfd

        !

        address-family ipv4 unicast

            network 10.1.0.254/32

            !

            neighbor 10.1.0.0 activate

            neighbor 10.1.0.0 route-map RP_IPV4_ADVERTISE_VIP out

            neighbor 10.1.0.0 route-map RP_IPV4_RECIEVE_DEF in

        !

!

Once the configuration is done, don’t forget to save the config:


1
$ sudo docker container exec -it  frr-frr-1 vtysh -c 'copy run start'

As you can see, the configuration of FRR is very similar to the one you’ve done via CLI, something that is called these days as a “industry CLI“.

On the second load balancer, we would add AS_PATH prepending to the route map, which is controlling announcement to the fabric, to control ourselves on which load balancer the traffic be sent. Such an approach permits us to have per VIP granularity, so that we can use one LB as a primary for one VIP and another for another VIP:


1
2
3
4
5
6
7
8
! CONFIGURATION IS TRUNCATED FOR BREVITY

!

    route-map RP_IPV4_ADVERTISE_VIP permit 10

        match ip address prefix-list PL_IPV4_VIP

        set as-path prepend 65100 65100 65100 65100 65100

    !

!

! CONFIGURATION IS TRUNCATED FOR BREVITY

The rest configuration is identical.

Once you have created configuration for routing from the load balancer side, you should create one on the data center leafs as well. Here is the example for c-3-l1 (for c-3-l2 it will be the same, just IP address will be different):


1
2
3
4
5
6
7
8
9
10
11
12
13
14
enable

!

configure terminal

    router bgp 65003

        vrf CUST_1

            neighbor 10.1.0.1 remote-as 65100

            !

            address-family ipv4

                neighbor 10.1.0.1 activate

                neighbor 10.1.0.1 default-originate always

            !

        !

    !

!

You may need to disable checksum offload “sudo ethtool -K ens19 tx off rx off” on load balancers, where ens19 is a name of the interface towards DC fabric.

If everything has been done correctly you should be able to see the BGP session up:


1
2
3
4
5
6
c-3-l1#show bgp ipv4 unicast summary vrf CUST_1 

BGP summary information for VRF CUST_1

Router identifier 10.1.1.0, local AS number 65003

Neighbor Status Codes: m - Under maintenance

  Neighbor         V  AS           MsgRcvd   MsgSent  InQ OutQ  Up/Down State   PfxRcd PfxAcc

  10.1.0.1         4 65100         174257    204606    0    0    5d23h Estab   1      1

Step #3.3. Load Balancing with HAProxy

Final piece is to make the configuration of the HAProxy. As the FRR, we will run it containerized. To do that, we create the following hierarchy of directories:


1
2
3
4
5
6
7
8
+--haproxy

   +--config

   |  +--haproxy.cfg

   +--ssl

   |  +--key.pem

   |  +--cert.pem

   |  +--joined.pem

   +--docker-compose.yaml

First of all, let’s generate the self-signed certificate, which we will use to terminate the customer connectivity:


1
2
3
$ mkdir -p haproxy/config

$ openssl req -x509 -newkey rsa:4096 -keyout haproxy/ssl/key.pem -out haproxy/ssl/cert.pem -sha256 -days 365 -nodes -subj '/CN=test-website.lab.karneliuk.com'

cat ssl/cert.pem  ssl/key.pem &gt; ssl/joined.pem

HAProxy requires the key and the certificate to be in the same file; as such, we simply merger them:


1
$ cat haproxy/ssl/cert.pem haproxy/ssl/key.pem &gt; haproxy/ssl/joined.pem

The last file you will attach to the frontend configuration of the load balancer.

The next part is the configuration of load balancer:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
$ cat haproxy/config/haproxy.cfg

global

    maxconn 4096

    log /dev/log local0



defaults

    timeout connect 5s

    timeout client 30s

    timeout server 30s

    retries 3

    log global

    mode http

    option httplog

    maxconn 3000



frontend webservers_vip

    bind 10.1.0.254:8080

    bind 10.1.0.254:8443 ssl crt /etc/ssl/joined.pem

    http-request redirect location https://10.1.0.254:8443 unless { ssl_fc }

    default_backend webservers_origin



peers sticktables

    bind :10000

    server c-3-lb1

    server c-3-lb2 10.1.0.3:10000

    table sticky-sessions type ip size 1m



backend webservers_origin

    balance roundrobin

    stick match src table sticktables/sticky-sessions

    stick store-request src table sticktables/sticky-sessions

    server c-3-o1 10.1.1.1:8080 check

    server c-3-o2 10.1.1.3:8080 check

The configuration of the load balancer consits of several parts, but at a high level the following three are of particular interest:

frontend contains the VIP part, listened ports, mapping it to SSL certificate and redirection rules from HTTP to HTTPS. You also specify here what backend is used to send the traffic to.
backend contains the defitions of the servers, which attracts the traffic as well as well as the load balancing mechanism and, in our case, stickiness configuration.
peers specify how to reach other load balancers in the multi-active cluster to syncronise the state of the sessions. That is needed for the failover scenario to ensure that the standby load balancer will send customer’s session to exactly the same backend node (web origin server) as a primary one.

Finally, the docker-compose file:


1
2
3
4
5
6
7
8
9
10
11
12
$ cat haproxy/docker-compose.yaml 

---

version: "3.9"

services:

    haproxy:

        image: haproxy

        restart: always

        network_mode: host

        volumes:

          - ./config/haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg:ro

          - ./ssl:/etc/ssl

...

Join our Zero-to-Hero Network Automation Training to master Docker skills

Launch the docker container:


1
$ sudo docker compose up -d

And check that it is up and running:


1
2
3
4
aaa@c-3-lb1:~$ sudo docker container ls

&#91;sudo] password for aaa: 

CONTAINER ID   IMAGE           COMMAND                  CREATED       STATUS       PORTS     NAMES

141815a9029a   haproxy         "docker-entrypoint.s…"   3 hours ago   Up 2 hours             haproxy-haproxy-1

Congratulations. You have made it till the end of the configuration and can check now how the whole solution is working.

Validation

The ultimate success criteria for the whols solution is to check, wether you can reach from the host c-3-e1 the origin server via load balancer. In other words, we are checking that:

the load balancer is reachable over the network
the origin web server is reachable via the network
the load balancer terminates SSL sessions
the load balancer sends the customer request to the origin web server

Step #1. Normal operation

The validation can be very simple: just hit curl towards the VIP and port (use -k key to skip the validation of the SSL certificates, which would fail for the self-signed certificates anyway):


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ curl -X GET https://10.1.0.254:8443 -k

&lt;html>

&lt;head>

&lt;title>Best Automation Trainings at Karneliuk.com&lt;/title>

&lt;/head>

&lt;body>

&lt;h1>Test worker&lt;/h1>

&lt;p>Enroll to &lt;a href="http://training.karneliuk.com/">Network Automation Trainings&lt;/a>.&lt;/p>



&lt;p>For online documentation and support please refer to

&lt;a href="http://nginx.org/">nginx.org&lt;/a>.&lt;br/>

Commercial support is available at

&lt;a href="http://nginx.com/">nginx.com&lt;/a>.&lt;/p>



&lt;p>&lt;em>Thank you for using nginx.&lt;/em>&lt;/p>

&lt;/body>

&lt;/html>

You can also check if the redirect is working:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ curl -X GET http://10.1.0.254:8080 -k -L

&lt;html>

&lt;head>

&lt;title>Best Automation Trainings at Karneliuk.com&lt;/title>

&lt;/head>

&lt;body>

&lt;h1>Test worker&lt;/h1>

&lt;p>Enroll to &lt;a href="http://training.karneliuk.com/">Network Automation Trainings&lt;/a>.&lt;/p>



&lt;p>For online documentation and support please refer to

&lt;a href="http://nginx.org/">nginx.org&lt;/a>.&lt;br/>

Commercial support is available at

&lt;a href="http://nginx.com/">nginx.com&lt;/a>.&lt;/p>



&lt;p>&lt;em>Thank you for using nginx.&lt;/em>&lt;/p>

&lt;/body>

&lt;/html>

As you see, you need to add also the “-L” key as by default the curl doesn’t follow the redirect.

To see that the load balancer is actually working in a sticky mode, we will do the following amendment: in the static html file we have created before we will add a some info about the origin webserver, so that in the respones we can see where it comes from:


1
2
3
4
5
6
7
8
9
10
11
&lt;html>

&lt;head>

&lt;title>Best Automation Trainings at Karneliuk.com&lt;/title>

&lt;/head>

&lt;body>

&lt;h1>Test worker&lt;/h1>

&lt;p>Enroll to &lt;a href="http://training.karneliuk.com/">Network Automation Trainings&lt;/a>.&lt;/p>

&lt;p><strong>origin server 1</strong>&lt;/p>

&lt;p>&lt;em>Thank you for using nginx.&lt;/em>&lt;/p>

&lt;/body>

&lt;/html>

Same amendment we’ll do on the second server. We don’t need to restart the Docker container with NGINX, the files are stored on the Linux server, not inside the container.

So if we do multiple reuquests now, you see that all of them are landed to the same server:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
c-3-e1$ curl -X GET https://10.1.0.254:8443 -k -L

&lt;html>

&lt;head>

&lt;title>Best Automation Trainings at Karneliuk.com&lt;/title>

&lt;/head>

&lt;body>

&lt;h1>Test worker&lt;/h1>

&lt;p>Enroll to &lt;a href="http://training.karneliuk.com/">Network Automation Trainings&lt;/a>.&lt;/p>

&lt;p>origin server 2&lt;/p>

&lt;p>&lt;em>Thank you for using nginx.&lt;/em>&lt;/p>

&lt;/body>

&lt;/html>



c-3-e1$ curl -X GET https://10.1.0.254:8443 -k -L

&lt;html>

&lt;head>

&lt;title>Best Automation Trainings at Karneliuk.com&lt;/title>

&lt;/head>

&lt;body>

&lt;h1>Test worker&lt;/h1>

&lt;p>Enroll to &lt;a href="http://training.karneliuk.com/">Network Automation Trainings&lt;/a>.&lt;/p>

&lt;p>origin server 2&lt;/p>

&lt;p>&lt;em>Thank you for using nginx.&lt;/em>&lt;/p>

&lt;/body>

&lt;/html>



c-3-e1$ curl -X GET https://10.1.0.254:8443 -k -L

&lt;html>

&lt;head>

&lt;title>Best Automation Trainings at Karneliuk.com&lt;/title>

&lt;/head>

&lt;body>

&lt;h1>Test worker&lt;/h1>

&lt;p>Enroll to &lt;a href="http://training.karneliuk.com/">Network Automation Trainings&lt;/a>.&lt;/p>

&lt;p>origin server 2&lt;/p>

&lt;p>&lt;em>Thank you for using nginx.&lt;/em>&lt;/p>

&lt;/body>

&lt;/html>

This is the confirmation that the stickiness is working. Otherwise, as the load balancing mechanism is round-robin, all the requests will be placed to the same server.

Step #2. Failover

Before tesing failolver, let’s check the status of the routing table inside the VRF at our border leaf:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
c-3-b1#show ip route vrf CUST_1



VRF: CUST_1

Codes: C - connected, S - static, K - kernel, 

       O - OSPF, IA - OSPF inter area, E1 - OSPF external type 1,

       E2 - OSPF external type 2, N1 - OSPF NSSA external type 1,

       N2 - OSPF NSSA external type2, B - BGP, B I - iBGP, B E - eBGP,

       R - RIP, I L1 - IS-IS level 1, I L2 - IS-IS level 2,

       O3 - OSPFv3, A B - BGP Aggregate, A O - OSPF Summary,

       NG - Nexthop Group Static Route, V - VXLAN Control Service,

       DH - DHCP client installed default route, M - Martian,

       DP - Dynamic Policy Route, L - VRF Leaked,

       RC - Route Cache Route



Gateway of last resort is not set



 B E      10.1.0.0/31 &#91;200/0] via VTEP 10.0.255.102 VNI 10001 router-mac 4a:be:91:df:4f:91

 B E      10.1.0.2/31 &#91;200/0] via VTEP 10.0.255.103 VNI 10001 router-mac 12:c5:a5:ed:70:ab

 B E      10.1.0.254/32 &#91;200/0] via VTEP 10.0.255.102 VNI 10001 router-mac 4a:be:91:df:4f:91

 B E      10.1.1.0/31 &#91;200/0] via VTEP 10.0.255.102 VNI 10001 router-mac 4a:be:91:df:4f:91

 B E      10.1.1.2/31 &#91;200/0] via VTEP 10.0.255.103 VNI 10001 router-mac 12:c5:a5:ed:70:ab

 C        10.2.0.0/31 is directly connected, Ethernet1

You can see that VIP is reachable via next-hop 10.0.255.102, which is a loopback IP used for VTEP at c-3-l1. Now, let’s disable the port from c-3-l1 towards the load balancer, which will terminate the BGP session and shall cause re-routing. Thanks to BFD, it will be quick:


1
2
3
4
c-3-l1#conf t

c-3-l1(config)#int eth2

c-3-l1(config-if-Et2)#shut

c-3-l1(config-if-Et2)#Sep  4 12:07:07 c-3-l1 Bgp: %BGP-3-NOTIFICATION: sent to neighbor 10.1.0.1 (VRF CUST_1 AS 65100) 6/6 (Cease/other configuration change) 0 bytes

As soon as this happens, you shall be able to see the backup route availble to VIP:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
c-3-b1#show ip route vrf CUST_1



VRF: CUST_1

Codes: C - connected, S - static, K - kernel, 

       O - OSPF, IA - OSPF inter area, E1 - OSPF external type 1,

       E2 - OSPF external type 2, N1 - OSPF NSSA external type 1,

       N2 - OSPF NSSA external type2, B - BGP, B I - iBGP, B E - eBGP,

       R - RIP, I L1 - IS-IS level 1, I L2 - IS-IS level 2,

       O3 - OSPFv3, A B - BGP Aggregate, A O - OSPF Summary,

       NG - Nexthop Group Static Route, V - VXLAN Control Service,

       DH - DHCP client installed default route, M - Martian,

       DP - Dynamic Policy Route, L - VRF Leaked,

       RC - Route Cache Route



Gateway of last resort is not set



 B E      10.1.0.2/31 &#91;200/0] via VTEP 10.0.255.103 VNI 10001 router-mac 12:c5:a5:ed:70:ab

 B E      10.1.0.254/32 &#91;200/0] via VTEP 10.0.255.103 VNI 10001 router-mac 12:c5:a5:ed:70:ab

 B E      10.1.1.0/31 &#91;200/0] via VTEP 10.0.255.102 VNI 10001 router-mac 4a:be:91:df:4f:91

 B E      10.1.1.2/31 &#91;200/0] via VTEP 10.0.255.103 VNI 10001 router-mac 12:c5:a5:ed:70:ab

 C        10.2.0.0/31 is directly connected, Ethernet1

Which is pointing towards the VTEP at c-3-l2.

Now repeat request from the c-3-e1:


1
2
3
4
5
6
7
8
9
10
11
12
$ curl -X GET https://10.1.0.254:8443 -k -L

&lt;html>

&lt;head>

&lt;title>Best Automation Trainings at Karneliuk.com&lt;/title>

&lt;/head>

&lt;body>

&lt;h1>Test worker&lt;/h1>

&lt;p>Enroll to &lt;a href="http://training.karneliuk.com/">Network Automation Trainings&lt;/a>.&lt;/p>

&lt;p>origin server 2&lt;/p>

&lt;p>&lt;em>Thank you for using nginx.&lt;/em>&lt;/p>

&lt;/body>

&lt;/html>

You see that request is landing on the same server.

GitHub repository

You can find all the configuration files in our GitHub repo.

Lessons Learned

The biggest learning curve for us was about the HAProxy itself. Being very popular and powerfull tool, its documentation in many cases is relatively complicated. As such, we had to go through many different sources in order to hash out together all the descriptions to be able to build a working solution. However, as usual, the harder you study, the easier is in production.

Summary

The blogpots appered to be somewhat lengthy and contained a lot of configurations from different areas. At the same time, it shows the real complexity of the application delivery networks, not even touching TCP optimisation or caching. The important point here is that you can achieve really good, stable, and feature reach network solution leveraging solely Open Source components. We truly believe that HAProxy + FRR bundly is very strong one, as it allows to achieve load balancing without the need to stretch nasty L2 segments even within the data centre. Reach us out if you have any questions about the solution. Take care and good bye.

Need Help? Contract Us

If you need a trusted and experienced partner to automate your network and IT infrastructure, get in touch with us.

P.S.

If you have further questions or you need help with your networks, we are happy to assist you, just send us a message. Also don’t forget to share the article on your social media, if you like it.

BR,

Anton Karneliuk

Building High-Available Web Services: Open Source Load Balancing Based on HAProxy + FRR and Origin Web Server Based on NGINX Connected to Arista EVPN/VXLAN. Part 2 – Configuration and Validation.

What Is More Important: Network Technologies or Network Automation?

Solution Implementation

Step #1. Data Center Fabric

Step #1.1. eBGP Underlay on Arista EOS

Step #1.2. EVPN/VXLAN Overlay on Arista EOS

Step #2. Origin Web Servers

Step #2.1. Containerization Engine with Docker

Step #2.2. Origin Web Server with NGINX

Step #3. Open Source Load Balancer

Step #3.1. Containerization Engine with Docker

Step #3.2. Routing Engine with FRR

Step #3.3. Load Balancing with HAProxy

Validation

Step #1. Normal operation

Step #2. Failover

GitHub repository

Lessons Learned

Summary

Need Help? Contract Us

P.S.

Anton Karneliuk

Leave a Reply Cancel Reply

Read Next

Building High-Available Web Services: Open Source Load Balancing Based on HAProxy + FRR and Origin Web Server Based on NGINX Connected to Arista EVPN/VXLAN. Part 1.

pygnmi 2. How to use pyGNMI?

Scheduling the project (critical path method + fast tracking + crashing)

Building High-Available Web Services: Open Source Load Balancing Based on HAProxy + FRR and Origin Web Server Based on NGINX Connected to Arista EVPN/VXLAN. Part 2 – Configuration and Validation.

What Is More Important: Network Technologies or Network Automation?

Solution Implementation

Step #1. Data Center Fabric

Step #1.1. eBGP Underlay on Arista EOS

Step #1.2. EVPN/VXLAN Overlay on Arista EOS

Step #2. Origin Web Servers

Step #2.1. Containerization Engine with Docker

Step #2.2. Origin Web Server with NGINX

Step #3. Open Source Load Balancer

Step #3.1. Containerization Engine with Docker

Step #3.2. Routing Engine with FRR

Step #3.3. Load Balancing with HAProxy

Validation

Step #1. Normal operation

Step #2. Failover

GitHub repository

Lessons Learned

Summary

Need Help? Contract Us

P.S.

Anton Karneliuk

Leave a Reply Cancel Reply

Read Next

Building High-Available Web Services: Open Source Load Balancing Based on HAProxy + FRR and Origin Web Server Based on NGINX Connected to Arista EVPN/VXLAN. Part 1.

pygnmi 2. How to use pyGNMI?

Scheduling the project (critical path method + fast tracking + crashing)

Kubernetes 001. Building Cluster on Ubuntu Linux with Docker and Calico in 2022

Sliding Sidebar