Hello my friend,

We have discussed so far EVPN with BGP signaling and VXLAN data plane with different scenarios: pure switching, inter-VXLAN routing and even BGP peering from VM to leaf. But all the leafs were Nokia (Alcatel-Lucent) SR OS, as Cisco IOS XRv doesn’t support VXLAN. In this article we’ll replace on leaf with Cumulus Linux VX, so let’s see how we achieve interoperability now.

Disclaimer

I’m not going (and I can’t) thoroughly explain Cumulus Linux structure and operation as it being Linux has lots of different options to be configured. So I just provide my way to do things.

Brief description

My friend @Nicola Arnoldi has shown Cumulus Linux, what is network operation system for white box. In nutshell, white box mean that hardware is capable to be managed by different operations systems, not the one coming from switch manufacturer itself. For instance, Dell, Mellanox and several other vendors produce such white box devices.

But Cumulus Linux, and that it’s great advantage, can work not only on white box switch, but also as KVM or VMWare VM, which is good for automation development or for learning. As it comes from its name, it’s a Linux with additional modules installed for routing, switching etc. From the configuration point of view, Cumulus Linux can be configured either using CLI (called Network Command Line Utility (NCLU)) or by managing configuration files and daemons. The first is easier, of you configure devices manually, whereas the latter is the most suitable for automation.

I personally find this is very promising, as I think it’s the only way how the industry can and should develop further, where software and hardware could come from different vendors. Such approach is called disaggregation and even giants slowly stepping into this world. Why do I think it’s good? Very easy, both HW and SW vendors must develop their products according to standards (RFC, IEEE, OCP, SAI, etc). This will assure high degree of interoperability even across different SW vendors (much higher than now), what makes network deployment much easier and what is key for automation and programmability. The competition between SW vendors will be more tough so they will have to develop useful features quicker, as they understand that they will be easily changed by another SW vendor in case of problems. The latter is relevant also for HW vendors, as you will need to change only switches, without touching logical topology. We, as a customer, are winners in any case.

What we are going to test?

We’ll redo configuration activities, we have done so far with pure Nokia (Alcatel-Lucent) SR OS switches in terms pure switching and inter-VXLAN routing. Due to different implementation of BGP EVPN RFC7432 by Nokia and Cumulus Linux, not all things interoperate smoothly. But we’ll talk about it later together with workaround.

Software version

Newer version of Nokia VSR was used comparing to previous labs!

The following infrastructure is used in my lab:

  • CentOS 7 with python 2.7.
  • Ansible 2.4.2
  • Nokia (Alcatel-Lucent) SR OS 15.0.R7
  • Cumulus Linux VX 3.5.2

See the previous article to get details how to build the lab

Topology

Generally, topology looks the same as we have previously:

The main difference is that instead of Nokia (Alcatel-Lucent) VSR (SR2) we have Cumulus Linux VX (VX2). Cisco IOS XRv routers XR4 and XR4 are the same as they participate only as spine switches to form BGP fabric and emulate client VMs.

The logical topology is absolutely the same:

To make give your sense of configuration comparison, both Nokia (Alcatel-Lucent) SR OS and Cumulus Linux configuration will be shown. Spine function of Cisco IOS XR will be preconfigured (you will see it in initial configuration files). Client function of Cisco IOS XR will be shown in respective chapters:

It isn’t mistake, no initial configuration file for Cumulus Linux is provided as we haven’t deployed it yet. We’ll do it together in the next chapter

Deployment of VM with Cumulus Linux VX and its initial configuration

The first point on our agenda is to get Cumulus Linux VX up and running. To do so we need to download it from the official website. To do so you register at their website. When you log in to the site, you go to the respective link:

Then you need to choose the product (there is only one option though):

After that you got the page, where you need to choose the particular hypervisor. So it’s up to you, but in this article I’ll describe usage of VMWare-based VM:

So far we have downloaded corresponding OVA-image. Now it’s time to deploy it. Actually it’s quite easy in VMWare Player. You need to open downloaded OVA-file and choose the name for new VM and its folder:

After import is done, you need to modify some basic parameters, like interface assignment per virtual network. For this lab only 2 fist are needed (one for OOB management and one for data plane), so pay attention to their assignment. The first interface is actually management. We assign it to VMnet8, which is built-in network (called NAT) and has DHCP server. So we will be able to reach CLI of the newly deployed Cumulus Linux VX using terminal client (I use Putty). All the rest interfaces are data planes. As I told, we’ll use only the one attached to VMnet4, where we have data plane interfaces of Cisco IOS XRv routers and Centos with Nokia (Alcatel-Lucent) VSR.

The rest of the interfaces are just assigned according to my further needs and aren’t covered in this article.

You might notice that Cumulus Linux VX requires quite low resources, just 1 GB RAM. From my other tests, I was able to run 4x Cumulus Linux VX and 2x Cisco IOS XRv routers simultaneously as VMWare VMs having Windows 10 with 8 GB RAM at my laptop.

Having done that we launch our newly created VM and get into CLI:

The default login is “cumulus” and default password is “CumulusLinux!”. Though we can already connect using SSH from our terminal client, we don’t know the IP addresses assigned to management interface, so we need to know it:

1
2
3
4
5
6
7
8
9
.cumulus@cumulus:~$ ifconfig eth0
.eth0      Link encap:Ethernet  HWaddr 00:50:56:35:96:e3
.          inet addr:192.168.44.142  Bcast:192.168.44.255  Mask:255.255.255.0
.          inet6 addr: fe80::250:56ff:fe35:96e3/64 Scope:Link
.          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
.          RX packets:147 errors:0 dropped:0 overruns:0 frame:0
.          TX packets:150 errors:0 dropped:0 overruns:0 carrier:0
.          collisions:0 txqueuelen:1000
.          RX bytes:14957 (14.6 KiB)  TX bytes:20506 (20.0 KiB)

Now we can connect using SSH from our terminal client and proceed with initial configuration:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
.cumulus@cumulus:~$ net add vrf mgmt
.
.*********************************************************************************
.NOTE: Enabling or disabling Management VRF will cause all SSH sessions to
.disconnect on the next 'net commit'. This only happens the first time you do
.a 'net commit' after enabling or disabling Management VRF.
.
.Enabling or disabling Management VRF may interfere with other previously
.configured services which may previously have been using the management interface
.for communication including: NTP, DNS, API, CLAG Backup IP, SNMP. See
.'net example management-vrf' for more practical examples on using Management VRF
.with NCLU.
.*********************************************************************************
.
.cumulus@cumulus:~$ net add hostname VX2
.
.cumulus@cumulus:~$ net add time zone Europe/Berlin

The logic behind the command is quite easy:

  • We start it always with keyword “net”, so Linux know that we refer to NCLU
  • If we want to add something to configuration, we put “add”
  • If we want to remove something was configured, we put “del”

Then just use inline help by pressing tab.

You can review some example directly in CLI by issuing “net example xxx”, where xxx is name of example. Use tab for inline help as well

The good stuff regarding Cumulus Linux VX configuration using CLI (NCLU) is that it has 2 stage commit process, much like Cisco IOS XR or Nokia (Alcatel-Lucent) SR OS in candidate mode. So, we have chance to review our changes before final approval, what we do as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
.cumulus@cumulus:~$ net pending
.--- /etc/network/interfaces     2018-02-25 19:39:50.000000000 +0000
.+++ /var/run/nclu/iface/interfaces.tmp  2018-04-15 13:09:07.852482769 +0000
.@@ -3,10 +3,16 @@
.
. source /etc/network/interfaces.d/*.intf
.
. # The loopback network interface
. auto lo
. iface lo inet loopback
.
. # The primary network interface
. auto eth0
. iface eth0 inet dhcp
.+    vrf mgmt
.+
.+auto mgmt
.+iface mgmt
.+    address 127.0.0.1/8
.+    vrf-table auto
.--- /etc/hosts  2018-02-25 19:39:50.262724416 +0000
.+++ /var/run/nclu/netmisc/etc_hosts     2018-04-15 13:05:33.561313372 +0000
.@@ -1,6 +1,6 @@
. 127.0.0.1      localhost
. ::1            localhost ip6-localhost ip6-loopback
. ff02::1                ip6-allnodes
. ff02::2                ip6-allrouters
.
.-127.0.1.1      cumulus
.+127.0.1.1      VX2
.--- /etc/dhcp/dhclient-exit-hooks.d/dhcp-sethostname    2018-02-23 03:03:41.000000000 +0000
.+++ /var/run/nclu/netmisc/etc_dhcp_sethostname  2018-04-15 13:05:33.561313372 +0000
.@@ -1,11 +1,11 @@
. # This script sets the machine hostname to the hostname sent from the DHCP server.
. # If you want to enable this script, change SETHOSTNAME to "yes"
. # Copyright 2013, 2015, 2017, Cumulus Networks, Inc.  All rights reserved.
.
.-SETHOSTNAME="yes"
.+SETHOSTNAME="no"
.
. if [ $SETHOSTNAME = "yes" ] && [ ! -z $new_host_name ]
. then
.     hostname $new_host_name
.     sed --in-place -e "/127\.0\.1\.1/s/^.*$/127.0.1.1  $new_host_name/" /etc/hosts
. fi
.--- /etc/hostname       2018-02-25 19:39:50.262724416 +0000
.+++ /var/run/nclu/netmisc/etc_hostname  2018-04-15 13:05:33.556313372 +0000
.@@ -1 +1 @@
.-cumulus
.+VX2
.--- /etc/timezone       2018-02-25 19:25:00.089196021 +0000
.+++ /var/run/nclu/time/etc_timezone_scratchpad  2018-04-15 13:06:21.888312713 +0000
.@@ -1 +1 @@
.-Etc/UTC
.+Europe/Berlin
.
.
.
.net add/del commands since the last 'net commit'
.================================================
.
.
.User     Timestamp                   Command
.-------  --------------------------  -------------------------------
.cumulus  2018-04-15 13:05:18.800539  net add vrf mgmt
.cumulus  2018-04-15 13:05:33.563453  net add hostname VX2
.cumulus  2018-04-15 13:06:21.888925  net add time zone Europe/Berlin

The output quite long, but informative. We see in the first its part what exactly being change and in which file (you remember, it’s Linux) and in the end again our commands are shown.

After the review, we commit the changes:

1
.cumulus@cumulus:~$ net commit

As you might have noticed previously, we are disconnected from SSH as we have created management VRF and management port (eth0) is bundled to this VRF automatically, when we create it.

So we reconnect to SSH and now see that hostname as well as context has changed.

1
.cumulus@VX2:mgmt-vrf:~$

The last point we need to do in this chapter is to create interfaces and check reachability of other nodes in our lab. As we have used VLANs extensively, we’ll continue to do it in the current lab as well, so we need to create corresponding configuration. Also we create loopback interface that will be used as VXLAN VTEP and source for BGP session for EVPN:

1
2
3
4
5
.cumulus@VX2:mgmt-vrf:~$ net add interface swp1 bridge vids 23,24
.cumulus@VX2:mgmt-vrf:~$ net add vlan 23 ip address 10.22.33.22/24
.cumulus@VX2:mgmt-vrf:~$ net add vlan 24 ip address 10.22.44.22/24
.cumulus@VX2:mgmt-vrf:~$ net add loopback lo ip address 10.0.0.22/32
.cumulus@VX2:mgmt-vrf:~$ net add loopback lo ipv6 add fc00::10:0:0:22/128

Just to make your familiar with the commands, we check the changes before committing the config:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
.cumulus@VX2:mgmt-vrf:~$ net pending
.--- /etc/network/interfaces     2018-04-15 15:16:13.220690728 +0200
.+++ /var/run/nclu/iface/interfaces.tmp  2018-04-15 15:23:06.211358405 +0200
.@@ -1,18 +1,42 @@
. # This file describes the network interfaces available on your system
. # and how to activate them. For more information, see interfaces(5).
.
. source /etc/network/interfaces.d/*.intf
.
. # The loopback network interface
. auto lo
. iface lo inet loopback
.+    address 10.0.0.22/32
.+    address fc00::10:0:0:22/128
.
. # The primary network interface
. auto eth0
. iface eth0 inet dhcp
.     vrf mgmt
.
.+auto swp1
.+iface swp1
.+    bridge-vids 23 24
.+
.+auto bridge
.+iface bridge
.+    bridge-ports swp1
.+    bridge-vids 23-24
.+    bridge-vlan-aware yes
.+
. auto mgmt
. iface mgmt
.     address 127.0.0.1/8
.     vrf-table auto
.+
.+auto vlan23
.+iface vlan23
.+    address 10.22.33.22/24
.+    vlan-id 23
.+    vlan-raw-device bridge
.+
.+auto vlan24
.+iface vlan24
.+    address 10.22.44.22/24
.+    vlan-id 24
.+    vlan-raw-device bridge
.
.
.
.net add/del commands since the last 'net commit'
.================================================
.
.
.User     Timestamp                   Command
.-------  --------------------------  ------------------------------------------------
.cumulus  2018-04-15 15:20:52.078123  net add interface swp1 bridge vids 23,24
.cumulus  2018-04-15 15:21:15.982722  net add vlan 23 ip address 10.22.33.22/24
.cumulus  2018-04-15 15:21:25.950790  net add vlan 24 ip address 10.22.44.22/24
.cumulus  2018-04-15 15:22:54.462519  net add loopback lo ip address 10.0.0.22/32
.cumulus  2018-04-15 15:23:04.800568  net add loopback lo ipv6 add fc00::10:0:0:22/128

As you see, much more actions are done and strings are added comparing what we have entered as all the necessary dependencies are involved. Now we commit the config:

1
.cumulus@VX2:mgmt-vrf:~$ net commit

Now Cumulus Linux VX based leaf switch VX2 should reach other devices in data centre fabric: spine switches Cisco IOS XRv XR3 and XR4:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
.cumulus@VX2:mgmt-vrf:~$ ping 10.22.33.33
.PING 10.22.33.33 (10.22.33.33) 56(84) bytes of data.
.64 bytes from 10.22.33.33: icmp_seq=1 ttl=255 time=2.00 ms
.^C
.--- 10.22.33.33 ping statistics ---
.1 packets transmitted, 1 received, 0% packet loss, time 0ms
.rtt min/avg/max/mdev = 2.008/2.008/2.008/0.000 ms
.
.
.cumulus@VX2:mgmt-vrf:~$ ping 10.22.44.44
.PING 10.22.44.44 (10.22.44.44) 56(84) bytes of data.
.64 bytes from 10.22.44.44: icmp_seq=1 ttl=255 time=2.70 ms
.^C
.--- 10.22.44.44 ping statistics ---
.1 packets transmitted, 1 received, 0% packet loss, time 0ms
.rtt min/avg/max/mdev = 2.705/2.705/2.705/0.000 ms

Nokia (Alcatel-Lucent) VSR for the moment isn’t reachable as we haven’t implemented routing yet, what will be done in the next section.

The official documentations provides much more details, so please refer there for further information

BGP configuration of leaf switches (Nokia (Alcatel-Lucent) SR OS and Cumulus Linux)

This part consists of two sub-parts:

  • Configuration of BGP for underlay IP fabric
  • Configuration of BGP for overlay EVPN

BGP-based underlay IP fabric

Let’s recap topology what we are going to deploy in this chapter (Nokia (Alcatel-Lucent) SR2 = Cumulus Linux VX2):

The first point on our BGP agenda is to deploy underlay IP fabric so that both leaf switches can reach each our loopbacks:

SR1 – Nokia (Alcatel-Lucent) VSR VX2 – Cumulus Linux VX

A:SR1>edit-cfg# candidate view
=========================
configure
router
policy-option
begin
prefix-list PL_BGP_LO
prefix 10.0.0.0/24 prefix-length-range 32-32
exit
policy-statement RP_BGP_IPV4_UNICAST
default-action drop
exit
entry 10
from
prefix-list PL_BGP_LO
exit
action accept
exit
exit
exit
commit
exit
autonomous-system 65011
router-id 10.0.0.11
ecmp 2
bgp
multipath 2
next-hop-resolution use-bgp-routes
group FABRIC
family ipv4
export RP_BGP_IPV4_UNICAST
authentication-key FABRIC
neighbor 10.11.33.33
peer-as 65001
exit
neighbor 10.11.44.44
peer-as 65002
exit
exit
exit
exit
exit
=========================

cumulus@VX2:mgmt-vrf:~$
net add bgp autonomous-system 65012
net add bgp router-id 10.0.0.22
net add bgp bestpath as-path multipath-relax
net add bgp neighbor 10.22.33.33 remote-as 65001
net add bgp neighbor 10.22.33.33 password FABRIC
net add bgp neighbor 10.22.44.44 remote-as 65002
net add bgp neighbor 10.22.44.44 password FABRIC
net add bgp ipv4 unicast network 10.0.0.22/32
net commit

Spine switches done by Cisco IOS XR are preconfigured. Refer to the first article about BGP/EVPN over VXLAN.

When the configuration is applied and committed, we need to check the BGP states and routes learned so far. To recap at Nokia (Alcatel-Lucent) SR OS we do it as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
.A:SR1# show router bgp summary
.===============================================================================
. BGP Router ID:10.0.0.11        AS:65011       Local AS:65011
.===============================================================================
! output ommited
.BGP Summary
.===============================================================================
.Legend : D - Dynamic Neighbor
.===============================================================================
.Neighbor
.Description
.                   AS PktRcvd InQ  Up/Down   State|Rcv/Act/Sent (Addr Family)
.                      PktSent OutQ
.-------------------------------------------------------------------------------
.10.11.33.33
.               65001       58    0 00h26m03s 2/2/4 (IPv4)
.                           62    0
.10.11.44.44
.               65002       58    0 00h24m07s 2/2/4 (IPv4)
.                           68    0
.-------------------------------------------------------------------------------
.
.
.A:SR1# show router route-table
.===============================================================================
.Route Table (Router: Base)
.===============================================================================
.Dest Prefix[Flags]                            Type    Proto     Age        Pref
.      Next Hop[Interface Name]                                    Metric
.-------------------------------------------------------------------------------
.10.0.0.11/32                                  Local   Local     01h11m38s  0
.       system                                                       0
.10.0.0.22/32                                  Remote  BGP       00h06m17s  170
.       10.11.33.33                                                  0
.10.0.0.22/32                                  Remote  BGP       00h06m17s  170
.       10.11.44.44                                                  0
.10.0.0.33/32                                  Remote  BGP       00h27m41s  170
.       10.11.33.33                                                  0
.10.0.0.44/32                                  Remote  BGP       00h25m45s  170
.       10.11.44.44                                                  0
.10.11.33.0/24                                 Local   Local     01h11m33s  0
.       toXR3                                                        0
.10.11.44.0/24                                 Local   Local     01h11m33s  0
.       toXR4                                                        0
.-------------------------------------------------------------------------------
.No. of Routes: 7
.Flags: n = Number of times nexthop is repeated
.       B = BGP backup route available
.       L = LFA nexthop available
.       S = Sticky ECMP requested
.===============================================================================

So we see all routes present at Nokia (Alcatel-Lucent) VSR: Now let’s make the similar checks at Cumulus Linux:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
.cumulus@VX2:mgmt-vrf:~$ net show bgp ipv4 uni summary
.BGP router identifier 10.0.0.22, local AS number 65012 vrf-id 0
.BGP table version 4
.RIB entries 7, using 1064 bytes of memory
.Peers 2, using 39 KiB of memory
.
.Neighbor        V         AS MsgRcvd MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd
.10.22.33.33     4      65001     193     194        0    0    0 00:09:26            3
.10.22.44.44     4      65002     193     194        0    0    0 00:09:26            3
.
.Total number of neighbors 2
!
!
.cumulus@VX2:mgmt-vrf:~$ net show route
.
.show ip route
.=============
.Codes: K - kernel route, C - connected, S - static, R - RIP,
.       O - OSPF, I - IS-IS, B - BGP, P - PIM, E - EIGRP, N - NHRP,
.       T - Table, v - VNC, V - VNC-Direct, A - Babel,
.       > - selected route, * - FIB route
.
.B>* 10.0.0.11/32 [20/0] via 10.22.33.33, vlan23, 00:09:52
.C>* 10.0.0.22/32 is directly connected, lo, 00:09:59
.B>* 10.0.0.33/32 [20/0] via 10.22.33.33, vlan23, 00:09:52
.B>* 10.0.0.44/32 [20/0] via 10.22.44.44, vlan24, 00:09:52
.C>* 10.22.33.0/24 is directly connected, vlan23, 00:09:59
.C>* 10.22.44.0/24 is directly connected, vlan24, 00:09:59
.
.
.show ipv6 route
.===============
.Codes: K - kernel route, C - connected, S - static, R - RIPng,
.       O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
.       v - VNC, V - VNC-Direct, A - Babel,
.       > - selected route, * - FIB route
.
.C>* fc00::10:0:0:22/128 is directly connected, lo, 00:09:59
.C * fe80::/64 is directly connected, vlan24, 00:09:59
.C * fe80::/64 is directly connected, vlan23, 00:09:59
.C>* fe80::/64 is directly connected, bridge, 00:09:59

Actually I haven’t used show bgp routes so far, as I haven’t issued it at Nokia (Alcatel-Lucent) VSR as it’s very good deployed there. So to check BGP RIB at Cumulus Linux use the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
.cumulus@VX2:mgmt-vrf:~$ net show bgp ipv4 unicast
.BGP table version is 4, local router ID is 10.0.0.22
.Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
.              i internal, r RIB-failure, S Stale, R Removed
.Origin codes: i - IGP, e - EGP, ? - incomplete
.
.   Network          Next Hop            Metric LocPrf Weight Path
.*  10.0.0.11/32     10.22.44.44                            0 65002 65011 i
.*>                  10.22.33.33                            0 65001 65011 i
.*> 10.0.0.22/32     0.0.0.0                  0         32768 i
.*  10.0.0.33/32     10.22.44.44                            0 65002 65011 65001 i
.*>                  10.22.33.33              0             0 65001 i
.*> 10.0.0.44/32     10.22.44.44              0             0 65002 i
.*                   10.22.33.33                            0 65001 65011 65002 i
.
.Displayed  4 routes and 7 total paths

If you compare the output with Cisco IOS XR, IOS or NX-OS, you will spot that structure is exactly the same.

Before going further, let’s check if Nokia (Alcatel-Lucent) SR OS router SR1 can reach Cumulus Linux switch VX2:

1
2
3
4
5
6
.A:SR1# ping 10.0.0.22 source 10.0.0.11 count 1
.PING 10.0.0.22 56 data bytes
.64 bytes from 10.0.0.22: icmp_seq=1 ttl=63 time=3.78ms.
.---- 10.0.0.22 PING Statistics ----
.1 packet transmitted, 1 packet received, 0.00% packet loss
.round-trip min = 3.78ms, avg = 3.78ms, max = 3.78ms, stddev = 0.000ms

The reachability is achieved, let’s move on.

BGP-based overlay EVPN

The next point in our data centre building is to configure BGP signalling for EVPN overlay between our leave switches. The following topology is implemented (as previously, SR2 = VX2 and 65011:123 as well as 65012:123 = 65000:123) :

So, we need to put the following config on the devices:

SR1 – Nokia (Alcatel-Lucent) VSR VX2 – Cumulus Linux VX

A:SR1>edit-cfg# candidate view
=========================
configure
router
bgp
rapid-update
rapid-withdrawal
group OVERLAY
family evpn
authentication-key OVERLAY
neighbor 10.0.0.22
local-address 10.0.0.11
multihop 5
peer-as 65012
exit
exit
exit
exit
exit
=========================

cumulus@VX2:mgmt-vrf:~$
net add bgp neighbor 10.0.0.11 remote-as 65011
net add bgp neighbor 10.0.0.11 update-source lo
net add bgp neighbor 10.0.0.11 password OVERLAY
net add bgp neighbor 10.0.0.11 ebgp-multihop 5
net add bgp l2vpn evpn neighbor 10.0.0.11 activate
net del bgp ipv4 unicast neighbor 10.0.0.11 activate
net add bgp l2vpn evpn advertise-all-vni
net commit

Some words about what’s going on here. We configure eBGP multihop session between loopbacks for L2VPN EVPN address family.

At Nokia (Alcatel-Lucent) VSR there is nothing new comparing we did before. At Cumulus Linux the logic (and commands as you see) is somewhat similar to Cisco IOS, that’s why we need to disable IPv4 unicast address-family for this BGP session, as it’s enabled by default. Additionally in Cumulus Linux we instruct it to advertise all configured VNI.

Let’s briefly check the, if the peering is up:

1
2
3
4
5
6
7
8
9
10
.cumulus@VX2:mgmt-vrf:~$ net show bgp l2vpn evpn summary
.BGP router identifier 10.0.0.22, local AS number 65012 vrf-id 0
.BGP table version 0
.RIB entries 0, using 0 bytes of memory
.Peers 3, using 59 KiB of memory
.
.Neighbor        V         AS MsgRcvd MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd
.10.0.0.11       4      65011     157     162        0    0    0 00:07:43            0
.
.Total number of neighbors 1

The infrastructure is prepared for roll out of the services.

Case #1. Switching within L2 domain (L2 option of EVPN with VXLAN)

In the same sequence, we did before, we start deployment of EVPN-based services with pure switching. The following service topology is to be implemented:

The following configuration is to be implemented on our leaf switches:

SR1 – Nokia (Alcatel-Lucent) VSR VX2 – Cumulus Linux VX

A:SR1>edit-cfg# candidate view
=========================
configure
service
customer 2 create
description TEST_EVPN_TENNANT
exit
vpls 1000789 customer 2 create
vxlan vni 789 create
exit
bgp
route-distinguisher 10.0.0.11:789
route-target export target:65000:789 import target:65000:789
exit
bgp-evpn
vxlan
no shutdown
exit
mpls
shutdown
exit
exit
stp
shutdown
exit
sap 1/1/2:555 create
no shutdown
exit
proxy-arp
no shutdown
exit
proxy-nd
evpn-nd-advertise router
no shutdown
exit
no shutdown
exit
exit
exit
=========================

cumulus@VX2:mgmt-vrf:~$
net add bgp l2vpn evpn vni 789 rd 10.0.0.22:789
net add bgp l2vpn evpn vni 789 route-target both 65000:789
net add vxlan vni789 vxlan id 789
net add vxlan vni789 vxlan local-tunnelip 10.0.0.22
net add vxlan vni789 bridge access 666
net add vxlan vni789 bridge learning off
net add vxlan vni789 bridge arp-nd-suppress on
net add interface swp1 bridge vids 23,24,666
net add vlan 666 ip forward off
net add vlan 666 ipv6 forward off
net commit

Main difference comparing to previous configurations is that we need to enable Proxy-ARP to make interoperability working. We haven’t done it previously, as both Nokia (Alcatel-Lucent) VSRs were working in the same manner. It’s in general good practice to enable proxy-ARP in order to reduce load of administrative in data center IP fabric.

More details of EVPN/VXLAN configuration can be found in the official Cumulus Linux guide.

After the VPN is configured we can briefly check its operational status and main BGP-related parameters. For Nokia (Alcatel-Lucent) VSR we do it as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
.A:SR1# show service id 1000789 base
.===============================================================================
.Service Basic Information
===============================================================================
.Service Id        : 1000789             Vpn Id            : 0
.Service Type      : VPLS
.Name              : (Not Specified)
.Description       : (Not Specified)
.Customer Id       : 2                   Creation Origin   : manual
.Last Status Change: 04/15/2018 15:52:59
.Last Mgmt Change  : 04/15/2018 15:52:59
.Etree Mode        : Disabled
.Admin State       : Up                  Oper State        : Up
.MTU               : 1514                Def. Mesh VC Id   : 1000789
.SAP Count         : 1                   SDP Bind Count    : 0
.Snd Flush on Fail : Disabled            Host Conn Verify  : Disabled
.SHCV pol IPv4     : None
.Propagate MacFlush: Disabled            Per Svc Hashing   : Disabled
.Allow IP Intf Bind: Disabled
.Fwd-IPv4-Mcast-To*: Disabled            Fwd-IPv6-Mcast-To*: Disabled
.Def. Gateway IP   : None
.Def. Gateway MAC  : None
.Temp Flood Time   : Disabled            Temp Flood        : Inactive
.Temp Flood Chg Cnt: 0
.SPI load-balance  : Disabled
.TEID load-balance : Disabled
.Src Tep IP        : N/A
.VSD Domain        :
.
.-------------------------------------------------------------------------------
.Service Access & Destination Points
.-------------------------------------------------------------------------------
.Identifier                               Type         AdmMTU  OprMTU  Adm  Opr
.-------------------------------------------------------------------------------
.sap:1/1/2:555                            q-tag        1518    1518    Up   Up
.===============================================================================
.* indicates that the corresponding row element may have been truncated.
!
!
.A:SR1# show service id 1000789 bgp
.===============================================================================
.BGP Information
.===============================================================================
.Vsi-Import           : None
.Vsi-Export           : None
.Route Dist           : 10.0.0.11:789
.Oper Route Dist      : 10.0.0.11:789
.Oper RD Type         : configured
.Rte-Target Import    : 65000:789            Rte-Target Export: 65000:789
.Oper RT Imp Origin   : configured           Oper RT Import   : 65000:789
.Oper RT Exp Origin   : configured           Oper RT Export   : 65000:789
.PW-Template Id       : None
.-------------------------------------------------------------------------------
.===============================================================================

For Cumulus Linux VX:

1
2
3
4
5
6
7
8
9
10
11
.cumulus@VX2:mgmt-vrf:~$ net show bgp l2vpn evpn vni 789
.VNI: 789 (known to the kernel)
.  Type: L2
.  Tenant VRF: Default-IP-Routing-Table
.  RD: 10.0.0.22:789
.  Originator IP: 10.0.0.22
.  Advertise-gw-macip : No
.  Import Route Target:
.    65000:789
.  Export Route Target:
.    65000:789

Let’s check how this service is working

Case #1. Verification

To check our VPN we need to configure customer VMs, which are emulated by VRFs at Cisco IOS XRv routers. We do it according to the topology provided in the beginning of the previous chapter. Here is the necessary configuration:

XR3 – Cisco IOS XRv XR4 – Cisco IOS XRv

RP/0/0/CPU0:XR3(config)#show conf
!
vrf VM5
address-family ipv4 unicast
!
address-family ipv6 unicast
!
!
interface GigabitEthernet0/0/0/0.555
vrf VM5
ipv4 address 192.168.2.3 255.255.255.0
ipv6 address fc00::192:168:2:1/112
encapsulation dot1q 555
!
end

RP/0/0/CPU0:XR4(config)#show conf
!
vrf VM6
address-family ipv4 unicast
!
address-family ipv6 unicast
!
!
interface GigabitEthernet0/0/0/0.666
vrf VM6
ipv4 address 192.168.2.4 255.255.255.0
ipv6 address fc00::192:168:2:2/112
encapsulation dot1q 666
!
end

Let’s issue ping between these two VMs to check of the connectivity is working:

1
2
3
4
5
6
.RP/0/0/CPU0:XR3#ping vrf VM5 192.168.2.2
.Sun Apr 15 17:32:05.518 UTC
.Type escape sequence to abort.
.Sending 5, 100-byte ICMP Echos to 192.168.2.2, timeout is 2 seconds:
.!!!!!
.Success rate is 100 percent (5/5), round-trip min/avg/max = 1/5/9 ms

It’s always interesting to see, what’s going on in the wire, isn’t it? Wireshark can help:

We have explained previously packet structure.

So we see that it works, what is definitely good. Let’s check the control plane, what BGP contains. On Nokia (Alcatel-Lucent) VSR we have the following output:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
.*A:SR1# show router bgp routes evpn mac
.===============================================================================
. BGP Router ID:10.0.0.11        AS:65011       Local AS:65011
.===============================================================================
. Legend -
. Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
.                 l - leaked, x - stale, > - best, b - backup, p - purge
. Origin codes  : i - IGP, e - EGP, ? - incomplete
.===============================================================================
.BGP EVPN MAC Routes
.===============================================================================
.Flag  Route Dist.         MacAddr           ESI
.      Tag                 Mac Mobility      Label1
.                          Ip Address
.                          NextHop
.-------------------------------------------------------------------------------
.i     10.0.0.11:789       00:50:56:23:b3:34 ESI-0
.      0                   Seq:0             VNI 789
.                          N/A
.                          10.0.0.11
.
.i     10.0.0.11:789       02:65:ff:00:03:3a ESI-0
.      0                   Static            VNI 789
.                          N/A
.                          10.0.0.11
.
.u*>i  10.0.0.22:789       00:50:56:34:71:46 ESI-0
.      0                   Seq:0             VNI 789
.                          fe80::250:56ff:fe34:7146
.                          10.0.0.22
.
.u*>i  10.0.0.22:789       00:50:56:34:71:46 ESI-0
.      0                   Seq:0             VNI 789
.                          fc00::192:168:2:2
.                          10.0.0.22
.
.u*>i  10.0.0.22:789       00:50:56:34:71:46 ESI-0
.      0                   Seq:0             VNI 789
.                          192.168.2.2
.                          10.0.0.22
.
.u*>i  10.0.0.22:789       00:50:56:34:71:46 ESI-0
.      0                   Seq:0             VNI 789
.                          N/A
.                          10.0.0.22
.
.-------------------------------------------------------------------------------
.Routes : 6
.===============================================================================

On Cumulus Linux VX we see the next info in BGP RIB:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
.cumulus@VX2:mgmt-vrf:~$ net show bgp l2vpn evpn route vni 789
.BGP table version is 29, local router ID is 10.0.0.22
.Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
.Origin codes: i - IGP, e - EGP, ? - incomplete
.EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC]:[IPlen]:[IP]
.EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
.EVPN type-5 prefix: [5]:[ESI]:[EthTag]:[IPlen]:[IP]
.
.   Network          Next Hop            Metric LocPrf Weight Path
.*> [2]:[0]:[0]:[48]:[00:50:56:23:b3:34]
.                    10.0.0.11                              0 65011 i
.*> [2]:[0]:[0]:[48]:[00:50:56:34:71:46]
.                    10.0.0.22                          32768 i
.*> [2]:[0]:[0]:[48]:[00:50:56:34:71:46]:[32]:[192.168.2.2]
.                    10.0.0.22                          32768 i
.*> [2]:[0]:[0]:[48]:[00:50:56:34:71:46]:[128]:[fc00::192:168:2:2]
.                    10.0.0.22                          32768 i
.*> [2]:[0]:[0]:[48]:[00:50:56:34:71:46]:[128]:[fe80::250:56ff:fe34:7146]
.                    10.0.0.22                          32768 i
.*> [2]:[0]:[0]:[48]:[02:65:ff:00:03:3a]
.                    10.0.0.11                              0 65011 i
.*> [3]:[0]:[32]:[10.0.0.11]
.                    10.0.0.11                              0 65011 i
.*> [3]:[0]:[32]:[10.0.0.22]
.                    10.0.0.22                          32768 i
.
.Displayed 8 prefixes (8 paths)

We see major difference between Nokia (Alcatel-Lucent) SR OS and Cumulus Linux: the latter does create type-2 MAC-IP routes, whereas the first one doesn’t.

This fact will create our life (and interoperability) very complicated in the next chapter. But for now, as ping is working, communication is established between two VMs. The last check will be to see the data plane information. At Cumulus Linux we see it as follows:

1
2
3
4
5
6
7
8
.cumulus@VX2:mgmt-vrf:~$ net show bridge macs vlan 666
.
.  VLAN  Master    Interface    MAC                  TunnelDest  State      Flags    LastSeen
.------  --------  -----------  -----------------  ------------  ---------  -------  ----------
.   666  bridge    bridge       00:50:56:24:37:46                permanent           00:46:39
.   666  bridge    swp1         00:50:56:34:71:46                                    00:00:17
.   666  bridge    vni789       00:50:56:23:b3:34                           offload  00:20:39
.   666  bridge    vni789       02:65:ff:00:03:3a                static              00:21:04

And Nokia (Alcatel-Lucent) SR OS is programmed correctly as well:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
.A:SR1# show service id 1000789 fdb detail
.===============================================================================
.Forwarding Database, Service 1000789
.===============================================================================
.ServId    MAC               Source-Identifier        Type     Last Change
.                                                     Age
.-------------------------------------------------------------------------------
.1000789   00:50:56:23:b3:34 sap:1/1/2:555            L/59     04/15/18 16:25:05
.1000789   00:50:56:34:71:46 vxlan:                   Evpn     04/15/18 15:59:45
.                           10.0.0.22:789
.-------------------------------------------------------------------------------
.No. of MAC Entries: 2
.-------------------------------------------------------------------------------
.Legend:  L=Learned O=Oam P=Protected-MAC C=Conditional S=Static
.===============================================================================

Now we can go the last chapter that is inter-VXLAN routing in EVPN.

Case #2. Inter-VXLAN routing (L3 option of EVPN with VXLAN – distributed gateway)

In this scenario we’ll deploy distributed gateway, which is one of the most useful options of EVPN. As we now, it’s possible to deploy anycast GW in Nokia (Alcatel-Lucent) SR OS using passive VRRP feature. In Cumulus Linux it’s possible to do it configuring static virtual IP/MAC pare on the interface, much like the passive VRRP. But this solution doesn’t interoperate right now due to different attributes BGP assign to such routes in Nokia (Alcatel-Lucent) SR OS and Cumulus Linux. I’ll configure that, but default gateway on the VMs will point to “physical” MAC/IP, not to the virtual one. In the end of this chapter I’ll show you the problem in the trace.

Here is the first service topology:

And here is the topology for the second one:

Let’s fulfil the requirements:

SR1 – Nokia (Alcatel-Lucent) VSR VX2 – Cumulus Linux VX

A:SR1>edit-cfg# candidate view
=========================
configure
service
vpls 1000123 customer 2 create
allow-ip-int-bind
exit
vxlan vni 123 create
exit
bgp
route-distinguisher 10.0.0.11:123
route-target export target:65000:123 import target:65000:123
exit
bgp-evpn
vxlan
no shutdown
exit
mpls
shutdown
exit
exit
stp
shutdown
exit
sap 1/1/2:111 create
no shutdown
exit
proxy-arp
shutdown
exit
proxy-nd
shutdown
exit
service-name “L2_DOMAIN_1”
no shutdown
exit
vpls 1000456 customer 2 create
allow-ip-int-bind
exit
vxlan vni 456 create
exit
bgp
route-distinguisher 10.0.0.11:456
route-target export target:65000:456 import target:65000:456
exit
bgp-evpn
vxlan
no shutdown
exit
mpls
shutdown
exit
exit
stp
shutdown
exit
sap 1/1/2:333 create
no shutdown
exit
proxy-arp
shutdown
exit
proxy-nd
shutdown
exit
service-name “L2_DOMAIN_2”
no shutdown
exit
vprn 20000000 customer 2 create
route-distinguisher 10.0.0.11:2000
vrf-target export target:65011:456 import target:65012:456
interface “IRB_VXLAN1” create
address 192.168.0.453/24
local-proxy-arp
mac 00:20:00:01:23:01
vpls “L2_DOMAIN_1”
exit
exit
interface “IRB_VXLAN2” create
address 192.168.1.453/24
local-proxy-arp
mac 00:20:00:04:56:01
vpls “L2_DOMAIN_2”
exit
exit
no shutdown
exit
exit
exit
=========================

cumulus@VX2:mgmt-vrf:~$
net add vrf CUST
net add interface swp1 bridge vids 23,24,222,444,666
net add bgp l2vpn evpn advertise-default-gw
net add bgp l2vpn evpn vni 123 rd 10.0.0.22:123
net add bgp l2vpn evpn vni 123 route-target both 65000:123
net add bgp l2vpn evpn vni 456 rd 10.0.0.22:456
net add bgp l2vpn evpn vni 456 route-target both 65000:456
net add vxlan vni123 vxlan id 123
net add vxlan vni123 vxlan local-tunnelip 10.0.0.22
net add vxlan vni123 bridge access 222
net add vxlan vni123 bridge learning off
net add vxlan vni123 bridge arp-nd-suppress on
net add vxlan vni456 vxlan id 456
net add vxlan vni456 vxlan local-tunnelip 10.0.0.22
net add vxlan vni456 bridge access 444
net add vxlan vni456 bridge learning off
net add vxlan vni456 bridge arp-nd-suppress on
net add vlan 222 vrf CUST
net add vlan 222 ip address 192.168.0.454/24
net add vlan 222 ip address-virtual 00:00:5e:00:01:23 192.168.0.450/24
net add vlan 222 ipv6 address fc00::192:168:0:254/112
net add vlan 222 hwaddress 00:20:00:01:23:02
net add vlan 444 vrf CUST
net add vlan 444 ip address 192.168.1.454/24
net add vlan 444 ip address-virtual 00:00:5e:00:04:56 192.168.1.450/24
net add vlan 444 ipv6 address fc00::192:168:1:254/112
net add vlan 444 hwaddress 00:20:00:04:56:02

Much in the same way as in previous part, we have enabled proxy-ARP on both sides. As the configuration of Nokia (Alcatel-Lucent) SR OS was explained previously, we won’t focus on it. So, what are we doing on Cumulus Linux? Here is the list of the actions:

  • Create VRF CUST
  • Allow VLANs for customer on port
  • Advertise VXLAN GW MAC/IP for all VNIs in EVPN
  • Create VTEP for VNI 123, associate it with customer VLAN 222 and associate IP/MAC
  • Create VTEP for VNI 456, associate it with customer VLAN 444 and associate IP/MAC

Generally speaking, we should be ready, so we can proceed with verification.

Case #2. Verification

As in the verification of the previous case, we need first of all to create corresponding configuration on Cisco IOS XR routers XR3 and XR4:

XR3 – Cisco IOS XRv XR4 – Cisco IOS XRv

RP/0/0/CPU0:XR3(config)#show conf
!
vrf VM1
address-family ipv4 unicast
!
address-family ipv6 unicast
!
!
vrf VM3
address-family ipv4 unicast
!
address-family ipv6 unicast
!
!
interface GigabitEthernet0/0/0/0.111
vrf VM1
ipv4 address 192.168.0.3 255.255.255.0
ipv6 address fc00::192:168:0:1/112
encapsulation dot1q 111
!
interface GigabitEthernet0/0/0/0.333
vrf VM3
ipv4 address 192.168.1.3 255.255.255.0
ipv6 address fc00::192:168:1:1/112
encapsulation dot1q 333
!
router static
vrf VM1
address-family ipv4 unicast
0.0.0.0/0 192.168.0.453
!
address-family ipv6 unicast
::/0 fc00::192:168:0:253
!
!
vrf VM3
address-family ipv4 unicast
0.0.0.0/0 192.168.1.453
!
address-family ipv6 unicast
::/0 fc00::192:168:1:253
!
!
!
end

RP/0/0/CPU0:XR4(config)#show conf
!
vrf VM2
address-family ipv4 unicast
!
address-family ipv6 unicast
!
!
vrf VM4
address-family ipv4 unicast
!
address-family ipv6 unicast
!
!
interface GigabitEthernet0/0/0/0.222
vrf VM2
ipv4 address 192.168.0.4 255.255.255.0
ipv6 address fc00::192:168:0:2/112
encapsulation dot1q 222
!
interface GigabitEthernet0/0/0/0.444
vrf VM4
ipv4 address 192.168.1.4 255.255.255.0
ipv6 address fc00::192:168:1:2/112
encapsulation dot1q 444
!
router static
vrf VM2
address-family ipv4 unicast
0.0.0.0/0 192.168.0.454
!
address-family ipv6 unicast
::/0 fc00::192:168:0:254
!
!
vrf VM4
address-family ipv4 unicast
0.0.0.0/0 192.168.1.454
!
address-family ipv6 unicast
::/0 fc00::192:168:1:254
!
!
!
end

As configuration is done, we should be able to reach from VM1 all other VMs in this tenant those are VM2, VM3, VM4:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
.RP/0/0/CPU0:XR3#ping vrf VM1 192.168.0.1
.Sun Apr 15 20:18:02.699 UTC
.Type escape sequence to abort.
.Sending 5, 100-byte ICMP Echos to 192.168.0.1, timeout is 2 seconds:
.!!!!!
.Success rate is 100 percent (5/5), round-trip min/avg/max = 1/1/1 ms
.RP/0/0/CPU0:XR3#ping vrf VM1 192.168.0.2
.Sun Apr 15 20:18:03.809 UTC
.Type escape sequence to abort.
.Sending 5, 100-byte ICMP Echos to 192.168.0.2, timeout is 2 seconds:
.!!!!!
.Success rate is 100 percent (5/5), round-trip min/avg/max = 1/4/9 ms
.RP/0/0/CPU0:XR3#ping vrf VM1 192.168.1.1
.Sun Apr 15 20:18:05.869 UTC
.Type escape sequence to abort.
.Sending 5, 100-byte ICMP Echos to 192.168.1.1, timeout is 2 seconds:
.!!!!
.Success rate is 80 percent (4/5), round-trip min/avg/max = 1/5/9 ms
.RP/0/0/CPU0:XR3#ping vrf VM1 192.168.1.2
.Sun Apr 15 20:18:09.949 UTC
.Type escape sequence to abort.
.Sending 5, 100-byte ICMP Echos to 192.168.1.2, timeout is 2 seconds:
......
.Success rate is 0 percent (0/5)

Suddenly we see that communication is working within the L2 domain across different leaf switches and between different L2 domains connected to the same switch. To recall pure Nokia (Alcatel-Lucent) VSR based case, we didn’t have such problems. When I made tests with two Cumulus Linux VX as leaf switches I also didn’t have any problems. So interoperability seems to be not reached. What can we do?

Long story short. Due to the fact that Nokia (Alcatel-Lucent) SR OS doesn’t create type-2 MAC/IP route, Cumulus Linux can’t populate its ARP table, what is necessary for the routing:

1
2
3
4
5
6
7
8
9
10
11
12
13
.cumulus@VX2:mgmt-vrf:~$ arp -a
.localhost (192.168.2.2) at 00:50:56:34:71:46 [ether] on vlan666
.localhost (10.22.44.44) at 00:50:56:34:71:46 [ether] on vlan24
.localhost (10.22.33.33) at 00:50:56:23:b3:34 [ether] on vlan23
.localhost (192.168.0.253) at 00:20:00:01:23:01 [ether] on vlan222
.localhost (192.168.1.253) at 00:20:00:04:56:01 [ether] on vlan444
.localhost (192.168.0.2) at 00:50:56:34:71:46 [ether] on vlan222
.localhost (192.168.0.1) at  on vlan222
.localhost (192.168.1.2) at 00:50:56:34:71:46 [ether] on vlan444
.localhost (192.168.1.1) at  on vlan444
.localhost (192.168.44.2) at 00:50:56:ec:a9:dd [ether] on eth0
.localhost (192.168.44.1) at 00:50:56:c0:00:08 [ether] on eth0
.localhost (192.168.44.254) at 00:50:56:f5:52:db [ether] on eth0

He trying to get information on local subnet and even on VXLAN but without success (messages 64 and 65 in the following Wireshark output):

If we look into related BGP RIB, we’ll see the MAC/IP route created by VX2 for its attached VM2 as well as for itself (gateway), whereas SR1 creates MAC/IP route only for itself (gateway):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
.cumulus@VX2:mgmt-vrf:~$ net show bgp l2vpn evpn route vni 123
.BGP table version is 33, local router ID is 10.0.0.22
.Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
.Origin codes: i - IGP, e - EGP, ? - incomplete
.EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC]:[IPlen]:[IP]
.EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
.EVPN type-5 prefix: [5]:[ESI]:[EthTag]:[IPlen]:[IP]
.
.   Network          Next Hop            Metric LocPrf Weight Path
.*> [2]:[0]:[0]:[48]:[00:20:00:01:23:01]:[32]:[192.168.0.253]
.                    10.0.0.11                              0 65011 i
.*> [2]:[0]:[0]:[48]:[00:20:00:01:23:02]:[32]:[192.168.0.254]
.                    10.0.0.22                          32768 i
.*> [2]:[0]:[0]:[48]:[00:20:00:01:23:02]:[128]:[fc00::192:168:0:254]
.                    10.0.0.22                          32768 i
.*> [2]:[0]:[0]:[48]:[00:20:00:01:23:02]:[128]:[fe80::220:ff:fe01:2302]
.                    10.0.0.22                          32768 i
.*> [2]:[0]:[0]:[48]:[00:50:56:23:b3:34]
.                    10.0.0.11                              0 65011 i
.*> [2]:[0]:[0]:[48]:[00:50:56:34:71:46]
.                    10.0.0.22                          32768 i
.*> [2]:[0]:[0]:[48]:[00:50:56:34:71:46]:[32]:[192.168.0.2]
.                    10.0.0.22                          32768 i
.*> [2]:[0]:[0]:[48]:[00:50:56:34:71:46]:[128]:[fe80::250:56ff:fe34:7146]
.                    10.0.0.22                          32768 i
.*> [3]:[0]:[32]:[10.0.0.11]
.                    10.0.0.11                              0 65011 i
.*> [3]:[0]:[32]:[10.0.0.22]
.                    10.0.0.22                          32768 i
.
.Displayed 10 prefixes (10 paths)

As Nokia (Alcatel-Lucent) VSR receives corresponding type-2 MAC/IP routes, it can populate its ARP properly:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
.A:SR1# show router 20000000 arp
.===============================================================================
.ARP Table (Service: 20000000)
.===============================================================================
.IP Address      MAC Address       Expiry    Type   Interface
.-------------------------------------------------------------------------------
.192.168.0.1     00:50:56:23:b3:34 02h53m39s Dyn[I] IRB_VXLAN1
.192.168.0.2     00:50:56:34:71:46 00h00m00s Evp[I] IRB_VXLAN1
.192.168.0.253   00:20:00:01:23:01 00h00m00s Oth[I] IRB_VXLAN1
.192.168.0.254   00:20:00:01:23:02 00h00m00s Evp[I] IRB_VXLAN1
.192.168.1.1     00:50:56:23:b3:34 02h53m40s Dyn[I] IRB_VXLAN2
.192.168.1.2     00:50:56:34:71:46 00h00m00s Evp[I] IRB_VXLAN2
.192.168.1.253   00:20:00:04:56:01 00h00m00s Oth[I] IRB_VXLAN2
.192.168.1.254   00:20:00:04:56:02 00h00m00s Evp[I] IRB_VXLAN2
.-------------------------------------------------------------------------------
.No. of ARP Entries: 8
.===============================================================================

There are two possible workarounds for that:

I’d say both of them isn’t suitable for production, so I don’t recommend to deploy both Cumulus Linux and Nokia (Alcatel-Lucent) SR OS as leaf switches inside DC today with the current level of SW development. Chose what you like based on your requirements and knowledge.

Just to show you how the first workaround is implemented:

1
2
.cumulus@VX2:mgmt-vrf:~$ sudo arp -s 192.168.0.1 00:50:56:23:b3:34 -i vlan222
.cumulus@VX2:mgmt-vrf:~$ sudo arp -s 192.168.1.1 00:50:56:23:b3:34 -i vlan444

Now we check ARP table at Cumulus Linux again:

1
2
3
4
5
6
7
8
9
10
11
12
13
.cumulus@VX2:mgmt-vrf:~$ arp -a
.localhost (192.168.2.2) at 00:50:56:34:71:46 [ether] on vlan666
.localhost (10.22.44.44) at 00:50:56:34:71:46 [ether] on vlan24
.localhost (10.22.33.33) at 00:50:56:23:b3:34 [ether] on vlan23
.localhost (192.168.0.253) at 00:20:00:01:23:01 [ether] on vlan222
.localhost (192.168.1.253) at 00:20:00:04:56:01 [ether] on vlan444
.localhost (192.168.0.2) at 00:50:56:34:71:46 [ether] on vlan222
.localhost (192.168.0.1) at 00:50:56:23:b3:34 [ether] PERM on vlan222
.localhost (192.168.1.2) at 00:50:56:34:71:46 [ether] on vlan444
.localhost (192.168.1.1) at 00:50:56:23:b3:34 [ether] PERM on vlan444
.localhost (192.168.44.2) at 00:50:56:ec:a9:dd [ether] on eth0
.localhost (192.168.44.1) at 00:50:56:c0:00:08 [ether] on eth0
.localhost (192.168.44.254) at 00:50:56:f5:52:db [ether] on eth0

MAC of VM1 and VM3 comes from “show interface gig 0/0/0/0” at XR3.

We can try ping again:

1
2
3
4
5
6
.RP/0/0/CPU0:XR3#ping vrf VM1 192.168.1.2
.Sun Apr 15 21:39:03.056 UTC
.Type escape sequence to abort.
.Sending 5, 100-byte ICMP Echos to 192.168.1.2, timeout is 2 seconds:
.!!!!!
.Success rate is 100 percent (5/5), round-trip min/avg/max = 1/4/9 ms

Now it’s working as proper.

Considerations on anycast GW

Just to recap what we have configured already in previous chapter (only relevant configuration is provided:

SR1 – Nokia (Alcatel-Lucent) VSR VX2 – Cumulus Linux VX

A:SR1>edit-cfg# candidate view
=========================
configure
service
vprn 20000000 customer 2 create
interface “IRB_VXLAN1” create
backup 192.168.0.450
ping-reply
traceroute-reply
mac 00:00:5e:00:01:23
exit
exit
interface “IRB_VXLAN2” create
vrrp 1 passive
backup 192.168.1.450
ping-reply
traceroute-reply
mac 00:00:5e:00:04:56
exit
exit
no shutdown
exit
exit
exit
=========================

cumulus@VX2:mgmt-vrf:~$
net add vlan 222 ip address-virtual 00:00:5e:00:01:23 192.168.0.450/24
net add vlan 444 ip address-virtual 00:00:5e:00:04:56 192.168.1.450/24

For a moment I shutdown BGP peering for EVPN between Nokia (Alcatel-Lucent) SR OS router SR1 and Cumulus Linux switch VX2. The following output comes from the latter one in terms of routes (only relevant route for anycast GW is shown):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
.cumulus@VX2:mgmt-vrf:~$ net show bgp l2vpn evpn route vni 123
.BGP table version is 7, local router ID is 10.0.0.22
.Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
.Origin codes: i - IGP, e - EGP, ? - incomplete
.EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC]:[IPlen]:[IP]
.EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
.EVPN type-5 prefix: [5]:[ESI]:[EthTag]:[IPlen]:[IP]
.
.   Network          Next Hop            Metric LocPrf Weight Path
.*> [2]:[0]:[0]:[48]:[00:00:5e:00:01:23]:[32]:[192.168.0.250]
.                    10.0.0.22                          32768 i
.*> .[2]:[0]:[0]:[48]:[00:00:5e:00:01:23]:[128]:[fe80::200:5eff:fe00:123]
.                    10.0.0.22                          32768 i
.*> [2]:[0]:[0]:[48]:[00:20:00:01:23:02]:[32]:[192.168.0.254]
.                    10.0.0.22                          32768 i
.*> .[2]:[0]:[0]:[48]:[00:20:00:01:23:02]:[128]:[fc00::192:168:0:254]
.                    10.0.0.22                          32768 i
.*> .[2]:[0]:[0]:[48]:[00:20:00:01:23:02]:[128]:[fe80::220:ff:fe01:2302]
.                    10.0.0.22                          32768 i
.*> [3]:[0]:[32]:[10.0.0.22]
.                    10.0.0.22                          32768 i
.
.Displayed 6 prefixes (6 paths)
!
!
.cumulus@VX2:mgmt-vrf:~$ net show bgp l2vpn evpn route rd 10.0.0.22:123
.EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC]
.EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
.EVPN type-5 prefix: [5]:[ESI]:[EthTag]:[IPlen]:[IP]
.
.BGP routing table entry for 10.0.0.22:123:[2]:[0]:[0]:[48]:[00:00:5e:00:01:23]:[32]:[192.168.0.250]
.Paths: (1 available, best #1)
.  Not advertised to any peer
.  Route [2]:[0]:[0]:[48]:[00:00:5e:00:01:23]:[32]:[192.168.0.250] VNI 123
.  Local
.    10.0.0.22 from 0.0.0.0 (10.0.0.22)
.      Origin IGP, localpref 100, weight 32768, valid, sourced, local, bestpath-from-AS Local, best
.      Extended Community: ET:8 RT:65000:123 Default Gateway
.      AddPath ID: RX 0, TX 18
.      Last update: Tue Apr 24 21:33:28 2018

Now I bring the BGP peering back and let it converge. After a while I see new route for this MAC/IP. It should be OK, if we have both of them: old one and new. But Cumulus Linux stops advertising this route. Let’s have a closer look:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
.cumulus@VX2:mgmt-vrf:~$ net show bgp l2vpn evpn route vni 123
.BGP table version is 16, local router ID is 10.0.0.22
.Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
.Origin codes: i - IGP, e - EGP, ? - incomplete
.EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC]:[IPlen]:[IP]
.EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
.EVPN type-5 prefix: [5]:[ESI]:[EthTag]:[IPlen]:[IP]
.
.   Network          Next Hop            Metric LocPrf Weight Path
.*> [2]:[0]:[0]:[48]:[00:00:5e:00:01:23]:[32]:[192.168.0.250]
.                    10.0.0.11                              0 65011 i
.*> .[2]:[0]:[0]:[48]:[00:00:5e:00:01:23]:[128]:[fe80::200:5eff:fe00:123]
.                    10.0.0.22                          32768 i
.*> [2]:[0]:[0]:[48]:[00:0c:29:6b:7c:ec]
.                    10.0.0.22                          32768 i
.*> [2]:[0]:[0]:[48]:[00:0c:29:6b:7c:ec]:[128]:[fe80::20c:29ff:fe6b:7cec]
.                    10.0.0.22                          32768 i
.*> [2]:[0]:[0]:[48]:[00:20:00:01:23:01]:[32]:[192.168.0.253]
.                    10.0.0.11                              0 65011 i
.*> [2]:[0]:[0]:[48]:[00:20:00:01:23:02]:[32]:[192.168.0.254]
.                    10.0.0.22                          32768 i
.*> [2]:[0]:[0]:[48]:[00:20:00:01:23:02]:[128]:[fc00::192:168:0:254]
.                    10.0.0.22                          32768 i
.*> [2]:[0]:[0]:[48]:[00:20:00:01:23:02]:[128]:[fe80::220:ff:fe01:2302]
.                    10.0.0.22                          32768 i
.*> [2]:[0]:[0]:[48]:[00:50:56:23:b3:34]
.                    10.0.0.11                              0 65011 i
.*> [3]:[0]:[32]:[10.0.0.11]
.                    10.0.0.11                              0 65011 i
.*> [3]:[0]:[32]:[10.0.0.22]
.                    10.0.0.22                          32768 i
.*> [5]:[0]:[0]:[24]:[192.168.1.0]
.                    10.0.0.11                              0 65011 i
.*> [5]:[0]:[0]:[32]:[192.168.1.253]
.                    10.0.0.11                              0 65011 i
.*> [5]:[0]:[0]:[32]:[192.168.1.254]
.                    10.0.0.11                              0 65011 i
.
.Displayed 14 prefixes (14 paths)
!
!
.cumulus@VX2:mgmt-vrf:~$ net show bgp l2vpn evpn route rd 10.0.0.11:123
.EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC]
.EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
.EVPN type-5 prefix: [5]:[ESI]:[EthTag]:[IPlen]:[IP]
.
.BGP routing table entry for .10.0.0.11:123:[2]:[0]:[0]:[48]:[00:00:5e:00:01:23]:[32]:[192.168.0.250]
.Paths: (1 available, best #1)
.  Advertised to non peer-group peers:
.  10.0.0.11
.  Route [2]:[0]:[0]:[48]:[00:00:5e:00:01:23]:[32]:[192.168.0.250] VNI 123
.  65011
.    10.0.0.11 from 10.0.0.11 (10.0.0.11)
.      Origin IGP, localpref 100, valid, external, bestpath-from-AS 65011, best
.      Extended Community: RT:65000:123 ET:8 MM:0, sticky MAC
.      AddPath ID: RX 0, TX 45
.      Last update: Tue Apr 24 22:04:40 2018

So we see that Nokia (Alcatel-Lucent) SR OS has different set of attributes, which somehow instructs Cumulus Linux to step up. If I would do ping right now, only Nokia SR1 will respond.

Final configuration files: 118_config_final_SR1 118_config_final_VX2 118_config_final_XR3 118_config_final_XR4

Lessons learned

Two major lessons learned I have made for during this lab.

The first one is about interoperability. Unfortunately, not all vendors fully implement RFCs so that when we come to the real practice, things stop working. It could be that Nokia (Alcatel-Lucent) SR OS isn’t developed for data centres, but recent product line, which is called Nokia SR 7750-s, could be quite good in this role.

The second one is the approach Cumulus Linux has for building data center IP fabric using BGP unnumbered. In the configuration of BGP you can just said “neighbor swpX remote-as external”, where “swpX” is just a port towards another switch and “external” meaning any ASN of neighboring switch. What would be not suitable for internet just of security reasons is perfectly fine for ease scale of data center IP fabric.

Conclusion

Though article turned to be quite long, I hope you enjoy reading it. Putting things in different condition where they have to interoperate makes the mind working intensively. I sent a couple of hours trying to make the last case working, before I deeply understood the problem. For the data centre field Cumulus Linux seems to be quite promising as it supports core functions of modern data centre technologies, which are EVPN-BGP and VXLAN. And we have even made it working id the leaf is coming from different vendor, what might happen into reality, when we speak about DCI and different DC coming from different companies historically. Take care and good bye!

P.S.

If you have further questions or you need help with your networks, I’m happy to assist you, just send me message. Also don’t forget to share the article on your social media, if you like it.

Support us





BR,

Anton Karneliuk

Useful links

For Cisco folks, the comparison of Cumulus Linux commands to Cisco NX-OS