Site icon Karneliuk

SP. Part 3. Automated Service Provider Fabric with Arista, Cisco, Nokia and Ansible

Hello my friend,

I hope you have wonderful Christmas break and now you are coming back fully energized to rock in 2019. Today we are going to automate service provider network (or how I called it Service Provider Fabric, read below why) in terms of underlay configuration in multi-vendor environment with Cisco IOS XR, Nokia SR OS and Arista EOS network functions.


1
2
3
4
5
No part of this blogpost could be reproduced, stored in a
retrieval system, or transmitted in any form or by any
means, electronic, mechanical or photocopying, recording,
or otherwise, for commercial purposes without the
prior permission of the author.

Brief description

In 2018 we are talked a lot about modern data center architecture running EVPN/VXLAN. As you know, the traditional network business in general is to transmit the traffic from one application to another. In data center context the major focus is on the different components of applications, like front-end/back-end services, various data bases and so on. Regardless of their nature meaning whether they are virtualized or not, there are typically 2 types of the traffic between components of applications:

Data centers have a long development history to current state, starting with VLAN segmentation, STP and routing functions in core to present state, where gateway function is distributed across all Leaf switches (or even hypervisors). The actual service is called “overlay” and is realized using BGP-EVPN signaling with VXLAN encapsulation. The service itself is decoupled from network infrastructure called “underlay”, which main function is to provide connectivity between service termination endpoints (VTEP for VXLAN). Typically, the architecture is 2-Tier with “leaf” switches, which terminate overlay services, and “spine” switches, which performs pure packet forwarding. One “leaf” switch is connected to multiple “spine” switches for load-sharing and resiliency. Such architecture is called “Clos” or “IP fabric”. For more complex scenarios, like hyper-scale data centers or data center interconnect (DCI), the “fabric” could contain 3,4 and more tiers.

Read more details about data center architecture and configuration for Cisco, Nokia, Cumulus and Arista.

If we think for a moment, we can spot that today there is almost no difference in the architecture of the data center and service provider networks (we discussed it once previously), as the key components are similar:

Having said that, it’s also to fair to highlight the differences themselves:

These two facts create additional requirements to service provider network, which aren’t presented in data center world: traffic engineering and QoS.

Summarizing all the information above, the architecture of the data center and service provider network is very similar, that’s why I called it “Service Provider Fabric”. Following the same terminology used for data centers, what makes absolutely sense for me, we have the following building blocks:

That is our Service Provider Fabric. We already shared some examples of IP-VPN and EVPN (E-LAN). There were some minor changes in the topology, that’s why you should always check the most resent state on my GitHub page.

The Service Provider Fabric is multivendor consisting of Arista EOS, Cisco IOS XR and Nokia (Alcatel-Lucent) EOS router with Cumulus Linux acting as CE.

What are we going to test?

The configuration of ISIS with Segment Routing for Nokia, Arista and Cisco as well as BGP configuration was already shown. That’s why it isn’t focus.

In the focus there are two things:

The latter is important, because for Nokia SR OS the YANG modules aren’t published anywhere officially, therefore I don’t use them as I am not allowed to put them to GitHub. On the other hand, Cisco IOS XR YANG modules  and Arista EOS YANG modules are officially published, I just copied corresponding modules to my GitHub, because, as you will see later, original YANG modules are used for automation of the network functions configuration.

Nevertheless, it doesn’t mean that we can’t use NETCONF/YANG on Nokia (Alcatel-Lucent) SR OS network functions, as we have already used it many times. It means that we would need to perform the configuration via CLI, then extract the configuration using NETCONF in XML format and then create Jinja2 template of XML file, which in future will be used for NETCONF configuration. We’ll do that in future, as it requires some time, but for now we stay CLI-based automation via Ansible sros_config module.

Moreover, Nokia doesn’t support Segment Routing in OpenConfig YANG modules currently, so we will have to use Nokia native YANG modules.

Software version

The following software components are used in this lab:

See the previous article to get details how to build the lab

Topology

Starting from now I’m shifting all the schemes in ASCII format. The reason for that is quite simple. I’m manly developing all the automation from CLI in Linux, and having the topology in ASCII format provides me possibility to view it directly in CLI without necessity to switch to any other application, what is very convenient and saves the time. So the management topology looks like the following:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
+---------------------------------------------------------------------------+
|                                                                           |
|    +-------------+          +-----------------+                           |
|    | Internet GW |          | management host |                           |
|    +------+------+          +--------+--------+                           |
|           | eth1                     | ens33                              |
|           | .2                       | .131       OOB: 192.168.141.0/24   |
|           |                          |                                    |
|       +---+-----------+--------------+--------------+-------------+       |
|       |               |              |              |             |       |
|       | BOF (Eth)     | MgmtEth0     | Management1  | Management1 | eth0  |
|       | .101          | .51          | .71          | .72         | .84   |
|   +---+---+       +---+---+      +---+---+      +---+---+     +---+---+   |
|   |  SR1  |       |  XR1  |      |  EOS1 |      |  EOS2 |     |  VX4  |   |
|   +-------+       +-------+      +-------+      +-------+     +-------+   |
|                                                                           |
+---------------------------------------------------------------------------+

Previously this topology was called “physical topology”, but after creating cheat sheet for KVM, it doesn’t matter anymore which hypervisor are you using: KVM, VMware Workstation or Oracle VirtualBox.

The logical topology is the same, as it was used in the previous lab, but more informative this time including IP subnets for interconnect links and SIDs for Segment Routing:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
+-------------------------------------------------------------------------------------------------------------+
|                                                                                                             |
|     L0:                                                                                                     |
|     10.0.0.22/32                                                                                            |
|     fc00::10:0:0:22/128        10.1.22.0/24                                                                 |
|        (Metric: 10)            fc00::10:1:22:0/112                                                          |
|    +---------------+              (Metric: 10)                                                              |
|    | EOS1 (SID:22) |(Eth2)+----------------------+                                                          |
|    +---------------+ .22/:22                     |                                                          |
|          (Eth1)                                  |                                                          |
|            + .22/:22                             + .1/:1       10.1.44.0/24                                 |
|            |                                   (Eth1)          fc00::10:1:44:0/112                          |
|            |  10.22.33.0/24               +--------------+        (Metric: 10)        +----------------+    |
|            |  fc00::10:22:33:0/112        | EOS2 (SID:1) |(Eth3)+--------------+(Eth3)|  SR1 (SID: 44) |    |
|            |     (Metric: 100)            +--------------+ .1:/1              .44/:44 +----------------+    |
|            |                                   (Eth2)                                                       |
|            + .33/:33                             + .1/:1  L0:                         system:               |
|        (G0/0/0/0)                                |        10.0.0.1/32                 10.0.0.44/32          |
|    +---------------+                             |        fc00::10:0:0:1/128          fc00::10:0:0:44/128   |
|    |  XR1 (SID:33) |(G0/0/0/1)+------------------+           (Metric: 10)                (Metric: 10)       |
|    +---------------+ .33/:33      (Metric: 10)                                                              |
|                                10.1.33.0/24                                                                 |
|     L0:                        fc00::10:1:33:0/112                                                          |
|     10.0.0.33/32                                                                                            |
|     fc00::10:0:0:33/128                                                                                     |
|        (Metric: 10)              ISIS 0 (CORE) / level 2 / Segment Routing                                  |
|                                                                                                             |
|                                                ASN: 65000                                                   |
|                                                                                                             |                                              
|                               Full iBGP mesh for VPNV4 UNICAST, VPNV6 UNICAST                               |
|                                 and L2VPN EVPN between loopbacks of all PEs                                 |
|                                                                                                             |
+-------------------------------------------------------------------------------------------------------------+

As you can find in the description to the topology, all 3 PE routers in this topology are coming from different vendors:

The relevant software versions were shown in the beginning of this article.

All the routers are running ISIS routing protocol with Segment Routing. The values are of interfaces metrics as well as Segment Routing SIDs are shown on the diagram above. ISIS is running in multi-topology fashion both for IPv4 and IPv6 address families. Full-mesh iBGP for VPNv4/VPNv6 unicast and EVPN is running between the loopback0 and system interfaces of the PE network functions.

The initial configuration as well as the topologies and all the automations you can find on my GitHub page.

General structure of Service Provider Fabric automation with Ansible

I strongly recommend you to visit my GitHub page to learn the hierarchy of folders and files.

As we speak about automation, I will focus only on respective folder, leaving the rest for your discovery. First of all, let’s take a high level look on the content of the folder with Ansible playbooks:


1
2
3
4
5
6
7
8
9
10
11
12
+--service_provider_fabric
   +--ansible
      +--ansible.cfg
      +--files (folder)
      +--group_vars (folder)
      +--hosts
      +--output (folder)
      +--README.md
      +--roles (folder)
      +--sp_sr.yml
      +--vars (folder)
      +--yang_extractor_config.yml

The content of the folders will be explained later in this part.

As my intention is to make Git a product compete by itself, there are a lot of “README.md” files. For example, README.md in “ansible” folder provides hints how to use playbooks and other valuable information in terms of what is supported, how the configuration is done and so on. File “ansible.cfg” has just one configuration line to remove the problem with SSH caused by constant VNF network starts, so the host key check is disabled:


1
2
3
$ cat ansible.cfg
[defaults]
host_key_checking = False

File “hosts” is Ansible hosts as it lists all the groups and assignment of the network functions to the groups. For the Service Provider Fabric nodes, the grouping is done by network OS version:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
$ cat hosts
[management_host]
localhost

[iosxr]
XR1

[eos]
EOS1
EOS2

[sros]
SR1

[cumulus]
VX4

[fabric:children]
iosxr
eos
sros

The next important part is the platform defaults from “group_vars”, like username/passwords, network OS type as well some platform defaults (right now the label block for Segment Routing only, but could be easily extended in future in case of necessity). To reduce the length of the article, only the tree and one example for Nokia SR OS based PE SR1 is shown:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
+--service_provider_fabric
   +--ansible
      +--group_vars
         +--all
         |  +--main.yml
         +--cumulus
         |  +--main.yml
         +--eos
         |  +--main.yml
         +--iosxr
         |  +--main.yml
         +--README.md
         +--sros
            +--main.yml


$ cat group_vars/sros/main.yml
---
ansible_user: admin
ansible_pass: admin
ansible_ssh_pass: admin
ansible_network_os: sros
platform_defaults:
    mpls_label_ranges:
        - id: sr
          start: 500000
          end: 524287
...

The variables, which are used for configuration of the network functions in the Service Provider Fabric are stored in “vars” folder. I’ve created the structure, what corresponds my needs. Again, it can be easily extended, if needed, and then the corresponding updates must be done in the Jinja2 templates (see later in the “roles”). Below you can see the tree for “vars” and example of the variables for Cisco IOS XR based PE XR1:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
+--service_provider_fabric
   +--ansible
      +--vars
         +--EOS1_underlay.yml
         +--EOS2_underlay.yml
         +--README.md
         +--SR1_underlay.yml
         +--XR1_underlay.yml


$ cat vars/XR1_underlay.yml
---
interfaces:
    - name: GigabitEthernet0/0/0/0
      enabled: true
      ipv4:
          address: 10.22.33.33
          prefix: 24
      ipv6:
          address: fc00::10:22:33:33
          prefix: 112
    - name: GigabitEthernet0/0/0/1
      enabled: true
      ipv4:
          address: 10.1.33.33
          prefix: 24
      ipv6:
          address: fc00::10:1:33:33
          prefix: 112
    - name: Loopback0
      enabled: true
      ipv4:
          address: 10.0.0.33
          prefix: 32
      ipv6:
          address: fc00::10:0:0:33
          prefix: 128
routing:
    - id: ISIS
      tag: CORE
      enabled: true
      af:
          - afi: IPV4
            safi: UNICAST
          - afi: IPV6
            safi: UNICAST
      area_id: "49.0000"
      system_id: "0100.0000.0033"
      is_type: LEVEL_2
      advertise: passive
      interfaces:
          - name: Loopback0
            passive: true
            segment_routing:
                id_type: index
                id_value: 33
          - name: GigabitEthernet0/0/0/0
            passive: false
            type: POINT_TO_POINT
            metric: 100
          - name: GigabitEthernet0/0/0/1
            passive: false
            type: POINT_TO_POINT
    - id: BGP
      tag: BGP
      enabled: true
      asn: 65000
      af:
          - afi: L3VPN_IPV4
            safi: UNICAST
          - afi: L3VPN_IPV6
            safi: UNICAST
          - afi: L2VPN
            safi: EVPN
      multipath: 4
      peer_groups:
          - id: IBGP_PEERS
            peer_asn: 65000
      neighbors:
          - id: 10.0.0.22
            peer_group: IBGP_PEERS
          - id: 10.0.0.44
            peer_group: IBGP_PEERS
mpls:
    - protocol: isis-sr
      enabled: true
...

Looking on the variables, you can imagine, what exactly is templated and automated.

The last piece of information I want to highlight, before talking about the core Ansible playbooks itself, is the OpenConfig YANG modules. I tried to use them wherever possible, but there are certain limitations yet (unfortunately), and as explained earlier not all the vendors not fully support them. OpenConfig YANG modules are used first of all to construct properly Jinja2 templates for JSON files and then to turn JSON files with config into proper XML structure used by NETCONF. These particular OpenConfig YANG modules were extracted via NETCONF directly from the network functions:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
+--service_provider_fabric
   +--ansible
      +--files
         +--openconfig
         |  +--eos
         |  |  +--*.yang (a lot of YANG modules)
         |  |  +--README.md
         |  +--iosxr
         |  |  +--*.yang (a lot of YANG modules)
         |  |  +--README.md
         |  +--README.md
         |  +--sros
         |     +--README.md
         +--README.md

Now we reached the core part, which is the automation itself. It’s proper time, as all the pieces need for it already shown previously. The main playbook is called “service_provider_fabric.yml” and heavily relies on the “roles”:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
+--service_provider_fabric
   +--ansible
      +--roles
      |  +--ovelray_services
      |  |  +--(empty yet)
      |  +--README.md
      |  +--underlay_bgp
      |  |  +--tasks
      |  |  |  +--configuration_loop.yml
      |  |  |  +--main_final.yml
      |  |  |  +--main_temp.yml
      |  |  |  +--main.yml
      |  |  |  +--netconf_xml_modification.yml
      |  |  +--templates
      |  |     +--iosxr-temp-yang.j2
      |  |     +--nexus-temp.j2
      |  |     +--openconfig-network-instance.j2
      |  |     +--sros-temp.j2
      |  +--underlay_bgp
      |     +--tasks
      |     |  +--configuration_loop.yml
      |     |  +--main_final.yml
      |     |  +--main_temp.yml
      |     |  +--main.yml
      |     |  +--netconf_xml_modification.yml
      |     +--templates
      |        +--openconfig-interfaces.j2
      |        +--openconfig-lldp.j2
      |        +--openconfig-network-instance.j2
      |        +--sros-temp.j2
      +--service_provider_fabric.yml

Currently only “underlay_mpls” and “underlay_bgp” roles are ready (though even they can be modified in future to utilize NETCONF for Nokia native YANG modules. Actually, “underlay_mpls” includes not only MPLS (Segment Routing), but also configuration ISIS and LLDP. I won’t go through the explanation of each and every file in these roles, as it will be very time-consuming and boring. Some details were already explained earlier for Ansible and NETCONF, Ansible and OpenConfig. So I’m further developing ideas shared earlier. In a nutshell here happening the following, when we launch “service_provider_fabric.yml” Ansible playbook:

  1. For each and every network function the variables are imported.
  2. Based on the presence or absence of the certain variables, proper templates are used to create JSON configuration files in OpenConfig format for Cisco IOS XR and Arista EOS and CLI for Nokia SR OS.
  3. Using pyang and json2xml JSON configuration files are automatically converted into XML, which can be consumed by ncclient plugin used in Ansible netconf_config module (only for Arista and Cisco).
  4. Some clean up in XML files is performed to remove unnecessary XMLNS and rearrange elements, where the sequence is important, as automatic translation from JSON to XML doesn’t take it into account (only for Arista and Cisco).
  5. Configuration is pushed to the network functions.

This set of activates is valid both for “underlay_mpls” and “underlay_bgp” roles. In a couple of minutes, you will see the details of the execution of the playbook with some verifications.

But before, there is one more helper tool called “yang_extractor_config.yml”. It used to retrieve the configuration in YANG format from the particular network function, which super useful upon the creation of the automation, as even OpenConfig implementation by different vendors varies into details, where typically the devil is hidden as well. The structure is the following:


1
2
3
4
5
6
7
8
9
10
+--service_provider_fabric
   +--ansible
      +--output
      |  +--*.json (could be some files, but not necessary)
      |  +--README.md
      +--roles
      |  +--yang_extractor_config
      |     +--tasks
      |        +--main.yml
      +--yang_extractor_config.yml

Upon execution of Ansible playbook “yang_extractor_config.yml” the following is happening:

  1. Depending on provided tag, configuration in certain OpenConfig YANG module is extracted from the network function
  2. Then it is stored in the “output” folder in JSON format for further evaluation to modify the Jinja2 templates used in the configuration part

The high level explanation is done and we can go to testing and verification of our Service Provider Fabric automation using Ansible.

Automated configuration of underlay infrastructure for Arista, Cisco, Nokia: ISIS and Segment Routing

The initial state of our Service Provider Fabric is just restarted devices with the initial configuration applied from “sp_XXX_initial.conf” files, where XXX is hostname. Let’s briefly check from the management host that all the network functions are reachable:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
$ ping EOS1 -c 1
PING EOS1 (192.168.141.71) 56(84) bytes of data.
64 bytes from EOS1 (192.168.141.71): icmp_seq=1 ttl=64 time=0.265 ms
--- EOS1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.265/0.265/0.265/0.000 ms


$ ping EOS2 -c 1
PING EOS2 (192.168.141.72) 56(84) bytes of data.
64 bytes from EOS2 (192.168.141.72): icmp_seq=1 ttl=64 time=0.490 ms
--- EOS2 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.490/0.490/0.490/0.000 ms


$ ping SR1 -c 1
PING SR1 (192.168.1.101) 56(84) bytes of data.
64 bytes from SR1 (192.168.1.101): icmp_seq=1 ttl=64 time=2.21 ms
--- SR1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 2.212/2.212/2.212/0.000 ms


$ ping XR1 -c 1
PING XR1 (192.168.141.51) 56(84) bytes of data.
64 bytes from XR1 (192.168.141.51): icmp_seq=1 ttl=255 time=0.596 ms

--- XR1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.596/0.596/0.596/0.000 ms

Before we launch the automation, let’s verify the state of the interfaces and MPLS data plane at Arista EOS:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
EOS1#show ip interface brief
Interface              IP Address         Status     Protocol         MTU
Management1            192.168.141.71/24  up         up              1500


EOS1#show mpls lfib route
! Mpls routing is not enabled
! Neither IP nor IPv6 routing is enabled
MPLS forwarding table (Label [metric] Vias) - 0 routes
MPLS next-hop resolution allow default route: False
Via Type Codes:
          M - Mpls Via, P - Pseudowire Via,
          I - IP Lookup Via, V - Vlan Via,
          VA - EVPN Vlan Aware Via, ES - EVPN Ethernet Segment Via,
          VF - EVPN Vlan Flood Via, AF - EVPN Vlan Aware Flood Via,
          NG - Nexthop Group Via
Source Codes:
          S - Static MPLS Route, B2 - BGP L2 EVPN,
          B3 - BGP L3 VPN, R - RSVP,
          P - Pseudowire, L - LDP,
          IP - IS-IS SR Prefix Segment, IA - IS-IS SR Adjacency Segment,
          IL - IS-IS SR Segment to LDP, LI - LDP to IS-IS SR Segment,
          BL - BGP LU, ST - SR TE Policy,
          DE - Debug LFIB

Same commands for Cisco IOS XR and Nokia SR OS:


1
2
3
4
5
Cisco IOS XR# show ipv4 int brief
Nokia SR OS# show router interface

Cisco IOS XR# show mpls forwarding
Nokia SR OS# show router tunnel-table

More information on multi-vendor verification was shown earlier.

Let’s execute Ansible playbook “service_provider_fabric.yml” with tag “underlay_mpls” to build the ISIS/LLDP/Segment Routing infrastructure (don’t forget to define Ansible hosts):


1
2
3
4
5
6
7
8
9
10
11
12
13
$ ansible-playbook service_provider_fabric.yml -i hosts --tags=underlay_mpls

PLAY [fabric] **********************************************************************************************************************************************************

!
! THE OUTPUT IS OMITTED
!

PLAY RECAP *************************************************************************************************************************************************************
EOS1                       : ok=31   changed=20   unreachable=0    failed=0  
EOS2                       : ok=31   changed=19   unreachable=0    failed=0  
SR1                        : ok=7    changed=2    unreachable=0    failed=0  
XR1                        : ok=37   changed=23   unreachable=0    failed=0

All the tasks were performed successfully, meaning our automation and templates contains no error. Let’s check the impact on the service provider fabric using the same verification on Cisco IOS XR running PE XR1:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
RP/0/0/CPU0:XR1#show ipv4 int brief
Fri Jan  4 23:41:45.532 UTC
Interface                      IP-Address      Status          Protocol Vrf-Name
Loopback0                      10.0.0.33       Up              Up       default
MgmtEth0/0/CPU0/0              192.168.141.51  Up              Up       mgmt
GigabitEthernet0/0/0/0         10.22.33.33     Up              Up       default
GigabitEthernet0/0/0/1         10.1.33.33      Up              Up       default
GigabitEthernet0/0/0/2         unassigned      Shutdown        Down     default


RP/0/0/CPU0:XR1#show ipv6 int brief
Fri Jan  4 23:41:53.091 UTC
Loopback0              [Up/Up]
    fe80::51c:d2ff:fefb:9d98
    fc00::10:0:0:33
GigabitEthernet0/0/0/0 [Up/Up]
    fe80::250:56ff:fe26:edcc
    fc00::10:22:33:33
GigabitEthernet0/0/0/1 [Up/Up]
    fe80::250:56ff:fe39:a937
    fc00::10:1:33:33
GigabitEthernet0/0/0/2 [Shutdown/Down]
    Unassigned


RP/0/0/CPU0:XR1#show mpls forwarding
Fri Jan  4 23:41:59.771 UTC
Local  Outgoing    Prefix             Outgoing     Next Hop        Bytes
Label  Label       or ID              Interface                    Switched
------ ----------- ------------------ ------------ --------------- ------------
16001  Pop         SR Pfx (idx 1)     Gi0/0/0/1    10.1.33.1       0
16022  900022      SR Pfx (idx 22)    Gi0/0/0/1    10.1.33.1       0
16044  900044      SR Pfx (idx 44)    Gi0/0/0/1    10.1.33.1       0
24000  Pop         SR Adj (idx 1)     Gi0/0/0/0    10.22.33.22     0
24001  Pop         SR Adj (idx 3)     Gi0/0/0/0    10.22.33.22     0
24002  Pop         SR Adj (idx 1)     Gi0/0/0/0    fe80::20c:29ff:fe48:e05d   \
                                                                   0
24003  Pop         SR Adj (idx 3)     Gi0/0/0/0    fe80::20c:29ff:fe48:e05d   \
                                                                   0
24004  Pop         SR Adj (idx 1)     Gi0/0/0/1    10.1.33.1       0
24005  Pop         SR Adj (idx 3)     Gi0/0/0/1    10.1.33.1       0
24006  Pop         SR Adj (idx 1)     Gi0/0/0/1    fe80::250:56ff:fed8:fdf3   \
                                                                   0
24007  Pop         SR Adj (idx 3)     Gi0/0/0/1    fe80::250:56ff:fed8:fdf3   \
                                                                   0

You see that interfaces are up and running, the proper IPv4 and IPv6 addresses are assigned and in MPLS LFIB there labels for all the SIDs we have in our network, which ultimately means ISIS is OK.

As the routing and MPLS part is working properly, it means that automation is correct. Generally, the logic is so, that you can add new interfaces in the list or changes the provided parameters, and the network functions will be properly configured, as Jinja2 templates in this case are quite flexible. LLDP isn’t checked, but you can verify it on your own, that it’s working.

Automated configuration of underlay infrastructure for Arista, Cisco, Nokia: MP-BGP

The next step in our automation journey is automation of BGP configuration. To get it working we just need to change the tag we use to launch “service_provider_fabric.yml” Ansible playbook to “underlay_bgp” value:


1
2
3
4
5
6
7
8
9
10
11
12
13
[aaa@sand7 ansible]$ ansible-playbook service_provider_fabric.yml -i hosts --tags=underlay_mpls --limit=SR1

PLAY [fabric] **********************************************************************************************************************************************************

!
! THE OUTPUT IS OMITTED
!

PLAY RECAP *************************************************************************************************************************************************************
EOS1                       : ok=7    changed=2    unreachable=0    failed=0  
EOS2                       : ok=7    changed=2    unreachable=0    failed=0  
SR1                        : ok=7    changed=2    unreachable=0    failed=0  
XR1                        : ok=18   changed=12   unreachable=0    failed=0

After the playbook is executed, let’s verify the BGP status at Nokia SR OS based PE SR1:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A:admin@SR1# show router bgp summary
===============================================================================
 BGP Router ID:10.0.0.44        AS:65000       Local AS:65000      
===============================================================================
BGP Admin State         : Up          BGP Oper State              : Up
Total Peer Groups       : 1           Total Peers                 : 2        
Total VPN Peer Groups   : 0           Total VPN Peers             : 0        
Total BGP Paths         : 19          Total Path Memory           : 6384      
 
Total IPv4 Remote Rts   : 0           Total IPv4 Rem. Active Rts  : 0        
Total IPv6 Remote Rts   : 0           Total IPv6 Rem. Active Rts  : 0        
Total IPv4 Backup Rts   : 0           Total IPv6 Backup Rts       : 0        
Total LblIpv4 Rem Rts   : 0           Total LblIpv4 Rem. Act Rts  : 0        
Total LblIpv6 Rem Rts   : 0           Total LblIpv6 Rem. Act Rts  : 0        
Total LblIpv4 Bkp Rts   : 0           Total LblIpv6 Bkp Rts       : 0          
Total Supressed Rts     : 0           Total Hist. Rts             : 0        
Total Decay Rts         : 0        
 
Total VPN-IPv4 Rem. Rts : 0           Total VPN-IPv4 Rem. Act. Rts: 0        
Total VPN-IPv6 Rem. Rts : 0           Total VPN-IPv6 Rem. Act. Rts: 0        
Total VPN-IPv4 Bkup Rts : 0           Total VPN-IPv6 Bkup Rts     : 0        
Total VPN Local Rts     : 0           Total VPN Supp. Rts         : 0        
Total VPN Hist. Rts     : 0           Total VPN Decay Rts         : 0        
 
Total MVPN-IPv4 Rem Rts : 0           Total MVPN-IPv4 Rem Act Rts : 0        
Total MVPN-IPv6 Rem Rts : 0           Total MVPN-IPv6 Rem Act Rts : 0        
Total MDT-SAFI Rem Rts  : 0           Total MDT-SAFI Rem Act Rts  : 0        
Total McIPv4 Remote Rts : 0           Total McIPv4 Rem. Active Rts: 0        
Total McIPv6 Remote Rts : 0           Total McIPv6 Rem. Active Rts: 0        
Total McVpnIPv4 Rem Rts : 0           Total McVpnIPv4 Rem Act Rts : 0        
Total McVpnIPv6 Rem Rts : 0           Total McVpnIPv6 Rem Act Rts : 0        
 
Total EVPN Rem Rts      : 0           Total EVPN Rem Act Rts      : 0        
Total L2-VPN Rem. Rts   : 0           Total L2VPN Rem. Act. Rts   : 0        
Total MSPW Rem Rts      : 0           Total MSPW Rem Act Rts      : 0        
Total RouteTgt Rem Rts  : 0           Total RouteTgt Rem Act Rts  : 0        
Total FlowIpv4 Rem Rts  : 0           Total FlowIpv4 Rem Act Rts  : 0        
Total FlowIpv6 Rem Rts  : 0           Total FlowIpv6 Rem Act Rts  : 0        
Total Link State Rem Rts: 0           Total Link State Rem Act Rts: 0        
Total SrPlcyIpv4 Rem Rts: 0           Total SrPlcyIpv4 Rem Act Rts: 0        

===============================================================================
BGP Summary
===============================================================================
Legend : D - Dynamic Neighbor
===============================================================================
Neighbor
Description
                   AS PktRcvd InQ  Up/Down   State|Rcv/Act/Sent (Addr Family)
                      PktSent OutQ
-------------------------------------------------------------------------------
10.0.0.22
               65000        7    0 00h01m54s 0/0/0 (VpnIPv4)
                            6    0           0/0/0 (VpnIPv6)
                                             0/0/0 (Evpn)
10.0.0.33
               65000        8    0 00h02m02s 0/0/0 (VpnIPv4)
                            7    0           0/0/0 (VpnIPv6)
                                             0/0/0 (Evpn)
-------------------------------------------------------------------------------

To verify the same status of the peering BGP for different address families on Arista EOS or Cisco IOS XR you need a bunch of commands


1
2
3
4
5
6
7
8
9
10
Arista EOS# show bgp vpn-ipv4 summary
Cisco IOS XR# show bgp vpnv4 unicast summary


Arista EOS# show bgp vpn-ipv6 summary
Cisco IOS XR# show bgp vpnv6 unicast summary


Arista EOS# show bgp evpn summary
Cisco IOS XR# show bgp l2vpn evpn summary

As we yet haven’t created any services yet, we don’t advertise any routes, so we can’t verify anything more. But at this point we have fully created underlay configuration of our Service Provider Fabric. We can scale it further in terms of number of network functions, interfaces and BGP peering.

Using helper tool to retrieve the configuration in YANG module

As different vendors have different implementations even of the same OpenConfig YANG modules, it’s important to see those deviations not only using pyang, but also directly in the output from the device (reverse engineering, you know). To do that I added to the pack the corresponding tool, which shows such principle and is working with those models, which are used now in the creation of service provider function. Here how the tool “yang_extractor_config.yml” works:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
$ ansible-playbook yang_extractor_config.yml -i hosts --limit=EOS2,XR1 --tags=oc-if

PLAY [fabric] **********************************************************************************************************************************************************

TASK [Gathering Facts] *************************************************************************************************************************************************
ok: [EOS2]
ok: [XR1]

TASK [yang_extractor_config : COLLECT OC INTERFACES CONFIG AND STATE] **************************************************************************************************
ok: [EOS2]
ok: [XR1]

TASK [yang_extractor_config : SAVE OC NETWORK INSTANCES CONFIG AND STATE] **********************************************************************************************
changed: [EOS2]
changed: [XR1]

PLAY RECAP *************************************************************************************************************************************************************
EOS2                       : ok=3    changed=1    unreachable=0    failed=0  
XR1                        : ok=3    changed=1    unreachable=0    failed=0  


$ more output/oc-if-EOS2.json
{
    "interfaces": {
        "interface": [
            {
                "aggregation": {
                    "config": {
                        "fallback-timeout": "90",
                        "min-links": "0"
                    }
                },
                "config": {
                    "description": "",
                    "enabled": "true",
                    "load-interval": "300",
                    "loopback-mode": "false",
                    "mtu": "0",
                    "name": "Ethernet4",
                    "tpid": "TPID_0X8100",
                    "type": "ethernetCsmacd"
                },
                "ethernet": {
                    "config": {
                        "fec-encoding": {
                            "disabled": "false",
                            "fire-code": "false",
                            "reed-solomon": "false",
                            "reed-solomon544": "false"
                        },
                        "mac-address": "00:00:00:00:00:00",
                        "port-speed": "SPEED_UNKNOWN",
                        "sfp-1000base-t": "false"
                    },
                    "state": {
                        "auto-negotiate": "false",
! FURTHER OUTPUT IS OMITTED

$ more output/oc-if-XR1.json
{
    "interfaces": {
        "interface": [
            {
                "config": {
                    "enabled": "true",
                    "name": "Loopback0",
                    "type": "idx:softwareLoopback"
                },
                "name": "Loopback0",
                "state": {
                    "admin-status": "UP",
                    "enabled": "true",
                    "ifindex": "7",
                    "last-change": "3737",
                    "mtu": "1500",
                    "name": "Loopback0",
                    "oper-status": "UP",
                    "type": "idx:softwareLoopback"
                },
                "subinterfaces": {
                    "subinterface": {
                        "index": "0",
                        "ipv4": {
                            "addresses": {
                                "address": {
                                    "config": {
                                        "ip": "10.0.0.33",
                                        "prefix-length": "32"
                                    },
                                    "ip": "10.0.0.33",
                                    "state": {
                                        "ip": "10.0.0.33",
                                        "origin": "STATIC",
! FURTHER OUTPUT IS OMITTED

Frankly speaking, it’s impossible to overestimate the help of this tool during the creation of all the automations as there were so many mismatches, which must be taken into account into template. If you take a look on any template for OpenConfig, you can see that it’s quite complicated.

The final configurations as well as topology files you can find on my GitHub.

Lessons learned

The main lesson I have taken that there is must be balance between flexibility and hard-coding. On the one hand, all the templates must be flexible enough to anticipate all possible parameters. On the other hand, not all the parameters are really necessary to be configured, so we need to define what we really will changes and let the rest parameters be hard-coded to make them the same across different vendors.

There is no answer to this question, or at least I haven’t found any, as it will always depend on use case. The good thing that we can easily extend the automation to cover missing parameters.

Conclusion

Service provider networks are turning into fabrics now. And fabrics by nature is highly scalable and highly automated platform. In this article today I have shown how to automate Service Provider Fabric using Ansible and all available configuration ways, like OpenConfig YANG, native YANG and CLI. It’s important to understand, which capability which option provide, so that we find the proper mix. It’s also important to find proper balance between hard-coding of certain parameters and variables and to reflect those balance in templates. What is also important is that you have seen the real example of the network abstraction, where the network function representation (file with vars) is fully vendor independent, and it’s only the templates and playbooks, which works to render that abstraction in particular vendor configuration.

Take care and good bye!

Support us






P.S.

If you have further questions or you need help with your networks, I’m happy to assist you, just send me message. Also don’t forget to share the article on your social media, if you like it.

BR,

Anton Karneliuk

Exit mobile version