Hello my friend,
In the previous blogpost we have shared how some thoughts how you can parse the CSV file and how in general to work with external files. But the beauty of the programming languages including Python, is that there are always more than one way of doing things. And with learning it more, you are opening new ways.
Automate all the things
Raise of the 5G in the Service Provider world, micro services in Data Centres and mobility in Enterprise networks significantly changes the expectations about the way the network operate and the pace the changes are implemented. It is impossible to meet those expectation without automation.
At our network automation training, either self-paced or instructor lead, you will learn the leading technologies, protocols, and tools used to manage the networks in the busiest networks worldwide, such as Google data centres. However, once you master all the skills, you will be able to automate the network of any scale. You will see the opportunities and you will exploit them.
Secret words: NETCONF, REST API, gRPC, JSON , XML, Protocol buffers, SSH, OpenConfig, Python, Ansible, Linux, Docker; and many other wonderful tools and techniques are waiting for you in our training!
What are we going to do today?
Today we speak about parsing the CSV again. It is not that we like the CSV that much; however, it is better to compare the technologies using the same basis. As such, today you will learn how to:
- Import standard Python modules and utilise their built-in variables, functions and classes.
- Parse the CSV files using the module csv.
- Provide the variables as the arguments to the Python script using the module sys.
The lab setup isn’t changed since the very beginning, so refer to it if you need guidance how to get Python 3.8 in Linux.
The code of the previous labs you can get in our GitHub repo.
Why does it matter?
Despite it is possible to create all the code and algorithms yourself, you can save very huge amount of time by reusing the libraries, which are already created. If you don’t like how the work, or you don’t trust the certain methods implemented there, it is OK to recreate your own. However, if you think about any infrastructure libraries, such as module to work with SSH to configure the network element, or to manage the network via the gRPC, you should better use the created, because they are created by professional community of the programmes with all the relevant tests. One of the aspects, why it is preferred to use the existing libraries is the performance. By default Python has not the best performance. However, there are a lot of libraries, which Python relies on, created using Cython. That requires though for you to understand the C, which is way more complicated comparing to the ordinary Python.
All in all, you are creating the Python programs to solve the certain tasks. So, why not to focus on your tasks and not to offload the low-level tasks for workers created to do so?
How are we doing that?
One of the categorisation, you can think of in terms of modules, is whether they are built-it or external:
- Built-in means that the module comes as a part of the standard Python distribution and shall not be installed. There are a numerous built-in modules. In today’s blogpost you will learn about them
- External means that the module shall be installed via the package installer for Python (pip). You will learn about such modules in the next blogpost.
The list of the standard Python modules, you can find on the official webpage.
The algorythim of working with any module is quite a simple one:
- Read the documentation and understand which artefacts (variables, functions, classes, etc) you need to import
- Import them in your code
- Use them
In today’s exercise we will implement the following scenario:
- The name of the CSV file to parse shall be provided as an argument upon the Python’s script launch. The sys module helps here.
- If the file is not provided, then the application should stop the execution providing the error that there is no file’s path provided. The same sys module helps here as well
- If the file has an ‘.csv’ extension, it shall be opened and parsed using the specific csv module.
- If the file has an extension other than ‘.csv’, which is verified using the re module, the script shall stop the execution providing an error as well.
#1. Importing modules
First thing first: you need to import the Python modules (sometimes there are also called libraries) in your script. There are multiple methods to do that; hence, it is quite difficult to show all of them however, these tree are the most widely used.
The first one is just to use the keyword import followed by the name of the module:
1 import module_name
This approach imports all the artefacts from the module and retains the name prefix of the module. It means, to call any module from the destination module you should call it as module_name.function(). On the one hand, the commands will be longer, but you will be able to use any artefact available in the module and they won’t overlap in terms of names with your own variables or functions.
The second approach is to import only a specific artefact using from module_name import artefact syntax:
1 from module_name import function
This approach is more selective as it gives you explicitly what you need. You also don’t retain here the module name, so you call the function shortly as function(). However, if you need any other artefact from the destination module, you will need to add it explicitly again.
The third approach is to import all the artefacts from the destination module without retaining the module’s namespace using from module_name import * syntax:
1 from module_name import *
This will give you possibility to use of the artefacts from the module without the necessity to call the module name; so, simple function(). However, you would need to check in advance, whether the names of the variables or functions you have defined in your script don’t overlap with the artefacts in the module you import.
The most widely used and safe are the first and the second approaches
To show you this different approaches, we will all the three in this exercise:
1
2
3
4
5
6
7 $ cat csv_reader.py
#!/usr/local/bin/python3.8
# Modules
import sys
from re import *
from csv import DictReader
Here some explanations:
- The module sys provides possibility to use some variables and functions tightly bundled with the interpreter.
- The module re contains all the functions and variables to work with the regular expressions in Python.
- The module csv is a purpose-built module to ease the work with the CSV tables in Python.
Let’s take a closer look how those modules can help us to implement the described algorithm.
#2. Providing arguments to the Python script (sys module)
The first external artefact we will use is the variable argv from the sys. This variable is a list, which contains all the arguments provided to launch of the Python’s script including the script’s name itself as a first element with index 0. Take a look on such a basic example:
1
2
3
4
5
6
7
8
9
10 $ cat csv_reader.py
#!/usr/local/bin/python3.8
# Modules
import sys
from re import *
from csv import DictReader
# Body
print(sys.argv)
And the result of its execution:
1
2 $ ./csv_reader.py
['./csv_reader.py']
The next argument you provide (the separator between the arguments is space) will be added as a next element in this list:
1
2 $ ./csv_reader.py some_data.csv
['./csv_reader.py', 'some_data.csv']
Using the len() function, you check the amount of the variable provided to the algorithm and using the if-conditional make a decision how to proceed further:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17 $ cat csv_reader.py
#!/usr/local/bin/python3.8
# Modules
import sys
from re import *
from csv import DictReader
# Body
if len(sys.argv) < 2:
pass
elif len(sys.argv) > 2:
pass
else:
pass
In this logical tree we define three possible outcomes:
- If you have provided less than 1 argument (2 including the script’s name itself), then script terminate its execution with a custom error.
- Same is applicable if you have provided more than 1 argument. However, in this case the custom error should be different; hence, another branch.
- In any other case, meaning you have provided explicitly 1 argument, the script shall continue its execution.
In our network automation training we share the real-life workflows deploys in Python, which you can start using in your networks.
#3. Stopping the script execution with a customised error message (sys module)
The module sys has a specific function called exit(), which allows you to terminate the execution of the script with a custom text string describing error or a numeric code. In case the text is provided, then the code will be 1. So, you now need to replace the pass in the first two if-conditions:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17 $ cat CEX/12/csv_reader.py
#!/usr/local/bin/python3.8
# Modules
import sys
from re import *
from csv import DictReader
# Body
if len(sys.argv) < 2:
sys.exit('You haven\'t provided the path to the file.')
elif len(sys.argv) > 2:
sys.exit('You have provied too many arguments.')
else:
pass
Like in the previous example, as have imported sys using the keyword import, we need to call the elements using the prefix: sys.argv, sys.exit(). Let’s verify how the script works:
1
2
3
4
5
6
7
8
9
10 $ ./csv_reader.py some_data.csv abc
You have provied too many arguments.
$ ./csv_reader.py
You haven't provided the path to the file.
$ ./csv_reader.py some_data.csv
$
As we haven’t implemented any workflow for the correct scenario, you don’t see any output there. However, in the two scenarios, where you provide improper amount of arguments, you will see the corresponding error notifying the user about the error.
#4. Verifying the string’s content (re module)
The next step is to verify, what is the extension of the file. Per the scenario, the file should have a ‘.csv’ extension. The module re has a function match(), which requires two arguments:
- The pattern you are looking for following the regular expression syntax.
- The string, where you are are looking for the pattern.
If the pattern is found, then the result of the function execution is True; if not, then it is False.
Applying that for our scenario, we need to use sys.exit() in case we have an extension different to ‘.csv’. Otherwise, we will execute the Python’s script further.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21 $ cat csv_reader.py
#!/usr/local/bin/python3.8
# Modules
import sys
from re import *
from csv import DictReader
# Body
if len(sys.argv) < 2:
sys.exit('You haven\'t provided the path to the file.')
elif len(sys.argv) > 2:
sys.exit('You have provied too many arguments.')
else:
if match('.*\.csv$', sys.argv[1]):
pass
else:
sys.exit('The provided file doesn\'t have \'.csv\' extension.')
As we have imported the re module using from … import * syntax, we don’t need to retain the prefix.
#5. Parsing the CSV file (csv module)
The step in this scenario is to use a purpose-created module csv, specifically its artefact called DictReader(), which is a class in a nutshell. This class allows you to create an object, which coverts the CSV file into a list of dictionaries, where each dictionary is number of key/value pairs, where key names come from the header of the CSV table.
To show how it works, you can use the same table you have used in the previous lab:
1
2
3
4 $ cat some_data.csv
Id,Name,Interface,Speed,Encapsulation,VLAN,IPv4,IPv6
1,DE-DB-1,eth0,1000,none,none,192.168.100.11/24,fc00:192:168:100::B/64
2,DE-CP-1,eth0,10000,dot1q,10,192.168.100.12/24,fc00:192:168:100::C/64
The final code of this exercise will be the following:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29 $ cat csv_reader.py
#!/usr/local/bin/python3.8
# Modules
import sys
from re import *
from csv import DictReader
# Body
if len(sys.argv) < 2:
sys.exit('You haven\'t provided the path to the file.')
elif len(sys.argv) > 2:
sys.exit('You have provied too many arguments.')
else:
if match('.*\.csv$', sys.argv[1]):
ld = []
with open(sys.argv[1], 'r') as f:
reader = DictReader(f)
for row in reader:
ld.append(row)
print(ld)
else:
sys.exit('The provided file doesn\'t have \'.csv\' extension.')
In the final new segment you:
- Create an empty Python’s list.
- Open the file using the context manager with … as ….
- Create an object using the function DictReader() from the csv module.
- Using the for-loop create the list going over the rows in the CSV table.
- Finally print() the resulting list.
The Python’s script is ready, and you can verify its operation:
1
2 $ ./csv_reader.py some_data.csv
[{'Id': '1', 'Name': 'DE-DB-1', 'Interface': 'eth0', 'Speed': '1000', 'Encapsulation': 'none', 'VLAN': 'none', 'IPv4': '192.168.100.11/24', 'IPv6': 'fc00:192:168:100::B/64'}, {'Id': '2', 'Name': 'DE-CP-1', 'Interface': 'eth0', 'Speed': '10000', 'Encapsulation': 'dot1q', 'VLAN': '10', 'IPv4': '192.168.100.12/24', 'IPv6': 'fc00:192:168:100::C/64'}]
The output is pretty much the same, like in the previous blogpost. However, this time we have used a different approach.
If you prefer video
If you more prefer watching the video rather than reading the articles, it is all good. Subscribe to our YouTube channel, where you will find all the latest our videos including previous Code EXpress (CEX) episodes.
And here is the latest one:
What else shall you try?
Learning programming is all about trying and testing. To fully understand what we have covered so far, you can try the following additional scenarios:
- Try different methods to import the modules and find out what works for you the most.
- Modify the provided Python’s script so that it creates a CSV file using some functions from the csv module and write its content into a file provided in the arguments.
Lessons at GitHub
Don’t pass by, come to our GitHub repository and follow us.
Conclusion
All the production applications we are creating ourselves typically includes the modules, either built-it or external ones. The reason for that is very simple: reusing the modules allows to focus on solving our objectives, rather than practicing in programming of the low-level activities.
Support us
P.S.
If you have further questions or you need help with your networks, I’m happy to assist you, just send me message. Also don’t forget to share the article on your social media, if you like it.
BR,
Anton Karneliuk