Automation as a network engineer
I decided to start writing as I work my way through learning Ansible, Python, Paramiko, and Nornir. I'm a Network Engineer, and looking to start automating network functions and changes. Not so much as Software Defined Networking, but more like scripting for mundane or repetitive tasks throughout networks. Sit back and follow along through my pitfalls and triumphs as I attempt to learn Ansible and other network automation tools. Hopefully you can learn from some of my trials and save some time yourself. I will be doing followup posts with things I've learned in my automation journey, and some of the troubles I run into.
Why learn automation you ask? Well, it's the future for one. DevOps is one of the next things coming down the pipe in the world of IT. If you don't get on the bandwagon you'll be left behind. Companies, now more than ever, are looking for ways to save money, and do more with less. COVID-19 has really amplified the need for automation in many companies. With layoffs in the hundreds or thousands, more and more work is left to fewer employees. Ask me how I know :)
Automation is the ticket to doing more with less. If there's a task that you have to do frequently that's always repeatable with the same steps, that's the first target for automation. These type of tasks are usually the first to be identified for automation. Automation doesn't even have to be as complex as a full-blown Ansible playbook, or Python script. It can be as simple as a cron job or scheduled task to reboot a device once a week so it doesn't crash. That's where most IT folks start with automation. Always start small, don't try to boil the ocean on your first go. Again, ask me how I know!
Where to start?
Where should I start with automation you ask? The answer to that, in my opinion, is that it depends. If you have any sort of programming background, something like Python might be more familiar to you. Overall Python is much more powerful and flexible than tools like Ansible. Though Ansible is build using Python, you are somewhat confined to the framework developed thus far. Though with Python experience you can write your own modules to interact with more devices. Specifically as a Network Engineer, this is where I started. With a slight programming background with C++, Java, and PowerShell scripting, I had a bit of a head start learning Python. Libraries like Netmiko make direct interaction with network gear possible, and libraries like TextFSM give you the ability to parse the output returned from the device when you run commands. Using raw Python, this gives you so much power to alter configurations on the fly and respond to real-time changes in the environment. If/else type conditions are much more difficult in Ansible, and your code gets nasty quite quickly.
The other main factor in which tool you should learn first would be what you are trying to accomplish by automation. If you have a fairly straightforward, template network design, Ansible might be a tool more fitted to your needs. Ansible is all about defining a state for a device to be in when it's done, and the Python in the back end handles the rest for you. You don't have to worry as much about which commands to run and in which order. So one could argue that Ansible is simpler, but it's more just a different way of approaching things. Most networks I work in are not "cookie-cutter". I have to be able to adapt to port configurations changing dynamically from switch to switch by looking at things like CDP in real-time. This is much more difficult to do in Ansible. Check out a couple of examples on my GitHub:
As you can see, the Netmiko example is much more straightforward. This job had to look at individual switches, check the spanning-tree for a specific vlan, which is used as the native vlan on trunking links to other switches in order to ensure the vlan was added to the allowed vlan list. This all was along with creating the Layer 2 VLAN object on the switch and changing all ports in a given VLAN to the new VLAN. This job was run against approximately 50 switches in a batch, and took about 20 minutes to run, mostly due to resource limitations of my Ansible server. This work would have taken hours, if not a couple days to get done by hand, not to mention all the downtime for users. Using automation I was able to cut the network over to a new network/subnet over a lunch break since all the clients on that vlan were using DHCP.
In closing, I hope you start your automation journey, and with any luck I can help you out of some tight spots with tips and information I uncover along my journey. For some reason I seem to come across many interesting problems to solve with automation and run into some odd issues due to this. My plan is to share these findings here so hopefully someone else coming across the same issue can get some answers.