In a previous article on Site Reliability Engineering (SRE), we discussed the SRE role and why it’s in high demand. At the end of the article, some advice was given on how to become an SRE. It is not easy. But with the right determination, it can be done. This article will highlight the skills you need to become an SRE.
What Is SRE?
First, we have to keep in mind what SRE is:
Site Reliability Engineering is the application of software engineering to operational problems. The word ‘reliability’ means a Site Reliability Engineer has a particular role in an organization and the software development lifecycle. In these organizations, application developers learn how to build reliable services to ensure that the computer systems of an organization run correctly, 24/7. Security, stability, and scalability are very important here. The business wants reliable services.
Site reliability engineers create a bridge between development and operations by applying a software engineering mindset to system administration topics.
This implies an SRE should have development and operations skills. The Site Reliability Engineer should be proficient in both skill sets, not just one, as we can see with the T-shaped professionals. This demands a broad education in skills. How do we develop these skills?
Become a Pi-Shaped SRE
Pi-shaped (versus T-shaped) professionals are now in high demand. Their depth and breadth of knowledge isn’t only in a single discipline. They have a range of specialized skill sets that can be applied across the full spectrum of business issues impacted by site reliability.
To ensure the IT systems of an organization work correctly, an SRE needs to know not only about operations, but also about software development to educate their colleagues about software quality and monitoring. So the SRE has to know about operations, software development, and education/coaching.
Education is one thing. How to persuade people to change their work to enhance the software quality is a very different skill. So an SRE also needs to have powers of persuasion.
Education and persuasion are great software skills, but what about technical skills?
What technical skills should an aspiring SRE acquire?
Technical Skills of an SRE
An SRE needs to have broad technical skills to ensure an IT system works correctly in production.
- Learn scripting.
Working with a command-line interface (CLI) is a necessary skill for SRE to automate your programs. Get skilled with scripting in UNIX and Powershell.
- Learn scripting languages like Python to automate processes.
Python is a great scripting language for SRE. It’s not hard to learn, and with Python, you can automate a lot of your processes. Python is also readable, and scripts can easily be shared with your co-workers for education.
- Learn installing Linux distributions.
Install a Linux distribution like Ubuntu or Fedora on your computer or virtual environment. Really test yourself here and make mistakes. You will learn from these mistakes.
- Learn to code in different code editors.
As an SRE, you will work with various teams. These teams will use different code editors. Examples are Eclipse , Atom, and Vim. Get used to these and you will blend in well with the teams you work with.
- Learn how to do version control with GIT.
GIT is an open-source version control system. Learning this will boost your SRE skills and make you a great asset to any DevOps team. Also, show your coding skills in GitHub.
- Learn how to create a website.
- Get experience with a CI/CD pipeline.
Software development is done through a Continuous Integration / Continuous Development pipeline. Know how a CI/CD pipeline works. Know what tooling (e.g., Jenkins) is necessary.
- Learn about software architectures.
Service-Oriented Architecture (SOA) and microservices are modern software architectures every SRE should know. An SRE has to know about the details and the bigger picture.
- Get familiar with different databases.
This applies not only to relational databases but also to NoSQL databases, where the data storage is no longer tabular.
- Gain practical knowledge about containerization.
Containerization has become an important part of software development and configuration.
Skills working with Docker and Kubernetes are in high demand. If you learn the basics, you are on your way to becoming an awesome SRE.
- Make testing a daily habit.
Testing has become a critical skill in software development. Know how to use the different testing methodologies and get used to unit testing, system testing, integration testing, and user acceptance testing.
- Know how to monitor a computer system.
Monitoring a computer system is important to detect issues in a production system. Gain practical knowledge with the different tools like Nagios, the ELK stack, etc. A great way to learn this is by creating a dashboard.
As you can see, an SRE needs to have a broad set of technical skills. The technical skills above are not exhaustive. It takes years to become a professional SRE. Don’t get intimidated by this. Just start. Get your hands dirty and learn along the way. Document everything. Learn constantly.
Stuff will go wrong. When it does, ask yourself: Why did it go wrong? Can I fix it myself, or should I ask for help?
Keep your certificates up to date and stay current with market demand. But more importantly, stay curious. IT changes rapidly and what you learn today can be obsolete tomorrow.
Becoming an SRE is not easy. You need to have development and operations skills.
Next to that, you need educational/coaching skills to educate your colleagues on software quality. An SRE should be curious, practical, humble, and tenacious. In production, things can go wrong very quickly. An SRE should maintain oversight, and at the same time, fix issues with their team.
The road to becoming an SRE is long. But if you are curious to know your computer system – all its functions, its variables, its capabilities, its vulnerabilities – you are already making the first step.