Traditionally, the first key question that coders had to answer as they charted their career paths was whether they wanted to be a software engineer (SWE) or a site reliability engineer (SRE).
That question remains relevant today – but not as relevant as it once was. Increasingly, the lines separating SWEs and SREs are blurring. The two roles are still distinct from each other, but the stakes surrounding them have evolved as it has become easier for SWEs to embrace the tools and workflows that once belonged only to SREs.
With that reality in mind, here’s a primer on the similarities and differences between SWE and SRE roles today.
What is an SWE?
An SWE is someone who specializes in software development. The SWE’s main task is to design and implement code.
SWEs tend to focus on problems that are finite in scale and have a scope that is clear from the outset. For example, an SWE might be tasked with implementing a specific new feature within an application. That’s a relatively straightforward task. Although there is plenty of creativity at play in designing and implementing the feature, it’s not as if an SWE is expected to reinvent the entire application development and deployment process.
What is an SRE?
An SRE, on the other hand, is someone whose role focuses on transformative change – at least ideally. SREs are responsible for finding ways to add efficiency, scalability, and reliability across software environments.
In some ways, the SRE role overlaps with that of a traditional IT Ops engineer, whose job is to deploy and manage applications. But SREs go further than merely maintaining the status quo. They develop creative solutions that lead to continuous improvement in software deployment and management. This means that the SRE’s role is more open-ended and diverse than that of an SWE.
The SRE role is also different from that of IT operations because SREs use a developer’s mindset to guide operations tasks. They look for ways to define and implement operational routines using code. They aim to make processes iterative and scalable, just like application development.
SWEs vs. SREs
The main differences between an SWE and an SRE, then, lie in which types of problems they solve. An SWE’s chief responsibility is to write code, while an SRE assumes a broader set of tasks that aim to improve software performance and reliability across the board.
From a tooling and operational perspective, however, SWEs and SREs are not that different. Both roles focus on using code and code-driven processes to achieve their tasks.
The Changing Role of the SWE
Based on the above, you might think that the job of the SWE is relatively narrow compared to that of the SRE. Traditionally, SREs have gotten to focus on radical innovation, while SWEs simply generate code.
But that’s not necessarily the case anymore. Today’s SWEs are in a stronger position than ever to branch out into operational roles in ways that parallel the work performed by SREs. When debugging, for example, SWEs can make use of reliability management tools – like Stackpulse – that provide rich monitoring data to help contextualize the information they get from debugging tools. They can also use reliability information to gain a clearer understanding of the problems that arise in production environments, and then use those insights to shape the next round of feature enhancements or bug fixes.
To put this another way, the reliability tools available to SWEs today mean that they are no longer isolated at the development end of the software delivery pipeline. Their visibility now extends into the operational processes at the opposite end of the pipeline, and they can participate in those processes in a way that was not possible previously.
The ability of SWEs to perform tasks like this is important not just because it builds a stronger linkage between software engineering and IT operations, but also because SWEs are in a position to resolve problems that SREs can’t. Because SREs are charged with a broad array of tasks and have to maintain a plethora of different services at once, they are less likely to be able to use reliability data to home in on specific bugs or performance problems. But given their focus on code and their ownership over the development process, SWEs can more ably identify direct relationships between operations problems and code problems.
This means that, with the help of modern reliability engineering tools, SWEs can solve operations issues that SREs may struggle to handle. SREs are great at dealing with high-level, end-to-end operational challenges caused by systematic problems, but not with smaller-scale issues that arise from specific snippets of code and may be isolated in nature. Solving the latter sorts of problems is how SWEs can contribute to modern operations.
Is Software Engineering Dead?
To be clear, this doesn’t mean that the SWE role as we’ve traditionally known it is dead, or that SWEs are simply becoming SREs. The responsibilities of SWEs and SREs remain distinct enough that these are clearly differentiated roles. SWEs are still primarily responsible for development, and SREs are primarily responsible for reliability engineering. Each role also makes distinct contributions to operations, as explained above.
What has changed, however, is that SREs are no longer the sole bridge between development and operations. Software engineers can now participate in that work, too, as long as they have the visibility they need into reliability operations.