Error Monitoring

The Role of Error Monitoring in CI/CD



In application development these days, you can monitor anything. You can monitor user data, system data, performance data, and even details on every exception that happens. And this exception monitoring can be the key to not only improving the quality of individual releases, but also the quality of your overall release process.

The power of monitoring has baffled some teams to the point of paralysis, and overexcited others where they adopt it too much. Both are a problem, and when it comes to broader CI/CD processes, identifying how you will use your analytics is critical.

Development and operations teams are busy—often too busy to think beyond their current releases and production applications. While they understand the benefits of CI/CD, the typical implementation of them is a hodgepodge of tools and processes, and most of the time becomes a manual process.

What they are forgetting is the process of processes—the holistic delivery pipeline that impacts every release that happens within it. And this is where application quality plays a critical role.
Most often, the process of unit testing and exception handling is in the hands of individual developers. After all, they are the ones that know the code in and out. However, there is no reason to take their data and spread it across the entire team to keep tabs on the environment as a whole.

When developers add error monitoring with tools like Rollbar and Raygun to their applications, and a simple line of code in their try-catch statements, they are collecting information they’ve always had, but it is now aggregated across all their exceptions, and displayed in an interface that makes the manual one-by-one addressing of errors a lot easier. It also helps the developers draw focus to the most critical errors, and those that are not really errors, but instead the result of other business logic, or releases that are not fully wired up.

Taking this individual developer data and expanding it across the entire team allows the organization to be aware, for example, of how the number of features during a particular release impacts the number and complexity of errors during releases. Such data can allow the team to optimize the environment to reduce the chance of serious exceptions, and the rate at which they can be addressed.
What does this have to do with CI/CD? It means less time in these environments, and less time trying to revert or release an application that is laden with errors. Error monitoring is at the beginning of release during code creation, at the point of integration in continuous integration environments, and there right before a production release.

A good error monitoring tool gets that data to the developers and operations teams quickly in all these phases, in a more convenient way then the IDE. Some tools even allow push notifications to your desktop and your mobile device, so that on any particular build, you’ll see the critical errors right away. In the case of doing a CI build on every commit, you are truly practicing DevOps by keeping the flow of releases going and responding to issues immediately.

The aggregation of data over time allows individual developers to improve their processes, and the whole team. That is the goal of DevOps—to oversee the meta-delivery pipeline to optimize all future releases by identifying weak spots, and points of risk during the entire release process.
Beyond the improvements in application quality and collaboration, there are situations where exception monitoring is required:

  1. Microservices applications are a small part of a whole, which means what happens in the service has to be monitored separately and as part of the entire application. How do you reconcile that without visibility?

  2. With serverless code like Amazon AWS Lambda and Azure Functions, the code runs in a black box, and it’s very hard to know if something goes wrong, and what caused it.

We have a tendency now to over-measure and over-analyze. This is an ongoing concern. But adding exception monitoring to every application is, I believe, a responsible coder’s obligation.

Chris Riley is a technologist and DevOps advocate for @Splunk who has spent 12 years helping organizations transition from traditional development practices to a modern set of culture, processes and tooling.


Click on a tab to select how you'd like to leave your comment

Leave a Comment

Your email address will not be published.

Skip to toolbar