DevOps like It’s 1999 with PagerDuty

591 VIEWS

· · · ·

This is the next article in an ongoing series I am writing about integrating modern DevOps tools into a derivative of a 20 year old computer game I’ve called OASIS. Previously, we integrated Keen.io into OASIS, which has allowed us to generate detailed analytics reports about everything from player behavior to in-game statistics. In addition to analytics, another incredibly important DevOps category is monitoring. To provide more transparency to the issues that arise within the game, I’ve decided to integrate PagerDuty alerting into the game.

The OASIS codebase, which was built on top of a derivative of the original DikuMUD codebase, has an already existing method called bug() that writes critical issues to the server logs. While this is useful in identifying issues while someone is watching the log directly, it can make it difficult to identify and resolve issues immediately as they happen.

PagerDuty has an impressive list of integrations with third-party tools, which would come in handy if we were building a more modern application, but because some of this code is almost as old as I am, my best bet for integration is to build a custom method using the PagerDuty Trigger API so we can write incidents to PagerDuty from directly within the game.

Thanks to PagerDuty’s documentation, and the work we did last week to provide cURL and JSON support to the game, accomplishing this is incredibly simple. Because we are now utilizing two different APIs that require us to POST raw JSON data, the first change need to make is to pull out the cURL code into a separate method.

This is important, as it will allow us to keep the amount of duplicate code we write to a minimum. Next, we need to actually write the method to trigger a PagerDuty incident. Utilizing our new curl_json_post() method, it’s as simple as building the JSON data and submitting the data.

Finally, now that we have a way to trigger PagerDuty alerts, we need to actually put it to use. This is where the previously mentioned bug() method comes in. The simplest implementation here is for us to simply trigger a PagerDuty incident every time the bug() method is called. This is as simple as calling trigger_incident() just before the end of the method.

So, what exactly happens when this method is called? Well, that ultimately depends on the PagerDuty escalation policies we set up, but generally you (or someone on your team) will get notified of the incident via email, text message, or phone call. The incident will also appear on the PagerDuty dashboard with any relevant information.

image00

Interestingly, I noticed that if an incident isn’t resolved after 6 hours or so, it is automatically marked as Resolved. This is pretty cool, as it assumes that incidents that aren’t addressed directly are perceived as not actually being important, which keeps the incident report clean and current.

image01

It should be noted that, for now, I am only interested in consuming PagerDuty’s incident triggering API, however they have an impressively thorough API for working with all other facets of PagerDuty. If I wanted to, I could pipe external server alerts back into the game, which would provide a pretty cool double-sided level of transparency.

 

Do you think you can beat this Sweet post?

If so, you may have what it takes to become a Sweetcode contributor... Learn More.

Zachary Flower is a freelance web developer, writer, and polymath. He has an eye for simplicity and usability, and strives to build products with both the end user and business goals in mind.


Discussion

Comments are closed.

Menu