Maybe you are working in a traditional enterprise shop, wanting to achieve a little more team efficiency when there are still meetings that include arguing over ITIL questions like, “When does a change become a release?” Or perhaps you’re a small startup that just doesn’t know where to start. This article will walk through some of the basic tools and integration points that will make your life better, get the results you want, and get you started down the path to leveraging ChatOps.
First, let’s define ChatOps
A solid, centralized, programmable communications channel is key to a modern operations team. Leveraging these platforms (for example, Slack), is the cornerstone to an up-and-coming area of operational management called ChatOps (or less commonly, BotOps).
ChatOps is where your communications platform can become an interactive control panel for your operational universe. You have “conversations” with purpose-built interactive processes registered as users on the communications service. Each autonomous user (a.k.a. a “bot”), will either listen for keywords in a channel before doing anything, or they can be interacted with through a direct chat. During the conversation, a bot will ask questions in natural language until it has the information it needs to submit the request for processing. It will then respond with success, or any errors that were returned (or whatever information you requested, like current CPU usage on server XYZ).
Communications platform
The core to any successful ops team, or any team for that matter, is being able to get in contact and have a group conversation. The current platform of choice for teams of all sizes is Slack. For many, its roots hold true to what made IRC great, and it has the ability to be a truly modern and mobile platform. Slack also has lots of existing third-party integrations available to talk to most popular services. Its ever-expanding list of features includes the recent introduction of voice and video calling. (And you can’t forget the starting price of $0.)
Let’s set up our own Slack for use in this article.
Step 1 – Go to http://www.slack.com/ and enter your email address on the main page.
Step 2 – Enter the 6-digit code to send to your email address.
Step 3 – Create your main account and set the password.
Step 4 – Explain what it is for—Options include home, work, shared interest, or other. Other has an extra field to fill out.
Step 5 – Name your Slack community.
Of course, Slack will validate if it is free or not.
Step 7 – You have a community.
Note: I must apologize at this point if you are part of a company that already has an in-house solution like Skype, Lync, Sametime, or Jabber. (Whichever you use, it won’t feel adequate anymore.)
Now you are ready to invite colleagues, friends, customers, or whomever you want. (And don’t forget the mobile app.)
Now, let’s make Slack useful for more than pasting memes and arranging when to go for coffee.
Source code and builds
While not necessarily operational in nature, it is always a good idea to know what’s going on.
Tracking commits to the code base is always useful. Here are steps to have GitHub push notifications to a channel in Slack. Similar steps are also available for both on-premises and cloud versions of GitLab, Bitbucket, and even Microsoft Visual Studio Team Foundation.
For builds, there are multiple tools out there besides GitLab-CI and Microsoft’s offerings (which are integrated the same way as previously listed). Jenkins, CircleCI and Travis CI can also integrate with Slack and publish to a specific channel.
I typically have everything related to a single product in a #product channel across builds and checkins. Then, the developers and operators both have a single place to look (or ignore).
Incident management
PagerDuty has, in my opinion, the best integration with Slack. Other more direct competitors are OpsGenie and VictorOps for pure incident management. Notification platforms also have Slack integrations that will meet the needs of a lot of people.
Note: if you are an existing team and do not have one of these tools in your arsenal, you really should investigate them. They are more than worth the money, especially for larger teams with on-call rotations, and any team that is distributed.
Even larger fully ITIL-compliant platforms like ServiceNow and CherWell can easily integrate with Slack to publish to a channel. A seamless integration that passes information back may take a little programming and some tweaking using a generic bot platform like HUBOT. (HUBOT is from GitHub, which is credited with creating the term, and it is their implementation.)
The best instructions to enable the Slack integration for PagerDuty are on PagerDuty’s site.
PagerDuty has some additional background on the development of their Slack integration, including a two-minute marketing video if you want more details on what exactly it can do.
Additionally, if you are just setting up an incident management platform, then you should route all your notifications and alerts through it so they can be tracked and not lost to the ages in a console you forgot to check, or a Slack message you missed. (An integration example: CloudWatch integration with PagerDuty.)
Cloud infrastructure management
Cloud infrastructure management is the other side to ChatOps that really shows the value of the concept. It is one thing to update or close a ticket—it is another to restart an instance or redeploy an entire application from a messaging window instead of whatever combination of web browser, SSH, and VPNs most teams need.
We’ll go with Opsidian for this section. There is a more platform-agnostic product called DeployBot available, and it handles deployments across different clouds well, but I have a bit of an AWS obsession at the moment, so I’m going with an AWS-native bot here. Plus, the AWS-native bot also accepts CloudWatch notifications (so you can ignore the last section of this blog if you’d like).
1) As the first step, go to Opsidian’s installation page.
2) Next, click on the big Add to Slack button and authorize the access it is requesting.
3) Now you should see a success message.
4) Into Slack we go to run the command.
5) Click Access Keys. All access to AWS through APIs is driven by access keys. It also wants the default region to work with. I already have a user set up.
6) It’s asking if we want to enable CloudWatch notifications to come into Slack. Select yes.
7) Follow along with the new instructions, and you should get the same success message at the bottom.
8) You are ready to roll. Let’s list existing S3 buckets just to prove it.
9) For more details on the commands you can run, go to Opsidian’s commands page.
Excellent! Your AWS is now chat-enabled.
Infrastructure notifications into Slack
If you have incident management with notifications routing through it, then this section isn’t really required. Just as if you followed through with the Cloud Management section, you would already have CloudWatch flowing through to Slack.
This section will give you the basics to get information flowing into Slack if you need. It is possible and really cheap (but not free) to do.
You start by using a free service like IFTTT that can receive events, and it is available to publish to Slack. In the case of AWS CloudWatch, you would configure a “simple service” to route requests from AWS to IFTTT so they can be published on Slack. SNS2IFTTT is the current tool of choice to do this. It’s just inside Lambda so it will only cost a few pennies every time there is an alert (which shouldn’t be very often if your app is well behaved). Detailed instructions are available here.
Zapier is another option, but the AWS CloudWatch connector will cost a little more than the compute cost of SNS2IFTTT. It just depends how much you feel your time maintaining SNS2IFTTT is worth to you.
A third option would be to get a larger, more robust monitoring suite like Netuitive, and leverage its native Slack integration.
Conclusion
At this point, your team is fully enabled to do basically anything they need from Slack.
This article is a happy-day scenario to get an operational team functioning effectively as fast as possible to manage a cloud platform using modern solutions, with the communications platform as the core. If you can’t apply this article completely, it should at least have enough information to get you started down your own modernization path.
Happy operating!