Kubernetes is gaining adoption at every type of organization, and has become the leading container orchestration tool. And although Kubernetes makes container orchestration simple, Kubernetes itself isn’t always easy. Kubernetes helps you manage containers, but it doesn’t come with built-in tools for monitoring and working with it.
To manage a Kubernetes cluster effectively, you need deep visibility into every one of its components, and this is only possible by monitoring and analyzing Kubernetes logs. This article explains how to view and analyze the logging data produced by Kubernetes.
kubectl logs
Kubectl is the CLI tool for running commands on a Kubernetes cluster. It has commands for various operations, including retrieving logs.
Using the kubectl logs command, you can retrieve the logs for a pod, or a container, or even tail these logs in real time. You can specify that you’d like to view logs only for the previous hour, or the last 20 lines. You can perform other functions like including timestamps in the logs or logging to standard error. These are the most basic logs available in Kubernetes, and aren’t stored persistently. They are deleted along with the pod.
Kubectl logs are useful when you’re looking for a specific log and know where to find it. They require some familiarity with the various components of Kubernetes, and knowing the various commands and operators of kubectl. Kubectl is not meant to be your primary logging tool. For that, you need a more robust solution.
Logging with a Fluentd agent
Pod-level logs are temporarily stored in a dedicated logging container. As pods are dynamically moved between nodes by DaemonSets, their containers are deleted and so are the logs associated with these containers. You can collect and store logs persistently by routing them to a logging tool using an agent like Fluentd.
Using this model of a logging agent, you can set up cluster-level logging for Kubernetes. The Fluentd agent collects logs from each node and passes everything on to an external logging solution. This could be Stackdriver if you’re on the Google Cloud platform, or Elasticsearch if you prefer an open source solution, or a dedicated log analysis platform like Sumo Logic. For the rest of this post we’ll focus on Elasticsearch as it is open source, and is a good standard by which to measure other Kubernetes logging solutions.
Elasticsearch and Kibana
Elasticsearch is a distributed full-text search and analytics engine. It is a powerful querying tool that is used for log analysis by many development teams. It is distributed in that it uses multiple nodes, which are in turn grouped into clusters to store all the data that it needs to analyze. I can store and analyze very large quantities of data in the petabyte range, which is what makes it especially well suited for a modern cloud-native application running on Kubernetes.
One of the key strengths of Elasticsearch is the speed of querying. Its distributed architecture means it is able to run queries on multiple clusters across a large volume of data. With version 6.0, there are new features to automatically save disk space, and ensure no primary and replica shards are located on the same node.
Previously, Elasticsearch used a feature called tribe nodes for querying multiple clusters. In this method, Elasticsearch makes a single node responsible for running and routing the query and returning the results. The node that receives the request identifies which other nodes contain the data required to process the query. It then runs the query and merges the responses into a single global result. The problem with tribe nodes is that they tax a single node, and don’t spread the load around to other nodes, which is against a core principle of Elasticsearch. Recently, Elasticsearch launched cross-cluster search with version 5.5. In this method, the query is run as a federated search on a limited number of nodes, and no single node is taxed with processing the query.
Elasticsearch is complemented by Kibana, an open source data visualization tool—It’s biggest strength is that it has strong native integration with Elasticsearch. Unlike the minimalist open source charting tools of before, Kibana comes with a range of powerful and great-looking visualization options, including various chart types, tools to filter data and get to exactly what you’re looking for, and a very refined user interface. To top it all, there are a number of ready-made dashboards you can browse through and adapt for your data. There are a number of Kubernetes dashboards available today.
Considering all these drool-worthy features, Elasticsearch is a powerful tool for querying large quantities of log data—which is the case with Kubernetes. However, managing a large-scale Elasticsearch cluster comes with its pains. It requires active management to ensure peak performance when storage is running low, or compute capacity is not enough to support workloads. If you’d rather leave the management of your log analysis engine to a vendor-managed Elasticsearch service, a third-party log analysis tool like Sumo Logic is well worth the investment. It gives you the peace of mind that your logs are always available no matter the scale, and lets you get on with what you really want to do—build out cutting-edge features for your applications, and not tinker with the tools used to build those features.
Conclusion
As you can tell, logging for Kubernetes is completely different from logging for traditional applications. It requires a deep understanding of how distributed systems function, and how to optimize them for peak performance.
You need an intimate knowledge of the various components of Kubernetes to effectively use kubectl log commands. As you adopt cluster-level logging, you’ll need to rely on Fluentd, which is a dependable and robust log aggregation tool.
But the most important link in this toolchain is the way you store, analyze, and visualize your log data. Elasticsearch along with Kibana is the best open source solution available today. It has a lot to offer, and can be used alongside another primary logging tool you may end up using eventually.
There are challenges with managing a large scale Elasticsearch cluster, and for many organizations, offloading that work to a vendor makes the most sense. Whichever option you choose, it’s a great time to be a Kubernetes user, and the available logging tools are proof of how the Kubernetes ecosystem has matured over the past year. We have much to look forward to as we move into 2018.