In any enterprise with real customers, there is rarely one of anything, and with multiple instances comes the ability to have configuration drift.
That’s a challenge because it makes it hard to achieve parity (software environments that are configured in the same way) across development, QA, production, and even disaster recovery environments to ensure applications can be tested in real-world scenarios, and will run as expected every time. This is especially true in the container space with its transient tendencies.
With that challenge in mind, here’s a primer on how to achieve parity between disparate Kubernetes environments.
The first (and foremost) thing to think about is Kubernetes versioning. Obviously, you can’t have parity if you have different Kubernetes versions in different environments.
Within any given Kubernetes deployment, there are obviously a set of Kubernetes binaries for the control plane (including the scheduler, apiserver, and etcd), and the kubelet and kube-proxy running on each compute node. If an enterprise can stay within the same point release, like 1.11 or 1.14, then that is a solid start.
In a perfect world, each environment will be the exact same version, right down to the security erratas that are applied. Keeping versions in sync manually can be extremely time-consuming, however, so leveraging a commercially supported Kubernetes distribution which will package these updates for easier distribution is ideal.
Creating containers for Kubernetes to orchestrate requires that there be a way to run the containers. Given how technology typically works, there are many different runtime options in the container space. Some deployments have multiple runtimes deployed for the widest possible supportability, but if you can use a single one to simplify management, that would be ideal.
Just as a side note: The common container runtimes are containerd and runC from Docker which are open-sourced and based on the Open Container Initiative (OCI) specification. Other runtimes are starting to become more commonplace, including kubevirt for running whole virtual machines within a Kubernetes cluster.
How each application is deployed is arguably not as important as creating parity between Kubernetes environments, but the more automation that can be included in the deployment steps, the less chance for human error. We all know that human error causes more configuration drift than any other single event.
Deployment automation covers three areas in the Kubernetes world. They can be tied together, but they all should be addressed.
1 – Infrastructure deployment to create the nodes to run on Kubernetes can be handled through vendor-provided offerings like CloudFormation on AWS, JumpStart from Dell, or vendor-neutral offerings like Terraform from HashiCorp.
2 – Kubernetes deployment is either done through a lot of manual scripting, using an installer provided by a vendor, or an automation tool like Ansible.
3 – Application and service deployments within the cluster can be done one at a time from the command line kubectl (or a vendor-specific option like gcloud), through templates which are more consistent, or through something more abstract with better updating functionality like Helm charts.
Having all of these different areas of automation triggered through a CI/CD pipeline (like Jenkins) or other mechanism to fully build and deploy an environment with no manual intervention makes life so much easier.
Are all the services that are required deployed with the proper interface versions exposed? This includes any Open Service Brokers that will be leveraged and routed to external services.
There are testing scenarios where the full version of every service is not required to meet the goals of an environment, but the service being tested still needs multiple dependencies met. In these cases, using a service virtualization tool — like Parasoft or Mountebank — will allow services to submit requests and reset preset responses without the overhead of fully deployed dependencies. While this is not ideal, it can be a useful tool for isolating the root of performance or security incidents.
While not commonly used, things like GPUs and FPGA cards are leveraged in containers. The same cards should be available for testing. It doesn’t matter if it’s the exact same mix and number of each node type — It is more important that each node type simply exists in every environment when a service requiring it is deployed for any reason. Even if it is a single-labeled host available in an environment, to truly have consistency, the hardware components need to be there every time.
Ensure that every environment has the same network policies available to the applications being deployed, and the same firewall and virtual LAN at the host level. The ultimate goal is that every rule required in production can be audited and tested against any other environment which has that application deployed. This needs to be true regardless of the scale of deployment, whether it is a single group of nodes in a single cluster, multiple node groups within a cluster, or even multiple clusters that are federated.
And you are using the same technology to federate everywhere, right? Different types of clustering use the network in different ways. It’s best to stick to standards like Kubernetes federation v2 when possible.
While it may seem obvious that namespaces will be used in every environment, it is important to note not just how many exist, but how each application and service is deployed. Which services coexist in namespaces? Are all the namespace-specific network and RBAC roles available? If network and RBAC policies are in place, having consistent names for namespaces will save your sanity more often than not.
While there is a lot to think about to ensure that every Kubernetes environment deployed within your enterprise is consistent, the first time that consistency stops a defect from making it to a customer-facing instance will protect your reputation and save considerable time and money.