Kubernetes follows certain rules and policies when it comes to networking, and it’s not uncommon to encounter issues when trying to connect applications running in Kubernetes. Even the most trivial deployment needs to have the correct configuration so that K8s can assign the right IP address or ingress controller to the service. Furthermore, if you are operating the cluster on a public cloud provider like Google Cloud or AWS, you may have to follow their recommended configurations when deploying custom Kubernetes networking tools and certificate managers.
From an operator’s point of view, your job is to choose the right CNI (Flannel, Calico, or Weave), install a certificate manager (cert-manager), and route a domain to the cluster so that everything will work efficiently.
If you are a developer, on the other hand, you are probably more worried about ingress, paths, and certificates. How would you figure out why your application is not being routed successfully in a specific cluster?
In this extended tutorial, we will introduce you to the six most popular Kubernetes networking troubleshooting issues and show you how to solve them.
How to Debug DNS Resolution in K8s
Cannot Access Application from the Outside Using Ingress NGINX Controller
Flannel vs. Calico vs. Weave – Which One Is Better?
How Can I Redirect HTTP to HTTPs Using K8s Ingress?
Cert-Manager: How to See if the Client TLS Certificate Was Renewed
HELP! My Worker Node Is Not Ready and Returns the “CNI Plugin Not Initialized” Error
1. How to Debug DNS Resolution in Kubernetes networking
If you have trouble resolving DNS in K8s (when issuing certificates, for example), you might want to start with debugging the DNS resolution flow within the cluster. Here is what you can do:
Make sure that the dns-server is up and running:
$ kubectl logs --namespace=kube-system -l k8s-app=kube-dns
Query the kube-dns endpoints as they should be exposed:
$ kubectl run -i -t dnsbox --image=tutum/dnsutils --restart=Never
Start an interactive terminal within the dnsbox container and check the /etc/resolv.conf to ensure that it can resolve the kube-dns service:
2. Cannot Access Application from the Outside Using Ingress NGINX Controller
Incorrect ingress configurations are often the main source of problems related to the failure to establish the right routes or connectivity for the applications.
For example, you need to make sure that you assign the correct class into YAML config in the annotations section:
With cluster version < 1.19
Another source of problems comes from overriding the default backend annotation:
nginx.ingress.kubernetes.io/default-backend: example.com
This is supposed to be a backend that handles requests that the ingress controller does not understand, so it shouldn’t match a valid backend. Instead, it should match a service that handles 404 requests. For example, the following ingress might create issues:
3. Flannel vs. Calico vs. Weave – Which One Is Better at Kubernetes networking?
CNI plugins like Flannel, Calico, and Weave are designed to provide an unambiguous and painless way to configure container networking using a common interface. Using these plugins, you can rest assured that the minimum Kubernetes networking requirements are satisfied so that K8s can run efficiently. We will explain each provider below.
Flannel
Flannel is focused on networking at Layer 3 in the OSI networking model. It is considered to be a simple configuration tool for basic requirements. It runs a simple overlay network across all nodes of the Kubernetes cluster. For more advanced requirements (like the ability to configure Kubernetes networking policies and firewalls), we recommend that you use a more completable-future friendly plugin like Calico.
To get started with Flannel, you can install the required services and daemonsets by applying the following manifest in a test cluster:
$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
For more details about how Flannel works behind the scenes, you can read this guide.
One common issue with Flannel is that pods sometimes fail to communicate with other pods in the same cluster, especially after restarting nodes or when upgrading the cluster. If you have this issue, you may need to upgrade Flannel to the latest version as follows:
- Delete the Flannel daemonset:
$ kubectl delete daemonset kube-flannel-ds --namespace=kube-system
- Upgrade Flannel to the latest version:
$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
Calico
Calico is a full-featured CNI plugin that is maintained by Tigera. It’s currently very well maintained and has wide community support. If you choose to migrate to Calico from Flannel, you may find that the integration process is smooth. Calico utilizes the BGP protocol to move network packets between nodes.
To get started with Calico, you can install the required services and daemonsets by applying the following manifest in a self-managed Kubernetes cluster:
$ kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml
This will deploy several resources in the cluster. You can use the calicoctl cli to inspect the node status:
$ calicoctl node status
And you can inspect the status of the calico-kube-controllers pod like this:
Weave
Weave is a full-featured CNI plugin maintained by Weaveworks, which means that it allows you to create network policies (unlike Flannel). It uses a mesh overlay model between all nodes of a K8s cluster and employs a combination of strategies for routing packets between containers on different hosts.
To get started with Weave, you can install the required services and daemonsets by applying the following manifest in a test cluster:
$ kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
Next, check to see if it’s running:
$ kubectl logs -n kube-system weave-net-7pnz9 weave
Network 10.32.0.0/12 overlaps with existing route 10.32.1.10/32 on host
In this case, it looks like the default Kubernetes network that Weave defined in IPALLOC_RANGE overlaps with one that already exists in the host. You can try again with a different configuration address range:
$ kubectl exec -n kube-system weave-net-fhcgx -c weave -- /home/weave/weave --local status
If you’ve used Calico or Flannel and aren’t satisfied with their features or the experience they provide, Weave is a good alternative.
For an even more detailed comparison, you can read this article.
4. How Can I Redirect HTTP to HTTPs Using K8s Ingress?
If your K8s ingress operator does not support HTTP to HTTPS redirects out-of-the-box, you might have to configure it to do so within the appropriate metadata.
For example, you might need to set up a redirect middleware using Traefik, as follows:
nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
You might want to check to see if you have the following config on your NGINX sites config just to be sure:
When a certificate is renewed, some fields (like the revision date) will be updated. You can check the revision count of a certificate using the following command:
The most common configuration issues with CNI plugins are related to setting the correct pod-network-cidr parameter or failing to match the CNI plugin configuration (IPALLOC_RANGE in Weave or this config in Flannel). For reference, you can use the following command when setting up the cluster:
$ sudo kubeadm init --pod-network-cidr=192.168.0.0/16
This cidr block must be available to use within your network, and it should fall within the –cluster-cidr block if defined.
You can inspect the current value by using cluster-info and filtering the specified config:
You can also search for solutions if you are dealing with a known issue with your cloud provider. If you are using AKS, for example, you can search here for known issues and filter by your CNI plugin name.
Expand Your Knowledge of Kubernetes Networking
Learning about K8s networking is a continuous process. You will need to spend considerable time evaluating the available options and troubleshooting issues in order to be productive.
To start, you should carefully read the official docs and keep them handy for reference.
You can expand your knowledge by reading Platform9’s Kubernetes networking blog series and exploring the “Further Readings” sections. By the end of this series, you will have a strong foundation for tackling challenges related to managing Kubernetes networking.
I also recommend that you read this three-part series on K8s networking fundamentals written by Mark Betz, since it’s considered to be one of the best introductions to the topic.
The best way to understand K8s networking, of course, is to practice it in a real cluster. You can try Platform9’s free tier, which bootstraps the necessary services and storage plugins using the cloud provider of your choice.