Debug container networking
It is a common problem that your application that runs in a pod cannot reach another service.
Firewalls and configuration parameters are usually blamed for this, but not everyone knows how to debug them. This chapter provides a few tools and a methodology to teach you a few things about network issues in Kubernetes pods.
Let's factor out any application related issue
To be certain that the issue is truly a network issue, first, you have to factor out your application.
Start a shell in the pod in order to run all the following tests.
kubectl exec -it pod-xxx bash
tip
If you get an error running the kubectl exec
command, try starting a shell other than bash.
Try kubectl exec -it pod-xxx sh
or kubectl exec -it pod-xxx ash
for Alpine images. If none of the shells exist in the container image, you are out of luck. You may still be able to start up a new pod with a shell in the same cluster: kubectl run -i --tty laszlo-debug1 --image=debian -- bash
It's (not) always DNS
If you are using a host name to connect to the remote service, it is always good to verify that the host name is resolved by a name server.
You can use dig
for this task, but I personally prefer nslookup
.
Try using nslookup with the target domain nslookup subdomain.myremotehost.xyc
and see if it resolves.
If it resolves, you will see an IP address or a CNAME:
$ nslookup google.com
Server: 127.0.0.53
Address: 127.0.0.53#53
Non-authoritative answer:
Name: google.com
Address: 142.250.201.206
Name: google.com
Address: 2a00:1450:400d:806::200e
But there is a chance that the address cannot be resolved:
$ nslookup does-not-exist.com
Server: 127.0.0.53
Address: 127.0.0.53#53
Non-authoritative answer:
*** Can't find does-not-exist.com: No answer
tip
If a host name is not resolved, you can try other DNS servers to double check its existence. It may just be the case that your name resolution is off/slow.
nslookup does-not-exist.com 1.1.1.1
queries Cloudflare's DNS while nslookup does-not-exist.com 8.8.8.8
queries Google's DNS. If these servers do not know about a host, you can be sure it is a DNS problem.
tip
If nslookup
is not available in the pod, you can try installing it, but you will have to be root in order to do so. On a Debian based system, you can do so with apt update && apt install -y dnsutils
If you are not root, you can start a debug container in the same environment as your application in which you are root: kubectl run -i --tty laszlo-debug1 --image=debian -- bash
Try accessing the service
If DNS works, it is time to access the service from the pod.
If the service is a HTTP based API, use curl to access it
curl -X GET https://subdomain.myremotehost.xyc/api/myendpoint
If it is a database or some other kind of binary protocol, you can use the telnet
command to open a plain socket connection to it:
$ telnet subdomain.myremotehost.xyc 5432
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
tip
If telnet
is not available in the pod, you can try installing it, but you will have to be root in order to do so. On a Debian based system, you can do so with apt update && apt install -y telnet
If you are not root, you can start a debug container in the same environment as your application in which you are root: kubectl run -i --tty laszlo-debug1 --image=debian -- bash
If telnet or curl is unable to connect to the service, you can be sure that it is not running or communication to the host is blocked somehow.
$ telnet subdomain.myremotehost.xyc 5432
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection refused