Skip to main content

Debug container networking

It is a common problem that your application that runs in a pod cannot reach another service.

Firewalls and configuration parameters are usually blamed for this, but not everyone knows how to debug them. This chapter provides a few tools and a methodology to teach you a few things about network issues in Kubernetes pods.

To be certain that the issue is truly a network issue, first, you have to factor out your application.

Start a shell in the pod in order to run all the following tests.

kubectl exec -it pod-xxx bash
tip

If you get an error running the kubectl exec command, try starting a shell other than bash.

Try kubectl exec -it pod-xxx sh or kubectl exec -it pod-xxx ash for Alpine images. If none of the shells exist in the container image, you are out of luck. You may still be able to start up a new pod with a shell in the same cluster: kubectl run -i --tty laszlo-debug1 --image=debian -- bash

It's (not) always DNS‚Äč

If you are using a host name to connect to the remote service, it is always good to verify that the host name is resolved by a name server.

You can use dig for this task, but I personally prefer nslookup.

Try using nslookup with the target domain nslookup subdomain.myremotehost.xyc and see if it resolves.

If it resolves, you will see an IP address or a CNAME:

$ nslookup google.com
Server: 127.0.0.53
Address: 127.0.0.53#53

Non-authoritative answer:
Name: google.com
Address: 142.250.201.206
Name: google.com
Address: 2a00:1450:400d:806::200e

But there is a chance that the address cannot be resolved:

$ nslookup does-not-exist.com
Server: 127.0.0.53
Address: 127.0.0.53#53

Non-authoritative answer:
*** Can't find does-not-exist.com: No answer
tip

If a host name is not resolved, you can try other DNS servers to double check its existence. It may just be the case that your name resolution is off/slow. nslookup does-not-exist.com 1.1.1.1 queries Cloudflare's DNS while nslookup does-not-exist.com 8.8.8.8 queries Google's DNS. If these servers do not know about a host, you can be sure it is a DNS problem.

tip

If nslookup is not available in the pod, you can try installing it, but you will have to be root in order to do so. On a Debian based system, you can do so with apt update && apt install -y dnsutils

If you are not root, you can start a debug container in the same environment as your application in which you are root: kubectl run -i --tty laszlo-debug1 --image=debian -- bash

Try accessing the service‚Äč

If DNS works, it is time to access the service from the pod.

If the service is a HTTP based API, use curl to access it

curl -X GET https://subdomain.myremotehost.xyc/api/myendpoint

If it is a database or some other kind of binary protocol, you can use the telnet command to open a plain socket connection to it:

$ telnet subdomain.myremotehost.xyc 5432
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
tip

If telnet is not available in the pod, you can try installing it, but you will have to be root in order to do so. On a Debian based system, you can do so with apt update && apt install -y telnet

If you are not root, you can start a debug container in the same environment as your application in which you are root: kubectl run -i --tty laszlo-debug1 --image=debian -- bash

If telnet or curl is unable to connect to the service, you can be sure that it is not running or communication to the host is blocked somehow.

$ telnet subdomain.myremotehost.xyc 5432
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection refused