How to Troubleshoot your App in Kubernetes

Troubleshooting apps in Kubernetes can be a bit of a nightmare. When you deploy your app onto Kubernetes and it doesn’t work, where do you start to figure out what’s gone wrong?

In this article I’ll talk about how to debug an app that’s deployed, and seems to be running, but for some reason you can’t access it. I’ll tell you the three main areas I look at when I’m investigating app issues in Kubernetes.

But first, coffee:

Feed me

Just hook it to my vein pls

That’s better. Caffeine acquired. Now we can begin.

Common causes of problems

So what usually causes errors with apps on Kubernetes?

After debugging more than enough of my own bad Kubernetes problems, I’ve found that apps deployed-but-not-working on Kubernetes are often caused by one of a few things:

  • App is listening on the wrong ports

  • App is listening on the wrong interface (e.g. it should listen on 0.0.0.0, not localhost)

  • Wrong configuration of the app

  • App can’t read or write a file on disk, maybe the directory doesn’t exist or it doesn’t have permission

That doesn’t cover every possible error, but it certainly covers a lot of them.

How to troubleshoot

Here’s a typical application architecture on Kubernetes. I’ve numbered the three main areas that I would check, if I was looking to figure out why an application deployment isn’t working in Kubernetes:

Troubleshooting apps in Kubernetes

Troubleshooting applications in Kubernetes: where to begin?!

As Kubernetes is such a complex platform, there are potentially a lot of places you could look, to find out the root cause of the issue.

But you can often find the root cause by checking these three things:

  1. Check your Pods

  2. Check the Service

  3. Check the Route or Ingress

Let’s look at each of these in turn.

Checking the Pod

Firstly, let’s look at at the application itself. This means inspecting the Pod that is running your container.

As I’m just your humble guide to Kubernetes troubleshooting, I can’t possibly list every scenario which might cause an issue in your application. But let’s look at some common ways to troubleshoot.

Troubleshooting tips

You can start by checking the state of the Pod:

kubectl get pods <pod name>

The next useful point for troubleshooting is to look at the logs of your Pod:

kubectl logs <pod name>

Look for any warning or error level logs. These are lots from the application running inside the Pod. But if the Pod is continuously restarting or crashing, the logs can be really useful in finding out why:

  • Is the app missing a config file? Perhaps you need to supply some custom configuration to the app, in a ConfigMap or Secret

  • Is the app trying to connect to a service which doesn’t exist? e.g. an incorrect database URL, or incorrect URL for an API?

  • Is the app trying to connect to another service, but it’s using the wrong username and password?

Test the app from inside the Pod

Assuming that your app is some kind of web site or web app, try accessing the app from inside the Pod itself.

So try opening a shell in the container, and then access the app using a command-line HTTP client. The two common HTTP clients are curl and wget.

For example, if your app runs on port 8080, then try this from inside the Pod:

curl localhost:8080
  • Do you get a good result?

  • If not, is there anything obvious in the error message?

  • Do you see anything in the logs?

  • If curl doesn’t exist in the Pod, can you try something else, like wget?

If your app seems to be running OK, the next step is to look at the Service.

Testing the Service

The next step after testing your application Pods is to check the Service.

The Service is the load-balancing object in Kubernetes. It’s important because it makes your app accessible within the cluster.

But a Service can easily be misconfigured. From a simple typo, to using the wrong ports, I’ve been there, done that.

Troubleshooting tips

Start by getting the service name for your app:

kubectl get svc

Once you’ve figured out which service you need to troubleshoot, try a couple of things:

  • Can you access the Service from another Pod? Open a terminal inside another Pod, and try something like:

    curl http://myappservice:8080
    

    Can you reach the app?

  • Check the configuration of the Service (kubectl describe svc ...). Does everything look OK? Is it pointing to the correct ports on your Pod?

If you can access the Service within the cluster, but you can’t access the app externally, then it’s probably an issue with the Route or Ingress.

What usually causes errors with Services on Kubernetes?

Here are some reasons why a Service might not be working as you expect:

  • The Service is pointing to the wrong ports. The Service needs to know which port to forward requests to. If the Service port field doesn’t match your application’s port, it’ll fail.

  • The Service selector is pointing to the wrong Pods. The Service looks for Pods which match the labels it has in its selector field. If the selector doesn’t match the labels you’ve given in your Deployment or DeploymentConfig, it won’t find your Pods!

Did you pass this step?

Then the final step in this little troubleshooting guide is to check the traffic getting into the cluster.

Check the Route/Ingress

If your app is facing the outside world, you’ll probably be accessing it via a Route or an Ingress.

So if your app seems to be working inside the cluster, the final step is to check if you can access it from your desktop.

A web browser and curl are the essential tools for this step.

First, try visiting the app in a web browser if it’s a web app. Or, if it’s an API, then try to curl your app’s endpoint.

If your request times out, or you just can’t get to the app at all, then you might have a problem in your Route or Ingress.

You can get some information about the Route/Ingress object, either using kubectl describe ... or using your Kubernetes/OpenShift dashboard:

kubectl describe ingress ...

Or in OpenShift:

oc describe route ...

What usually causes errors with Routes and Ingresses on Kubernetes?

Networking varies massively from cluster-to-cluster, so there’s not one reason for networking problems. But, some of the potential reasons for for not being able to access an app from outside the cluster are:

  • The Route is HTTPS-only, and you’re accessing it via http://. Try adding https:// to the front.

  • The Ingress or Route are pointing to the wrong Service, perhaps

  • Something’s up with the Router (e.g. HAProxy, Traefik, etc.) in your Kubernetes cluster! Call your cluster admin…

TL;DR

In summary then: troubleshooting your app on Kubernetes can seem a bit overwhelming. But I’ve found that the best way is to break it down into parts, and check each part in turn.

  • Check that your app is healthy and servicing requests by trying to go inside the Pod, run a test or read its logs

  • Check that your Service is accessible from other Pods

  • Check your network Ingress or Route

Good luck, you will figure it out. And when you do, you will have learned a ton about Kubernetes in the process!

Comments

Got any thoughts on what you've just read? Anything wrong, or no longer correct? Sign in with your GitHub account to leave a comment.

(All comments get added as Issues in our GitHub repo here, using the comments tool Utterances)