> For the complete documentation index, see [llms.txt](https://stakpak.gitbook.io/docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://stakpak.gitbook.io/docs/tutorial/fix-kubernetes-apps-that-are-running-but-not-reachable.md).

# Fix Kubernetes Apps That Are Running but Not Reachable

## Overview

In this tutorial, we’ll use Stakpak to investigate and fix a Kubernetes incident where the application is unreachable even though the cluster looks healthy.

* The Pods are running
* The deployments are healthy.
* The ingress exists.

But the application is still down.

Instead of manually jumping between kubectl get, kubectl describe, ingress configs, Services, logs, events, and DNS trying to figure out where traffic is failing, we’ll ask Stakpak to investigate the cluster for us.

Stakpak will inspect the environment, connect the signals across Kubernetes networking layers, identify the root cause, apply the fix, and validate that the application is reachable again.

Then, we’ll set up Stakpak Autopilot to continuously monitor the cluster and handle similar issues automatically next time.

{% hint style="info" %}
Stakpak is open source, vendor neutral, and works with any model you choose.
{% endhint %}

## Problem

You deploy your application to Kubernetes, and everything seems fine.

The workloads are up. The Pods are running. Kubernetes does not show an obvious application crash.

But users cannot access the application.

The issue looks simple at first.

We try to access the application through the ingress:

`curl -i -H 'Host: demo.local' http://127.0.0.1/`

Instead of a successful response, the request fails.

So you start the usual Kubernetes debugging loop:

```
kubectl get pods -n apps
kubectl get svc -n apps
kubectl get ingress -n apps
kubectl describe ingress -n apps storefront
kubectl get events -n apps
kubectl logs -n apps deploy/frontend
kubectl logs -n ingress-nginx deploy/ingress-nginx-controller
```

And now you have to figure out what actually matters.

The frustrating part is that many Kubernetes failures look similar from the outside. A failed request could be caused by a problem in many different places:

* The Ingress
* The Ingress controller
* A Service
* Pod readiness
* Application configuration
* Internal DNS
* Networking
* A recent rollout
* A stale local or external route

Kubernetes gives you pieces of the picture, but not the full story.

You still have to trace the request path yourself:

* Did the request even reach the cluster?
* Did the ingress route traffic correctly?
* Did the Service point to the right Pods?
* Were healthy endpoints attached?
* Was the application actually listening on the expected port?
* Was the app failing while talking to another internal service?

## How Stakpak Helps?

Instead of manually tracing traffic across ingress rules, Services, endpoints, logs, events, and application configs, we can ask Stakpak to investigate the cluster for us.

Stakpak inspects the Kubernetes networking path, connects the signals across the cluster, identifies where traffic is failing, applies the fix, and validates that the application becomes reachable again.

Then, we’ll configure Stakpak [Autopilot](/docs/how-it-works/autopilot.md) to continuously monitor the cluster and automatically detect and fix similar networking issues in the future automatically.

## Application

The application is a storefront running on Kubernetes. Users access the storefront through\
an Ingress route, which sends traffic to the frontend service. The frontend is responsible\
for rendering the web page and retrieving product data from an internal Catalog API. The\
Catalog API is not exposed directly to users; it runs inside the cluster and provides the\
product information that appears on the storefront page.

The main components are:

* Frontend: Handles user-facing requests and serves the storefront page.
* Catalog API: Provides product data to the frontend through an internal Kubernetes\
  service.
* Ingress: Exposes the frontend to users through the demo.local hostname.
* Kubernetes Services: Route traffic between the Ingress, frontend, and internal API.
* Namespaces: Separate the application and supporting platform resources inside the\
  cluster.

The normal request flow is: a user accesses the storefront, the request enters through the\
Ingress, traffic is routed to the frontend, the frontend calls the Catalog API, and the\
page is returned with product data.

Now that we understand the app, we can start troubleshooting.

## Step-by-Step Guide

### Prerequisites

1. [Install Stakpak](/docs/get-started/install-stakpak.md)
2. Cloud provider credentials configured

### Troubleshooting

1. Open Stakpak and ask it to `Investigate why the Kubernetes app is not reachable, even tho the pods are running`

Now lets let it do its magic

<figure><img src="/files/l1FShhEjigbpbAf0IFCi" alt=""><figcaption></figcaption></figure>

Stakpak traced the request path across the cluster and identified multiple networking and configuration issues causing the outage.

It found that the storefront-public Service had no healthy endpoints because the selector did not match the running Pods.

Then it:

* Fixed the Service selector from storefront-v2 to storefront
* Corrected the Service targetPort from 8081 to web
* Fixed the frontend API configuration from catalog-api.platform.svc.cluster.local to catalog-api.apps.svc.cluster.local
* Applied the updated Kubernetes manifests
* Restarted the frontend deployment so it picked up the updated ConfigMap

After the changes were applied, Stakpak verified that:

* The storefront-public Service endpoints were created successfully
* The Ingress routed traffic to the correct backend Pods
* The application returned HTTP 200 OK

Now everything is working🥳

Let's ask it to set up Stakpak [Autopilot](/docs/how-it-works/autopilot.md)so we avoid waking up at 3am because of an incident🤡

{% hint style="info" %}
Stakpak Autopilot monitors your apps 24/7, detects unexpected changes, fixes what’s safe, and only alerts you when it actually matters.
{% endhint %}

### Monitoring

<figure><img src="/files/lGMkaieGY0MLQPLi7oG6" alt=""><figcaption></figcaption></figure>

Thats it, now it won't hunt us in our nightmares at 3 am.

## Extra Resources:

### Related Use Cases

* [Containerize a Python App](/docs/tutorial/containerize-a-python-app.md)
* [Fix Kubernetes CrashLoopBackOff in Minutes](/docs/tutorial/fix-kubernetes-crashloopbackoff-in-minutes.md)

and more...

### References

* [Install Stakpak](/docs/get-started/install-stakpak.md)
* [Configure Stakpak](/docs/get-started/configure-stakpak.md)
* [Configuration and credential file settings in the AWS CLI](https://docs.aws.amazon.com/cli/v1/userguide/cli-configure-files.html)
* [Autopilot](/docs/how-it-works/autopilot.md)
* [Handling Secrets](/docs/how-it-works/handling-secrets.md)
* [Warden Guardrails](/docs/how-it-works/warden-guardrails.md)