> For the complete documentation index, see [llms.txt](https://stakpak.gitbook.io/docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://stakpak.gitbook.io/docs/tutorial/investigate-why-an-ec2-application-is-not-reachable.md).

# Investigate Why an EC2 Application is Not Reachable

## Overview

In this tutorial, we'll use Stakpak to investigate and fix an AWS networking incident where an application running on EC2 is healthy, but unreachable from the internet.

Rather than manually inspecting EC2, VPC, subnet, route table, security group, network ACL, systemd, nginx, and application logs one by one, we'll use Stakpak to:

* Investigate the incident
* Identify the root cause
* Apply the fix
* Validate that the EC2 application becomes reachable again

By the end of this tutorial, you'll learn how to use Stakpak to troubleshoot EC2 application reachability issues across both the instance and AWS networking layers. We will also sit stakpak autopilot so it monitors our infrastructure 24/7, auto fix issues when it's safe, and pings us when human judgment is needed.

{% hint style="info" %}
Stakpak is open source, vendor neutral, and works with any model you choose.
{% endhint %}

## Problem

You deploy a simple web application to an EC2 instance, and everything seems fine at first.

The Terraform deployment succeeds.

The EC2 instance is running.

The instance has a public IP address.

The security group appears to allow HTTP traffic.

The application process is healthy.

nginx is running.

But when you try to access the application from the internet, the request times out.

```
curl -v --connect-timeout 5 --max-time 10 http://ec2-3-236-155-58.compute-1.amazonaws.com/health
```

<figure><img src="/files/iYIQHDPHcy6EnuKO6XVx" alt=""><figcaption></figcaption></figure>

So you start the usual EC2 reachability debugging loop:

```
aws ec2 describe-instances \
  --instance-ids i-0a2bf3df8a5769989 \
  --region us-east-1

aws ec2 describe-instance-status \
  --instance-ids i-0a2bf3df8a5769989 \
  --region us-east-1

aws ec2 describe-security-groups \
  --group-ids sg-0d133f86e2d08a392 \
  --region us-east-1

aws ec2 describe-route-tables \
  --filters Name=vpc-id,Values=vpc-001f8813b0d78f5e3 \
  --region us-east-1

aws ec2 describe-subnets \
  --subnet-ids subnet-07083683f7e1d2f09 \
  --region us-east-1

aws ec2 describe-network-acls \
  --filters Name=association.subnet-id,Values=subnet-07083683f7e1d2f09 \
  --region us-east-1
```

Then you start checking the instance

```
aws ssm start-session \
  --target i-0a2bf3df8a5769989 \
  --region us-east-1
```

Now you have to figure out what actually matters.

* Is the instance unhealthy?
* Is nginx down?
* Is the app listening on the wrong interface?
* Is the security group blocking traffic?
* Is the subnet public?
* Is the route table missing an internet route?
* Is the public IP missing?
* Is another VPC networking control blocking the request?

AWS gives you the clues, but you still have to connect them.

## How Stakpak Helps?

Instead of manually tracing the request path across EC2, systemd, nginx, security groups, route tables, subnets, internet gateways, public IPs, and network ACLs, we can ask Stakpak to investigate the AWS environment for us.

Stakpak inspects both sides of the problem:

• The application and host layer\
• The AWS networking and infrastructure layer

It checks whether the EC2 instance is healthy, whether the application is running, whether nginx is listening, whether local health checks pass, and whether the VPC path from the internet to the instance is correctly configured.

Stakpak then connects the signals across AWS and the instance, identifies why the public request is failing, applies the minimum safe fix, and validates that the application becomes reachable from the internet.

This is especially useful for EC2 incidents because public reachability depends on several layers being correct at the same time. A healthy application does not guarantee a reachable application.

## Application

The application is a simple web service running on an EC2 instance.

It represents a small catalog preview service for the Northstar Commerce platform.&#x20;

The app exposes a health endpoint and a basic HTML page. It runs locally on the instance and is served to external clients through nginx.

The main components are:

* EC2 Instance: Runs the application and nginx.
* Python Web Application: Provides the demo web service and health endpoint.
* systemd Service: Keeps the application process running.
* nginx: Listens on HTTP port 80 and proxies requests to the local app.
* Security Group: Controls instance-level inbound and outbound traffic.
* Subnet: Places the instance inside the VPC network.
* Route Table: Defines how traffic leaves the subnet.
* Internet Gateway: Provides internet connectivity for the VPC.
* Network ACL: Applies subnet-level traffic rules.
* IAM Instance Profile: Allows access through AWS Systems Manager Session Manager.

The normal request flow is:

A user sends an HTTP request to the EC2 public DNS name, traffic enters the VPC through the internet gateway, reaches the public subnet, passes the subnet and instance network controls, reaches nginx on port 80, nginx proxies the request to the local Python app on 127.0.0.1:8080, and the app returns a health response.

The expected health endpoint is:

`GET /health`

When the application is working correctly, it returns:

`{`\
`"status": "ok",`\
`"service": "northstar-catalog-preview"`\
`}`

In this incident, the application is healthy from inside the instance, but unreachable from the internet.

Now that we understand the app, we can start troubleshooting.

## Step-by-Step Guide

### Prerequisites

1. [Install Stakpak](/docs/get-started/install-stakpak.md)
2. Cloud provider credentials configured

### Troubleshooting

1. Open Stakpak and ask it to `investigate the EC2 issue`

Now lets let it do its magic

<figure><img src="/files/l1FShhEjigbpbAf0IFCi" alt=""><figcaption></figcaption></figure>

Stakpak started by investigating why the EC2 /health endpoint was timing out by checking DNS, EC2 status, SSM access, security groups, route tables, NACLs, and local app health.

It found that the EC2 instance and app were healthy, but the subnet Network ACL was blocking outbound ephemeral response traffic. The instance could receive traffic on port 80, but couldn’t send responses back to clients.

<figure><img src="/files/MWtW5AjnyxOBOIT146eo" alt=""><figcaption></figcaption></figure>

Then it:

* Verified EC2 status checks were passing
* Confirmed SSM access was online
* Confirmed nginx and the app were running locally
* Verified local /health returned 200 OK
* Confirmed the security group and route table were correct
* Added an outbound NACL rule for TCP 1024-65535
* Ran Terraform validation
* Applied the Terraform fix

During apply, Terraform replaced the EC2 instance because the AL2023 AMI changed.

After the fix, Stakpak verified that:

* The new instance i-04244ee1e1e4ef422 was running
* The new URL was <http://ec2-44-223-99-238.compute-1.amazonaws.com>
* The NACL allowed outbound ephemeral traffic
* /health returned HTTP/1.1 200 OK

Now everything is working🥳

Let's ask it to set up Stakpak [Autopilot](/docs/how-it-works/autopilot.md)so we avoid waking up at 3am because of an incident🤡

{% hint style="info" %}
Stakpak Autopilot monitors your apps 24/7, detects unexpected changes, fixes what’s safe, and only alerts you when it actually matters.
{% endhint %}

### Monitoring

<figure><img src="/files/DwAc7r3dux5mJG7gKacd" alt=""><figcaption></figcaption></figure>

Thats it, now it won't hunt us in our nightmares at 3 am.

## Extra Resources:

### Related Use Cases

* [Containerize a Python App](/docs/tutorial/containerize-a-python-app.md)
* [Load Test to Optimize Cloud Costs](/docs/tutorial/load-test-to-optimize-cloud-costs.md)
* [Free TLS with  Caddy Web Server on AWS EC2 with Let's Encrypt](/docs/tutorial/free-tls-with-caddy-web-server-on-aws-ec2-with-lets-encrypt.md)

and more...

### References

* [Install Stakpak](/docs/get-started/install-stakpak.md)
* [Configure Stakpak](/docs/get-started/configure-stakpak.md)
* [Configuration and credential file settings in the AWS CLI](https://docs.aws.amazon.com/cli/v1/userguide/cli-configure-files.html)
* [Autopilot](/docs/how-it-works/autopilot.md)
* [Handling Secrets](/docs/how-it-works/handling-secrets.md)
* [Warden Guardrails](/docs/how-it-works/warden-guardrails.md)


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://stakpak.gitbook.io/docs/tutorial/investigate-why-an-ec2-application-is-not-reachable.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
