Load Testing to Optimize Performance & Cloud Costs

How to load test your app to find the sweet spot between performance and cost.

Overview

Most load testing tools focus on performance metrics like response times and throughput. They don't show you how much your scaling decisions cost in real-time.

This guide shows you how to use Stakpak to see both performance & cost during load tests. You'll learn exactly how much each scaling decision costs, so you can find the cheapest way to keep your app running smoothly.

Problem

Companies face a lot of challenges while trying to optimize cloud costs through load testing:

The limitations of traditional load testing

  • Standard load testing tools focus on performance metrics such as response times, throughput, and error rates.

  • Then they use these results to make architectural design decisions like scaling services, adding replicas, or reconfiguring infrastructure without thinking about the financial impact.

  • They only realize the true cost of their architectural decisions once the cloud bill arrives.

This disconnect makes it difficult to align technical performance goals with cost optimization. Performance and cost live in two different universes

Business Impact

  • Unpredictable monthly cloud bills

  • Inefficient resource utilization (often 20-30% waste)

  • Difficulty in capacity planning and budgeting

  • Competitive disadvantage due to higher operational costs

How Stakpak Helps?

With the Infrastructure Cost Estimation Rule Books, you can ask Stakpak not to just measure performance but also the cost of your architecture decisions. Instead of waiting for the bill, you can forecast costs during testing and make architecture decisions that are both performance-driven and cost-aware.

You also dont need to remember any CLI/tool commands.

Step-by-Step Guide

Prerequisites

  1. Cloud provider credentials configured

  2. Application deployed and accessible

  3. Basic understanding of your application's architecture.

  4. Choose the endpoint you want to test (Staging or Ephemeral Environment)

  5. Make sure you have explicit permission to run load tests.

In this guide, we will be load testing hackathon-judge-app. Let's take a look at the Cloud Architecture and what this app does.

Architecture

This setup deploys the hackathon-judge-app on Amazon ECS Fargate in the eu-north-1 region. Traffic flows through an Application Load Balancer (80/443 → Target Group 8501) into ECS tasks running across two availability zones for high availability.

  • Networking: VPC (10.0.0.0/16) with public subnets (NAT Gateways) and private app subnets.

  • Compute: ECS Fargate cluster with auto scaling (1–2 tasks, CPU 70%, memory 80%).

  • Registry: Amazon ECR stores the container images.

  • Observability: CloudWatch Logs (7 days) and Container Insights enabled.

Now, let's take a look at the app

Application

Hackathon Judge App

A Streamlit web application for judging hackathon pitches. Designed to be accessible through mobile web browsers with persistent data storage.

Features

  • Judge selection: Each judge can select their name before scoring teams

  • Team scoring: Judges can score teams based on configurable criteria

  • Score persistence: All scores are saved to a local JSON file

  • Mobile-friendly design: Optimized for use on mobile devices

  • Configurable through YAML: Easy to adjust teams, judges, and judging criteria

  • Custom branding: Add your event logo and title for a personalized experience

  • Authentication: Password protection for judges to secure the scoring process

Now we can start that we understand the app and the architecture, we can start load testing our app

  1. Open your terminal

  2. Open Stakpak by typing stakpak

In this guide, we will use Apache Bench to load test our app

  1. Now let's ask Stakpak to load test our app [insert app link] with Apache Bench and monitor its resource utilization

That's it stakpak will automaically figure out what to do and how to use Apache Bench

Here we see that the CPU utilization spiked to 17.88% and memory remained stable at 15% and auto scalling wasnt triggred

Now let's run the high load test 200 concarrunt users in 120 seconds

Even under stress testing, the highest CPU spike was 42.53%, far below the 70% auto-scaling threshold. This shows that the current infrastructure is over-provisioned for the tested workload and has plenty of headroom before scaling becomes necessary.

If you want to see this in action you can see our Stakpak Ship It session where we did that live

Now that we’ve confirmed our infrastructure is over-provisioned, the next step is to evaluate cost efficiency. With Stakpak, you can go beyond performance testing and ask it to:

  • Estimate cloud costs for the current setup (it will use the Infrastructure Cost Estimation Rule Books)

  • Generate a detailed report breaking down where resources (CPU, memory, storage, networking) are underutilized

  • Provide actionable recommendations for cost optimization for example, rightsizing instances, adjusting auto-scaling policies, or switching to more efficient pricing models

Extra Resources:

  • Performance Testing

  • Infrastructure Cost Estimation

and more...

References

Last updated