Detect and Fix Missing Backups for a PostgreSQL DB in Production
Overview
Running a PostgreSQL database in production without backups is one of the most common and dangerous mistakes.
Everything works fine… until it doesn’t.
In this guide, we will set up Stakpak Autopilot to continuously monitor our database, ensure backups are configured, and nag us on Slack if they’re missing.
Problem
A PostgreSQL database can run in production without any backups configured, and nothing will tell you.
Setting up backup monitoring manually requires:
Writing and scheduling backup jobs
Verifying backups actually run
Tracking retention and storage
Setting up alerts when backups fail or stop
In practice, this is often skipped or misconfigured.
Backups silently stop working, schedules drift, or alerts are never set up.
By the time you notice, It's usually too late.
How Stakpak Helps
Stakpak uses /Init to analyze your infrastructure and identifies stateful services like PostgreSQL.
Then it:
Detects whether a backup strategy exists
Flags missing or unsafe configurations
Recommends a safe backup schedule
Offers to set up Stakpak Autopilot to monitor your infrastructure 24/7, detect unexpected changes, fix what’s safe, and only alert you when it actually matters.
Step-by-Step Guide
Prerequisites
Cloud provider credentials configured
Now we can start.
Open stakpak and type
/initNow it will start exploring the local repos and the different cloud providers you have configured
Let's take a look at the apps.md that it created

First, we see that it found one app running on a t3.small EC2 instance, which has both Flask + PostgreSQL

Then it flagged all the risky stuff that it found, one of which was that there are no backups at all

Then it recommended to set up autopilot schedules, as you can see, one of them is to make sure that the database is backed up
Now let's ask Stakpak to mitigate the critical risk and to sit up Stakpak Autopilot

Then it will do stakpak magic✨
Now, let's see what Stakpak did

As you can see, it mitigated the stuff I told it to, and she started the autopilot schedules
Let's wait for the first schedule to fire in 5 min

Here, as you can see, it's working correctly🥳
Now, let's delete the backup and see what it does

as you can see it found that there was no backup, and it ran the script and backed up the date in S3
Extra Resources:
Related Use Cases
Monitor stateful services and ensure they’re safe (backups, persistence)
Detect disks filling up before they break production
Catch expired or misconfigured credentials
Detect infrastructure drift and risky changes
Investigate CI/CD failures and find root causes
Spot abnormal cloud costs and leaks
Monitor Kubernetes issues (OOMKilled, crashes)
Ensure services are actually running and reachable
Flag insecure configurations and risks
and more...
References
Last updated