The $47,000 Mistake
In Q3 2024, we received an AWS bill for $47,000. Our average monthly spend was $3,200.
The culprit was a misconfigured S3 lifecycle policy that had been silently accruing Glacier retrieval costs for four months. Nobody had noticed because our cost alerting was set up at the account level, not the service level.
This post is about the infrastructure lessons we learned the hard way, so you don't have to.
The Five Decisions That Matter Most
1. Multi-region from Day One? No. Unless you have genuine compliance requirements for data residency, don't try to go multi-region as a startup. The operational complexity is enormous and the reliability gains are marginal compared to a well-architected single-region setup with proper availability zone distribution.
2. Managed Services Over Self-Hosting Every time we've tried to self-host something we "could easily manage ourselves" — Redis, Postgres, Elasticsearch — we've regretted it. The engineering hours spent on maintenance and incident response always outweigh the cost savings.
Use RDS. Use ElastiCache. Use OpenSearch. The margin on AWS managed services is your peace of mind.
3. IaC Before Your First Production Incident If you don't have your infrastructure in code (Terraform, Pulumi, CDK — take your pick), you are one catastrophic incident away from being unable to rebuild your environment.
4. Cost Tagging is Not Optional Tag every resource with at minimum: `environment`, `service`, and `owner`. Enable AWS Cost Explorer from day one. Set up billing alarms at the service level, not just the account level.
5. On-call Rotation Needs to Scale Before You Do An on-call rotation of one person is not a rotation. Invest in proper incident management tooling (PagerDuty or equivalent) and document your runbooks before you need them.
What We'd Do Differently
Start with a platform team mentality from day one, even if the "team" is one person whose job is 20% infrastructure. The cost of retrofitting operational maturity into a fast-growing product is always higher than the cost of building it in from the start.