Vedonyx

Vedonyx JournalDev & Engineering

Cloud Infrastructure for Startups: What We Wish We Knew at $1M ARR

Esther Howard

Esther Howard

Software Developer

March 17, 2026

11 min read

Cloud Infrastructure for Startups: What We Wish We Knew at $1M ARR

The infrastructure decisions you make when you're small will haunt you or save you when you scale. A CTO's honest retrospective on what we got right and catastrophically wrong.

The $47,000 Mistake

In Q3 2024, we received an AWS bill for $47,000. Our average monthly spend was $3,200.

The culprit was a misconfigured S3 lifecycle policy that had been silently accruing Glacier retrieval costs for four months. Nobody had noticed because our cost alerting was set up at the account level, not the service level.

This post is about the infrastructure lessons we learned the hard way, so you don't have to.

The Five Decisions That Matter Most

1. Multi-region from Day One? No. Unless you have genuine compliance requirements for data residency, don't try to go multi-region as a startup. The operational complexity is enormous and the reliability gains are marginal compared to a well-architected single-region setup with proper availability zone distribution.

2. Managed Services Over Self-Hosting Every time we've tried to self-host something we "could easily manage ourselves" — Redis, Postgres, Elasticsearch — we've regretted it. The engineering hours spent on maintenance and incident response always outweigh the cost savings.

Use RDS. Use ElastiCache. Use OpenSearch. The margin on AWS managed services is your peace of mind.

3. IaC Before Your First Production Incident If you don't have your infrastructure in code (Terraform, Pulumi, CDK — take your pick), you are one catastrophic incident away from being unable to rebuild your environment.

4. Cost Tagging is Not Optional Tag every resource with at minimum: `environment`, `service`, and `owner`. Enable AWS Cost Explorer from day one. Set up billing alarms at the service level, not just the account level.

5. On-call Rotation Needs to Scale Before You Do An on-call rotation of one person is not a rotation. Invest in proper incident management tooling (PagerDuty or equivalent) and document your runbooks before you need them.

What We'd Do Differently

Start with a platform team mentality from day one, even if the "team" is one person whose job is 20% infrastructure. The cost of retrofitting operational maturity into a fast-growing product is always higher than the cost of building it in from the start.

#Cloud#Infrastructure#AWS#Scaling#DevOps

Never miss an insight.

Join 15,000+ leaders getting our latest technical strategies.