Microservices vs Monolith: When to Split Your Architecture
I’ve rebuilt CitizenApp’s backend twice. Once I split it into microservices because “that’s what production systems do.” I was wrong. We spent 6 months debugging distributed tracing logs instead of shipping features. Then we merged back into a monolith, and velocity tripled.
This post isn’t romantic. It’s a decision framework for when distributed systems actually pay off—and when they’re just complexity theater.
The Monolith Default
Start monolithic. Every successful SaaS I know—Stripe, GitHub, Figma—began as a single codebase. This isn’t laziness; it’s pragmatism.
A monolith with Astro + FastAPI gives you:
- Single deployment pipeline. One GitHub Actions workflow deploys everything.
- Shared database transactions. ACID guarantees without eventual consistency nightmares.
- Simpler debugging. Stack traces tell the whole story.
- Lower operational overhead. One container, one set of logs, one database.
For CitizenApp’s first 18 months, we ran everything in a single FastAPI app on Render. 50K users. 9 AI features. One database. We shipped features faster than teams with “proper” microservices architecture.
Cost matters too. A monolith on Render costs ~$50/month to start. Your first microservice architecture? You’re looking at orchestration, service mesh tooling, monitoring agents, minimum 3-4x the infrastructure cost.
When Splitting Actually Wins
Stop generalizing about “scale.” Microservices solve specific problems:
1. Independent Deployment Cycles
Your ML pipeline team needs to redeploy inference workers 20x daily. Your API team ships weekly. A monolith couples them.
Split when: Different services have different deployment frequencies or risk profiles.
# BAD: Monolith—one team blocks another
class FastAPIApp:
def get_user(self, user_id: str):
return db.query(User).filter(User.id == user_id).first()
def generate_embedding(self, text: str):
# If this crashes, entire API goes down
return claude_client.create_embedding(text)
# GOOD: Separate services
# api/main.py
@app.get("/users/{user_id}")
async def get_user(user_id: str):
return {"id": user_id, "name": "John"}
# ml/embedding_service.py
@app.post("/embed")
async def embed_text(text: str):
# Independent scaling, independent deploys
return await claude_client.create_embedding(text)
2. Resource Isolation
One feature consumes all database connections and starves the rest. A monolith makes this a crisis.
Split when: You need to isolate resource usage for reliability.
I watched CitizenApp’s document processing eat 40 DB connections while user login requests queued. We extracted it:
# Dedicated PostgreSQL pool for background jobs
background_pool = create_engine(
DATABASE_URL,
pool_size=5,
max_overflow=10,
pool_pre_ping=True # Verify connections are alive
)
# Main API keeps its own pool
api_pool = create_engine(
DATABASE_URL,
pool_size=20,
max_overflow=20
)
This was a config fix, not a service split. Know the difference.
3. Technology Lock-in
Your team knows Python. A new requirement needs Node.js + WebSockets. You hire specialists who want to own their stack.
Split when: Technology genuinely constrains your team, not ego.
At CitizenApp, our real-time collaboration engine became a separate Node.js service. Python developers built the core product. Node developers owned the socket layer. Clear boundaries. Clear ownership.
4. Scale-Specific Optimization
You’re serving 100K concurrent WebSocket connections for real-time features. Your API serves REST requests with average 200ms response time.
These have opposite optimization strategies. A shared codebase gets tangled.
Split when: Different services optimize for different performance profiles.
The Checklist: Stay Monolithic Unless…
□ Different services deploy 5+ times per day independently?
□ Resource contention causes production incidents monthly?
□ You need different tech stacks that teams prefer to own?
□ One service requires horizontal scaling others don't?
□ Your database is a bottleneck (not your app logic)?
□ You have 50+ engineers who can't maintain coherent monolith?
If you checked fewer than 3 boxes: stay monolithic.
Cost Reality
Monolith (Render, 10K users): $50/month
Basic microservices (Kubernetes, observability): $2K+/month
Production microservices (service mesh, distributed tracing, on-call rotations): $10K+/month
The last line item is often invisible. Microservices require:
- On-call rotations across services
- Distributed tracing (DataDog, New Relic)
- API gateway and load balancing
- Service mesh (Istio/Linkerd) at scale
- Chaos engineering to test failure scenarios
Gotcha: The False Signal
Most teams split too early because of vanity architecture.
I’ve seen startups choose Kubernetes on day 1. “We want to scale.” They’re running 3 Docker containers. Kubernetes doesn’t make you Netflix; shipping features does.
What actually burned me: splitting CitizenApp’s authentication into a separate service “for security.” We added:
- JWT validation in two places (inconsistency bugs)
- Network calls to the auth service on every request
- Debugging distributed auth flows took 3x longer
- We eventually merged it back
The real cost of microservices is cognitive load. Every service boundary is a place to debug, secure, test, and monitor. You’re paying that cost immediately.
The Right Approach
- Start monolithic. Ship features. Measure what actually breaks.
- Only split services that are provably constrained. Database? Compute? Deployment frequency?
- Use clear contracts. If you split, define OpenAPI schemas, versioning strategy, and timeout policies upfront.
- Monitor the boundary costs. If inter-service calls become a bottleneck, you split too aggressively.
For CitizenApp, we stayed monolithic for 18 months, split the ML pipeline (legitimate reason), and kept everything else together. The result: we shipped 9 AI features faster than competitors with “proper” distributed systems.
Architectural purity is cheaper than shipped features.
Comments
All comments are moderated before appearing.
Leave a comment