Lesson 3: Stateless vs Stateful Architecture
What this lesson covers
- Why serverless functions must be stateless
- Where state lives: external stores vs in-process
- Consistency models: strong vs eventual
- Designing data flows across ephemeral invocations
Read time: 10–12 minutes
What you'll learn
- Ephemeral execution model: Each function invocation is a fresh process with no persistent memory.
- State externalization: All mutable state moves to external stores (databases, caches, object storage).
- Consistency trade-off: Serverless systems naturally operate under eventual consistency, not strong consistency.
Simple Explanation
What it is
Stateless means each request is handled as if it is the first time: no memory from previous calls. Stateful systems keep data in memory between requests, like a long-running server process that remembers users and sessions.
Why we need it
Serverless scales by creating many short-lived instances on demand. If each instance kept its own memory, you would get inconsistent behavior and data loss. Stateless design makes every instance safe to run anywhere.
Benefits
- Simple scaling because any instance can handle any request.
- Fewer hidden bugs from stale in-memory state.
- Better resilience because failed instances do not take state with them.
Tradeoffs
- External storage required for sessions, caches, and progress.
- Extra latency when reading or writing state remotely.
- Eventual consistency when state replicates across regions.
Real-world examples (architecture only)
- Login session stored in database instead of memory.
- Shopping cart stored in cache or database, not in a server process.
- File processing stores progress in a table for retries.
Core Concept: Ephemeral Functions and External State
The Traditional Model: Stateful Servers
A traditional web server is a persistent resource. It runs continuously. State lives in its memory:
This server is stateful. The next request from the same user may encounter the same server, finding their session still in memory.
The Serverless Model: Ephemeral Functions
A serverless function is ephemeral. It exists only during execution. After it returns, the process terminates:
Why? Serverless platforms scale to thousands of concurrent invocations. Each needs a fresh process. Keeping processes warm for potential reuse is expensive. Terminating them is cheaper.
Implication: You cannot rely on in-process state. All state must be external.
Where State Lives: The Externalization Principle
In serverless systems, every piece of mutable state moves to an external store:
| State Type | Traditional | Serverless |
|---|---|---|
| User sessions | In-process HashMap | DynamoDB, Firestore, Redis |
| Database connections | Connection pool | Fresh connection per invocation |
| Cached data | In-process cache | Redis, Memcached, Firestore |
| Temporary files | Local /tmp (warning: may persist between invocations, unreliable) | S3, Cloud Storage, or recompute |
| Request context | Stack variable | Event parameter or external store |
Example: User Authentication
Traditional:
# Server starts once, session cache lives in memory
session_cache = {}
def login(request):
user = authenticate(request.json)
session_cache[user.id] = {
"permissions": ["read", "write"],
"logged_in_at": now_ms(),
}
return {"userId": user.id}
def get_profile(request):
session = session_cache.get(request.user_id)
if session:
return {"permissions": session["permissions"]}
Serverless:
# Each invocation is fresh; no in-memory cache
def login(event, context):
user = authenticate(event.get("body"))
# Store session in an external database
db.put(
table="sessions",
item={
"user_id": user.id,
"permissions": ["read", "write"],
"logged_in_at": now_ms(),
"ttl": now_unix() + 86400,
},
)
return {"userId": user.id}
def get_profile(event, context):
user_id = event.get("userId")
result = db.get(table="sessions", key={"user_id": user_id})
if not result:
raise ValueError("Session not found")
return {"permissions": result["permissions"]}
What this does: The serverless version externalizes session state so any invocation can read it safely, which enables horizontal scaling.
Consistency Models: Strong vs Eventual
A consistency model defines what clients can expect from data.
Strong Consistency
Definition: After a write, all subsequent reads see the new value.
How it works: Synchronous writes. The write completes before returning to the client.
Cost: Slower (can't distribute easily), less available (single source of truth).
Traditional: Single database, strong consistency.
Eventual Consistency
Definition: After a write, reads eventually see the new value. There's a brief window where different clients see different data.
How it works: Write goes to one place immediately. Replication to other places happens asynchronously.
Advantage: Faster (no synchronous replication wait), more available (works even if some replicas are down).
Cost: Brief inconsistency windows. Complex conflict resolution.
Serverless: Functions often operate under eventual consistency because:
- External state stores may replicate asynchronously by default
- Synchronous writes to multiple places are expensive (slower function invocation)
Designing Data Flows Under Eventual Consistency
With eventual consistency, you must design systems expecting temporary inconsistency.
Pattern 1: Read-Your-Own-Writes (RYOW)
Goal: Ensure the client sees their own writes immediately.
How:
- Store the written value in the response
- Or, store in a temporary place the client reads from first
Example: Creating an Order
def create_order(event, context):
order = {
"id": uuid4(),
"customer_id": event.get("customerId"),
"items": event.get("items"),
"status": "pending",
"created_at": now_ms(),
}
db.put(table="orders", item=order)
return order
def get_order(event, context):
order_id = event.get("orderId")
result = db.get(table="orders", key={"id": order_id})
return result
Pattern 2: Versioning and Timestamps
Track when data was written. Clients can decide to use cached (older) data or wait for fresh data.
Example: Profile Updates
# Store: {"version": 1, "email": "old@example.com", "updated_at": 1000}
def update_profile(event, context):
user_id = event.get("userId")
new_email = event.get("newEmail")
result = db.update(
table="profiles",
key={"user_id": user_id},
updates={
"email": new_email,
"version": {"op": "inc", "value": 1},
"updated_at": now_ms(),
},
return_values=True,
)
return result
def get_profile(event, context):
user_id = event.get("userId")
profile = db.get(table="profiles", key={"user_id": user_id})
age_ms = now_ms() - profile.get("updated_at", 0)
# Data can be stale depending on store replication
if age_ms > 0:
log_info(f"Profile age: {age_ms}ms")
return profile
Pattern 3: Idempotent Writes with Retry
Design writes so they're safe to retry even if replicas are out of sync.
def transfer_funds(event, context):
from_account = event.get("fromAccount")
to_account = event.get("toAccount")
amount = event.get("amount")
transfer_id = event.get("transferId")
db.put(
table="transfers",
item={
"transfer_id": transfer_id,
"from_account": from_account,
"to_account": to_account,
"amount": amount,
"status": "pending",
"created_at": now_ms(),
},
condition="attribute_not_exists(transfer_id)",
)
try:
db.update(
table="accounts",
key={"account_id": from_account},
updates={"balance": {"op": "inc", "value": -amount}},
)
except Exception as exc:
db.delete(table="transfers", key={"transfer_id": transfer_id})
raise exc
try:
db.update(
table="accounts",
key={"account_id": to_account},
updates={"balance": {"op": "inc", "value": amount}},
)
except Exception as exc:
db.update(
table="accounts",
key={"account_id": from_account},
updates={"balance": {"op": "inc", "value": amount}},
)
raise exc
db.update(
table="transfers",
key={"transfer_id": transfer_id},
updates={"status": "completed"},
)
return {"status": "success"}
State Storage Options: Trade-Offs
| Store | Latency | Consistency | Cost | Use Case |
|---|---|---|---|---|
| DynamoDB / Firestore | Varies by workload | Configurable | Higher | Primary application state |
| Redis / Memcached | Low (in-memory) | Key-level atomic | Medium | Caches, sessions, counters |
| S3 / Cloud Storage | Varies by object size | Eventual/consistent (service-specific) | Lower | Large objects, backups |
| Relational DB (RDS, Cloud SQL) | Varies by workload | Strong (within limits) | Higher | Complex queries, ACID needed |
| Elasticsearch | Varies by workload | Eventual | Medium | Full-text search, analytics |
Choosing the Right Store
- Session data: Redis (fast, temporary)
- Order data: DynamoDB (durable, scalable, eventually consistent)
- User profile: Firestore (rich queries, real-time sync)
- Logs/metrics: S3 + Athena (cheap, bulk analysis)
- Complex reporting: BigQuery, Redshift (offline batch processing)
Connection Management in Serverless
Database connections are expensive. Each connection ties up memory and resources.
Anti-Pattern: Opening/Closing Connections Per Invocation
# SLOW (connection overhead per invocation)
def handler(event, context):
connection = mysql_connect(config)
result = connection.query("SELECT ...")
connection.close()
return result
Pattern: Connection Pooling (or Connection Reuse)
# GOOD (reuse connections across invocations)
pool = None
def get_pool():
global pool
if pool is None:
pool = mysql_pool(host=os.environ.get("DB_HOST"), limit=5)
return pool
def handler(event, context):
connection = get_pool().get_connection()
try:
result = connection.query("SELECT ...")
return result
finally:
connection.release()
What this does: Reuses a connection pool across warm invocations to reduce latency while still handling stale connections safely.
Caveat: Lambda may reuse container between invocations. Connections survive. But they may timeout or die. Always have retry logic.
Common Mistakes in State Management
-
Assuming in-process state persists: It doesn't. Every invocation may be a new process. Don't store state in variables.
-
Writing to local
/tmpand expecting persistence:/tmpmay persist between invocations, but it is size-limited, unreliable, and not shared across invocations. Use external storage. -
Ignoring concurrency: Two invocations writing the same record simultaneously. Use atomic updates or versioning.
-
Synchronous waits for consistency: After a write, don't immediately read expecting the new value. Either get it from the response, or use eventual consistency patterns.
-
Opening too many connections: Connection pools leak. Close/release connections properly in finally blocks.
When to Use Eventual Consistency vs Strong Consistency
Use Eventual Consistency (common in serverless systems):
- Most application state (orders, profiles, inventory)
- You can tolerate brief windows of stale data
- You can implement compensating actions if conflicts occur
Use Strong Consistency (requires synchronous coordination):
- Financial transactions
- Inventory reservations (to prevent overselling)
- Atomic operations across multiple entities
Practical: Combine both. Use strong consistency for critical operations (payment), eventual consistency for non-critical (email notification).
What Comes Next
With stateless design and external state understood—now you're ready for:
- Lesson 4: Loose coupling (designing independent, scalable services)
- Lesson 5: Compute concepts (cold starts, concurrency limits, memory)
- Lesson 6: Real-world use cases (what actually works in serverless)
Key Takeaway: Serverless functions are ephemeral. All state lives externally. Design for eventual consistency and idempotent operations. This is not a limitation—it's the foundation for scalable systems.
Project (Cloud-Agnostic)
Design a serverless session system where login writes to an external store and profile reads from it.
Deliverables:
- Describe the vendor-neutral architecture (event source, compute, state, observability).
- Map each component to AWS or GCP services.
- Explain why each service fits the consistency and latency needs.
If you want feedback, email your write-up to maarifaarchitect@gmail.com.
References
- AWS Lambda Developer Guide: https://docs.aws.amazon.com/lambda/latest/dg/welcome.html
- AWS DynamoDB: https://docs.aws.amazon.com/amazondynamodb/
- AWS S3: https://docs.aws.amazon.com/s3/
- Google Cloud Functions: https://docs.cloud.google.com/functions/docs
- Google Cloud Firestore: https://cloud.google.com/firestore/docs
- Google Cloud Storage: https://cloud.google.com/storage/docs