Event-Driven Serverless Architectures
Event-driven architectures decouple producers and consumers via asynchronous messaging. This lesson covers patterns, multi-cloud messaging, and design trade-offs.
Simple Explanation
What it is
Event-driven architecture means one service announces an event and many other services react independently.
Why we need it
This approach scales well and avoids long, fragile call chains.
Benefits
- Independent scaling for each consumer.
- Resilience when one service is slow or down.
- Easy extensibility when you add new consumers later.
Tradeoffs
- Eventual consistency across services.
- More complex debugging across async flows.
Real-world examples (architecture only)
- Order created -> Billing -> Inventory -> Email.
- File uploaded -> Virus scan -> Thumbnail -> Metadata.
Core Pattern: Event Bus / Pub-Sub
AWS Implementation (SQS + SNS)
SNS (Pub/Sub, fan-out):
import boto3
import json
sns = boto3.client('sns')
topic_arn = 'arn:aws:sns:us-east-1:123456789012:order-events'
# Publish an event
sns.publish(
TopicArn=topic_arn,
Subject='Order Created',
Message=json.dumps({
'order_id': '12345',
'customer': 'Alice',
'items': ['item1', 'item2']
})
)
# Lambda subscriber (automatically invoked)
def handle_order_created(event, context):
message = json.loads(event['Records'][0]['Sns']['Message'])
order_id = message['order_id']
print(f"Processing order {order_id}")
SQS (Queue, pull-based):
# Publish to queue
sqs = boto3.client('sqs')
sqs.send_message(
QueueUrl='https://sqs.us-east-1.amazonaws.com/123456789012/order-queue',
MessageBody=json.dumps({'order_id': '12345'})
)
# Lambda polls the queue (event source mapping)
def handle_sqs_message(event, context):
for record in event['Records']:
message = json.loads(record['body'])
print(f"Processing: {message}")
Google Cloud Implementation (Pub/Sub)
from google.cloud import pubsub_v1
publisher = pubsub_v1.PublisherClient()
topic_path = publisher.topic_path('my-project', 'order-events')
# Publish
future = publisher.publish(
topic_path,
json.dumps({
'order_id': '12345',
'customer': 'Alice'
}).encode('utf-8')
)
# Cloud Function subscriber (HTTP-triggered)
def handle_order_event(request):
envelope = request.get_json()
payload = base64.b64decode(envelope['message']['data'])
message = json.loads(payload)
order_id = message['order_id']
print(f"Processing order {order_id}")
return 'OK', 200
Messaging Trade-offs
| Aspect | SNS (Pub/Sub) | SQS (Queue) | Pub/Sub (GCP) |
|---|---|---|---|
| Fan-out | Yes (1-to-many) | No (1-to-1 per consumer) | Yes (native) |
| Semantics | At-most-once | At-least-once | At-least-once |
| Dead-letter queue | Via SNS redrive | Native SQS DLQ | Via Pub/Sub dead letter |
| Retention | 15 minutes | Up to 14 days | 7 days |
| Cost model | Per message | Per message (cheaper) | Per message |
Best Practices
1. Dead-Letter Queues (DLQs)
Track failed messages for audit and replay:
# AWS SQS with DLQ
table = boto3.resource('dynamodb').Table('failed-orders')
def handle_order_error(event, context):
for record in event['Records']:
try:
process_order(record)
except Exception as e:
# Log to DLQ table for manual inspection
table.put_item(Item={
'timestamp': int(time.time()),
'message': record['body'],
'error': str(e)
})
raise # SNS/SQS will retry or send to DLQ
2. Idempotency
Ensure events can be processed multiple times safely:
def handle_event(event, context):
event_id = event['event_id']
# Check if already processed
response = table.get_item(Key={'event_id': event_id})
if 'Item' in response:
return 'Already processed'
# Process and record
result = do_work(event)
table.put_item(Item={
'event_id': event_id,
'result': result,
'timestamp': int(time.time())
})
return 'OK'
3. Batch Processing
Reduce per-message overhead:
def handle_batch(event, context):
batch = [json.loads(r['body']) for r in event['Records']]
# Process all at once (more efficient DB operations)
bulk_insert(batch)
return {'batchItemFailures': []}
When to Use Event-Driven
✓ Asynchronous workflows (can tolerate delays) ✓ Fan-out patterns (one event → multiple consumers) ✓ Decoupled microservices ✓ High-throughput batch processing ✓ Cost-optimized (pay per throughput, not per request)
✗ Synchronous, low-latency requirements ✗ Simple request-response patterns ✗ Single-step workflows
For hands-on implementation details, see Level 2 — Lesson 3: Event Sources & Triggers.
Next Steps
- Lesson 4: Push vs Pull event models.
- Lesson 5: Error handling and retries in async systems.
- Lesson 6: Orchestration vs choreography for complex workflows.