Event-Driven Serverless Architectures

Event-driven architectures decouple producers and consumers via asynchronous messaging. This lesson covers patterns, multi-cloud messaging, and design trade-offs.

Simple Explanation

What it is

Event-driven architecture means one service announces an event and many other services react independently.

Why we need it

This approach scales well and avoids long, fragile call chains.

Benefits

Independent scaling for each consumer.
Resilience when one service is slow or down.
Easy extensibility when you add new consumers later.

Tradeoffs

Eventual consistency across services.
More complex debugging across async flows.

Real-world examples (architecture only)

Order created -> Billing -> Inventory -> Email.
File uploaded -> Virus scan -> Thumbnail -> Metadata.

Event-driven order flow

Core Pattern: Event Bus / Pub-Sub

Event bus pattern

AWS Implementation (SQS + SNS)

SNS (Pub/Sub, fan-out):

import boto3
import json

sns = boto3.client('sns')
topic_arn = 'arn:aws:sns:us-east-1:123456789012:order-events'

# Publish an event
sns.publish(
    TopicArn=topic_arn,
    Subject='Order Created',
    Message=json.dumps({
        'order_id': '12345',
        'customer': 'Alice',
        'items': ['item1', 'item2']
    })
)

# Lambda subscriber (automatically invoked)
def handle_order_created(event, context):
    message = json.loads(event['Records'][0]['Sns']['Message'])
    order_id = message['order_id']
    print(f"Processing order {order_id}")

SQS (Queue, pull-based):

# Publish to queue
sqs = boto3.client('sqs')
sqs.send_message(
    QueueUrl='https://sqs.us-east-1.amazonaws.com/123456789012/order-queue',
    MessageBody=json.dumps({'order_id': '12345'})
)

# Lambda polls the queue (event source mapping)
def handle_sqs_message(event, context):
    for record in event['Records']:
        message = json.loads(record['body'])
        print(f"Processing: {message}")

Google Cloud Implementation (Pub/Sub)

from google.cloud import pubsub_v1

publisher = pubsub_v1.PublisherClient()
topic_path = publisher.topic_path('my-project', 'order-events')

# Publish
future = publisher.publish(
    topic_path,
    json.dumps({
        'order_id': '12345',
        'customer': 'Alice'
    }).encode('utf-8')
)

# Cloud Function subscriber (HTTP-triggered)
def handle_order_event(request):
    envelope = request.get_json()
    payload = base64.b64decode(envelope['message']['data'])
    message = json.loads(payload)
    
    order_id = message['order_id']
    print(f"Processing order {order_id}")
    return 'OK', 200

Messaging Trade-offs

Aspect	SNS (Pub/Sub)	SQS (Queue)	Pub/Sub (GCP)
Fan-out	Yes (1-to-many)	No (1-to-1 per consumer)	Yes (native)
Semantics	At-most-once	At-least-once	At-least-once
Dead-letter queue	Via SNS redrive	Native SQS DLQ	Via Pub/Sub dead letter
Retention	15 minutes	Up to 14 days	7 days
Cost model	Per message	Per message (cheaper)	Per message

Best Practices

1. Dead-Letter Queues (DLQs)

Track failed messages for audit and replay:

# AWS SQS with DLQ
table = boto3.resource('dynamodb').Table('failed-orders')

def handle_order_error(event, context):
    for record in event['Records']:
        try:
            process_order(record)
        except Exception as e:
            # Log to DLQ table for manual inspection
            table.put_item(Item={
                'timestamp': int(time.time()),
                'message': record['body'],
                'error': str(e)
            })
            raise  # SNS/SQS will retry or send to DLQ

2. Idempotency

Ensure events can be processed multiple times safely:

def handle_event(event, context):
    event_id = event['event_id']
    
    # Check if already processed
    response = table.get_item(Key={'event_id': event_id})
    if 'Item' in response:
        return 'Already processed'
    
    # Process and record
    result = do_work(event)
    table.put_item(Item={
        'event_id': event_id,
        'result': result,
        'timestamp': int(time.time())
    })
    
    return 'OK'

3. Batch Processing

Reduce per-message overhead:

def handle_batch(event, context):
    batch = [json.loads(r['body']) for r in event['Records']]
    
    # Process all at once (more efficient DB operations)
    bulk_insert(batch)
    
    return {'batchItemFailures': []}

When to Use Event-Driven

✓ Asynchronous workflows (can tolerate delays) ✓ Fan-out patterns (one event → multiple consumers) ✓ Decoupled microservices ✓ High-throughput batch processing ✓ Cost-optimized (pay per throughput, not per request)

✗ Synchronous, low-latency requirements ✗ Simple request-response patterns ✗ Single-step workflows

For hands-on implementation details, see Level 2 — Lesson 3: Event Sources & Triggers.

Next Steps

Lesson 4: Push vs Pull event models.
Lesson 5: Error handling and retries in async systems.
Lesson 6: Orchestration vs choreography for complex workflows.

Simple Explanation​

What it is​

Why we need it​

Benefits​

Tradeoffs​

Real-world examples (architecture only)​

Core Pattern: Event Bus / Pub-Sub​

AWS Implementation (SQS + SNS)​

Google Cloud Implementation (Pub/Sub)​

Messaging Trade-offs​

Best Practices​

1. Dead-Letter Queues (DLQs)​

2. Idempotency​

3. Batch Processing​

When to Use Event-Driven​

Next Steps​