Cost Optimization: AWS & GCP
Serverless is cost-efficient by design—you pay only for what you use. But without optimization, costs grow quickly. Both AWS and GCP offer ways to reduce bills: right-sizing memory, optimizing code, caching, and monitoring. Understanding the pricing models helps you choose between platforms and keep costs low.
Simple Explanation
What it is
Cost optimization is the process of keeping serverless bills small without harming performance or reliability.
Why we need it
Serverless costs scale with usage. If code is slow or over-provisioned, you pay for waste every time it runs.
Benefits
- Lower monthly spend with the same user experience.
- Better predictability by tracking cost drivers.
- Informed tradeoffs between speed and price.
Tradeoffs
- Requires measurement and profiling work.
- Optimizations can add complexity if done too early.
Real-world examples (architecture only)
- Reduce cold start time -> lower duration cost per request.
- Cache common responses -> fewer database reads.
Part 1: AWS Lambda Cost Optimization
Lambda Pricing Model
Total Cost = Requests + Compute Time
- Requests: $0.20 per 1 million
- Compute: Based on GigaByte-seconds (GB-seconds = memory × duration)
Price Calculation
Scenario:
- 1 million invocations/month
- 256 MB (0.25 GB) memory
- 100 ms (0.1 sec) average duration
Compute cost = 0.25 GB × 0.1 sec × 1M invocations × $0.0000166667/GB-sec
= 0.25 × 0.1 × 1M × $0.0000166667
= $0.42/month
Request cost = 1M × ($0.20 / 1M)
= $0.20/month
Total = $0.62/month (~$7.44/year)
Right-Sizing Memory
More memory = faster CPU = lower execution time. But more memory = higher per-ms cost.
Test at different levels:
# Test memory levels: 128, 256, 512, 1024 MB
def calculate_cost(memory_mb, duration_ms, invocations_per_month):
gb = memory_mb / 1024
seconds = duration_ms / 1000
gb_sec = gb * seconds * invocations_per_month
return gb_sec * 0.0000166667 # AWS compute price
print("128 MB:", calculate_cost(128, 2000, 1_000_000), "== $0.00417")
print("256 MB:", calculate_cost(256, 800, 1_000_000), "== $0.00333")
print("512 MB:", calculate_cost(512, 450, 1_000_000), "== $0.00375")
print("1024 MB:", calculate_cost(1024, 350, 1_000_000), "== $0.00583")
256 MB is often the sweet spot for Python. Test your function at different memory levels to find the optimal balance.
Test AWS Function Performance
for memory in 128 256 512 1024; do
echo "Testing $memory MB..."
aws lambda update-function-configuration \
--function-name myfunction \
--memory-size $memory
# Measure execution time
time sam local invoke myfunction -e event.json
done
Optimize Code for Speed
Faster = cheaper.
# ❌ Expensive: Scan entire table then filter
result = ddb.scan(TableName="Items")
filtered = [item for item in result.get("Items", []) if item.get("status") == "active"]
# ✅ Cheaper: Filter in query
result = ddb.query(
TableName="Items",
IndexName="status-index",
KeyConditionExpression="status = :s",
ExpressionAttributeValues={":s": "active"},
)
Reduce Code Bundle Size
Smaller packages → faster initialization → lower cold start → lower cost
# View package size
du -sh .venv/
# List all dependencies
python -m pip list
# Remove unused packages
python -m pip uninstall -y unused_package
# Replace heavy packages with lighter alternatives
# pandas → built-in csv for simple parsing
# dateutil → datetime for basic timestamps
# requests: use urllib3 for minimal HTTP use cases
Lazy-Load Dependencies
Don't load everything inside initialization:
# ❌ Imports at module load (cold start cost)
# import boto3
# import requests
# ✅ Lazy-load only when needed
boto3 = None
requests = None
def handler(event, context):
global boto3, requests
if boto3 is None:
import boto3 as _boto3
boto3 = _boto3
if requests is None:
import requests as _requests
requests = _requests
s3 = boto3.client("s3")
# ...
Log Retention
CloudWatch storage costs add up. Set appropriate retention:
MyFunction:
Type: AWS::Serverless::Function
Properties:
LogRetentionInDays: 7 # Not forever
Retention settings:
- Development: 1 day
- Staging: 7 days
- Production (non-critical): 14 days
- Production (audit required): 90 days
DynamoDB Optimization
Choose the right billing mode:
On-Demand (best for variable traffic):
- $1.25 per million write units
- $0.25 per million read units
- Auto-scales instantly
- No provisioning
Provisioned (best for known/stable traffic):
- $0.47 per write unit-hour
- $0.09 per read unit-hour
- Manual scaling
- Lower cost for sustained traffic
When to use:
- On-demand: New apps, unpredictable traffic, one-off services
- Provisioned: Stable production apps with predictable traffic
Caching
Fewer queries = lower cost.
In-memory cache:
import time
cache = {}
CACHE_TTL = 60 # 1 minute
def get_cached_item(item_id):
cached = cache.get(item_id)
if cached and (time.time() - cached["time"]) < CACHE_TTL:
return cached["value"]
value = ddb.get_item(TableName=TABLE_NAME, Key={"id": item_id})
cache[item_id] = {"value": value, "time": time.time()}
return value
ElastiCache Redis:
import json
import os
import redis
client = redis.Redis(host=os.environ.get("REDIS_ENDPOINT"), port=6379)
def query_with_cache(item_id):
cached = client.get(f"item-{item_id}")
if cached:
return json.loads(cached)
item = ddb.get_item(**params)
client.setex(f"item-{item_id}", 3600, json.dumps(item))
return item
Remove Unused Functions
Delete functions you're not using:
# List functions by last modified date
aws lambda list-functions --query 'Functions[].{Name:FunctionName,Modified:LastModified}' --output table
# Delete unused
aws lambda delete-function --function-name old-function
Provisioned Concurrency (Use Sparingly)
Keeps functions warm constantly but costs 10x more:
CriticalFunction:
Type: AWS::Serverless::Function
Properties:
ProvisionedConcurrentExecutions: 10 # Keep 10 warmed
When to use: Critical user-facing APIs only (not background jobs)
Part 2: Google Cloud Functions Cost Optimization
Cloud Functions Pricing Model
Pricing structure:
- Requests: First 2 million free/month, then $0.40 per million
- Compute: Duration-based
- JavaScript/Python (vCPU share): $0.000002400/GB-sec
- Go (faster): $0.000002400/GB-sec
- Memory: $0.0000000325/GB-hour
Price Calculation
Scenario:
- 1 million invocations/month
- 256 MB (0.25 GB) memory
- 100 ms (0.1 sec) execution
Compute cost = 0.25 GB × 0.1 sec × 1M × $0.000002400
= 0.25 × 0.1 × 1M × $0.000002400
= $0.06/month (Much cheaper than AWS!)
Request cost (after free tier):
- First 2M free
- 1M more = $0.40/million × 3M total = $1.20
Memory cost = 0.25 GB × 730 hours × $0.0000000325
= $0.0059 (negligible)
Total ≈ $1.26/month (~$15/year)
Right-Sizing Memory
Same principle as AWS: find the sweet spot.
def calculate_gcp_cost(memory_mb, duration_ms, invocations_per_month):
gb = memory_mb / 1024
seconds = duration_ms / 1000
gb_sec = gb * seconds * invocations_per_month
compute_cost = gb_sec * 0.000002400
request_cost = 0
if invocations_per_month > 2_000_000:
request_cost = (invocations_per_month - 2_000_000) * (0.40 / 1_000_000)
return {"computeCost": compute_cost, "requestCost": request_cost, "total": compute_cost + request_cost}
print("128 MB:", calculate_gcp_cost(128, 2000, 1_000_000))
print("256 MB:", calculate_gcp_cost(256, 800, 1_000_000))
print("512 MB:", calculate_gcp_cost(512, 450, 1_000_000))
Optimize Code
Same principles as AWS:
from google.cloud import firestore
db = firestore.Client()
# ❌ Slow: Get all docs then filter
def list_items(request):
snapshot = db.collection("items").get()
active = [doc.to_dict() for doc in snapshot if doc.to_dict().get("status") == "active"]
return (active, 200)
# ✅ Fast: Filter in query
def list_items_fast(request):
snapshot = db.collection("items").where("status", "==", "active").get()
return ([doc.to_dict() for doc in snapshot], 200)
Firestore Optimization
On-Demand vs. Provisioned:
On-Demand (default):
- $0.06 per 100k reads
- $0.18 per 100k writes
- Auto-scales (no provisioning needed)
Provisioned throughput:
- $0.04 per 100 read ops-hour
- $0.12 per 100 write ops-hour
- Better for predictable traffic
Cloud Functions Memory Performance
Python doesn't scale linearly:
128 MB: cold start ~1000ms, execution 500ms
256 MB: cold start ~600ms, execution 200ms ← Sweet spot often here
512 MB: cold start ~400ms, execution 150ms
1 GB: cold start ~300ms, execution 140ms
For most apps, 256-512 MB is optimal.
Min Instances (Prewarming)
Keep instances ready to avoid cold starts:
gcloud functions deploy my-function \
--min-instances=5 \
--runtime python312 \
--trigger-http
Cost: ~$5-10/month per min-instance kept warm. Use only for critical APIs.
Log Retention
Cloud Logging has generous free tier (50 GB/month), but set retention:
# Keep only last 7 days
gcloud logging sinks update _Default \
--log-filter='timestamp > "-P7D"'
Caching
Same strategies as AWS:
import time
cache = {}
CACHE_TTL = 600 # 10 min
def get_user(request):
user_id = request.args.get("id")
cached = cache.get(user_id)
if cached and (time.time() - cached["time"]) < CACHE_TTL:
return (cached["value"], 200)
doc = db.collection("users").document(user_id).get()
data = doc.to_dict()
cache[user_id] = {"value": data, "time": time.time()}
return (data, 200)
AWS Lambda vs. Google Cloud Functions: Cost Comparison
| Metric | AWS Lambda | Google Cloud Functions |
|---|---|---|
| Request pricing | $0.20/million | Free first 2M, then $0.40/million |
| Compute (per GB-sec) | $0.0000166667 | $0.000002400 |
| Example cost (1M invocations, 256MB, 100ms) | $0.62/month | $1.26/month (higher due to free tier offset) |
| For 5M invocations (over free tier) | $3.13/month | $0.30/month (GCP cheaper!) |
| Memory pricing | Included in compute | Separate (negligible) |
| Min instances | Via provisioned concurrency ($$$) | Native, much cheaper |
| Free tier | 1M requests + 400k GB-sec | 2M invocations + 400k GB-sec + 50 GB logging |
Key Pricing Differences
- Per-request: AWS charges for every invocation; GCP gives 2M free
- Per-compute: GCP compute is ~7x cheaper than AWS
- For low traffic (< 2M/month): GCP is significantly cheaper
- For high traffic (> 5M/month): GCP remains cheaper for equivalent workloads
- Min instances: GCP's min-instances are cheaper than AWS provisioned concurrency
Multi-Cloud Cost Strategy
# Cost-conscious deployment strategy
# For development/low-traffic apps: GCP
# - Free tier covers most usage
# - Cheaper for < 5M invocations/month
# For high-traffic production: Compare both
# - Calculate actual costs for your workload
# - AWS has larger ecosystem (more third-party integrations)
# - GCP has better Firestore pricing
# For critical APIs: Use min-instances/provisioned concurrency on whichever platform
# Hybrid: Use different platforms for different workloads
# - User-facing APIs: GCP (cheaper compute)
# - Batch jobs: AWS (Lambda@Edge, SQS integration better)
# - Data processing: GCP (BigQuery integration)
Cost Monitoring & Optimization Checklist
General:
- Enabled cost alerts/budgets
- Monthly cost review scheduled
- Identified most expensive functions
Code Optimization:
- Removed unused dependencies
- Optimized database queries
- Implemented caching layer
- Lazy-load heavy libraries
Configuration:
- Right-sized memory (tested all levels)
- Set appropriate log retention
- Chosen correct database billing mode (on-demand vs. provisioned)
- Removed unused functions
Infrastructure:
- Min-instances/provisioned concurrency only on critical paths
- VPC access disabled if not needed
- Unused API Gateways/HTTP triggers removed
Hands-On: Multi-Cloud Cost Comparison
AWS
# Calculate cost
aws ce get-cost-and-usage \
--time-period Start=2026-01-01,End=2026-02-01 \
--granularity MONTHLY \
--metrics BlendedCost \
--filter file://lambda-filter.json
# Set budget alert
aws budgets create-budget \
--account-id YOUR_ACCOUNT \
--budget file://budget.json
GCP
# View cost breakdown
gcloud billing accounts list
gcloud billing budgets create \
--billing-account=ACCOUNT_ID \
--display-name="Monthly Budget" \
--budget-amount=500 \
--threshold-rule percent=80
Key Takeaway
Serverless is already cheaper than provisioned infrastructure. Make it even cheaper by right-sizing memory, optimizing queries, caching aggressively, and choosing the platform that fits your workload. Monitor costs monthly to catch runaway bills early.