Skip to main content

Cost Optimization: AWS & GCP

Serverless is cost-efficient by design—you pay only for what you use. But without optimization, costs grow quickly. Both AWS and GCP offer ways to reduce bills: right-sizing memory, optimizing code, caching, and monitoring. Understanding the pricing models helps you choose between platforms and keep costs low.


Simple Explanation

What it is

Cost optimization is the process of keeping serverless bills small without harming performance or reliability.

Why we need it

Serverless costs scale with usage. If code is slow or over-provisioned, you pay for waste every time it runs.

Benefits

  • Lower monthly spend with the same user experience.
  • Better predictability by tracking cost drivers.
  • Informed tradeoffs between speed and price.

Tradeoffs

  • Requires measurement and profiling work.
  • Optimizations can add complexity if done too early.

Real-world examples (architecture only)

  • Reduce cold start time -> lower duration cost per request.
  • Cache common responses -> fewer database reads.

Part 1: AWS Lambda Cost Optimization

Lambda Pricing Model

Total Cost = Requests + Compute Time

  1. Requests: $0.20 per 1 million
  2. Compute: Based on GigaByte-seconds (GB-seconds = memory × duration)

Price Calculation

Scenario:
- 1 million invocations/month
- 256 MB (0.25 GB) memory
- 100 ms (0.1 sec) average duration

Compute cost = 0.25 GB × 0.1 sec × 1M invocations × $0.0000166667/GB-sec
= 0.25 × 0.1 × 1M × $0.0000166667
= $0.42/month

Request cost = 1M × ($0.20 / 1M)
= $0.20/month

Total = $0.62/month (~$7.44/year)

Right-Sizing Memory

More memory = faster CPU = lower execution time. But more memory = higher per-ms cost.

Test at different levels:

# Test memory levels: 128, 256, 512, 1024 MB
def calculate_cost(memory_mb, duration_ms, invocations_per_month):
gb = memory_mb / 1024
seconds = duration_ms / 1000
gb_sec = gb * seconds * invocations_per_month
return gb_sec * 0.0000166667 # AWS compute price


print("128 MB:", calculate_cost(128, 2000, 1_000_000), "== $0.00417")
print("256 MB:", calculate_cost(256, 800, 1_000_000), "== $0.00333")
print("512 MB:", calculate_cost(512, 450, 1_000_000), "== $0.00375")
print("1024 MB:", calculate_cost(1024, 350, 1_000_000), "== $0.00583")

256 MB is often the sweet spot for Python. Test your function at different memory levels to find the optimal balance.

Test AWS Function Performance

for memory in 128 256 512 1024; do
echo "Testing $memory MB..."

aws lambda update-function-configuration \
--function-name myfunction \
--memory-size $memory

# Measure execution time
time sam local invoke myfunction -e event.json
done

Optimize Code for Speed

Faster = cheaper.

# ❌ Expensive: Scan entire table then filter
result = ddb.scan(TableName="Items")
filtered = [item for item in result.get("Items", []) if item.get("status") == "active"]

# ✅ Cheaper: Filter in query
result = ddb.query(
TableName="Items",
IndexName="status-index",
KeyConditionExpression="status = :s",
ExpressionAttributeValues={":s": "active"},
)

Reduce Code Bundle Size

Smaller packages → faster initialization → lower cold start → lower cost

# View package size
du -sh .venv/

# List all dependencies
python -m pip list

# Remove unused packages
python -m pip uninstall -y unused_package

# Replace heavy packages with lighter alternatives
# pandas → built-in csv for simple parsing
# dateutil → datetime for basic timestamps
# requests: use urllib3 for minimal HTTP use cases

Lazy-Load Dependencies

Don't load everything inside initialization:

# ❌ Imports at module load (cold start cost)
# import boto3
# import requests

# ✅ Lazy-load only when needed
boto3 = None
requests = None


def handler(event, context):
global boto3, requests
if boto3 is None:
import boto3 as _boto3
boto3 = _boto3
if requests is None:
import requests as _requests
requests = _requests

s3 = boto3.client("s3")
# ...

Log Retention

CloudWatch storage costs add up. Set appropriate retention:

MyFunction:
Type: AWS::Serverless::Function
Properties:
LogRetentionInDays: 7 # Not forever

Retention settings:

  • Development: 1 day
  • Staging: 7 days
  • Production (non-critical): 14 days
  • Production (audit required): 90 days

DynamoDB Optimization

Choose the right billing mode:

On-Demand (best for variable traffic):

  • $1.25 per million write units
  • $0.25 per million read units
  • Auto-scales instantly
  • No provisioning

Provisioned (best for known/stable traffic):

  • $0.47 per write unit-hour
  • $0.09 per read unit-hour
  • Manual scaling
  • Lower cost for sustained traffic

When to use:

  • On-demand: New apps, unpredictable traffic, one-off services
  • Provisioned: Stable production apps with predictable traffic

Caching

Fewer queries = lower cost.

In-memory cache:

import time

cache = {}
CACHE_TTL = 60 # 1 minute


def get_cached_item(item_id):
cached = cache.get(item_id)
if cached and (time.time() - cached["time"]) < CACHE_TTL:
return cached["value"]

value = ddb.get_item(TableName=TABLE_NAME, Key={"id": item_id})
cache[item_id] = {"value": value, "time": time.time()}
return value

ElastiCache Redis:

import json
import os

import redis

client = redis.Redis(host=os.environ.get("REDIS_ENDPOINT"), port=6379)


def query_with_cache(item_id):
cached = client.get(f"item-{item_id}")
if cached:
return json.loads(cached)

item = ddb.get_item(**params)

client.setex(f"item-{item_id}", 3600, json.dumps(item))
return item

Remove Unused Functions

Delete functions you're not using:

# List functions by last modified date
aws lambda list-functions --query 'Functions[].{Name:FunctionName,Modified:LastModified}' --output table

# Delete unused
aws lambda delete-function --function-name old-function

Provisioned Concurrency (Use Sparingly)

Keeps functions warm constantly but costs 10x more:

CriticalFunction:
Type: AWS::Serverless::Function
Properties:
ProvisionedConcurrentExecutions: 10 # Keep 10 warmed

When to use: Critical user-facing APIs only (not background jobs)


Part 2: Google Cloud Functions Cost Optimization

Cloud Functions Pricing Model

Pricing structure:

  • Requests: First 2 million free/month, then $0.40 per million
  • Compute: Duration-based
    • JavaScript/Python (vCPU share): $0.000002400/GB-sec
    • Go (faster): $0.000002400/GB-sec
    • Memory: $0.0000000325/GB-hour

Price Calculation

Scenario:
- 1 million invocations/month
- 256 MB (0.25 GB) memory
- 100 ms (0.1 sec) execution

Compute cost = 0.25 GB × 0.1 sec × 1M × $0.000002400
= 0.25 × 0.1 × 1M × $0.000002400
= $0.06/month (Much cheaper than AWS!)

Request cost (after free tier):
- First 2M free
- 1M more = $0.40/million × 3M total = $1.20

Memory cost = 0.25 GB × 730 hours × $0.0000000325
= $0.0059 (negligible)

Total ≈ $1.26/month (~$15/year)

Right-Sizing Memory

Same principle as AWS: find the sweet spot.

def calculate_gcp_cost(memory_mb, duration_ms, invocations_per_month):
gb = memory_mb / 1024
seconds = duration_ms / 1000
gb_sec = gb * seconds * invocations_per_month
compute_cost = gb_sec * 0.000002400

request_cost = 0
if invocations_per_month > 2_000_000:
request_cost = (invocations_per_month - 2_000_000) * (0.40 / 1_000_000)

return {"computeCost": compute_cost, "requestCost": request_cost, "total": compute_cost + request_cost}


print("128 MB:", calculate_gcp_cost(128, 2000, 1_000_000))
print("256 MB:", calculate_gcp_cost(256, 800, 1_000_000))
print("512 MB:", calculate_gcp_cost(512, 450, 1_000_000))

Optimize Code

Same principles as AWS:

from google.cloud import firestore

db = firestore.Client()


# ❌ Slow: Get all docs then filter
def list_items(request):
snapshot = db.collection("items").get()
active = [doc.to_dict() for doc in snapshot if doc.to_dict().get("status") == "active"]
return (active, 200)


# ✅ Fast: Filter in query
def list_items_fast(request):
snapshot = db.collection("items").where("status", "==", "active").get()
return ([doc.to_dict() for doc in snapshot], 200)

Firestore Optimization

On-Demand vs. Provisioned:

On-Demand (default):

  • $0.06 per 100k reads
  • $0.18 per 100k writes
  • Auto-scales (no provisioning needed)

Provisioned throughput:

  • $0.04 per 100 read ops-hour
  • $0.12 per 100 write ops-hour
  • Better for predictable traffic

Cloud Functions Memory Performance

Python doesn't scale linearly:

128 MB:  cold start ~1000ms, execution 500ms
256 MB: cold start ~600ms, execution 200ms ← Sweet spot often here
512 MB: cold start ~400ms, execution 150ms
1 GB: cold start ~300ms, execution 140ms

For most apps, 256-512 MB is optimal.

Min Instances (Prewarming)

Keep instances ready to avoid cold starts:

gcloud functions deploy my-function \
--min-instances=5 \
--runtime python312 \
--trigger-http

Cost: ~$5-10/month per min-instance kept warm. Use only for critical APIs.

Log Retention

Cloud Logging has generous free tier (50 GB/month), but set retention:

# Keep only last 7 days
gcloud logging sinks update _Default \
--log-filter='timestamp > "-P7D"'

Caching

Same strategies as AWS:

import time

cache = {}
CACHE_TTL = 600 # 10 min


def get_user(request):
user_id = request.args.get("id")
cached = cache.get(user_id)
if cached and (time.time() - cached["time"]) < CACHE_TTL:
return (cached["value"], 200)

doc = db.collection("users").document(user_id).get()
data = doc.to_dict()
cache[user_id] = {"value": data, "time": time.time()}
return (data, 200)

AWS Lambda vs. Google Cloud Functions: Cost Comparison

MetricAWS LambdaGoogle Cloud Functions
Request pricing$0.20/millionFree first 2M, then $0.40/million
Compute (per GB-sec)$0.0000166667$0.000002400
Example cost (1M invocations, 256MB, 100ms)$0.62/month$1.26/month (higher due to free tier offset)
For 5M invocations (over free tier)$3.13/month$0.30/month (GCP cheaper!)
Memory pricingIncluded in computeSeparate (negligible)
Min instancesVia provisioned concurrency ($$$)Native, much cheaper
Free tier1M requests + 400k GB-sec2M invocations + 400k GB-sec + 50 GB logging

Key Pricing Differences

  • Per-request: AWS charges for every invocation; GCP gives 2M free
  • Per-compute: GCP compute is ~7x cheaper than AWS
  • For low traffic (< 2M/month): GCP is significantly cheaper
  • For high traffic (> 5M/month): GCP remains cheaper for equivalent workloads
  • Min instances: GCP's min-instances are cheaper than AWS provisioned concurrency

Multi-Cloud Cost Strategy

# Cost-conscious deployment strategy

# For development/low-traffic apps: GCP
# - Free tier covers most usage
# - Cheaper for < 5M invocations/month

# For high-traffic production: Compare both
# - Calculate actual costs for your workload
# - AWS has larger ecosystem (more third-party integrations)
# - GCP has better Firestore pricing

# For critical APIs: Use min-instances/provisioned concurrency on whichever platform

# Hybrid: Use different platforms for different workloads
# - User-facing APIs: GCP (cheaper compute)
# - Batch jobs: AWS (Lambda@Edge, SQS integration better)
# - Data processing: GCP (BigQuery integration)

Cost Monitoring & Optimization Checklist

General:

  • Enabled cost alerts/budgets
  • Monthly cost review scheduled
  • Identified most expensive functions

Code Optimization:

  • Removed unused dependencies
  • Optimized database queries
  • Implemented caching layer
  • Lazy-load heavy libraries

Configuration:

  • Right-sized memory (tested all levels)
  • Set appropriate log retention
  • Chosen correct database billing mode (on-demand vs. provisioned)
  • Removed unused functions

Infrastructure:

  • Min-instances/provisioned concurrency only on critical paths
  • VPC access disabled if not needed
  • Unused API Gateways/HTTP triggers removed

Hands-On: Multi-Cloud Cost Comparison

AWS

# Calculate cost
aws ce get-cost-and-usage \
--time-period Start=2026-01-01,End=2026-02-01 \
--granularity MONTHLY \
--metrics BlendedCost \
--filter file://lambda-filter.json

# Set budget alert
aws budgets create-budget \
--account-id YOUR_ACCOUNT \
--budget file://budget.json

GCP

# View cost breakdown
gcloud billing accounts list
gcloud billing budgets create \
--billing-account=ACCOUNT_ID \
--display-name="Monthly Budget" \
--budget-amount=500 \
--threshold-rule percent=80

Key Takeaway

Serverless is already cheaper than provisioned infrastructure. Make it even cheaper by right-sizing memory, optimizing queries, caching aggressively, and choosing the platform that fits your workload. Monitor costs monthly to catch runaway bills early.