Choosing between AWS ECS Fargate and EKS represents one of the most critical infrastructure decisions for modern cloud applications. With container adoption growing 67% year-over-year, 58% of teams now favor managed container services to reduce operational overhead. Companies making the right choice reduce infrastructure costs by 30-50%, cut deployment time by 60%, and free up engineering teams to focus on features instead of cluster management. The wrong choice, however, can lead to vendor lock-in, unnecessary complexity, or escalating costs.
This comprehensive guide provides the decision framework, cost analysis, and practical implementation patterns you need to choose between ECS Fargate and EKS for your specific workload in 2025.
Understanding AWS Container Orchestration Options
AWS offers three primary paths for running containers, each with distinct tradeoffs:
AWS ECS (Elastic Container Service)
ECS is Amazon's proprietary container orchestration service, deeply integrated with AWS services. It provides a simpler, more AWS-native experience with less complexity than Kubernetes.
Key characteristics:
- AWS-native API: Purpose-built for AWS, with first-class integration for ALB, CloudWatch, IAM, Secrets Manager
- Task definitions: Define containers, resources, networking, and volumes in JSON templates
- Service management: Automatically maintains desired container count, handles health checks, integrates with service discovery
- No control plane costs: Unlike EKS, you only pay for compute resources, not cluster management
- Fargate launch type: Serverless containers with zero infrastructure management
AWS EKS (Elastic Kubernetes Service)
EKS is Amazon's managed Kubernetes service, running standard Kubernetes with AWS-specific enhancements. It provides portability and access to the vast Kubernetes ecosystem.
Key characteristics:
- Standard Kubernetes: Full compatibility with Kubernetes APIs, tools, and ecosystem
- Managed control plane: AWS handles master node upgrades, patching, and high availability
- Flexible compute: Support for EC2 instances, Fargate, and hybrid on-premises nodes
- CNCF ecosystem: Access to thousands of Kubernetes-native tools, operators, and integrations
- Multi-cloud portability: Easier migration between cloud providers or on-premises environments
Launch Type: EC2 vs Fargate
Both ECS and EKS support two launch types with different management tradeoffs:
EC2 Launch Type:
- You manage EC2 instances, patching, scaling, and capacity
- Lower per-task cost for high-utilization workloads
- More control over instance types, AMIs, and customization
- Requires capacity planning and cluster management
Fargate Launch Type:
- AWS manages all infrastructure—no servers to provision or scale
- Pay only for vCPU and memory resources used per task
- Automatic scaling without capacity planning
- Ideal for variable workloads and smaller teams
Cost Analysis: ECS vs EKS
Understanding the true cost requires looking beyond compute to include management overhead and operational complexity.
ECS Pricing Model
ECS itself is free—you only pay for underlying AWS resources:
Fargate Pricing (no infrastructure management):
vCPU: $0.04048 per vCPU per hour
Memory: $0.004445 per GB per hour
Example: 1 vCPU, 2GB RAM task running 24/7
Monthly cost: (0.04048 * 1 + 0.004445 * 2) * 730 hours
= (0.04048 + 0.00889) * 730
= $36.03 per month per task
EC2 Launch Type (you manage instances):
Example: 3x t3.large instances (2 vCPU, 8GB each)
Cost: $0.0832/hour * 3 * 730 hours = $182.21/month
Plus: EBS volumes, data transfer, load balancers
Can run many containers per instance if utilization is high
EKS Pricing Model
EKS charges for both control plane and compute:
Control Plane Cost:
$0.10 per cluster per hour
= $73 per month per cluster (fixed cost regardless of size)
Compute Options:
Option 1: Fargate (same pricing as ECS Fargate):
1 vCPU, 2GB task: $36.03/month per pod
No EC2 management, but higher per-unit cost
Option 2: EC2 Managed Node Groups:
Same EC2 costs as ECS EC2 launch type
Plus $73/month for EKS control plane
You manage capacity, patching, and scaling
Real-World Cost Comparison
Scenario 1: Small Application (10 containers, 1 vCPU, 2GB each)
| Option | Monthly Cost | Management Overhead |
|---|---|---|
| ECS Fargate | $360 | Minimal (hours/month) |
| EKS + Fargate | $433 ($360 + $73) | Low (2-5 hours/month) |
| ECS + EC2 | $200-250 | Medium (10-15 hours/month) |
| EKS + EC2 | $270-320 | High (15-25 hours/month) |
Winner: ECS Fargate - lowest total cost of ownership
Scenario 2: Medium Application (50 containers, varying sizes)
| Option | Monthly Cost | Management Overhead |
|---|---|---|
| ECS Fargate | $1,800 | Minimal |
| EKS + Fargate | $1,873 | Low |
| ECS + EC2 | $600-800 | Medium |
| EKS + EC2 | $670-870 | High |
Winner: ECS EC2 if team has DevOps expertise, ECS Fargate for smaller teams
Scenario 3: Large Application (200+ containers, high utilization)
| Option | Monthly Cost | Management Overhead |
|---|---|---|
| ECS Fargate | $7,200+ | Minimal |
| EKS + Fargate | $7,273+ | Low-Medium |
| ECS + EC2 | $2,000-3,000 | Medium-High |
| EKS + EC2 | $2,070-3,070 | High |
Winner: EC2 launch type for cost efficiency if you have dedicated DevOps team
Hidden Costs to Consider:
- EKS: Requires Kubernetes expertise (training, certifications, specialized hiring)
- EC2: Patching, security updates, capacity planning, monitoring costs
- Tools: Kubernetes ecosystem tools may have licensing costs (Datadog, New Relic, etc.)
- Networking: Data transfer, NAT gateway, load balancer costs are similar across options
When to Choose ECS Fargate
ECS Fargate excels in these scenarios:
1. AWS-Native Applications
If your architecture heavily uses AWS services, ECS provides tighter integration:
# ECS Task Definition with Native AWS Integration
{
"family": "web-app",
"taskRoleArn": "arn:aws:iam::account:role/ecsTaskRole",
"executionRoleArn": "arn:aws:iam::account:role/ecsExecutionRole",
"networkMode": "awsvpc",
"containerDefinitions": [
{
"name": "app",
"image": "account.dkr.ecr.region.amazonaws.com/app:latest",
"cpu": 512,
"memory": 1024,
"essential": true,
"portMappings": [
{
"containerPort": 8080,
"protocol": "tcp"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/web-app",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs"
}
},
"secrets": [
{
"name": "DATABASE_PASSWORD",
"valueFrom": "arn:aws:secretsmanager:region:account:secret:db-password"
}
],
"environment": [
{
"name": "AWS_REGION",
"value": "us-east-1"
}
]
}
],
"requiresCompatibilities": ["FARGATE"],
"cpu": "512",
"memory": "1024"
}
Perfect for:
- Applications using RDS, DynamoDB, S3, SQS, SNS extensively
- Teams already familiar with CloudFormation and AWS APIs
- Microservices with straightforward deployment patterns
- Projects without multi-cloud requirements
2. Smaller Teams Without Kubernetes Expertise
ECS has a significantly lower learning curve:
# Deploy a service in ECS (simple AWS CLI)
aws ecs create-service \
--cluster production \
--service-name web-app \
--task-definition web-app:1 \
--desired-count 3 \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={
subnets=[subnet-abc123,subnet-def456],
securityGroups=[sg-xyz789],
assignPublicIp=DISABLED
}" \
--load-balancers "targetGroupArn=arn:aws:elasticloadbalancing:...,containerName=app,containerPort=8080"
# No need to understand Kubernetes concepts like:
# - Pods, ReplicaSets, Deployments
# - ConfigMaps, Secrets (different from AWS Secrets)
# - Ingress Controllers, Service Mesh
# - RBAC, Pod Security Policies
# - CRDs, Operators, Helm charts
Perfect for:
- Teams with 1-5 developers
- Startups prioritizing speed over flexibility
- Organizations without dedicated DevOps/platform teams
- Projects with straightforward container deployments
3. Variable or Unpredictable Workloads
Fargate's serverless model shines with variable traffic:
# Auto-scaling configuration for ECS Fargate
import boto3
ecs = boto3.client('ecs')
autoscaling = boto3.client('application-autoscaling')
# Register scalable target
autoscaling.register_scalable_target(
ServiceNamespace='ecs',
ResourceId='service/production/web-app',
ScalableDimension='ecs:service:DesiredCount',
MinCapacity=2,
MaxCapacity=50, # Scale from 2 to 50 tasks automatically
RoleARN='arn:aws:iam::account:role/ecsAutoscaleRole'
)
# Target tracking based on CPU
autoscaling.put_scaling_policy(
PolicyName='cpu-scaling',
ServiceNamespace='ecs',
ResourceId='service/production/web-app',
ScalableDimension='ecs:service:DesiredCount',
PolicyType='TargetTrackingScaling',
TargetTrackingScalingPolicyConfiguration={
'TargetValue': 70.0, # Target 70% CPU
'PredefinedMetricSpecification': {
'PredefinedMetricType': 'ECSServiceAverageCPUUtilization'
},
'ScaleInCooldown': 300,
'ScaleOutCooldown': 60
}
)
Benefits:
- No capacity planning or over-provisioning
- Pay only for actual usage during traffic spikes
- Automatic scale-down during low traffic periods
- No wasted capacity on idle instances
Perfect for:
- API services with unpredictable traffic patterns
- Batch processing with variable job volumes
- Development and staging environments
- Event-driven architectures with sporadic workloads
4. Rapid Deployment Requirements
ECS offers faster time-to-production:
// Infrastructure as Code with AWS CDK for ECS
import * as cdk from 'aws-cdk-lib';
import * as ecs from 'aws-cdk-lib/aws-ecs';
import * as ecsPatterns from 'aws-cdk-lib/aws-ecs-patterns';
export class WebAppStack extends cdk.Stack {
constructor(scope: cdk.App, id: string) {
super(scope, id);
// Create Fargate service with ALB in ~50 lines
const loadBalancedService = new ecsPatterns.ApplicationLoadBalancedFargateService(this, 'WebApp', {
taskImageOptions: {
image: ecs.ContainerImage.fromRegistry('nginx'),
containerPort: 80,
environment: {
ENVIRONMENT: 'production'
},
},
cpu: 512,
memoryLimitMiB: 1024,
desiredCount: 3,
publicLoadBalancer: true
});
// Auto-scaling based on requests
const scaling = loadBalancedService.service.autoScaleTaskCount({
minCapacity: 2,
maxCapacity: 10
});
scaling.scaleOnRequestCount('RequestScaling', {
requestsPerTarget: 1000,
targetGroup: loadBalancedService.targetGroup
});
}
}
Deployment time comparison:
- ECS Fargate: 15-30 minutes from zero to production
- EKS: 2-4 hours (cluster creation + configuration + deployment)
When to Choose EKS
EKS is the right choice in these scenarios:
1. Multi-Cloud or Hybrid Cloud Strategy
Kubernetes provides portability across cloud providers:
# Standard Kubernetes Deployment (works on EKS, GKE, AKS, on-prem)
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
namespace: production
spec:
replicas: 3
selector:
matchLabels:
app: web-app
template:
metadata:
labels:
app: web-app
version: v1.2.0
spec:
containers:
- name: app
image: myapp:1.2.0
ports:
- containerPort: 8080
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 1000m
memory: 1Gi
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-credentials
key: url
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
name: web-app-service
spec:
type: LoadBalancer
selector:
app: web-app
ports:
- port: 80
targetPort: 8080
Benefits:
- Same manifests work on any Kubernetes cluster
- Avoid vendor lock-in to AWS-specific APIs
- Easier migration between cloud providers
- Support for edge/on-premises hybrid deployments
Perfect for:
- Organizations with multi-cloud policies
- Companies planning future cloud migrations
- Hybrid architectures with on-premises components
- Enterprises requiring cloud portability for contract negotiations
2. Complex Microservices with Advanced Networking
Kubernetes ecosystem provides sophisticated networking capabilities:
# Service Mesh with Istio on EKS
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: web-app-routes
spec:
hosts:
- web-app.example.com
http:
- match:
- headers:
x-user-type:
exact: premium
route:
- destination:
host: web-app
subset: v2
weight: 100
- route:
- destination:
host: web-app
subset: v1
weight: 90
- destination:
host: web-app
subset: v2
weight: 10
---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: web-app-circuit-breaker
spec:
host: web-app
trafficPolicy:
connectionPool:
tcp:
maxConnections: 100
http:
http1MaxPendingRequests: 50
http2MaxRequests: 100
outlierDetection:
consecutiveErrors: 5
interval: 30s
baseEjectionTime: 30s
maxEjectionPercent: 50
subsets:
- name: v1
labels:
version: v1
- name: v2
labels:
version: v2
Advanced capabilities:
- Service mesh (Istio, Linkerd) for mTLS, traffic shaping, observability
- Advanced ingress controllers (NGINX, Traefik, Ambassador)
- Network policies for fine-grained security
- Cross-cluster service discovery and federation
Perfect for:
- 20+ microservices with complex inter-service communication
- Applications requiring canary deployments, A/B testing, traffic splitting
- Zero-trust security models with mTLS between services
- Organizations needing advanced observability and tracing
3. Existing Kubernetes Investment
If your team already knows Kubernetes, EKS provides familiarity:
# Standard kubectl commands work identically
kubectl get pods -n production
kubectl logs -f deployment/web-app
kubectl exec -it web-app-pod-xyz -- /bin/bash
kubectl port-forward svc/web-app 8080:80
kubectl apply -f manifests/
kubectl rollout status deployment/web-app
kubectl rollout undo deployment/web-app
# Existing Helm charts work without modification
helm install my-app ./charts/web-app \
--namespace production \
--values production-values.yaml
# CI/CD pipelines require minimal changes
# All existing Kubernetes tools, scripts, and automation work
Perfect for:
- Teams with Certified Kubernetes Administrators (CKA)
- Organizations with existing Kubernetes tooling investments
- Companies migrating from self-managed Kubernetes
- Projects using complex Helm charts and Kubernetes operators
4. Batch Processing and ML Workloads
Kubernetes excels at complex job scheduling and GPU workloads:
# Batch processing with Kubernetes Jobs
apiVersion: batch/v1
kind: Job
metadata:
name: data-processing-job
spec:
parallelism: 10 # Run 10 pods in parallel
completions: 100 # Process 100 tasks total
template:
spec:
containers:
- name: processor
image: data-processor:1.0
resources:
requests:
cpu: 2
memory: 4Gi
nvidia.com/gpu: 1 # Request GPU
limits:
cpu: 4
memory: 8Gi
nvidia.com/gpu: 1
env:
- name: TASK_INDEX
valueFrom:
fieldRef:
fieldPath: metadata.name
restartPolicy: OnFailure
nodeSelector:
workload-type: compute-intensive
---
# CronJob for scheduled tasks
apiVersion: batch/v1
kind: CronJob
metadata:
name: nightly-report
spec:
schedule: "0 2 * * *" # Run at 2 AM daily
jobTemplate:
spec:
template:
spec:
containers:
- name: report-generator
image: report-tool:latest
command: ["python", "generate_report.py"]
restartPolicy: OnFailure
Perfect for:
- ML training pipelines (Kubeflow, MLflow)
- Data processing jobs (Apache Spark on Kubernetes)
- Scheduled batch workloads (ETL, reports, data sync)
- GPU workloads requiring specialized instance types
Migration Strategies
Migrating from ECS to EKS
If you need to migrate, follow this phased approach:
# Phase 1: Create EKS cluster
import boto3
eks = boto3.client('eks')
# Create EKS cluster
cluster = eks.create_cluster(
name='production-eks',
version='1.28',
roleArn='arn:aws:iam::account:role/eks-cluster-role',
resourcesVpcConfig={
'subnetIds': ['subnet-abc123', 'subnet-def456', 'subnet-ghi789'],
'securityGroupIds': ['sg-xyz789'],
'endpointPublicAccess': False,
'endpointPrivateAccess': True
},
logging={
'clusterLogging': [
{
'types': ['api', 'audit', 'authenticator', 'controllerManager', 'scheduler'],
'enabled': True
}
]
}
)
print(f"Cluster creating: {cluster['cluster']['name']}")
# Phase 2: Convert ECS task definition to Kubernetes Deployment
# ECS Task Definition (original)
# {
# "family": "web-app",
# "cpu": "512",
# "memory": "1024",
# "containerDefinitions": [{
# "name": "app",
# "image": "myapp:1.0",
# "portMappings": [{"containerPort": 8080}]
# }]
# }
# Kubernetes Deployment (equivalent)
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
spec:
replicas: 3
selector:
matchLabels:
app: web-app
template:
metadata:
labels:
app: web-app
spec:
containers:
- name: app
image: myapp:1.0
ports:
- containerPort: 8080
resources:
requests:
cpu: 500m # 512 CPU units = 0.5 vCPU
memory: 1024Mi # 1024 MB
limits:
cpu: 500m
memory: 1024Mi
Migration phases:
- Parallel run (1-2 months): Run both ECS and EKS with traffic split
- Service-by-service migration: Move non-critical services first
- Data layer sync: Ensure databases, caches work with both
- Gradual traffic shift: Use ALB weighted targets or Route53
- ECS decommission: Once EKS is stable and validated
Starting Fresh: Decision Framework
Use this decision tree for new projects:
START
|
v
Do you need multi-cloud portability?
|
+-- YES --> Choose EKS
|
+-- NO
|
v
Do you have Kubernetes expertise on team?
|
+-- YES --> Choose EKS
|
+-- NO
|
v
Do you have 20+ microservices with complex networking?
|
+-- YES --> Invest in Kubernetes, choose EKS
|
+-- NO
|
v
Do you need batch/ML workloads with GPU?
|
+-- YES --> Choose EKS
|
+-- NO
|
v
Is your team < 10 developers?
|
+-- YES --> Choose ECS Fargate
|
+-- NO
|
v
Is cost optimization critical (high utilization)?
|
+-- YES --> Choose ECS with EC2 launch type
|
+-- NO --> Choose ECS Fargate
Deployment Best Practices
ECS Fargate Deployment Pattern
# CloudFormation template for ECS Fargate
AWSTemplateFormatVersion: '2010-09-09'
Description: 'ECS Fargate Service with ALB'
Resources:
Cluster:
Type: AWS::ECS::Cluster
Properties:
ClusterName: production
ClusterSettings:
- Name: containerInsights
Value: enabled
TaskDefinition:
Type: AWS::ECS::TaskDefinition
Properties:
Family: web-app
NetworkMode: awsvpc
RequiresCompatibilities:
- FARGATE
Cpu: '512'
Memory: '1024'
ExecutionRoleArn: !GetAtt ExecutionRole.Arn
TaskRoleArn: !GetAtt TaskRole.Arn
ContainerDefinitions:
- Name: app
Image: !Sub '${AWS::AccountId}.dkr.ecr.${AWS::Region}.amazonaws.com/web-app:latest'
PortMappings:
- ContainerPort: 8080
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-group: !Ref LogGroup
awslogs-region: !Ref AWS::Region
awslogs-stream-prefix: ecs
Environment:
- Name: ENVIRONMENT
Value: production
Secrets:
- Name: DB_PASSWORD
ValueFrom: !Sub 'arn:aws:secretsmanager:${AWS::Region}:${AWS::AccountId}:secret:db-password'
Service:
Type: AWS::ECS::Service
DependsOn: LoadBalancerListener
Properties:
ServiceName: web-app
Cluster: !Ref Cluster
TaskDefinition: !Ref TaskDefinition
DesiredCount: 3
LaunchType: FARGATE
NetworkConfiguration:
AwsvpcConfiguration:
SecurityGroups:
- !Ref ServiceSecurityGroup
Subnets:
- !Ref PrivateSubnet1
- !Ref PrivateSubnet2
AssignPublicIp: DISABLED
LoadBalancers:
- ContainerName: app
ContainerPort: 8080
TargetGroupArn: !Ref TargetGroup
HealthCheckGracePeriodSeconds: 60
AutoScalingTarget:
Type: AWS::ApplicationAutoScaling::ScalableTarget
Properties:
MaxCapacity: 10
MinCapacity: 2
ResourceId: !Sub 'service/${Cluster}/${Service.Name}'
RoleARN: !GetAtt AutoScalingRole.Arn
ScalableDimension: ecs:service:DesiredCount
ServiceNamespace: ecs
AutoScalingPolicy:
Type: AWS::ApplicationAutoScaling::ScalingPolicy
Properties:
PolicyName: cpu-scaling
PolicyType: TargetTrackingScaling
ScalingTargetId: !Ref AutoScalingTarget
TargetTrackingScalingPolicyConfiguration:
TargetValue: 70.0
PredefinedMetricSpecification:
PredefinedMetricType: ECSServiceAverageCPUUtilization
ScaleInCooldown: 300
ScaleOutCooldown: 60
EKS Deployment Pattern with Terraform
# Terraform configuration for EKS cluster
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 19.0"
cluster_name = "production-eks"
cluster_version = "1.28"
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
# Enable IRSA (IAM Roles for Service Accounts)
enable_irsa = true
# Managed node groups
eks_managed_node_groups = {
general = {
desired_size = 3
min_size = 2
max_size = 10
instance_types = ["t3.large"]
capacity_type = "ON_DEMAND"
labels = {
workload-type = "general"
}
taints = []
}
compute_intensive = {
desired_size = 1
min_size = 0
max_size = 5
instance_types = ["c5.2xlarge"]
capacity_type = "SPOT"
labels = {
workload-type = "compute-intensive"
}
taints = [{
key = "workload-type"
value = "compute-intensive"
effect = "NoSchedule"
}]
}
}
# Fargate profiles for serverless pods
fargate_profiles = {
default = {
name = "default"
selectors = [
{
namespace = "kube-system"
labels = {
k8s-app = "kube-dns"
}
},
{
namespace = "staging"
}
]
}
}
# Cluster addons
cluster_addons = {
coredns = {
most_recent = true
}
kube-proxy = {
most_recent = true
}
vpc-cni = {
most_recent = true
}
aws-ebs-csi-driver = {
most_recent = true
}
}
# CloudWatch logging
cluster_enabled_log_types = ["api", "audit", "authenticator", "controllerManager", "scheduler"]
tags = {
Environment = "production"
Terraform = "true"
}
}
# Install AWS Load Balancer Controller
resource "helm_release" "aws_load_balancer_controller" {
name = "aws-load-balancer-controller"
repository = "https://aws.github.io/eks-charts"
chart = "aws-load-balancer-controller"
namespace = "kube-system"
set {
name = "clusterName"
value = module.eks.cluster_name
}
set {
name = "serviceAccount.create"
value = "true"
}
set {
name = "serviceAccount.annotations.eks\\.amazonaws\\.com/role-arn"
value = aws_iam_role.aws_load_balancer_controller.arn
}
}
# Cluster Autoscaler
resource "helm_release" "cluster_autoscaler" {
name = "cluster-autoscaler"
repository = "https://kubernetes.github.io/autoscaler"
chart = "cluster-autoscaler"
namespace = "kube-system"
set {
name = "autoDiscovery.clusterName"
value = module.eks.cluster_name
}
set {
name = "awsRegion"
value = var.aws_region
}
set {
name = "rbac.serviceAccount.annotations.eks\\.amazonaws\\.com/role-arn"
value = aws_iam_role.cluster_autoscaler.arn
}
}
Monitoring and Observability
ECS Monitoring
# CloudWatch monitoring for ECS with boto3
import boto3
from datetime import datetime, timedelta
cloudwatch = boto3.client('cloudwatch')
# Get ECS service metrics
response = cloudwatch.get_metric_statistics(
Namespace='AWS/ECS',
MetricName='CPUUtilization',
Dimensions=[
{'Name': 'ServiceName', 'Value': 'web-app'},
{'Name': 'ClusterName', 'Value': 'production'}
],
StartTime=datetime.utcnow() - timedelta(hours=1),
EndTime=datetime.utcnow(),
Period=300, # 5 minutes
Statistics=['Average', 'Maximum']
)
for datapoint in response['Datapoints']:
print(f"Time: {datapoint['Timestamp']}, Avg CPU: {datapoint['Average']:.2f}%")
# Create CloudWatch alarm
cloudwatch.put_metric_alarm(
AlarmName='ecs-high-cpu',
ComparisonOperator='GreaterThanThreshold',
EvaluationPeriods=2,
MetricName='CPUUtilization',
Namespace='AWS/ECS',
Period=300,
Statistic='Average',
Threshold=80.0,
ActionsEnabled=True,
AlarmActions=['arn:aws:sns:region:account:ops-alerts'],
AlarmDescription='Alert when ECS service CPU exceeds 80%',
Dimensions=[
{'Name': 'ServiceName', 'Value': 'web-app'},
{'Name': 'ClusterName', 'Value': 'production'}
]
)
EKS Monitoring
# Prometheus monitoring for EKS
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: monitoring
data:
prometheus.yml: |
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
# Kubernetes API server
- job_name: 'kubernetes-apiservers'
kubernetes_sd_configs:
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;https
# Kubernetes nodes
- job_name: 'kubernetes-nodes'
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
# Kubernetes pods
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
# Application metrics
- job_name: 'web-app'
static_configs:
- targets: ['web-app-service:8080']
metrics_path: '/metrics'
---
# Grafana dashboard for EKS
apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-dashboard-eks
namespace: monitoring
data:
eks-cluster.json: |
{
"dashboard": {
"title": "EKS Cluster Overview",
"panels": [
{
"title": "Pod CPU Usage",
"targets": [{
"expr": "sum(rate(container_cpu_usage_seconds_total{pod!=\"\"}[5m])) by (pod)"
}]
},
{
"title": "Pod Memory Usage",
"targets": [{
"expr": "sum(container_memory_working_set_bytes{pod!=\"\"}) by (pod) / 1024 / 1024"
}]
},
{
"title": "Network I/O",
"targets": [
{
"expr": "sum(rate(container_network_receive_bytes_total[5m])) by (pod)"
},
{
"expr": "sum(rate(container_network_transmit_bytes_total[5m])) by (pod)"
}
]
}
]
}
}
Conclusion
Choosing between AWS ECS Fargate and EKS isn't about picking the "best" option—it's about matching the right tool to your specific requirements. ECS Fargate delivers the fastest path to production with minimal operational overhead, making it ideal for AWS-native applications, smaller teams, and straightforward container deployments. EKS provides the flexibility, portability, and advanced capabilities needed for complex microservices, multi-cloud strategies, and organizations with existing Kubernetes expertise.
For most teams starting fresh on AWS without multi-cloud requirements, ECS Fargate offers the lowest total cost of ownership when factoring in engineering time and operational complexity. Teams with Kubernetes experience or requiring advanced networking capabilities should choose EKS. Large-scale, high-utilization workloads benefit from EC2 launch types (available in both ECS and EKS) to optimize compute costs.
The decision ultimately comes down to three factors: team expertise, architectural complexity, and long-term portability requirements. Use the decision framework and cost models in this guide to make an informed choice based on your specific situation rather than industry hype or vendor recommendations.
Next Steps
- Assess your requirements using the decision framework provided in this guide
- Calculate total cost of ownership including engineering time, not just compute costs
- Start with a pilot project deploying a non-critical service to validate your choice
- Measure key metrics like deployment frequency, time to production, and operational overhead
- Document your decision with rationale for future reference and team alignment
- Plan for evolution knowing you can migrate between options as requirements change
- Invest in training to ensure your team has expertise in your chosen platform