The Curious Case of Request and Limit on AWS EKS Fargate

November 2, 2025

If you’re running containers on AWS EKS with Fargate, you might have encountered a surprising reality about how billing works. Let me share a scenario that trips up many engineers:

The Scenario

You have a pod with the following resource specifications:

resources:
  requests:
    cpu: 100m
    memory: 500Mi
  limits:
    cpu: 4000m
    memory: 8000Mi

At first glance, you might think: “Great! My pod will use minimal resources (100m CPU, 500Mi memory) but can burst up to 4 vCPUs and 8GB when needed.”

But here’s the twist: On AWS Fargate, you don’t pay for what you use—you pay for what you request.

How Fargate Billing Actually Works

Unlike traditional Kubernetes nodes where you pay for the entire EC2 instance, Fargate billing is based on the vCPU and memory resources requested by your pod. Here’s what makes it curious:

The Reality Check

With the configuration above:

Request: 100m CPU (0.1 vCPU), 500Mi memory
Limit: 4000m CPU (4 vCPU), 8000Mi memory

You might expect to pay for 0.1 vCPU and 500Mi of memory. However, Fargate has a catch.

Fargate’s Pod Size Calculation

Fargate doesn’t provision resources based on individual container requests. Instead, it:

Sums up all container requests in your pod
Rounds up to the nearest supported Fargate configuration

The supported configurations follow specific vCPU and memory combinations:

vCPU	Memory Options
0.25 vCPU	0.5 GB, 1 GB, 2 GB
0.5 vCPU	1 GB - 4 GB (1 GB increments)
1 vCPU	2 GB - 8 GB (1 GB increments)
2 vCPU	4 GB - 16 GB (1 GB increments)
4 vCPU	8 GB - 30 GB (1 GB increments)

What You Actually Pay For

In our example:

CPU request: 100m = 0.1 vCPU → Rounds up to 0.25 vCPU
Memory request: 500Mi ≈ 0.5 GB → 0.5 GB

You’ll be billed for: 0.25 vCPU and 0.5 GB memory

But here’s where it gets interesting…

The Limit Doesn’t Matter (For Billing)

Your limit of 4 vCPU and 8GB memory is essentially ignored for billing purposes. Fargate provisions based on requests, not limits.

However, there’s an important implication:

Performance Impact

Since Fargate provisions 0.25 vCPU for your pod, even though you set a limit of 4 vCPU:

Your container cannot actually use 4 vCPUs
It’s constrained by what Fargate provisioned (0.25 vCPU)
The limit becomes effectively meaningless

The same applies to memory:

Fargate provisions 0.5 GB
Your 8GB limit won’t help if the pod tries to use more than provisioned
OOMKilled errors will occur at the provisioned boundary, not your limit

Best Practices for Fargate

1. Set Requests = Limits

Since limits don’t give you burst capacity on Fargate, it’s cleaner to set them equal:

resources:
  requests:
    cpu: 250m      # 0.25 vCPU
    memory: 512Mi  # 0.5 GB
  limits:
    cpu: 250m
    memory: 512Mi

2. Understand the Rounding

Always check which Fargate configuration your requests will round up to. A request of 251m CPU will round up to 0.5 vCPU, doubling your cost.

3. Right-Size Your Pods

Monitor actual resource usage and set requests accordingly. Over-requesting wastes money; under-requesting causes performance issues or OOM errors.

4. Consider Multiple Containers

If your pod has multiple containers, Fargate sums all their requests:

# Container 1: 100m CPU, 256Mi memory
# Container 2: 100m CPU, 256Mi memory
# Total: 200m CPU, 512Mi memory
# Fargate provisions: 0.25 vCPU, 1 GB (rounds up memory to nearest supported config)

The Cost Implication

Let’s look at a real cost scenario (using approximate US East pricing):

Scenario 1: Optimal Configuration

Request/Limit: 0.25 vCPU, 0.5 GB
Cost: ~$0.012/hour per pod

Scenario 2: Misunderstood Configuration

Request: 0.1 vCPU, 0.5 GB (thinking you’re optimizing)
Actual Fargate provision: 0.25 vCPU, 0.5 GB
Cost: ~$0.012/hour per pod (same as Scenario 1!)

Scenario 3: Accidental Waste

Request: 0.3 vCPU, 0.6 GB
Actual Fargate provision: 0.5 vCPU, 1 GB (rounds up!)
Cost: ~$0.024/hour per pod (2x more!)

When Should You Actually Use EKS Fargate?

Given Fargate’s unique billing model, it’s not a one-size-fits-all solution. Here are the scenarios where Fargate shines and where it doesn’t:

✅ Perfect Use Cases for Fargate

1. Batch Jobs with Predictable Resource Needs

Fargate is ideal for batch processing workloads where you know exactly how much CPU and memory you need:

# Example: Nightly ETL job
apiVersion: batch/v1
kind: Job
metadata:
  name: data-processing-job
spec:
  template:
    spec:
      containers:
      - name: processor
        resources:
          requests:
            cpu: 1000m
            memory: 2Gi
          limits:
            cpu: 1000m
            memory: 2Gi

Why it works:

Jobs run to completion and terminate
You pay only for the duration of execution
No idle infrastructure costs
Predictable resource usage = no wasted provisioning

2. Self-Hosted CI/CD Runners

Jenkins agents, GitLab runners, GitHub Actions runners, or Argo Workflows on Fargate are excellent choices:

Benefits:

Runners spin up on-demand for each build
Terminate immediately after job completion
No need to maintain a pool of idle EC2 instances
Each build gets isolated, dedicated resources
Perfect for security-sensitive environments

Example use case:

# Ephemeral CI runner
resources:
  requests:
    cpu: 2000m      # 2 vCPU for fast builds
    memory: 4Gi     # 4 GB for build artifacts
  limits:
    cpu: 2000m
    memory: 4Gi

A typical build might take 5-10 minutes, so you only pay for that duration rather than keeping an EC2 instance running 24/7.

3. Microservices with Predictable, Steady Load

If your service has consistent traffic patterns without significant bursts:

Characteristics:

Stable request rates throughout the day
Predictable resource consumption
No need for aggressive autoscaling
Clear understanding of CPU/memory requirements

Example: Internal APIs, admin dashboards, scheduled report generators

4. Isolated, Secure Workloads

When you need strong workload isolation:

Multi-tenant applications requiring tenant-level isolation
Compliance-heavy workloads (each pod gets its own VM boundary)
Different customers’ workloads with security requirements

5. Development and Staging Environments

For non-production environments where:

You want to minimize operational overhead
Resource requirements are well-understood
Cost predictability matters more than absolute cost optimization
You can terminate pods during non-business hours

❌ Where Fargate Doesn’t Make Sense

1. Highly Variable or Bursty Workloads

If your application has unpredictable spikes:

Problem: You must request resources for peak load, meaning you overpay during low-traffic periods.

Better alternative: EC2-based node groups with Cluster Autoscaler or Karpenter, where you can:

Set low requests with high limits
Burst into available node capacity
Scale nodes down during off-peak

Example of bad fit:

# E-commerce site with 10x traffic during flash sales
# You'd need to request for peak (expensive)
# Or risk OOMKills during spikes (unreliable)

2. Long-Running Services with Consistent Traffic

For services running 24/7 with steady usage:

Cost comparison:

Fargate: 0.25 vCPU, 0.5 GB = ~$8.76/month per pod
EC2 t3.small: 2 vCPU, 2 GB = ~$15/month (can run 8+ pods efficiently)

At scale, EC2 nodes with efficient bin-packing become more cost-effective.

3. Stateful Applications Requiring Node-Level Access

Fargate limitations:

No daemonsets (can’t run node-level agents)
No privileged containers by default
Limited access to underlying infrastructure
No direct EBS volume mounting (only EFS)

Not suitable for:

Databases requiring local SSD access
Monitoring agents that need node-level metrics
Custom networking or storage drivers

4. GPU or Specialized Hardware Workloads

Fargate doesn’t support:

GPU instances
High-performance networking requirements
Custom instance types

Machine learning training, rendering, or HPC workloads need EC2-based nodes.

5. Cost-Sensitive, High-Density Workloads

If you’re running hundreds of small microservices:

The math:

100 pods × 0.25 vCPU × 0.5 GB on Fargate = high cost
Same workload on 3-4 optimized EC2 instances = significant savings

Bin-packing efficiency matters at scale.

Decision Framework: Fargate vs EC2 Nodes

Ask yourself these questions:

Question	Fargate If…	EC2 Nodes If…
Workload Duration	Short-lived, intermittent	Long-running, 24/7
Resource Patterns	Predictable, steady	Variable, bursty
Scale	<50 pods	>100 pods
Operational Overhead	Want zero node management	Have ops team/automation
Cost Priority	Simplicity > absolute cost	Optimize every dollar
Isolation Needs	Strong isolation required	Standard Kubernetes isolation OK
Startup Speed	Can tolerate 30-60s cold start	Need <10s pod scheduling

Hybrid Approach: Best of Both Worlds

Many organizations use both:

# Fargate for CI/CD and batch jobs
apiVersion: v1
kind: Pod
metadata:
  labels:
    workload: batch
spec:
  nodeSelector:
    eks.amazonaws.com/compute-type: fargate

# EC2 nodes for core services
apiVersion: v1
kind: Pod
metadata:
  labels:
    workload: production
spec:
  nodeSelector:
    node.kubernetes.io/instance-type: t3.large

This gives you:

Cost efficiency for steady workloads (EC2)
Operational simplicity for ephemeral workloads (Fargate)
Flexibility to optimize each workload type

Conclusion

The curious case of requests and limits on AWS EKS Fargate teaches us an important lesson: requests are king. Unlike traditional Kubernetes where limits provide burst capacity, Fargate provisions based solely on requests and rounds up to supported configurations.

Understanding this behavior is crucial for:

Cost optimization: Avoiding unnecessary rounding up
Performance tuning: Setting realistic expectations for your pods
Capacity planning: Knowing exactly what resources your applications have

When to use Fargate:

Batch jobs with fixed, predictable resource needs
Self-hosted CI/CD runners that spin up on-demand
Deployments with steady, predictable load patterns
Workloads requiring strong isolation
When operational simplicity outweighs cost optimization

When to avoid Fargate:

Highly variable or bursty workloads
Long-running services at scale
Cost-sensitive, high-density deployments
Stateful apps requiring node-level access

Remember: On Fargate, your limits are more of a documentation tool than a functional resource boundary. Set your requests wisely, choose the right compute type for each workload, and you’ll master the art of Fargate cost management.