When you deploy applications behind a Network Load Balancer (NLB) in AWS, you usually expect perfect traffic distribution — fast, fair, and stateless.
But what if your backend holds stateful sessions — like in-memory login sessions, caching, or WebSocket connections — and you need a given client to keep hitting the same target every time?
That’s where NLB sticky sessions (also called connection stickiness or source IP affinity) come in. They’re powerful but also misunderstood — and misconfiguring them can lead to uneven load, dropped connections, or mysterious client “resets.”
Let’s break down exactly how they work, how to set them up, what to watch for, and how to troubleshoot the tricky edge cases that appear in production.
1. What Are Sticky Sessions on an NLB?
At a high level, sticky sessions ensure that traffic from the same client consistently lands on the same target (EC2 instance, IP, or container) behind your NLB.
Unlike the Application Load Balancer (ALB) — which uses HTTP cookies for stickiness — the NLB operates at Layer 4 (TCP/UDP).
That means it doesn’t look inside your packets. Instead, it bases stickiness on network-level parameters like:
- Source IP address
- Destination IP and port
- Source port (sometimes included in the hash)
- Protocol (TCP, UDP, or TLS passthrough)
AWS refers to this as “source IP affinity.”
When enabled, the NLB creates a flow-hash mapping that ties the client to a backend target.
As long as the hash remains the same, the same client gets routed to the same target — even across multiple connections.
2. Enabling Sticky Sessions on an AWS NLB
Stickiness is configured per target group, not at the NLB level.
Step-by-Step via AWS Console
- Go to EC2 → Load Balancers → Target Groups
Find the target group your NLB listener uses. - Select the Target Group → Attributes tab
- Under Attributes, set:
Stickiness.enabled = trueStickiness.type = source_ip
- Save changes and confirm the attributes are updated.
Step-by-Step via AWS CLI
```bash
aws elbv2 modify-target-group-attributes \
--target-group-arn arn:aws:elasticloadbalancing:region:acct:targetgroup/mytg/abc123 \
--attributes Key=stickiness.enabled,Value=true Key=stickiness.type,Value=source_ip
How to Verify:
aws elbv2 describe-target-group-attributes \
--target-group-arn arn:aws:elasticloadbalancing:region:acct:targetgroup/mytg/abc123
Sample Output:
{
"Attributes": [
{ "Key": "stickiness.enabled", "Value": "true" },
{ "Key": "stickiness.type", "Value": "source_ip" }
]
}
3. How NLB Stickiness Actually Works (Under the Hood)
The NLB’s flow hashing algorithm calculates a hash from several parameters — often the “five-tuple”:
<protocol, source IP, source port, destination IP, destination port>
The hash is used to choose a target. When stickiness is enabled, NLB remembers this mapping for some time (typically a few minutes to hours, depending on flow expiration).
Key Behavior Points:
- If the same client connects again using the same IP and port, the hash matches → same backend target.
- If any part of that tuple changes (e.g. client source port changes), the hash may change → client might hit a different target.
- NLBs maintain this mapping in memory; if the NLB node restarts or fails over, the mapping is lost.
- Sticky mappings can also be lost when cross-zone load balancing or target health status changes.
Not Cookie-Based
Because NLBs don’t inspect HTTP traffic, there’s no cookie involved.
This means:
- You can’t set session duration or expiry time like in ALB stickiness.
- Stickiness only works as long as the same network path and source IP persist.
4. Known Limitations & Edge Cases
Sticky sessions on NLBs are helpful but brittle. Here’s what can go wrong:
| Issue | Cause | Effect |
|---|---|---|
| Client source IP changes | NAT, VPN, mobile switching networks | Hash changes → new target |
| Different source port | Client opens multiple sockets or reconnects | Each connection may map differently |
| TLS termination at NLB | NLB terminates TLS | Stickiness not supported (only for TCP listeners) |
| Unhealthy target | Health check fails | Mapping breaks; NLB reroutes |
| Cross-zone load balancing toggled | Distribution rules change | May break existing sticky mappings |
| DNS round-robin at client | NLB has multiple IPs per AZ | Client DNS resolver may change NLB node |
| UDP behavior | Stateless packets; different flow hash | Stickiness unreliable for UDP |
| Scaling up/down | New targets added | Hash table rebalanced; some clients remapped |
🧠 Tip: If you rely on stickiness, keep your clients stable (same IP) and avoid frequent target registration changes.
🔹 5. Troubleshooting Sticky Session Problems
When things go wrong, these are the most common patterns you’ll see:
1. “Stickiness not working”
- Check target group attributes:
aws elbv2 describe-target-group-attributes --target-group-arn <arn>Ensurestickiness.enabledis true. - Make sure your listener protocol is TCP, not TLS.
- Confirm that client IPs aren’t being rewritten by NAT or proxy.
- Check CloudWatch metrics → if one target gets all the traffic, stickiness might be too “sticky” due to limited source IP variety.
2. “Some clients lose session state randomly”
- Verify client network stability — mobile clients or corporate proxies can rotate IPs.
- Confirm health checks aren’t flapping targets.
- Review your application session design — if session data lives in memory, consider an external session store (Redis, DynamoDB, etc.).
3. “Load imbalance — one instance overloaded”
- This can happens when many users share one public IP (common in offices or ISPs).
All those clients hash to the same backend. - Mitigate by:
- Disabling stickiness if not strictly required.
- Using ALB with cookie-based stickiness (more granular).
- Scaling target capacity.
4. “Connections drop after some time”
- NLB may remove stale flow mappings.
- Check TCP keepalive settings on clients and targets. Ensure keepalive_time < NLB idle timeout (350 seconds) to prevent connection resets. Linux commands below:
# Check keepalive time (seconds before sending first keepalive probe)
sysctl net.ipv4.tcp_keepalive_time
# Check keepalive interval (seconds between probes)
sysctl net.ipv4.tcp_keepalive_intvl
# Check keepalive probes (number of probes before giving up)
sysctl net.ipv4.tcp_keepalive_probes
# View all at once
sysctl -a | grep tcp_keepalive
- Verify idle timeout on backend apps (e.g., web servers closing connections too early).
6. Observability & Testing
You can validate sticky behavior with:
- CloudWatch metrics:
ActiveFlowCount,NewFlowCount, and per-target request metrics. - VPC Flow Logs: confirm that repeated requests from the same client IP go to the same backend ENI.
- Packet captures: Use
tcpdumporsson your backend instances to see if the same source IP consistently connects.
Quick test with curl:
for i in {1..100}; do
echo "=== Request $i at $(date) ===" | tee -a curl_test.log
curl http://<nlb-dns-name>/ -v 2>&1 | tee -a curl_test.log
sleep 0.5
done
Run it from the same host and check which backend responds (log hostname on each instance).
Then try from another IP or VPN — you’ll likely see a different target.
7. Best Practices
- Only enable stickiness if necessary.
Stateless applications scale better without it. - If using TLS: terminate TLS at the backend or use ALB if you need session affinity.
- Use shared session stores.
Tools like ElastiCache (Redis) or DynamoDB make scaling simpler and safer. - Avoid toggling cross-zone load balancing during traffic — it resets the sticky map.
- Set up proper health checks — unhealthy targets break affinity immediately.
- Monitor uneven load — large NAT’d user groups can overload a single instance.
- For UDP — consider designing idempotent stateless processing; sticky sessions may not behave reliably.
8. Example Architecture Pattern
Scenario: A multiplayer game server behind an NLB.
Each player connects via TCP to the game backend that stores their in-memory state.
✅ Recommended setup:
- Enable
stickiness.enabled = trueandstickiness.type = source_ip - Disable TLS termination at NLB
- Keep targets in the same AZ with cross-zone load balancing disabled to maintain stable mapping
- Maintain external health and scaling logic to avoid frequent re-registrations
This setup ensures that the same player IP always lands on the same backend server, as long as their network path is stable.
9. Summary Table
| Attribute | Supported Value | Notes |
|---|---|---|
stickiness.enabled | true / false | Enables sticky sessions |
stickiness.type | source_ip | Only option for NLB |
| Supported Protocols | TCP, UDP (limited) | Not supported for TLS listeners |
| Persistence Duration | Until flow reset | Not configurable |
| Cookie-based Stickiness | ❌ No | Use ALB for cookie-based |
| Best for | Stateful TCP apps | e.g. games, custom protocols |
10. When to Use ALB Instead
If you’re dealing with HTTP/HTTPS applications that manage user sessions via cookies or tokens, you’ll be much happier using an Application Load Balancer.
It offers:
- Configurable cookie duration
- Per-application stickiness
- Layer-7 routing and metrics
The NLB should be reserved for high-performance, low-latency, or non-HTTP workloads that need raw TCP/UDP handling.
11. Closing Thoughts
AWS NLB sticky sessions are a great feature — but they’re not magic glue.
They work well when your network topology and client IPs are predictable, and your app genuinely needs flow affinity.
However, if your environment involves NATs, mobile networks, or frequent scale-ups, expect surprises.
When in doubt:
→ Keep your app stateless,
→ Let the load balancer do its job, and
→ Use stickiness only as a last resort for legacy or session-bound systems.