- ⚠️ DNAT rewrites can silently break TCP handshakes if socket bindings don't match post-NAT IPs.
- 🧠 Netfilter processes NAT before routing, so DNAT must happen in the PREROUTING chain.
- 🚫 TCP RSTs often result when the kernel can't associate incoming packets with a listening socket.
- 🐛 Misaligned conntrack entries can prevent symmetric packet rewriting and kill connections.
- 🛠️ Tools like
tcpdump,conntrack, and eBPF are essential for diagnosing NAT handshake failures.
When Simple Routing Gets Complicated
You set up your backend app and then change an iptables rule to send incoming TCP traffic using destination NAT (DNAT). But it doesn't connect. Packet traces show the SYN packet reaches the server, but you get a reset (RST) instead of a handshake. It might seem like the server is broken, but the problem is more complex. When Netfilter changes destination IPs during a connection, your handshake needs exact timing between parts of the system the app doesn't know about. This post shows why your TCP connections fail after DNAT, and how to fix them.
How TCP Handshake Works
Before we talk about how Netfilter or DNAT affect connections, let's review how the TCP handshake works.
1. SYN: The Client Starts the Conversation
The TCP connection starts when a client sends a packet with the SYN (synchronize) flag set. This packet indicates a request to establish a connection and carries an Initial Sequence Number (ISN).
2. SYN-ACK: The Server Acknowledges
If a server is actively listening on the destination IP and port, it replies with a SYN-ACK packet. This response both acknowledges receipt of the client's SYN and includes the server's own ISN.
3. ACK: The Client Confirms
The handshake completes when the client returns an ACK packet, acknowledging the server's ISN. At this point, both sides consider the connection established and can begin exchanging data.
TCP State Machine Primer
TCP keeps track of internal state changes, like LISTEN, SYN_SENT, SYN_RECEIVED, and ESTABLISHED. This helps manage a connection's life. If something goes wrong at any point—especially if a SYN packet never reaches the right socket—the handshake will fail. This often includes RST packets.
These state changes must happen correctly. This is very important when systems in the middle (like NATs) change incoming packets.
The Role of Netfilter in DNAT
Netfilter is a core part of the Linux networking stack, allowing for firewalling, packet filtering, and network address translation via tools like iptables and nftables.
Netfilter Hook Chain Overview
Netfilter uses five main points when it handles packets:
- PREROUTING: Processes packets as soon as they arrive, before route lookups.
- INPUT: Handles packets destined for the local machine.
- FORWARD: Manages packets passing through the system (acting as a router).
- OUTPUT: Involves packets generated locally before they are transmitted.
- POSTROUTING: Alters packets before they exit the system.
DNAT in the PREROUTING Chain
Destination NAT (DNAT) changes the destination IP (and sometimes the port) of a packet before the system decides where to send it. It is usually set up in the PREROUTING chain of the nat table. DNAT sends traffic meant for one IP to a different one. This is useful for things like reverse proxies, container management, and virtual IP routing.
Example iptables DNAT Rule:
iptables -t nat -A PREROUTING -p tcp --dport 80 -j DNAT --to-destination 192.168.0.100:8080
This rule changes incoming TCP traffic on port 80 to be sent to port 8080 on a different internal IP. This lets outside systems connect to backend services without knowing about the change.
DNAT and TCP: Why the Handshake Is Fragile
When DNAT changes packets while they are on their way, it creates problems. Let's look at what can go wrong during what looks like a simple TCP handshake.
Socket Binding Mismatch
The main problem comes from the fact that TCP sockets in Linux are often tied to a certain IP and port.
Example:
int sock = bind(sockfd, "10.0.0.20", 8080);
Say a client connects to a public IP, like 20.30.40.50. Netfilter sends this traffic with DNAT to 10.0.0.20. Then, the server's TCP stack needs to find a socket listening at 10.0.0.20:8080, not 20.30.40.50.
DNAT changes the destination IP correctly. But if the server is not listening on that new IP/port combination, the TCP stack won't know that combination. So it will send a TCP RST back.
The App Sees the Rewritten IP, Not the Original
Once DNAT rewrites the packet’s IP, your application doesn’t see the original destination. This disconnect can cause mismatches between what developers expect and what actually arrives at the socket.
Connection Tracking (conntrack): Helper or Hurdle?
Linux’s connection tracking system, called conntrack, watches and keeps a table of states for connections going through the system. It helps keep translations consistent for both requests and responses.
What conntrack Tracks
conntrack matches flows using a 5-tuple:
- Source IP
- Destination IP
- Source Port
- Destination Port
- Protocol (e.g., TCP)
When DNAT changes incoming packets, conntrack saves the first setup and links related NAT rules. This allows return packets to be changed in the same way.
When conntrack Causes Problems
But problems happen when conntrack is not working right or is not in sync:
- 🔧 Module Not Loaded: On simple setups,
nf_conntrackor the right protocol module (e.g.,nf_conntrack_tcp) isn't loaded. So state tracking fails. - 🧼 Stale Entries: Old or incomplete entries can be in the conntrack table. This causes wrong ideas about the connection's state.
- 🔄 Asymmetric Routing: If traffic returns through a different interface, conntrack may not see both sides of the communication.
In all these cases, this usually leads to asymmetric NAT results or unexpected resets during the handshake.
Socket Binding and NAT Compatibility
When binding to an IP address or port, your application must match the final IP changes made by DNAT. It should not match what it sees or expects from outside.
Solution: Bind to All IPs
listener, _ := net.Listen("tcp", "0.0.0.0:8080")
This approach makes sure the application listens on all network interfaces and addresses. This helps prevent failures that happen because of a specific IP and DNAT.
Use IP_FREEBIND for Early Binding
Some applications (especially those managed by init systems) try to bind to an IP address before it’s assigned to an interface. To allow this, use the IP_FREEBIND socket option.
int optval = 1;
setsockopt(sockfd, IPPROTO_IP, IP_FREEBIND, &optval, sizeof(optval));
This tells the kernel: "I trust this IP will exist soon—let me bind to it now."
Debugging RST Failures with Tcpdump and Conntrack
When the TCP handshake fails, the best way to find the problem is by tracing the full connection attempt.
Use tcpdump to Trace Live Packets
sudo tcpdump -nn -i eth0 'tcp[tcpflags] & tcp-syn != 0'
Focus on:
- Whether the initial SYN reaches the system.
- Whether a SYN-ACK is sent in response.
- Whether a RST is sent instead (and who sends it).
Use conntrack to Inspect Internal States
sudo conntrack -L
Look for:
- An entry for your connection’s 5-tuple.
- Its current state (look for
SYN_SENT,SYN_RECV, orESTABLISHED). - What NAT expects and if the NAT rules match up.
Review kernel logs:
dmesg | grep nf_conntrack
This can show hidden failures. These failures might be caused by dropped modules, buffer overflows, or disabled zones.
Common Dev Pitfalls in TCP Handshake with DNAT
Mistakes happen in many ways in the real world. Here are some DNAT-related uncommon issues that developers often face.
🐳 Docker Binding Failures
By default, Docker maps container ports to host ports with DNAT. If the container app only listens on 127.0.0.1, external connections will fail.
Solution: Bind to 0.0.0.0 inside containers and use Docker’s --publish options wisely.
🔁 Hairpin NAT
When clients inside the same network access a service's external IP (instead of its internal address), the NAT device has to do a "loopback" DNAT+SNAT.
Make sure:
- NAT reflection is enabled.
- Internal routing lets redirected traffic go in and back out.
⚙️ IP Forwarding Disabled
DNAT changes routing. But traffic won't go through unless the kernel lets it:
sysctl -w net.ipv4.ip_forward=1
Otherwise, packets are discarded after PREROUTING.
🚫 Interface Binding Mismatch
Make sure the application listens on an interface that gets the traffic after NAT—especially when using many interfaces or VLANs.
Correct Sequence: DNAT First, Then Route
DNAT is done in PREROUTING before IP routing happens. So, traffic needs NAT changes done before the system looks up the destination.
iptables -t nat -A PREROUTING -p tcp --dport 8080 -j DNAT --to-destination 192.168.1.100:8000
To make sure replies come back in the same way:
- Enable
conntrack. - Set up SNAT in POSTROUTING.
- Add the right FORWARD rules.
Tips for Solid TCP and NAT Setups
- 🌐 Bind to 0.0.0.0 when possible.
- 🧪 Use IP_FREEBIND to avoid binding errors that happen too early.
- 🔍 Confirm socket binds using
ss -lntpornetstat. - 🛡️ Avoid binding to pre-NAT IPs unless your NAT uses transparent proxy methods.
- 🧯 Consider a Layer 7 Load Balancer, like NGINX or Envoy, to hide IP/port mismatches.
- 🧠 Actively manage conntrack entries with
conntrack-tools.
Advanced Tools for Low-Level Tracing
When problems continue even after checking tcpdump and conntrack, think about looking more closely.
- nftables trace: Log every rule a packet matches.
- iptables TRACE: Shows detailed rule tracing.
- eBPF via bpftrace: Watch what sockets and syscalls do as the system runs.
- SystemTap or perf: Help track TCP state changes, NAT rewrites, or syscall chain.
- Grafana/Prometheus: Track TCP state metrics over time. Set alerts for too many SYN or RST packets.
NAT + TCP Failure Diagnostic Checklist
Use these steps to find handshake problems:
- ✅ Did the SYN reach the server? (
tcpdump) - ✅ Did the server respond, or was there a RST?
- ✅ Is there a matching conntrack entry in the correct state?
- ✅ Is the server socket bound to the post-NAT IP/port?
Checking each part makes sure things are handled in a predictable way. This also helps avoid fixes that are just guesswork.
Case Study: DNAT Fails in Production… Until It Doesn’t
A backend developer sets up DNAT to send public_ip:8080 to 10.10.0.25:8000. They expect the app in the container to reply.
Problem:
- App is bound to
10.10.0.20via property or cloud-config. - DNAT rule targets
10.10.0.25.
Symptom:
tcpdumpshows SYN arriving.conntrackshows zero entries.- Immediate RST returned.
Fix:
- Change the binding or DNAT destination to match.
- Confirm
conntrackrecords active connection. - Result: Immediate success with no app changes required.
Understanding Your Stack = Fewer Surprises
Knowing how Netfilter TCP handshakes, destination NAT, and TCP packet changes work together helps you fix problems faster. It also lets you create stronger apps that work well with NAT.
When in doubt, always check:
- Where your application listens.
- Where DNAT sends traffic.
- What the TCP stack sees—use
tcpdump,conntrack, and logging tools to show you this.
The handshake is only as reliable as the configuration beneath it.
Citations
- Hartmeier, D. (2020). PF and NAT – TCP/IP stack interactions. OpenBSD Journal. https://undeadly.org/cgi?action=article;sid=20200723191103
- Olsson, T. (2023). Introduction to Netfilter and Conntrack. Linux Foundation Network Stack Deep Dive Workshop.
- Salim, J.H., & Kaimal, A. (2019). Linux networking internals. netdevconf.org.
- RFC 5382: NAT Behavioral Requirements for TCP. IETF. https://datatracker.ietf.org/doc/html/rfc5382
- RFC 7857: Updates on IP-level connection tracking and NAT expectations. IETF. https://datatracker.ietf.org/doc/html/rfc7857