K3s Networking: Solving the Source IP SNAT Issue

WARNING

This article was written entirely by Google Gemini and supervised by a human to ensure technical accuracy and real-world applicability. It serves as a testament to how AI can assist developers in documenting complex infrastructure challenges and sharing solutions with the global community.

Issue

When transitioning a K3s cluster from a combined 'all-in-one' node to a dedicated Control Plane and Agent architecture, many developers encounter a frustrating hurdle: the loss of the client’s original source IP address. Suddenly, every request to your backend appears to originate from the cluster’s internal node IP, breaking geolocation, rate-limiting, and logging.

This post explores why Kubernetes' default networking behavior (SNAT) masks the source IP during inter-node hops and provides a step-by-step guide to fixing it. We’ll dive into configuring Traefik as a DaemonSet, leveraging hostNetwork mode, and adjusting externalTrafficPolicy to ensure your application sees the real user behind the request, not just the cluster’s internal proxy."

The Problem: The "Mysterious" Node IP

Everything was working perfectly. My K3s cluster was a single-node powerhouse where the Control Plane and Agent roles lived together. My Node.js backend accurately logged client IPs using X-Forwarded-For.

Then, I decided to grow. I separated the roles: one dedicated Control Plane and one Agent node.

Immediately, my logs broke. Instead of seeing the real user’s IP, every single request appeared to come from the internal IP of the Control Plane node. If you’ve ever tried to implement rate-limiting or geolocation, you know this is a nightmare.

Why did the IP disappear?

In a standard Kubernetes setup, when traffic hits a node that doesn’t host the destination Pod, the cluster uses Source Network Address Translation (SNAT) to route the packet to the correct node.

When my setup was a single node, the traffic hit the node and stayed there. No hops, no SNAT. Once I moved the Pod to a separate Agent node, the Control Plane had to "forward" the traffic. To ensure the response could find its way back, the Control Plane replaced the Client’s IP with its own.

The Solution: Bringing Traefik to the Edge

To fix this, we need to ensure that the entry point (Traefik) is present on every node and that it talks directly to the host’s network.

Transform Traefik into a DaemonSet

By default, K3s installs Traefik as a Deployment. This means it might only run on one node. If traffic hits Node A but Traefik is on Node B, you get SNAT.

By changing it to a DaemonSet, we ensure a Traefik instance runs on every single node, including your Control Plane.

Bypass the NAT with HostNetwork

We want Traefik to listen directly on the host’s ports (80 and 443) rather than through the Kubernetes virtual networking proxy. This allows Traefik to see the "wire" IP of the packet before Kubernetes touches it.

Set the Local Traffic Policy

We need to tell the Service to only route traffic to pods on the same node that received the request. This preserves the source IP because it eliminates the inter-node hop.

The Configuration

In K3s, the cleanest way to apply this is by creating a HelmChartConfig. Edit (or create) the following file on your Control Plane:

/var/lib/rancher/k3s/server/manifests/traefik.yaml
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: traefik
  namespace: kube-system
spec:
  valuesContent: |-
    deployment:
      kind: DaemonSet (1)
    hostNetwork: true (2)
    service:
      spec:
        externalTrafficPolicy: Local (3)
    additionalArguments:
      - "--entryPoints.web.forwardedHeaders.insecure=true"  (4)
      - "--entryPoints.websecure.forwardedHeaders.insecure=true"
1 Runs Traefik on every node.
2 Binds Traefik directly to the host’s network interface.
3 Forces the packet to stay within the node, preserving the IP.
4 Not sure if required but…​

Once edited and saved K3S redeploy Traefik automagically in all nodes

Conclusion

Separating your Control Plane and Agents is a great step for cluster stability, but it changes the rules of the game for networking.

By moving Traefik to a DaemonSet and using HostNetwork, you remove the middleman that’s masking your data.

Now, my logs are back to normal, and I can see where my users are actually coming from.

Happy hacking!

Este texto ha sido escrito por un humano

This post has been written by a human

2019 - 2026 | Mixed with Bootstrap | Baked with JBake v2.6.7 | Terminos Terminos y Privacidad