Bug #3666
closedPMTUD is broken for NATed traffic
0%
Description
Where you have an interface on a system with a lower MTU than other interfaces, send traffic larger than the MTU of the egress interface, and have pf enabled, you end up with a PMTUD black hole.
For example, simple LAN and WAN setup:
- set LAN to 1500 MTU and WAN to 1000
- send something from a host on LAN destined to something on WAN that's bigger than WAN's MTU with DF set, such as with hping:
hping3 -y -d 1400 -2 -p 12345 $dest_IP
where 12345 is the port and $dest_IP is an IP or hostname to send the traffic.
With pf enabled, the packet just disappears, the client gets nothing back. Disable pf, and you get back the appropriate "frag needed, DF set" ICMP error.
The above description and symptoms are specific to 2.2/10-STABLE, as of the most recent snapshot available at the time of this writing.
Files
Updated by Ermal Luçi over 10 years ago
- Status changed from New to Feedback
Patch put in to try to handle this case.
For record this is happening due to NAT being applied on packets and the generated ICMP is targeted to the pfSense machine itself.
Updated by Chris Buechler over 10 years ago
- Subject changed from enabling pf breaks PMTUD to PMTUD is broken for NATed traffic
- Status changed from Feedback to New
no change. I did confirm it's specific to NATed traffic and updated subject accordingly. Send any packet > egress interface's MTU with pf enabled where it's NATing the traffic, and you get no reply (and it doesn't leave any other interface on the box either). Turn off NAT, or disable pf, and you get the appropriate dest unreachable, frag needed reply.
Updated by Chris Buechler over 10 years ago
Additional data point. This seemingly isn't an issue in stock FreeBSD 10-STABLE. One I had handy:
FreeBSD m6600-vbox-10stable 10.0-STABLE FreeBSD 10.0-STABLE #0 r265018: Sun Apr 27 19:01:41 UTC 2014 root@grind.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64
Same scenario with NAT, that system works fine. Seems a regression related to changes we make.
Updated by Ermal Luçi over 10 years ago
You used the same ruleset on stock FreeBSD as pfSense?
Updated by Chris Buechler over 10 years ago
not identical, no. Had the same basic components - scrub all, pass all, nat on. I can throw the completely identical ruleset on there if that'd help. Can't reach that system at this instant to try it.
Updated by Chris Buechler over 10 years ago
- File broken-rules.debug broken-rules.debug added
- File broken-pfctl-vvsr.txt broken-pfctl-vvsr.txt added
- File broken-pfctl-vvsn.txt broken-pfctl-vvsn.txt added
I think you're on to something there. This:
scrub all nat on em0 from 192.168.0.0/16 to any -> (em0) pass in all pass out all
seems to work fine.
The attached, a pretty stock allow all ruleset for what we generate, doesn't work. I haven't tried to narrow it down any further, suggestions on likely culprits?
Updated by Ermal Luçi over 10 years ago
- Status changed from New to Feedback
I think the sysctl that was activated should fix this.
Updated by Ermal Luçi over 10 years ago
- % Done changed from 0 to 100
Applied in changeset 415b71f1d41c886b06dfc83d8bc2cb906be78509.
Updated by Chris Buechler about 10 years ago
- Status changed from Feedback to New
no change in behavior
Updated by Ermal Luçi about 10 years ago
Actually try out a next coming snapshot i found a quick hack that will help even for icmp case.
Updated by Ermal Luçi about 10 years ago
- Status changed from New to Feedback
Did you try to send an tcp/udp packet rather than icmp one?
Updated by Chris Buechler about 10 years ago
- Status changed from Feedback to Confirmed
no change. Ermal, msg me and we can both take a look at my test setup.
Updated by Chris Buechler about 10 years ago
Ermal - no change with the kernel you built. I have a test setup up now that you can reach. /msg me for info.
Updated by Ermal Luçi about 10 years ago
- Status changed from Confirmed to Feedback
Applied in changeset c46f9695ec7baf6dcfcc5a488fe0dd5dd6f4a00f.
Updated by Ermal Luçi about 10 years ago
Teh reply from interface was not being set properly.
Works for me now.
Updated by Chris Buechler about 10 years ago
- Status changed from Feedback to Confirmed
- % Done changed from 100 to 0
no change. Test setup on dev ESX is fully in place now, info on chaos wiki.
Updated by Chris Buechler about 10 years ago
- Status changed from Confirmed to Resolved
scratch that, the test box wasn't rebooted post-gitsync and gitsync doesn't apply the relevant change on the fly. This works now