PMTUD is broken for NATed traffic
Where you have an interface on a system with a lower MTU than other interfaces, send traffic larger than the MTU of the egress interface, and have pf enabled, you end up with a PMTUD black hole.
For example, simple LAN and WAN setup:
- set LAN to 1500 MTU and WAN to 1000
- send something from a host on LAN destined to something on WAN that's bigger than WAN's MTU with DF set, such as with hping:
hping3 -y -d 1400 -2 -p 12345 $dest_IP
where 12345 is the port and $dest_IP is an IP or hostname to send the traffic.
With pf enabled, the packet just disappears, the client gets nothing back. Disable pf, and you get back the appropriate "frag needed, DF set" ICMP error.
The above description and symptoms are specific to 2.2/10-STABLE, as of the most recent snapshot available at the time of this writing.
Fixes #3666. Set the sysctl net.inet.icmp.reply_from_interface to 1 to use the incoming interface to send the icmp reply from. It uses another part of patch to pf to undo NAT if it was already performed before
#2 Updated by Chris Buechler about 4 years ago
- Subject changed from enabling pf breaks PMTUD to PMTUD is broken for NATed traffic
- Status changed from Feedback to New
no change. I did confirm it's specific to NATed traffic and updated subject accordingly. Send any packet > egress interface's MTU with pf enabled where it's NATing the traffic, and you get no reply (and it doesn't leave any other interface on the box either). Turn off NAT, or disable pf, and you get the appropriate dest unreachable, frag needed reply.
#3 Updated by Chris Buechler about 4 years ago
Additional data point. This seemingly isn't an issue in stock FreeBSD 10-STABLE. One I had handy:
FreeBSD m6600-vbox-10stable 10.0-STABLE FreeBSD 10.0-STABLE #0 r265018: Sun Apr 27 19:01:41 UTC 2014 email@example.com:/usr/obj/usr/src/sys/GENERIC amd64
Same scenario with NAT, that system works fine. Seems a regression related to changes we make.
#6 Updated by Chris Buechler about 4 years ago
I think you're on to something there. This:
scrub all nat on em0 from 192.168.0.0/16 to any -> (em0) pass in all pass out all
seems to work fine.
The attached, a pretty stock allow all ruleset for what we generate, doesn't work. I haven't tried to narrow it down any further, suggestions on likely culprits?