Bug #14290
closedICMPv6 Path MTU Discovery breaks with NPT
100%
Description
I have the following setup:
Tunnel via HE.net
Internal Prefix on LAN: 2001:db8:1::1/64
Routed /48 from HE: 2001:db8:2::/48
NPt is setup on GIF interface like this:
Internal Prefix: 2001:db8:1::/48
External Prefix: 2001:db8:2::/48
MTU Set to 1452 on HE and local tunnel interface, LAN interface set to 1500
I check test-ipv6.com and see it complains about MTU issues, I then test pmtu with tracepath from a linux machine while watching tcpdump for icmp6 I see no packet to large messages
when I disable NPt I immediately get the correct ICMPv6 too large replies, same when I have tracepath running while changing config on the tunnel interface and hit apply
but as soon as NPt is enabled, no more ICMPv6 too large messages
pfSense 23.01-RELEASE (arm) on SG-3100
Updated by John S over 1 year ago
I can confirm I also have this exact same issue on 23.05.1-RELEASE. However, It's not just when GIF tunnels are used as the issue can also be recreated with IPv6 over L2TP.
If NPt is used then ICMPv6 too large messages stop being sent, they do work for a few seconds whenever the WAN interface is restarted but then stop. After which they do continue to be sent for any IPv6 addresses where NPt is not configured.
Updated by Kristof Provost 10 months ago
I believe I've reproduced this. It looks like the problem is in the icmp6_error() code, which tries to do a route lookup on the translated prefix, fails and as a result doesn't produce the packet too big error.
In my test adding a route for the translated prefix makes it successfully send the icmp6 error. However, that still doesn't fix the problem fully, because we then fail to translate the icmp6 error's destination address back. I'm continuing to investigate.
Updated by Kristof Provost 10 months ago
We can work around the problem by having pf perform the packet-to-big check and generating the icmp6 too big error:
- https://reviews.freebsd.org/D43499
- https://reviews.freebsd.org/D43500
Updated by Marcos M 10 months ago
- Subject changed from ICMPv6 path mtu discovery breaks with enabled NPt to ICMPv6 Path MTU Discovery breaks with NPT
- Status changed from New to Feedback
- Assignee set to Kristof Provost
- Target version set to 2.8.0
- % Done changed from 0 to 100
- Plus Target Version set to 24.03
- Affected Architecture deleted (
SG-3100)
Updated by John S 9 months ago
May I ask if this is included in 24.03-DEVELOPMENT (amd64)? As I have tested again on the latest build 24.03.a.20240222.2304 and it's not yet resolved, I don't see ICMP packet too big returned for all routes.
I have two IPv6 /48 WAN networks, I have a route to send 2001:4860:4860::8844 out of WAN1 and a route to send 2001:4860:4860::8888 out of WAN2. NPT used only for WAN2, my LAN is on a /64 subnet of WAN1. LAN has a MTU of 9000, for jumbo frames. WAN1 and WAN2 have a MTU of 1392.
"traceroute -6 --mtu -I 2001:4860:4860::8844" generates "ICMP6, packet too big, mtu 1392, length 1240" from pfsense.
"traceroute -6 --mtu -I 2001:4860:4860::8888" no packet too big if produced.
Updated by Kristof Provost 9 months ago
Yes, the fix is included in that snapshot build.
I had a theory about why it might not be working for you, but it does not match your description of your setup. Briefly, the current fix avoids the problem by checking the destination interface MTU before doing any pf processing. That'll work, unless the packet hits a route-to rule that sends it to an interface with a lower MTU than the original destination interface.
If you say that both WAN1 and WAN2 have the same MTU this should just work.
Also, are you running the trace route commands on the pfsense box itself? That's an entirely different setup, where the icmp error should be produced by the IP stack, not by pf. The fix only affects forwarded traffic.
Updated by John S 9 months ago
Ok thank you, ah it doesn't seem to be working in my setup.
No I'm not running the traceroute on pfsense, I'm running it on a different box on the local LAN.
I can try changing the MTU to be different per WAN, this is actually my goal, I set the MTU to the same as a workaround with MSS enabled. But sadly this workaround is only for TCP traffic, so it is not ideal.
Anything you would like me to try?
Updated by Kristof Provost 9 months ago
To be clear: I'd expect things to just work if both of your WANs have the same MTU, and maybe not if they don't.
You could try this dtrace snippet to see if we trigger an icmp6 error or not.
Given your description I'd expect to see a stack trace from the first trace route command and not from the second, but we've already seen my expectations do not match your tests.
dtrace -n 'fbt::icmp6_error:entry { stack(); }'
(Terminate with Ctrl+C when done)
Updated by John S 9 months ago
traceroute -6 --mtu -I 2001:4860:4860::8844 which did return a packet too big response, gave:
dtrace: description 'fbt::icmp6_error:entry ' matched 1 probe
CPU ID FUNCTION:NAME
1 66482 icmp6_error:entry
kernel`pf_route6+0x916
kernel`pf_test6+0xf90
kernel`pf_check6_in+0x5e
kernel`pfil_mbuf_in+0x38
kernel`ip6_input+0x654
kernel`swi_net+0x138
kernel`ithread_loop+0x257
kernel`fork_exit+0x7f
kernel`0xffffffff8128409e
traceroute -6 --mtu -I 2001:4860:4860::8888 which did not, gave:
CPU ID FUNCTION:NAME
0 66482 icmp6_error:entry
kernel`pf_route6+0x916
kernel`pf_test6+0xf90
kernel`pf_check6_in+0x5e
kernel`pfil_mbuf_in+0x38
kernel`ip6_input+0x654
kernel`netisr_dispatch_src+0x22c
kernel`ether_demux+0x149
kernel`ether_nh_input+0x36d
kernel`netisr_dispatch_src+0xaf
kernel`ether_input+0x69
kernel`ether_demux+0x8e
kernel`ether_nh_input+0x36d
kernel`netisr_dispatch_src+0xaf
kernel`ether_input+0x69
kernel`vtnet_rxq_eof+0x6d1
kernel`vtnet_rx_vq_process+0xbc
kernel`ithread_loop+0x257
kernel`fork_exit+0x7f
kernel`0xffffffff8128409e
So something different is happening.
On the 2nd attempt, I can tell no ICMPv6 too big was returned, since the traceroute stalled trying 9000 MTU and I also ran a tcpdump to capture all ICMP6 traffic.
I have repeated this test several times now, also swapping the order of the two tests, and it's always exactly the same results.
Just a thought, I am using policy based routing, so a gateway entry in the firewall. Could this be a factor?
Updated by Kristof Provost 9 months ago
So those backtraces are functionally identical. That would suggest that the reason you're not getting the icmp error in the second case is due to the NPT translation. (That's known, icmp6_error() fails to find the outbound interface because it doesn't know the translated prefix).
However, in both cases the check in pf_test6() ought to be generating the icmp error, and it's not doing so for either case.
That's presumably because we're testing inbound traffic, and that check explicitly checks for outbound (or to be exact, forwarded) traffic.
We don't run that check until we do the outbound firewall check. That's done post-routing.
However, given that we're handling route-to (i.e. policy based routing) processing is a little different here. There's no network stack routing decision, pf immediately outputs the packet itself.
We do still perform an outbound rules check, but it doesn't flag the traffic as being forwarded, which I suspect is what's causing the pf_test6() packet-too-big check to be skipped.
I'm going to try to reproduce this locally. If my understanding is correct and complete this may be an easy fix.
Updated by Kristof Provost 9 months ago
I've managed to reproduce (what I believe is) your problem in a test case, and the expected fix also fixes that.
That's gone upstream in https://cgit.freebsd.org/src/commit/?id=9566d9272600a472be3a608ea79197c57cad1dc3 and has been cherry-picked into our trees. The next snapshot build should work for you.