Bug #16103
openPPPoE WAN loses IPv4 addresses on ``IPV6CP`` ``LayerDown`` events
100%
Description
A dual-stack pfSense instance that uses PPP on its WAN interface (PPPoE in my case) loses its v4 address if IPV6CP should, for any reason, transition to LayerDown state after the v4 address has been assigned.
The cause appears to be in `/usr/local/sbin/ppp-linkdown`. Commit:7610a39 removes all v4 addresses (see #11629) without testing that the script was invoked in response to a v4 LayerDown event, therefore also executes when invoked in response to a v6 LayerDown event.
Forum thread at forum#196820 [1] (be sure to see also the update at post 4), though there's not much discussion there at this point. It is very likely that forum#190097 [2] has the same cause for reasons I'd rather not make public (but am happy to explain privately).
In my case, the trigger for the IPV6CP LayerDown event appears to be a duplicate ConfigureRequest (ie same message, same Identifier), possibly due to timeout, received after `mpd5` has ACK'd the first ConfigureRequest, has transitioned from Ack-Rcvd to Opened and v6 is in LayerUp state. This seems to cause `mpd5` to decide that something is wrong and to resynchronise v6 negotiation by transitioning to LayerDown state (which invokes `ppp-linkdown`) and renegotiate v6. This succeeds but the v4 address is gone. `mpd5` doesn't know that and doesn't attempt to renegotiate v4.
I have pcap traces showing both successful and unsuccessful PPP session negotiations which I can make available confidentially, if wanted. They show that true sequencing is a bit different from that implied by ppp.log, including an ACK delay of 1046ms and 1079ms where negotiation failed (and 1041ms when it succeeded). They also show that fail cases include a double termination request from `mpd5`, though it's unclear whether that is significant.
Attached are PPP logs showing success and failure cases, and also an excerpt of the failure case showing only IPV6CP activity.
REPRODUCTION
May be difficult given the involvement of behaviour of an external party, but one possible approach is to capture a single inbound configuration request from the ISP and inject it using `ngctl` or by some other means. How to inject it as if it were received from the ISP is more than I know how to do, but here's how to capture an inbound IPV6CP configuration request (requires bouncing the WAN connection):
PCAP_FILE=pppoe-ipv6cp-inbound-config.pcap # alter if desired WAN_IF=igc3 # alter as required mymac=`ifconfig $WAN_IF|grep ether|awk '{print $2}'` tcpdump -i WAN_IF -w $PCAP_FILE ether dst $mymac and pppoes and ppp proto 0x8057 and ppp[2:1]=0x01 & pfSctl -c 'interface reload wan'
Foreground and kill tcpdump. Verify that there is a single configuration request packet (`tcpdump -r $PCAP_FILE -nve`). If you have more than one, use `editcap -r` to extract it. Mine looks like:
11:16:22.344264 b0:70:0d:1b:29:47 > 90:ec:77:90:18:e2, ethertype PPPoE S (0x8864), length 60: PPPoE [ses 0x635] IP6CP (0x8057), length 16: IP6CP, Conf-Request (0x01), id 0, length 16 encoded length 14 (=Option(s) length 10) 0x0000: 8057 0100 000e Interface-ID Option (0x01), length 10: 9e89:1eff:fe2f:0000 0x0000: 9e89 1eff fe2f 0000
Injecting the packet should cause `mpd5` to behave as shown in the attached logs. It will also cause `mpd5` to send an ACK which could potentially confuse the ISP's LNS.
[2] https://forum.netgate.com/topic/190097/loss-of-ipv4-address-on-pppoe-interface-after-reboot
Files