Project

General

Profile

Actions

Bug #12947

open

DHCP6 client does not take any action if the interface IPv6 address changes during renewal

Added by David Myers about 2 years ago. Updated 6 months ago.

Status:
Feedback
Priority:
Normal
Assignee:
-
Category:
DHCP (IPv6)
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
24.03
Release Notes:
Default
Affected Version:
Affected Architecture:

Description

I recently started using T-Mobile 5G Home Internet. The gateway device you're required to use is almost completely unconfigurable. You can't even change DHCP or DNS settings, forget about having a bridge mode or any sort of DMZ or IP bypass. So for IPv6 I need to use NAT, which actually works pretty well.

The problem is that occasionally the IPv6 address will change without the interface going down. First the Router Advertisements change and NAT and dpinger break. Eventually dhcp6c notices the change but does nothing about it:

Mar 11 03:44:26 router dhcp6c[32738]: Sending Renew
Mar 11 03:44:27 router dhcp6c[32738]: dhcp6c Received INFO
Mar 11 03:44:27 router dhcp6c[32738]: remove an address 2607:fb90:5128:15d0:24dc:4e2:d0b6:df11/128 on igb2
Mar 11 03:44:27 router dhcp6c[32738]: add an address 2607:fb90:5120:e987:24dc:4e2:d0b6:df11/128 on igb2
Mar 11 03:44:27 router dhcp6c[32738]: T1(1125) and/or T2(1800) is locally determined

Manually restarting dpinger and reloading the filters gets things working again, but what probably really needs to happen is to run /etc/rc.newwanipv6.

I'm thinking of patching (the generation of) /var/etc/dhcp6c_opt1_script.sh to run /etc/rc.newwanipv6 if the current interface IPv6 address does not match /var/db/opt1_ipv6 after a DHCPv6 INFO or RENEW. Is this a reasonable short-term fix?


Files

dhcp6c-renew-12947.diff (3.78 KB) dhcp6c-renew-12947.diff Jim Pingle, 03/16/2022 02:36 PM
dhcpd.log (31.7 KB) dhcpd.log David Myers, 03/18/2022 01:59 PM
system.log (884 Bytes) system.log David Myers, 03/18/2022 01:59 PM

Related issues

Related to Regression #11570: Gateway monitoring services is not always restarted on interface events, which may prevent a WAN from recovering back to an online stateClosed

Actions
Actions #1

Updated by Jim Pingle about 2 years ago

  • Assignee set to Jim Pingle
  • Target version set to 2.7.0
  • Plus Target Version set to 22.05

For that to trigger the client would have to fire the script during an event when the change occurs. It may not, but it's hard to say for sure based on the logs you have. For starters, go to System > Advanced on the Networking tab and check "DHCP6 Debug" and see what it logs at the time.

It's possible that would get triggered by the script at the RENEW case but it's not certain. The next problem is that at least according to the documentation, it doesn't look like the script will get any environment variables populated with data it could leverage to compare old/new IP addresses similar to what is done for the IPv4 client. It might be viable to read the address out of the /var/db/<if>_ipv6 file and compare, like you suggest, but it would require testing to confirm.

Actions #2

Updated by Jim Pingle about 2 years ago

  • Subject changed from IPv6 address change not noticed to DHCP6 client does not take any action if the interface IPv6 address changes during renewal
Actions #3

Updated by Jim Pingle about 2 years ago

I tried altering the script so it would fire during a renew with mixed success. Though I found another odd behavior. If I change the IP address assigned to the device in upstream DHCP, the next renew it picks up the new address. However, it gets added to the interface, so both the old and new are present. The old address isn't removed until its lease expires, even though it's supposed to be using the new address instead. It's marked deprecated in the meantime:

vtnet0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
    description: WAN
    options=800b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,LINKSTATE>
    ether ca:1d:62:6c:c6:9c
    inet6 fe80::c81d:62ff:fe6c:c69c%vtnet0 prefixlen 64 scopeid 0x1
    inet6 2001:db8::103:1 prefixlen 128 deprecated
    inet6 2001:db8::103:2 prefixlen 128
    inet 198.51.100.103 netmask 0xffffff00 broadcast 198.51.100.255
    media: Ethernet 10Gbase-T <full-duplex>
    status: active
    nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>

The functions that fetch and check the interface address only get the old address which remains on the interface, not the additional address that came through.

When picking up the new address dhcp6c also seems to get stuck in a bit of a loop of REQUEST and RENEW messages (which I've seen be anywhere from ~15 seconds apart to ~60 seconds apart), but it may be related to the changes made here and not something else. It's been an issue in the past, though, see #11100 and #9634

Attached is a patch to try, though it's not very well optimized, it should at least show a difference in behavior.

Clearing the targets and assignment since there is a bit more happening here than I thought.

Actions #4

Updated by David Myers about 2 years ago

The patch didn't work.

I applied the patch to my 2.5.2 system then enabled DHCP6 client debug mode and saved the interface in order to force the dhcp6c client script to get regenerated. I did not reboot.

The IPv6 address changed at approximately 13:07 in the attached logs, which are edited from about 13:00 until rc.newwanipv6 ran a couple times, and with DHCPv4 messages removed.

One thing I don't understand is why there are no messages in the logs from the calls to logger from the dhcp6c client script. The logger command works fine from the command line.

While IPv6 was not working ifconfig showed this:

igb2: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1420
        description: LTE
        options=e100bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,RXCSUM_IPV6,TXCSUM_IPV6>
        ether 00:e0:67:22:fc:52
        inet6 fe80::2e0:67ff:fe22:fc52%igb2 prefixlen 64 scopeid 0x3
        inet6 2607:fb90:5120:e987:2e0:67ff:fe22:fc52 prefixlen 64 detached deprecated autoconf
        inet6 2607:fb90:1b7d:9992:2e0:67ff:fe22:fc52 prefixlen 64 autoconf
        inet6 2607:fb90:1b7d:9992:24dc:4e2:d0b6:df11 prefixlen 128
        inet 192.168.12.245 netmask 0xffffff00 broadcast 192.168.12.255
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>

In order to recover IPv6 I needed to release and renew the interface, at which point it looked like this:
igb2: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1420
        description: LTE
        options=e100bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,RXCSUM_IPV6,TXCSUM_IPV6>
        ether 00:e0:67:22:fc:52
        inet6 fe80::2e0:67ff:fe22:fc52%igb2 prefixlen 64 scopeid 0x3
        inet6 2607:fb90:1b7d:9992:2e0:67ff:fe22:fc52 prefixlen 64 autoconf
        inet6 2607:fb90:1b7d:9992:24dc:4e2:d0b6:df11 prefixlen 128
        inet 192.168.12.245 netmask 0xffffff00 broadcast 192.168.12.255
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>

I hadn't looked at ifconfig before and was surprised to see two addresses. pfSense uses the first one.

I've backed out the patch.

Actions #5

Updated by David Myers almost 2 years ago

I neglected to mention that I was using "Disable Gateway Monitoring Action" on my gateways when the above issues occurred. Perhaps that isn't relevant. In any event, I've re-enabled actions because I'm now using a Gateway Group.

Actions #6

Updated by → luckman212 almost 2 years ago

David Myers I believe I'm facing this exact issue, take a look at https://forum.netgate.com/topic/172849/rtsold-not-running-ipv6-wan-dhcp-keeps-losing-connectivity/

I'm highly motivated to try to get this fixed!

Actions #7

Updated by → luckman212 over 1 year ago

It appears we are out of luck on having devd fire events for IP address changes. There is a commit: https://reviews.freebsd.org/rGa75819461ec7c7d8468498362f9104637ff7c9e9 but that seems like it might not even be in FreeBSD13 (much less 12.3...). Maybe we can get this backported to 12.x somehow? Seems like it would be immensely useful in these cases.

My poor man's solution is hashing the output of ifconfig on all DHCP6 interfaces with cron and firing an event if the hash changes. But this places unnecessary load on the system and also has a potential for up to 1 minute of downtime before the change is picked up. Still, it's the best I can come up with for now until devd can trigger on address changes. I have a PR for this that I will submit later today.

Actions #8

Updated by → luckman212 over 1 year ago

Just updated PR #4595 with the new mitigation changes. Testers & feedback wanted.

Actions #9

Updated by → luckman212 over 1 year ago

I posted on the PR that since Reid Linnemann has just deprecated pfSense_getall_interface_addresses(), this should probably be updated to use the new pfSense_get_ifaddrs() function, where possible.

However, I do want to note for the record that my patch has been running for over a week now and has 100% fixed my issue.

Actions #10

Updated by Jim Pingle over 1 year ago

  • Status changed from New to Pull Request Review
  • Target version set to 2.7.0
  • Plus Target Version set to 22.09
Actions #11

Updated by Jim Pingle over 1 year ago

  • Plus Target Version changed from 22.09 to 22.11
Actions #12

Updated by Jim Pingle over 1 year ago

  • Plus Target Version changed from 22.11 to 23.01
Actions #13

Updated by Jim Pingle over 1 year ago

  • Status changed from Pull Request Review to Feedback

This needs re-tested since snapshots are on FreeBSD 14-CURRENT (main) now the change noted above is in the tree. I checked and the relevant commit is present in the branch(es) used to build dev snapshots.

If it still requires changes to the pfSense source we'll need an updated PR and to move this ahead to 23.05 since the current PR does not apply, and additional changes will need more time than we have for the 23.01 release.

Actions #14

Updated by Jim Pingle over 1 year ago

  • Plus Target Version changed from 23.01 to 23.05
Actions #15

Updated by Jim Pingle 11 months ago

  • Plus Target Version changed from 23.05 to 23.09

Still waiting on feedback from someone who can reproduce this to test against a 2.7.0 snap, 23.01 release, or a 23.05 snap.

Actions #16

Updated by David Myers 11 months ago

Jim Pingle wrote in #note-15:

Still waiting on feedback from someone who can reproduce this to test against a 2.7.0 snap, 23.01 release, or a 23.05 snap.

I'm pretty sure this is still happening on 23.01. With T-Mobile Home Internet I can go weeks without the IPv6 address changing, or it can change several times in one day, so I can't reproduce it on demand.

I've written a simple-minded script to detect and correct the problem and run it from cron every 5 minutes, and there's evidence in the logs of it taking action a few times.

#!/usr/bin/env perl
#
# Delete deprecated IPv6 addresses.
#
use v5.30;

my $if = 'igb2';
my $deleted = 0;

open(my $ifconfig, "-|", "/sbin/ifconfig $if inet6") || die "Can't run /sbin/ifconfig: $!";
while (<$ifconfig>) {
    if (/inet6 ([0-9a-fA-F.:]+) prefixlen (\d+) .*deprecated/) {
        system("/sbin/ifconfig $if inet6 $1/$2 delete");
        $deleted++;
    }
}
close($ifconfig);

if ($deleted) {
    system("/etc/rc.newwanipv6 $if");
}

In fact I think this was the script firing off early this morning. I've edited out messages from NUT:

Apr 17 01:21:49 router rc.gateway_alarm[72711]: >>> Gateway alarm: TMHI_DHCP6 (Addr:2620:fe::10 Alarm:1 RTT:53.447ms RTTsd:9.178ms Loss:41%)
Apr 17 01:21:49 router check_reload_status[401]: updating dyndns TMHI_DHCP6
Apr 17 01:21:49 router check_reload_status[401]: Restarting IPsec tunnels
Apr 17 01:21:49 router check_reload_status[401]: Restarting OpenVPN tunnels/interfaces
Apr 17 01:21:49 router check_reload_status[401]: Reloading filter
Apr 17 01:21:51 router php-fpm[65938]: /rc.filter_configure_sync: MONITOR: TMHI_DHCP6 has packet loss, omitting from routing group TMHI_FAILOVER6
Apr 17 01:21:51 router php-fpm[65938]: 2620:fe::10|2607:fb90:7522:c4a9:a484:c093:675:c336|TMHI_DHCP6|52.851ms|8.805ms|44%|down|highloss
Apr 17 01:25:00 router php-cgi[83652]: rc.newwanipv6: rc.newwanipv6: Info: starting on igb2.
Apr 17 01:25:00 router php-cgi[83652]: rc.newwanipv6: rc.newwanipv6: on (IP address: 2607:fb90:759b:d695:2e0:67ff:fe26:47ea) (interface: opt1) (real interface: igb2).
Apr 17 01:25:01 router php-cgi[83652]: rc.newwanipv6: Removing static route for monitor 2620:fe::fe:10 and adding a new route through fdfd:10:167:100::1
Apr 17 01:25:01 router php-cgi[83652]: rc.newwanipv6: Removing static route for monitor 9.9.9.10 and adding a new route through 192.168.12.1
Apr 17 01:25:01 router php-cgi[83652]: rc.newwanipv6: Removing static route for monitor 2620:fe::10 and adding a new route through fe80::c6e5:32ff:fed7:634e%igb2
Apr 17 01:25:01 router php-cgi[83652]: rc.newwanipv6: Removing static route for monitor 149.112.112.10 and adding a new route through 10.155.98.1
Apr 17 01:25:02 router check_reload_status[401]: Reloading filter
Apr 17 01:25:02 router php-cgi[83652]: rc.newwanipv6: The command '/sbin/ifconfig igb2 inet6 2607:fb90:7522:c4a9:2e0:67ff:fe26:47ea delete' returned exit code '1', the output was 'ifconfig: ioctl (SIOCDIFADDR): Can't assign requested address' 
Apr 17 01:25:02 router php-cgi[83652]: rc.newwanipv6: Resyncing OpenVPN instances for interface TMHI.
Apr 17 01:25:02 router php-cgi[83652]: rc.newwanipv6: Creating rrd update script
Apr 17 01:25:02 router php-cgi[83652]: rc.newwanipv6: Netgate pfSense Plus package system has detected an IP change or dynamic WAN reconnection - 2607:fb90:7522:c4a9:2e0:67ff:fe26:47ea -> 2607:fb90:759b:d695:2e0:67ff:fe26:47ea - Restarting packages.
Apr 17 01:25:02 router check_reload_status[401]: Starting packages
Apr 17 01:25:02 router check_reload_status[401]: Reloading filter
Apr 17 01:25:03 router php-fpm[79152]: /rc.start_packages: Restarting/Starting all packages.
Apr 17 01:25:03 router php-fpm[79152]: /rc.start_packages: Stopping service nut
Apr 17 01:25:03 router php-fpm[79152]: /rc.start_packages: Starting service nut
Apr 17 01:25:03 router php-fpm[7842]: /rc.filter_configure_sync: MONITOR: TMHI_DHCP6 is available now, adding to routing group TMHI_FAILOVER6
Apr 17 01:25:03 router php-fpm[7842]: 2620:fe::10|2607:fb90:759b:d695:2e0:67ff:fe26:47ea|TMHI_DHCP6|43.574ms|1.428ms|0.0%|online|none
Actions #17

Updated by Jim Pingle 9 months ago

  • Status changed from Feedback to New
  • Target version changed from 2.7.0 to CE-Next
Actions #18

Updated by Jim Pingle 6 months ago

  • Plus Target Version changed from 23.09 to 24.01
Actions #19

Updated by Marcos M 6 months ago

  • Status changed from New to Feedback

I tested this in 23.09 dev snapshots and am not able to reproduce the issue.

The following are logs from a lease change and a manual release/renew - of note is an additional IPv6 autoconf address that gets added:

Show

After stopping the DHCP6 server (including RAs), the lease expires and the interface address is marked as deprecated. After starting DHCP6/RAs again, pfSense picks up the change.

Show

Given the above, it looks like the original issue as described has been resolved - additional feedback is welcomed.

Actions #20

Updated by Jim Pingle 6 months ago

  • Plus Target Version changed from 24.01 to 24.03
Actions #21

Updated by Marcos M 5 months ago

  • Related to Regression #11570: Gateway monitoring services is not always restarted on interface events, which may prevent a WAN from recovering back to an online state added
Actions

Also available in: Atom PDF