Project

General

Profile

Bug #9577

radvd send_ra_forall failed on interface / can't join ipv6-allrouters

Added by Manuel Piovan 7 months ago. Updated about 2 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
IPv6 Router Advertisements (RADVD)
Target version:
Start date:
06/07/2019
Due date:
% Done:

0%

Estimated time:
Affected Version:
2.5.0
Affected Architecture:

Description

https://forum.netgate.com/topic/142363/ipv6-broken-radvd-can-t-join-ipv6-allrouters-on-interface/19

log is full of

Jun 6 18:57:34     radvd     66014     resuming normal operation
Jun 6 18:57:34     radvd     66014     attempting to reread config file
Jun 6 14:01:49     radvd     65728     version 2.17 started 
Jun 6 14:00:39     radvd     67952     can't join ipv6-allrouters on igb2
Jun 6 14:00:39     radvd     67952     can't join ipv6-allrouters on ath0_wlan0
Jun 6 14:00:33     radvd     67952     can't join ipv6-allrouters on igb2
Jun 6 14:00:22     radvd     67952     can't join ipv6-allrouters on igb2
Jun 6 14:00:20     radvd     67952     can't join ipv6-allrouters on ath0_wlan0
Jun 6 14:00:08     radvd     67952     can't join ipv6-allrouters on igb2
Jun 6 14:00:01     radvd     67952     can't join ipv6-allrouters on ath0_wlan0
Jun 6 13:59:53     radvd     67952     can't join ipv6-allrouters on ath0_wlan0
Jun 6 13:59:53     radvd     67952     can't join ipv6-allrouters on igb2
Jun 6 13:59:40     radvd     67952     can't join ipv6-allrouters on ath0_wlan0
Jun 6 13:59:36     radvd     67952     can't join ipv6-allrouters on igb2
Jun 6 13:59:30     radvd     67952     can't join ipv6-allrouters on igb2
Jun 6 13:59:25     radvd     67952     can't join ipv6-allrouters on ath0_wlan0

more debug output

Jun 7 10:00:25     radvd     55719     polling for 16 second(s), next iface is ath0_wlan0
Jun 7 10:00:25     radvd     55719     igb1 next scheduled RA in 16 second(s)
Jun 7 10:00:25     radvd     55719     send_ra_forall failed on interface igb1
Jun 7 10:00:25     radvd     55719     not sending RA for igb1, interface is not ready
Jun 7 10:00:25     radvd     55719     can't join ipv6-allrouters on igb1
Jun 7 10:00:25     radvd     55719     igb1 address: fe80::a236:9fff:fe85:96f1
Jun 7 10:00:25     radvd     55719     igb1 address: xxxx:xxx:xx:xxx::1
Jun 7 10:00:25     radvd     55719     igb1 linklocal address: fe80::a236:9fff:fe85:96f1
Jun 7 10:00:25     radvd     55719     IPv6 forwarding on interface seems to be disabled, but continuing anyway
Jun 7 10:00:25     radvd     55719     checking ipv6 forwarding of interface not supported
Jun 7 10:00:25     radvd     55719     prefix length for igb1 is 64
Jun 7 10:00:25     radvd     55719     link layer token length for igb1 is 48
Jun 7 10:00:25     radvd     55719     mtu for igb1 is 1500
Jun 7 10:00:25     radvd     55719     igb1 supports multicast or is point-to-point
Jun 7 10:00:25     radvd     55719     igb1 is running
Jun 7 10:00:25     radvd     55719     igb1 is up
Jun 7 10:00:25     radvd     55719     ioctl(SIOCGIFFLAGS) succeeded on igb1
Jun 7 10:00:25     radvd     55719     timer_handler called for igb1
Jun 7 10:00:25     radvd     55719     polling for 0 second(s), next iface is igb1
Jun 7 10:00:25     radvd     55719     igb2 next scheduled RA in 16 second(s)
Jun 7 10:00:25     radvd     55719     send_ra_forall failed on interface igb2
Jun 7 10:00:25     radvd     55719     not sending RA for igb2, interface is not ready
Jun 7 10:00:25     radvd     55719     can't join ipv6-allrouters on igb2
Jun 7 10:00:25     radvd     55719     igb2 address: fe80::a236:9fff:fe85:96f2
Jun 7 10:00:25     radvd     55719     igb2 address: xxxx:xxx:xxx:xxxx::1
Jun 7 10:00:25     radvd     55719     igb2 linklocal address: fe80::a236:9fff:fe85:96f2
Jun 7 10:00:25     radvd     55719     IPv6 forwarding on interface seems to be disabled, but continuing anyway
Jun 7 10:00:25     radvd     55719     checking ipv6 forwarding of interface not supported
Jun 7 10:00:25     radvd     55719     prefix length for igb2 is 64
Jun 7 10:00:25     radvd     55719     link layer token length for igb2 is 48
Example sequence.docx (19.5 KB) Example sequence.docx Log Snip-it of calls supporting RADVD process Ronald Schellberg, 10/10/2019 02:12 PM

History

#1 Updated by Manuel Piovan 7 months ago

ipv6 gateway disappear from connected clients and ipv6 is not working anymore, i need to restart radvd to make it work again for some times

#2 Updated by Greg M 7 months ago

Now I have this as well:

Jun 29 07:17:29 radvd 62926 can't join ipv6-allrouters on hn0.10
Jun 29 07:15:22 radvd 62926 can't join ipv6-allrouters on hn0.10
Jun 29 07:15:00 radvd 62926 can't join ipv6-allrouters on hn0.9
Jun 29 07:13:07 radvd 62926 can't join ipv6-allrouters on hn0.7
Jun 29 07:12:47 radvd 62926 can't join ipv6-allrouters on hn0.10
Jun 29 07:11:25 radvd 62926 can't join ipv6-allrouters on hn0.8
Jun 29 07:11:23 radvd 62926 can't join ipv6-allrouters on hn0.9
Jun 29 07:10:22 radvd 62926 can't join ipv6-allrouters on hn0.10
Jun 29 07:08:10 radvd 62926 can't join ipv6-allrouters on hn0.10

#3 Updated by Greg M 6 months ago

Now I don`t have above any more but I have this (but everything is working just fine):

Jul 22 14:44:54 radvd 40666 IPv6 forwarding on interface seems to be disabled, but continuing anyway
Jul 22 14:43:25 radvd 40666 IPv6 forwarding on interface seems to be disabled, but continuing anyway
Jul 22 14:41:56 radvd 40666 IPv6 forwarding on interface seems to be disabled, but continuing anyway
Jul 22 14:41:20 radvd 40666 IPv6 forwarding on interface seems to be disabled, but continuing anyway
Jul 22 14:40:03 radvd 40666 IPv6 forwarding on interface seems to be disabled, but continuing anyway
Jul 22 14:39:37 radvd 40666 IPv6 forwarding on interface seems to be disabled, but continuing anyway
Jul 22 14:37:53 radvd 40666 IPv6 forwarding on interface seems to be disabled, but continuing anyway
Jul 22 14:37:32 radvd 40666 IPv6 forwarding on interface seems to be disabled, but continuing anyway
Jul 22 14:36:42 radvd 40666 IPv6 forwarding on interface seems to be disabled, but continuing anyway
Jul 22 14:34:32 radvd 40666 IPv6 forwarding on interface seems to be disabled, but continuing anyway
Jul 22 14:31:44 radvd 40666 IPv6 forwarding on interface seems to be disabled, but continuing anyway
Jul 22 14:30:26 radvd 40666 IPv6 forwarding on interface seems to be disabled, but continuing anyway
Jul 22 14:29:31 radvd 40666 IPv6 forwarding on interface seems to be disabled, but continuing anyway
Jul 22 14:29:21 radvd 40666 IPv6 forwarding on interface seems to be disabled, but continuing anyway

#4 Updated by Manuel Piovan 6 months ago

Greg M wrote:

Now I don`t have above any more but I have this (but everything is working just fine):

IPv6 forwarding on interface seems to be disabled, but continuing anyway

confirming this, same here
radvd is now 2.18

#5 Updated by Greg M 4 months ago

Hi!

Can someone PLEASE take a look at this one.

Thanks!

#6 Updated by Ronald Schellberg 3 months ago

There are multiple issues, some easily solved. The "disabled" logging message can be deleted, as it is just an indication that for FreeBSD the feature is stubbed out. I can submit a RADVD patch file for interface.c to delete 5 lines.

I have been bashing away at this for several weeks now and need some advise from Netgate whether to continue on with the 2.5 version or focusing more on changes that have been made to stable/12.

I have tried incorporating some of stable/12 and the issue still exists but to a lesser extent, having seen that stable/12 doesn't solve the problem, I have switched back to 2.5.

What I have found is an issue with the FreeBSD in6p_leave_group, every other call, it finds and removes the desired group. The subsequent call to in6p_join_group reinserts the group, but not correctly. The pointer to ifp should be listed in the last entry of the list but is NULL. The next leave/join group cycle in RADVD, the in6p_leave_group fails to find the entry (duh the entry is NULL at this point) and since none found it exits. Well, the subsequent call to in6p_join_group also does not find the entry, so the list is incremented and the entry correctly added until the "radvd can't join ipv6-allrouters" condition occurs (somewhere between 1000 and 2000 leave/join cycles or about 24 hours for me). It would be nice if the leave/join implementation of RADVD was not necessary.

I attached a notated word document showing 4 RADVD leave/join cycles with numerous added log messages that details the above sequence.

I can continue to bash away at this on 2.5 but if the changes in stable/12 are going to get incorporated soon before 2.5 is released, my time may be better spent testing and fixing it.

#7 Updated by Jim Pingle 3 months ago

2.5 will be moving to a 12.1 or stable/12 base, but that choice has not yet been made. It definitely will not stay on 12.0, though.

Even if 12.1 is selected, if specific changes to stable/12 after 12.1-RELEASE are beneficial, we can pick those back if needed.

#8 Updated by Ronald Schellberg 3 months ago

After several failed attempts at creating a 12.1 version, the process that worked was to create a new branch from pfSense/releng/12.1 then cherry-picking commits from the 2.5 branch since mid-February. I also applied my 6RD patch to this branch as I need the stf changes to get ipv6 working for me.

That patch caused a kernel panic and a reboot on my bare metal firewall, that was impossible to capture on the vga console. So I switched tactics, and created hyper-v VM instance on my build machine which has two hardware network interfaces but I needed an ISO with a serial console to capture the console spew. Read multiple rebuilds over the last 20 days. Last night I finally have a version that successfully installs and boots.

With similar logging added to sys/netinet6/in6_mcast.c, I can confirm that releng/12.1 appears not to have the same issues that 2.5 has since 12.1 rewrote the internals in6_mcast.c. RADVD has been running about 5 hours now and I expect it to continue like the 2.4 branch. I can confirm tomorrow, as it would stop working for me after about 24 hours.

I would like to try removing the IPV6_LEAVE_GROUP call from the bsd44.c patch of RADVD to see if that is still necessary, but want to make sure this version is stable first.

#9 Updated by Ronald Schellberg 3 months ago

Ronald Schellberg wrote:

I can confirm tomorrow, as it would stop working for me after about 24 hours.

I would like to try removing the IPV6_LEAVE_GROUP call from the bsd44.c patch of RADVD to see if that is still necessary, but want to make sure this version is stable first.

Rebuilt a clean version (without logging and debug) and that has been running on the VM for almost 2 days. Now installed it on my bare metal main router.

On a side note, why has issue dropped from the 2.5 issue list????

#10 Updated by Jim Pingle about 2 months ago

  • Target version set to 2.5.0

Ronald Schellberg wrote:

On a side note, why has issue dropped from the 2.5 issue list????

It was never assigned a target version, so it was never on that list, so it couldn't be "dropped" from the list.

I've added it now, it definitely needs addressed before release, but from the looks of the other info here and in the forum thread it may solve itself once we move the base to 12.1.

The workaround from the forum thread isn't pretty, but it does work. Add a cron job for:

0    *    *    *    *    root    /usr/bin/killall radvd && /bin/sleep 5 && /usr/local/sbin/radvd -p /var/run/radvd.pid -C /var/etc/radvd.conf -m syslog

I haven't tested it, but this would probably also work:

/usr/local/sbin/pfSsh.php playback svc stop radvd && /bin/sleep 5 && /usr/local/sbin/pfSsh.php playback svc start radvd

Also available in: Atom PDF