Bug #10558
closedMulticast daemons work at boot, but fail if restarted
Added by Maarten Hendrix over 4 years ago. Updated about 4 years ago.
0%
Description
Problem:
IGMPProxy (and PIMD) will not start after pfSense update on 05-02-2020.
Error message:
MRT_ADD_VIF; Errno(48): Address already in use
Full log:
[2.5.0-DEVELOPMENT][admin@router]/root: igmpproxy -dvvvvvvvv /var/etc/igmpproxy.conf Searching for config file at '/var/etc/igmpproxy.conf' Config: Quick leave mode enabled. Config: Got a phyint token. Config: IF: Config for interface igb2.4. Config: IF: Got upstream token. Config: IF: Got ratelimit token '0'. Config: IF: Got threshold token '1'. Config: IF: Got altnet token 213.75.0.0/16. Config: IF: Altnet: Parsed altnet to 213.75/16. Config: IF: Got altnet token 10.0.0.0/8. Config: IF: Altnet: Parsed altnet to 10/8. Config: IF: Got altnet token 217.166.0.0/16. Config: IF: Altnet: Parsed altnet to 217.166/16. IF name : igb2.4 Next ptr : 0 Ratelimit : 0 Threshold : 1 State : 1 Allowednet ptr : 644000 Config: Got a phyint token. Config: IF: Config for interface igb2.44. Config: IF: Got downstream token. Config: IF: Got ratelimit token '0'. Config: IF: Got threshold token '1'. Config: IF: Got altnet token 172.19.44.0/24. Config: IF: Altnet: Parsed altnet to 172.19.44/24. IF name : igb2.44 Next ptr : 0 Ratelimit : 0 Threshold : 1 State : 2 Allowednet ptr : 644030 Config: Got a phyint token. Config: IF: Config for interface pppoe0. Config: IF: Got disabled token. IF name : pppoe0 Next ptr : 0 Ratelimit : 0 Threshold : 1 State : 0 Allowednet ptr : 0 Config: Got a phyint token. Config: IF: Config for interface igb1.1. Config: IF: Got disabled token. IF name : igb1.1 Next ptr : 0 Ratelimit : 0 Threshold : 1 State : 0 Allowednet ptr : 0 Config: Got a phyint token. Config: IF: Config for interface igb1.20. Config: IF: Got disabled token. IF name : igb1.20 Next ptr : 0 Ratelimit : 0 Threshold : 1 State : 0 Allowednet ptr : 0 buildIfVc: Interface lo0 Addr: 127.0.0.1, Flags: 0xffff8049, Network: 127/8 buildIfVc: Interface igb1.1 Addr: 172.19.0.1, Flags: 0xffff8843, Network: 172.19.0/24 buildIfVc: Interface igb2.4 Addr: 10.86.116.244, Flags: 0xffff8843, Network: 10.86.112/21 buildIfVc: Interface igb2.44 Addr: 172.19.44.1, Flags: 0xffff8843, Network: 172.19.44/24 buildIfVc: Interface igb1.20 Addr: 203.0.113.1, Flags: 0xffff8843, Network: 203.0.113/24 buildIfVc: Interface pppoe0 Addr: 62.251.92.176, Flags: 0xffff88d1, Network: 62.251.92.176/32 Found config for igb1.1 Found config for igb2.4 Found config for igb2.44 Found config for igb1.20 Found config for pppoe0 Found upstrem IF #0, will assing as upstream Vif 22 adding VIF, Ix 0 Fl 0x0 IP 0xf474560a igb2.4, Threshold: 1, Ratelimit: 0 Network for [igb2.4] : 10.86.112/21 Network for [igb2.4] : 213.75/16 Network for [igb2.4] : 10/8 Network for [igb2.4] : 217.166/16 adding VIF, Ix 1 Fl 0x0 IP 0x012c13ac igb2.44, Threshold: 1, Ratelimit: 0 Network for [igb2.44] : 172.19.44/24 Network for [igb2.44] : 172.19.44/24 MRT_ADD_VIF; Errno(48): Address already in use
IGMPProxy config:
##------------------------------------------------------ ## Enable Quickleave mode (Sends Leave instantly) ##------------------------------------------------------ quickleave phyint igb2.4 upstream ratelimit 0 threshold 1 altnet 213.75.0.0/16 altnet 10.0.0.0/8 altnet 217.166.0.0/16 phyint igb2.44 downstream ratelimit 0 threshold 1 altnet 172.19.44.0/24 phyint pppoe0 disabled phyint igb1.1 disabled phyint igb1.20 disabled
As was discussed on the forum more users are facing this problem:
https://forum.netgate.com/topic/153228/igmpproxy-is-not-starting-after-update-to-latest-2-5-0-dev
Files
Updated by Jim Pingle over 4 years ago
- Category changed from Unknown to IGMP Proxy
- Status changed from New to Feedback
If you have been tracking 2.5.0 snapshots since before early May, first make sure that igmpproxy gets reinstalled forcefully since it may not match the running kernel:
pkg update -f; pkg upgrade -yf igmpproxy
Do the same with pimd
if you have it installed.
Updated by Maarten Hendrix over 4 years ago
After running that and restarting the pfSense box, IGMPProxy still won't start.
[2.5.0-DEVELOPMENT][admin@router.hendrix.network]/root: igmpproxy -h Usage: igmpproxy [-h] [-n] [-d] [-v [-v]] <configfile> -h Display this help screen -n Do not run as a daemon -d Run in debug mode. Output all messages on stderr. Implies -n. -v Be verbose. Give twice to see even debug messages. igmpproxy 0.2.1 [2.5.0-DEVELOPMENT][admin@router.hendrix.network]/root: uname -ar FreeBSD router.hendrix.network 12.1-STABLE FreeBSD 12.1-STABLE 967e7a34ab2(devel-12) pfSense amd64
Is there any other information i can provide to help?
Updated by Maarten Hendrix over 4 years ago
Still same today:
[2.5.0-DEVELOPMENT][admin@router.hendrix.network]/root: pkg update -f ; pkg upgrade -yf igmpproxy Updating pfSense-core repository catalogue... Fetching meta.txz: 100% 912 B 0.9kB/s 00:01 Fetching packagesite.txz: 100% 2 KiB 1.8kB/s 00:01 Processing entries: 100% pfSense-core repository update completed. 7 packages processed. Updating pfSense repository catalogue... Fetching meta.conf: 100% 163 B 0.2kB/s 00:01 Fetching packagesite.txz: 100% 136 KiB 139.8kB/s 00:01 Processing entries: 0% Newer FreeBSD version for package zip: To ignore this error set IGNORE_OSVERSION=yes - package: 1201516 - running kernel: 1200086 Ignore the mismatch and continue? [Y/n]: Processing entries: 100% pfSense repository update completed. 500 packages processed. All repositories are up to date. Updating pfSense-core repository catalogue... pfSense-core repository is up to date. Updating pfSense repository catalogue... pfSense repository is up to date. All repositories are up to date. The following 1 package(s) will be affected (of 0 checked): Installed packages to be REINSTALLED: igmpproxy-0.2.1_1,1 [pfSense] Number of packages to be reinstalled: 1 22 KiB to be downloaded. [1/1] Fetching igmpproxy-0.2.1_1,1.txz: 100% 22 KiB 22.4kB/s 00:01 Checking integrity... done (0 conflicting) [1/1] Reinstalling igmpproxy-0.2.1_1,1... [1/1] Extracting igmpproxy-0.2.1_1,1: 100%
Updated by Maarten Hendrix over 4 years ago
I also did a full reinstall. Nothing changed.
Updated by Jens Leinenbach over 4 years ago
Maarten Hendrix wrote:
Problem:
IGMPProxy (and PIMD) will not start after pfSense update on 05-02-2020.
Does IGMPProxy run if you uninstall PIMD? Both services didn't run to the same time. This worked for me.
I also tried both packages and noticed the same some weeks ago. It took some time until I noticed that although IGMPProxy can use IGMPv3 on the WAN side, it does not talk IGMPv3 on the LAN side, so it is not compatible with the MagentaTV box by German Telekom.
Updated by Maarten Hendrix over 4 years ago
I tested with PIMD because it does a similar job.
I tested with it installed and without it installed. Both the same result.
Updated by Jens Leinenbach over 4 years ago
Maarten Hendrix wrote:
I tested with PIMD because it does a similar job.
I tested with it installed and without it installed. Both the same result.
That is interesting. According to my logs, igmpproxy was running propery until running "igmpproxy -dvvvvvvvv /var/etc/igmpproxy.conf"
Maybe I also disabled and enabled the service from the GUI.
Now I have the same issue.
What I can say is that I upgraded my hardware between 05-02-2020 and three days ago. My pfsense devices were bowth up-to-date and igmpproxy was running.
I imported the configuration to the new device and igmpproxy was still running.
Although my igmpproxy service is running, I have the same error message.
"MRT_ADD_VIF; Errno(48): Address already in use"
Additionally, I noted this: After running:
service -e
I get the following error message:
May 20 08:04:41 admin 89232 /usr/sbin/service: WARNING: $igmpproxy_enable is not set properly - see rc.conf(5). ...and more for other services.
Did you notice this line appears twice? Maybe the configuration file is not parsed properly:
Network for [igb2.44] : 172.19.44/24
Although the networks 192.168.1/24 and 192.168.0/24 appear twice, there is no error message.
But when it seems to try to add 193.158/15 a second time, it breaks.
adding VIF, Ix 0 Fl 0x0 IP 0x0101a8c0 lagg0, Threshold: 3, Ratelimit: 0 Network for [lagg0] : 192.168.1/24 Network for [lagg0] : 192.168.1/24 Network for [lagg0] : 239.35/16 Found upstrem IF #0, will assing as upstream Vif 20 adding VIF, Ix 1 Fl 0x0 IP 0x0200a8c0 lagg1, Threshold: 3, Ratelimit: 0 Network for [lagg1] : 192.168.0/24 Network for [lagg1] : 192.168.0/24 Network for [lagg1] : 224/4 Network for [lagg1] : 87.128/10 Network for [lagg1] : 193.158/15 Network for [lagg1] : 80.157/16 Network for [lagg1] : 62.155/16 Network for [lagg1] : 217.0/13 Network for [lagg1] : 193.158/15 MRT_ADD_VIF; Errno(48): Address already in use
It seems that the configuration file is not parsed properly.
Updated by Maarten Hendrix over 4 years ago
Looks the same indeed:
Found upstrem IF #0, will assing as upstream Vif 22 adding VIF, Ix 0 Fl 0x0 IP 0x3b73560a igb0.4, Threshold: 1, Ratelimit: 0 Network for [igb0.4] : 10.86.112/21 Network for [igb0.4] : 213.75/16 Network for [igb0.4] : 10/8 Network for [igb0.4] : 217.166/16 adding VIF, Ix 1 Fl 0x0 IP 0x012c13ac igb2.44, Threshold: 1, Ratelimit: 0 Network for [igb2.44] : 172.19.44/24 Network for [igb2.44] : 172.19.44/24 MRT_ADD_VIF; Errno(48): Address already in use
Updated by Jens Leinenbach over 4 years ago
Maarten Hendrix wrote:
Looks the same indeed:
[...]
I disabled the service and the debug mode, updated pfsense, including php, rebooted and enabled the service again. The service is running again now.
If an update does not help, try to disable the service and the debug mode, too, reboot, enable the service and save, and do not try to start it on command line. Then see if your service is running (Status/Services).
Updated by Maarten Hendrix over 4 years ago
That indeed looks like it started again.
Will it still work after a reboot or do i need to disable it every time i update / reboot?
Updated by Jim Pingle over 4 years ago
It might be that it only runs the first time after a reboot and anything that triggers the service to restart may make it fail. If that's the case, it almost sounds like it's not leaving its multicast groups when it exits, or something along those lines.
Updated by Jens Leinenbach over 4 years ago
I concur: I just tried to restart the service via Status/Services and it fails.
Updated by Maarten Hendrix over 4 years ago
Same result here. After a restart of the service it fails. After that if you reboot it still fails to start.
Now I stopped the service in settings and reboot again. After the reboot i enable the service again and enable logging and it starts without a problem.
Updated by Jim Pingle over 4 years ago
- Category changed from IGMP Proxy to Operating System
- Status changed from Feedback to Confirmed
Updated by Jim Pingle over 4 years ago
- Subject changed from IGMPProxy is not starting to Multicast daemons work at boot, but fail if restarted
Updated by Marc J over 4 years ago
Jim Pingle wrote:
It might be that it only runs the first time after a reboot and anything that triggers the service to restart may make it fail. If that's the case, it almost sounds like it's not leaving its multicast groups when it exits, or something along those lines.
Seems possibly related to this Kernel bug: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246629
No way of working around this until the freebsd kernel is patched I am assuming...
-Basically, I am asking for confirmation that we don't have any pfsense 2.5.0 snapshots on FreeBSD Kernel version: 12.1-p5? (Or 11.2?) (As those are mentioned to be issue free...)
Also saw people mention that reverting to an older version of psfsense (april) resolves the issue, can you confirm what kernel patch level it's running?
Updated by Louis B over 4 years ago
Hello,
I completely agree that this problem is almost certain related to the FreeBSD bug
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246629
So I did add some input to that bugreport, hoping that helps. The actual state of that freebsd-bugreport is ^Importance: --- Affects Only Me ^ which is definitively not correct.
In general, I wonder I the pfSense developers do generate FreeBSD bug reports and/or add extra input to existing reports. Which hopefully speed up things :)
Louis
Updated by Jim Pingle over 4 years ago
That FreeBSD bug report does appear to be related, we'll try to draw some attention to that.
-Basically, I am asking for confirmation that we don't have any pfsense 2.5.0 snapshots on FreeBSD Kernel version: 12.1-p5? (Or 11.2?) (As those are mentioned to be issue free...)
No, current snapshots are using 12.1-STABLE and there is no other base for 2.5.0 at the moment, the older ones have cycled off the snapshot servers.
Also saw people mention that reverting to an older version of psfsense (april) resolves the issue, can you confirm what kernel patch level it's running?
That was before the snapshots moved to a 12.1-STABLE base.
Updated by Marc J over 4 years ago
Understood..
Thanks for the follow up and info. Anything you can do from your side to draw some attention to it would always be appreciated Jim.
Updated by Marc J over 4 years ago
Jim Pingle wrote:
That FreeBSD bug report does appear to be related, we'll try to draw some attention to that.
-Basically, I am asking for confirmation that we don't have any pfsense 2.5.0 snapshots on FreeBSD Kernel version: 12.1-p5? (Or 11.2?) (As those are mentioned to be issue free...)
No, current snapshots are using 12.1-STABLE and there is no other base for 2.5.0 at the moment, the older ones have cycled off the snapshot servers.
Also saw people mention that reverting to an older version of psfsense (april) resolves the issue, can you confirm what kernel patch level it's running?
That was before the snapshots moved to a 12.1-STABLE base.
Hey Jim,
We got a response on that FreeBSD Bug report, and a request to test.
I myself am not very familiar, understanding you may not have time, but if you did have a bit of time and a better understanding of their request, maybe you could help us review it?
Thanks,
Updated by Jim Pingle over 4 years ago
I know, I was talking with that developer directly. We would need to test that change locally first before bringing it into snapshots, but we are working on it. A few other more pressing matters ahead of it in the queue, though.
Updated by Marc J over 4 years ago
Jim Pingle wrote:
I know, I was talking with that developer directly. We would need to test that change locally first before bringing it into snapshots, but we are working on it. A few other more pressing matters ahead of it in the queue, though.
Hey Jim!
Thats fantastic news. Love to hear that you two are already in touch. ill let you do you, as you clearly have a steady hold of it.
Let me know if you have any donation links for the project. The progress is amazing to see
Updated by Jim Pingle over 4 years ago
Per bz, Fix works and is awaiting review upstream and will be committed to HEAD, then stable/12. Once it's in stable/12 we'll pick it back into snapshots.
Updated by Louis B over 4 years ago
Jim,
Very good news!
Is there an option to test it here on my system running latest snapshotbuild!
(yep I did fix the PPPOE-problem, it works ...... when MTU field is filled ...)
Louis
Updated by Marc J over 4 years ago
Louis van Breda wrote:
Jim,
Very good news!
Is there an option to test it here on my system running latest snapshotbuild!
(yep I did fix the PPPOE-problem, it works ...... when MTU field is filled ...)Louis
I think we would need to wait for it to at least be committed into the HEAD before we can try to apply a patch.
Maybe Jim would be able to give a better idea. Either way I am super happy to see how fast the process is on this one. Massive appreciation to the team here!
Updated by Marc J over 4 years ago
Jim Pingle wrote:
I already answered that in comment 23
I myself read his question as: "Is there an option to test it here on my system now before running latest snapshot-build?"
As in, he is aware the newest snapshots will have the fix, he is looking for an option "now" on the snapshot he has installed :)
You response only mentions the upcoming builds. Does Not answer if he can test now somehow on his current system.
Updated by Louis B over 4 years ago
Yep,
Exactly, now we have momentum to get things fixed. If we find a bug lateron the momentum and the timslot is gone.
Louis
Updated by Jim Pingle over 4 years ago
It requires a new kernel, so no way to reliably test outside of snapshots. We'll pick up the change soon.
Updated by Luiz Souza over 4 years ago
- Status changed from Confirmed to Feedback
The fix was merged to pfSense sources.
Please test with the next snapshot.
Updated by Louis B over 4 years ago
- File config-pfSense.lan-20200620144617_NoPW.xml config-pfSense.lan-20200620144617_NoPW.xml added
- File 20200620 PS_A_PIMDrenamed_IMGPproxyEnabled.txt 20200620 PS_A_PIMDrenamed_IMGPproxyEnabled.txt added
- File 20200620 PS_A_PIMDrenamed.txt 20200620 PS_A_PIMDrenamed.txt added
- File 20200620 BootWithPIMD_RenamedAndIMGPproxyEnabled.txt 20200620 BootWithPIMD_RenamedAndIMGPproxyEnabled.txt added
- File 20200620 PIMD_OnlyRunningStrangeAllSituation.JPG 20200620 PIMD_OnlyRunningStrangeAllSituation.JPG added
- File 20200620 PS_A.txt 20200620 PS_A.txt added
- File 20200620 BootWithPIMD_DisabledAndIMGPproxyEnabled.txt 20200620 BootWithPIMD_DisabledAndIMGPproxyEnabled.txt added
- File 20200620 BootWithPIMD_Enabled.txt 20200620 BootWithPIMD_Enabled.txt added
Hello,
I did a lot of tests related to IGMP-proxy and PIMD using snapshot 2.5.0.a.20200620.0050
Dispite what I had hoped, the conclusion is that it is not working
- not IGMP-ptoxy and
- neither PIMD
I did try multiple different configs.
Be Aware that IGMP-proxy and PIMD can not run at the same time. I started testing with PIMD enabled and IMGP disabled. Multiple issues there. Then I did disable PIMD and enabled IMGP-proxy.
However dispite I did disable PIMD, it was still trying to start (strange), to prevent that, I did rename pimd to pind_DONOTSTART :) Tested the proxy under that condition, but .... not working.
Attached you will find many attachements
- my config file
- the IMGP-config
- one of the used PIMD configs (tryed many, none OK)
- boot log starting PIMD
- boot log starting IMGP proxy
- files showing running applications (PS -A)
Note that the only config PIMD was runnig, but in a very very incorrect way, running was with option bind to all
Since PIMD is the more modern application and has better logging I assume the best way to solve the underlying issues is to concentrate on PIMD (my feeling is that if PIMD is running, IMGP proxy will be running as well)
In the PIMD boot log I indicated with remarks starting with ">>" things going wrong.
Louis
Updated by Louis B over 4 years ago
- File var_etc_igmpproxy.conf var_etc_igmpproxy.conf added
- File var_erc_pimd.conf var_erc_pimd.conf added
Oeps,
I did forget to add two config examples (I did test other PIMD-configs as well).
So here they are.
Louis
Updated by Louis B over 4 years ago
Hello,
I am not the only one noticeing that there is still a problem :) So the problem was updated in the FreeBSD bugzilla (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246629). And hopefully solved. Corrections should will be in the next (actual?) FreeBSD snapshot.
However, I think there is at least one issue at pfSense level, beeing pimd starting (twice!) at boot time even if PIMD is not activated in the GUI. I do not expect that that is on purpose!!?
Reading my own commented bootlog, I noticed:
1a) >> Only the three vlan's below will have vifs later. Something wrong already at this point !!!!
1b >> The VLAN's below should have VIFS but do not get them !!!
=> could be related to the indicated FreeBSD bug
2) >> something wrong here pind starting twice !!
=> see comment above, this is pfSense related
3) >> The fact that the interfaces below get Invalid phyint address is not ok !!
=> this is not OK as well, I do not know what and where the problem is could be FreeBSD, but if so .... is it the same issue or another issue !!??
=> could be pfSense as well, .... my Feeling is that it is more likely that it is in the OS (but not sure)
4) >> I did try many PIMD config options, Appart from "Bind to all, nothing worked, and even that one did not work properly
=> my feeling is that this is at least for a significant part is related to the indicated FreeBSD bug
So my suggestion is to assume for now that 1) and 4) are specified FreeBSD bug related, and in parallel investigate issue 2) and 3). In case issue 3) is not related to the FreeBSD bug and not to pfSense another FreeBSD bugreport should be created.
Louis
Updated by Jim Pingle over 4 years ago
- Status changed from Feedback to New
An additional fix has been added to FreeBSD that we need to pull into snapshots.
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246629#c20
https://svnweb.freebsd.org/base?view=revision&revision=362494
Updated by Louis B over 4 years ago
Hello,
Be aware there were multiple things fixed in FreeBSD and placed in the snapshots. Latest message I got from free BSD is
--- Comment #20 from commit-hook@freebsd.org --- A commit references this bug:
Author: bz
Date: Mon Jun 22 10:52:31 UTC 2020
New revision: 362494
URL: https://svnweb.freebsd.org/changeset/base/362494
Log:
MFC r362472:
Rather than zeroing MAXVIFS times size of pointer [r362289] (still better than
sizeof pointer before [r354857]), we need to zero MAXVIFS times the size of
the struct. All good things come in threes; I hope this is it on this one.
PR: 246629, 206583
Reported by: kib
So be sure to pull the very latest FreeBSD version should have indicated patch level
Also, note that I am in direct contact with the FreeBSD developer. With input I provided, he noted that there is also another issue probably related to formatting of the pimd.conf file. Not sure there but one thing is for sure, the interface nameing for some reason or the other not consistent between FreeBSD, PIMD.conf and the logging. Together have to find out what the exact issue is.
The fact the PIMD is starting multiple times (I think even if PIMD is not enabled), id ofcourse a pfSense issue.
Louis
Updated by Jim Pingle over 4 years ago
We are aware, and are in direct communication with the FreeBSD developer who made the commits. I mentioned above already that there was another fix and even linked to the copy you commented. We have already pulled the latest fix into the source tree earlier this morning, it will be in the next snapshot (building now).
Updated by Jim Pingle over 4 years ago
Anything not directly related to the specific multicast issue caused by the FreeBSD bug does not belong on this issue, so let's keep the chatter to a minimum.
Updated by Jim Pingle over 4 years ago
- Status changed from New to Feedback
The most recent snapshot has the latest fix and it appears to work. I can stop and restart pimd without errors. Leaving open in Feedback state for additional confirmation.
Reminder: This is only for the original issue where a daemon could not be restarted after the first time it ran at boot, not for other issues with igmpproxy/pimd/etc configurations. Restrict feedback to only the original problem.
Updated by Anonymous about 4 years ago
- Status changed from Feedback to Resolved