Bug #10792
openCrash when switching interface off and on again in cohesion with multicast
0%
Description
Hello,
There are still crashes when switching off and on (vlan)interfaces. One of those crashes seems to be triggered by PIMD/Multicast. So I opened a FeeBSD Bug. This one is FreeBSD bug 248243.
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=248243
Note that in general the number of crashes related to switching off and on (vlan)interfaces, is much better now than it was in the past months. So some one has allready fixed one or more issues :)
For info, monitoring and follow up.
Louis
Copied from FreeBSD
I try to get PIMD running on latest pfSense release. One of the two problems I am facing is that FreeBsd as crashing as soon as I switch off and on again one of the interfaces, especially the interface being used as PIMD register_vif.
Hereby the relevant lines from "the core dump"
curthread = 0xfffff8000439f000: pid 12 tid 100040 "swi1: netisr 2"
Tracing pid 12 tid 100040 td 0xfffff8000439f000
kdb_enter() at kdb_enter+0x37/frame 0xfffffe00004de4b0
vpanic() at vpanic+0x197/frame 0xfffffe00004de500
panic() at panic+0x43/frame 0xfffffe00004de560
trap_fatal() at trap_fatal+0x391/frame 0xfffffe00004de5c0
trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00004de610
trap() at trap+0x286/frame 0xfffffe00004de720
calltrap() at calltrap+0x8/frame 0xfffffe00004de720
***--- trap 0xc, rip = 0xffffffff80e934f5, rsp = 0xfffffe00004de7f0, rbp =
0xfffffe00004de7f0 ---***
if_inc_counter() at if_inc_counter+0x15/frame 0xfffffe00004de7f0
if_simloop() at if_simloop+0xd1/frame 0xfffffe00004de830
pim_input() at pim_input+0x409/frame 0xfffffe00004de890
encap_input() at encap_input+0xd1/frame 0xfffffe00004de900
encap4_input() at encap4_input+0x28/frame 0xfffffe00004de930
ip_input() at ip_input+0x168/frame 0xfffffe00004de9e0
swi_net() at swi_net+0x12b/frame 0xfffffe00004dea50
ithread_loop() at ithread_loop+0x23c/frame 0xfffffe00004deab0
fork_exit() at fork_exit+0x7e/frame 0xfffffe00004deaf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00004deaf0
IMHO the problem is like this:
- pim is asking som info from underlying core
- that is translated into a trap
- the trap is not handled correctly and/or parameters are not correct
- the trap is handeld by thread 12 "swi1: netisr 2"
- which lead to a fatal crash
Louis
PS related but another problem #248103
Updated by Louis B over 4 years ago
I did retest interface stability. The situation is much better now. I can not reproduce crashes any more.
Updated by Renato Botelho over 4 years ago
- Status changed from New to Closed
- Target version deleted (
2.5.0)
Awesome! Thanks for reporting
Updated by Marcos M over 2 years ago
- Status changed from Closed to New
This happened after renaming the description of a VLAN on an LACP LAGG consisting of ix0 and ix1 on a Netgate 7100 running 22.05. The PIMD package is installed and configured with the interfaces on the referenced lagg.
db:0:kdb.enter.default> bt Tracing pid 0 tid 100028 td 0xfffff80005368740 kdb_enter() at kdb_enter+0x37/frame 0xfffffe00400d7190 vpanic() at vpanic+0x194/frame 0xfffffe00400d71e0 panic() at panic+0x43/frame 0xfffffe00400d7240 trap_fatal() at trap_fatal+0x38f/frame 0xfffffe00400d72a0 trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00400d7300 calltrap() at calltrap+0x8/frame 0xfffffe00400d7300 --- trap 0xc, rip = 0xffffffff80e97ea5, rsp = 0xfffffe00400d73d0, rbp = 0xfffffe00400d73d0 --- if_inc_counter() at if_inc_counter+0x15/frame 0xfffffe00400d73d0 if_simloop() at if_simloop+0xcd/frame 0xfffffe00400d7410 pim_input() at pim_input+0x3fa/frame 0xfffffe00400d7480 encap_input() at encap_input+0xd1/frame 0xfffffe00400d74f0 encap4_input() at encap4_input+0x28/frame 0xfffffe00400d7520 ip_input() at ip_input+0x16e/frame 0xfffffe00400d75d0 netisr_dispatch_src() at netisr_dispatch_src+0xb9/frame 0xfffffe00400d7620 ether_demux() at ether_demux+0x16a/frame 0xfffffe00400d7650 ether_nh_input() at ether_nh_input+0x33b/frame 0xfffffe00400d76b0 netisr_dispatch_src() at netisr_dispatch_src+0xb9/frame 0xfffffe00400d7700 ether_input() at ether_input+0x89/frame 0xfffffe00400d7760 vlan_input() at vlan_input+0x23b/frame 0xfffffe00400d77c0 ether_demux() at ether_demux+0x153/frame 0xfffffe00400d77f0 ether_nh_input() at ether_nh_input+0x33b/frame 0xfffffe00400d7850 netisr_dispatch_src() at netisr_dispatch_src+0xb9/frame 0xfffffe00400d78a0 ether_input() at ether_input+0x89/frame 0xfffffe00400d7900 iflib_rxeof() at iflib_rxeof+0xaa6/frame 0xfffffe00400d79e0 _task_fn_rx() at _task_fn_rx+0x72/frame 0xfffffe00400d7a20 gtaskqueue_run_locked() at gtaskqueue_run_locked+0x121/frame 0xfffffe00400d7a80 gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0xd2/frame 0xfffffe00400d7ab0 fork_exit() at fork_exit+0x7e/frame 0xfffffe00400d7af0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00400d7af0 --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
Updated by Louis B over 2 years ago
Hello,
Just for info:
Related to PIMD
- I am still a happy PIMD user however the very old >>released version<< of PIMD does not work under FreeBSD !!
- So to make it work I did / you have to compile the actual PIMD sources using FreeBSD 12 (https://github.com/troglobit/pimd)
- And combine that with the previously available NetGate PIMD gui package.
- I multiple times tried to convince the PIMD maintainer to formally release the actual PIMD sources, which in opposite to the old version do work without any problem (as far as I can see).
Related to pfSense interface issues
- at the time I wrote the issue, IMHO lots of issues, at this moment not as far as I know / not in my setup (running2.7.0 CE)
I can again try to persuade Joachim (the pimd maintainer) to formally release the code, he will not formally do that because he has no time for further support. But again the actual pimd code works without issues and at least in my setup there are no pfSense issues in this area
Updated by Louis B over 2 years ago
I changed my pfSense disk (SSD) for which reason I had to reinstall pfSense. After installing CE 2.7.0 version Fri Aug 12 00:02:48 UTC 2022 and copying and installing my pimd builds:
- pkg install pimd-3.0.b1.txz
- pkg install pfSense-pkg-pimd-3.0.1.txz
- cp pimd.conf /var/etc/pimd/pimd.conf
I noticed that under services PIMD was available as expected, but not properly working. The config is shown the status not and pimd is not starting via the GUI. I do not know why, but it is of course not supported at this moment.
However I can still start PIMD from the command line (pimd) and see the actual status using the pimctl command. After starting PIMD, it still works as expected.
Updated by Louis B over 2 years ago
I probably made a mistake. Every thing is still working including the GUI. Note that there seems to be two versions of the pimd.conf
one in /var/etc/pimd/pimd.conf and one in /usr/local/etc/pimd.conf (I do not know why)
I did copy my initial config to both locations