Bug #10792: Crash when switching interface off and on again in cohesion with multicast - pfSense - pfSense bugtracker

Actions

Copy link

Bug #10792

open

Crash when switching interface off and on again in cohesion with multicast

Added by Louis B almost 6 years ago. Updated almost 4 years ago.

Status:

New

Priority:

Normal

Assignee:

Category:

Operating System

Target version:

Start date:

07/28/2020

Due date:

% Done:

Estimated time:

Plus Target Version:

Release Notes:

Affected Version:

Affected Architecture:

Description

Hello,

There are still crashes when switching off and on (vlan)interfaces. One of those crashes seems to be triggered by PIMD/Multicast. So I opened a FeeBSD Bug. This one is FreeBSD bug 248243.
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=248243

Note that in general the number of crashes related to switching off and on (vlan)interfaces, is much better now than it was in the past months. So some one has allready fixed one or more issues :)

For info, monitoring and follow up.

Louis

Copied from FreeBSD

I try to get PIMD running on latest pfSense release. One of the two problems I am facing is that FreeBsd as crashing as soon as I switch off and on again one of the interfaces, especially the interface being used as PIMD register_vif.

Hereby the relevant lines from "the core dump"

curthread = 0xfffff8000439f000: pid 12 tid 100040 "swi1: netisr 2"

Tracing pid 12 tid 100040 td 0xfffff8000439f000

kdb_enter() at kdb_enter+0x37/frame 0xfffffe00004de4b0
vpanic() at vpanic+0x197/frame 0xfffffe00004de500
panic() at panic+0x43/frame 0xfffffe00004de560
trap_fatal() at trap_fatal+0x391/frame 0xfffffe00004de5c0
trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00004de610
trap() at trap+0x286/frame 0xfffffe00004de720
calltrap() at calltrap+0x8/frame 0xfffffe00004de720
***--- trap 0xc, rip = 0xffffffff80e934f5, rsp = 0xfffffe00004de7f0, rbp =

0xfffffe00004de7f0 ---***
if_inc_counter() at if_inc_counter+0x15/frame 0xfffffe00004de7f0
if_simloop() at if_simloop+0xd1/frame 0xfffffe00004de830
pim_input() at pim_input+0x409/frame 0xfffffe00004de890

encap_input() at encap_input+0xd1/frame 0xfffffe00004de900
encap4_input() at encap4_input+0x28/frame 0xfffffe00004de930
ip_input() at ip_input+0x168/frame 0xfffffe00004de9e0
swi_net() at swi_net+0x12b/frame 0xfffffe00004dea50
ithread_loop() at ithread_loop+0x23c/frame 0xfffffe00004deab0
fork_exit() at fork_exit+0x7e/frame 0xfffffe00004deaf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00004deaf0

IMHO the problem is like this:
- pim is asking som info from underlying core
- that is translated into a trap
- the trap is not handled correctly and/or parameters are not correct
- the trap is handeld by thread 12 "swi1: netisr 2"
- which lead to a fatal crash

Louis
PS related but another problem #248103

Actions

Copy link

Updated by Louis B almost 6 years ago

I did retest interface stability. The situation is much better now. I can not reproduce crashes any more.

Actions

Copy link

Updated by Renato Botelho almost 6 years ago

Status changed from New to Closed
Target version deleted (~~2.5.0~~)

Awesome! Thanks for reporting

Actions

Copy link

Updated by Marcos M almost 4 years ago

Status changed from Closed to New

This happened after renaming the description of a VLAN on an LACP LAGG consisting of ix0 and ix1 on a Netgate 7100 running 22.05. The PIMD package is installed and configured with the interfaces on the referenced lagg.

db:0:kdb.enter.default>  bt
Tracing pid 0 tid 100028 td 0xfffff80005368740
kdb_enter() at kdb_enter+0x37/frame 0xfffffe00400d7190
vpanic() at vpanic+0x194/frame 0xfffffe00400d71e0
panic() at panic+0x43/frame 0xfffffe00400d7240
trap_fatal() at trap_fatal+0x38f/frame 0xfffffe00400d72a0
trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00400d7300
calltrap() at calltrap+0x8/frame 0xfffffe00400d7300
--- trap 0xc, rip = 0xffffffff80e97ea5, rsp = 0xfffffe00400d73d0, rbp = 0xfffffe00400d73d0 ---
if_inc_counter() at if_inc_counter+0x15/frame 0xfffffe00400d73d0
if_simloop() at if_simloop+0xcd/frame 0xfffffe00400d7410
pim_input() at pim_input+0x3fa/frame 0xfffffe00400d7480
encap_input() at encap_input+0xd1/frame 0xfffffe00400d74f0
encap4_input() at encap4_input+0x28/frame 0xfffffe00400d7520
ip_input() at ip_input+0x16e/frame 0xfffffe00400d75d0
netisr_dispatch_src() at netisr_dispatch_src+0xb9/frame 0xfffffe00400d7620
ether_demux() at ether_demux+0x16a/frame 0xfffffe00400d7650
ether_nh_input() at ether_nh_input+0x33b/frame 0xfffffe00400d76b0
netisr_dispatch_src() at netisr_dispatch_src+0xb9/frame 0xfffffe00400d7700
ether_input() at ether_input+0x89/frame 0xfffffe00400d7760
vlan_input() at vlan_input+0x23b/frame 0xfffffe00400d77c0
ether_demux() at ether_demux+0x153/frame 0xfffffe00400d77f0
ether_nh_input() at ether_nh_input+0x33b/frame 0xfffffe00400d7850
netisr_dispatch_src() at netisr_dispatch_src+0xb9/frame 0xfffffe00400d78a0
ether_input() at ether_input+0x89/frame 0xfffffe00400d7900
iflib_rxeof() at iflib_rxeof+0xaa6/frame 0xfffffe00400d79e0
_task_fn_rx() at _task_fn_rx+0x72/frame 0xfffffe00400d7a20
gtaskqueue_run_locked() at gtaskqueue_run_locked+0x121/frame 0xfffffe00400d7a80
gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0xd2/frame 0xfffffe00400d7ab0
fork_exit() at fork_exit+0x7e/frame 0xfffffe00400d7af0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00400d7af0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---

Actions

Copy link

Updated by Louis B almost 4 years ago

Hello,

Just for info:

Related to PIMD
- I am still a happy PIMD user however the very old >>released version<< of PIMD does not work under FreeBSD !!
- So to make it work I did / you have to compile the actual PIMD sources using FreeBSD 12 (https://github.com/troglobit/pimd)
- And combine that with the previously available NetGate PIMD gui package.
- I multiple times tried to convince the PIMD maintainer to formally release the actual PIMD sources, which in opposite to the old version do work without any problem (as far as I can see).

Related to pfSense interface issues
- at the time I wrote the issue, IMHO lots of issues, at this moment not as far as I know / not in my setup (running2.7.0 CE)

I can again try to persuade Joachim (the pimd maintainer) to formally release the code, he will not formally do that because he has no time for further support. But again the actual pimd code works without issues and at least in my setup there are no pfSense issues in this area

Actions

Copy link

Updated by Louis B almost 4 years ago

I changed my pfSense disk (SSD) for which reason I had to reinstall pfSense. After installing CE 2.7.0 version Fri Aug 12 00:02:48 UTC 2022 and copying and installing my pimd builds:
- pkg install pimd-3.0.b1.txz
- pkg install pfSense-pkg-pimd-3.0.1.txz
- cp pimd.conf /var/etc/pimd/pimd.conf

I noticed that under services PIMD was available as expected, but not properly working. The config is shown the status not and pimd is not starting via the GUI. I do not know why, but it is of course not supported at this moment.

However I can still start PIMD from the command line (pimd) and see the actual status using the pimctl command. After starting PIMD, it still works as expected.

Actions

Copy link

Updated by Louis B almost 4 years ago

I probably made a mistake. Every thing is still working including the GUI. Note that there seems to be two versions of the pimd.conf
one in /var/etc/pimd/pimd.conf and one in /usr/local/etc/pimd.conf (I do not know why)

I did copy my initial config to both locations

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

pfSense

Custom queries

Bug #10792

Crash when switching interface off and on again in cohesion with multicast

Updated by Louis B almost 6 years ago

Updated by Renato Botelho almost 6 years ago

Updated by Marcos M almost 4 years ago

Updated by Louis B almost 4 years ago

Updated by Louis B almost 4 years ago

Updated by Louis B almost 4 years ago