Regression #11470
closedPanic when using CBQ traffic shaping
0%
Description
A couple users have reported a panic when using CBQ traffic shaping. It may also require using CBQ on VLAN interfaces.
db:0:kdb.enter.default> bt Tracing pid 12 tid 100039 td 0xfffff800053bf000 kdb_enter() at kdb_enter+0x37/frame 0xfffffe000043e610 vpanic() at vpanic+0x197/frame 0xfffffe000043e660 panic() at panic+0x43/frame 0xfffffe000043e6c0 trap_fatal() at trap_fatal+0x391/frame 0xfffffe000043e720 trap_pfault() at trap_pfault+0x4f/frame 0xfffffe000043e770 trap() at trap+0x286/frame 0xfffffe000043e880 calltrap() at calltrap+0x8/frame 0xfffffe000043e880 --- trap 0xc, rip = 0xffffffff80ec014e, rsp = 0xfffffe000043e950, rbp = 0xfffffe000043e980 --- ether_8021q_frame() at ether_8021q_frame+0x2e/frame 0xfffffe000043e980 vlan_transmit() at vlan_transmit+0xc8/frame 0xfffffe000043e9f0 vlan_altq_start() at vlan_altq_start+0xb4/frame 0xfffffe000043ea20 cbqrestart() at cbqrestart+0x64/frame 0xfffffe000043ea50 rmc_restart() at rmc_restart+0x6f/frame 0xfffffe000043ea80 softclock_call_cc() at softclock_call_cc+0x141/frame 0xfffffe000043eb30 softclock() at softclock+0x79/frame 0xfffffe000043eb50 ithread_loop() at ithread_loop+0x23c/frame 0xfffffe000043ebb0 fork_exit() at fork_exit+0x7e/frame 0xfffffe000043ebf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe000043ebf0
Attached is a textdump archive from a separate user with the same backtrace.
Files
Updated by Jim Pingle over 3 years ago
That doesn't look like the same issue, the backtrace is a quite a bit different despite both mentioning CBQ. They could be related, but they aren't close enough that I'd call them the same yet.
Updated by Jim Pingle over 3 years ago
- Plus Target Version set to 21.05
Would be nice to fix soon if we can, but not a blocker at the moment.
Updated by Jim Pingle over 3 years ago
- Plus Target Version changed from 21.05 to 21.09
Updated by Steve Wheeler over 3 years ago
If anyone can provide steps to replicate this please do so. It's 'just working' for me locally.
Updated by Reymond Rivera over 3 years ago
- File textdump.tar.0 textdump.tar.0 added
- File info.0 info.0 added
I believe I am hitting the same issue. I have included dump files that was generated.
I have enabled CBQ on 7 interface on my pfsense. Prior using CBQ I was using using PRIQ and no issue was encounter. I open forum discussion stephenw10 suggested to remove the last interface that I added in CBQ which I did and the issue stopped. Pfsense is running straight for 5 days now without it crashing.
Updated by Kristof Provost about 3 years ago
- Status changed from New to Feedback
I've not been able to reproduce this yet. I'd expect it to happen around the borrowing code of CBQ, where it starts or stops borrowing and handles a delayed packet. I'm not entirely clear on when the relevant code gets called (hence the inability to reproduce it so far), but the panic itself looks to be pretty obvious.
From the backtrace and code I'm fairly confident that the problem is that we don't have a vnet context set. We enter the code path through a callout, which won't have vnet context, but then we (potentially) transmit packets, and then die fairly early on in ether_8021q_frame(). One of the first things that function does is to access a vnet-local variable (V_soft_pad), which will then explode. That's a fairly common sort of bug, and happily easily fixed.
I've pushed the fix to devel-12 as 9fa5a825c272d9e60314960829843e9c3456bb67
Updated by Jim Pingle about 3 years ago
- Target version changed from CE-Next to 2.6.0
Updated by Max Leighton about 3 years ago
- File issue-11470-config.xml issue-11470-config.xml added
Please see the attached sanitized interfaces/shaper config for a 5100 that has this issue which may help in reproducing this if needed.
Updated by Jim Pingle about 3 years ago
- Plus Target Version changed from 21.09 to 22.01
Updated by Jim Pingle almost 3 years ago
- Status changed from Feedback to Resolved
- Assignee set to Kristof Provost
No recent reports. Can always reopen it if someone manages to reproduce it again with the current fix in place.