Project

General

Profile

Actions

Bug #15413

closed

Kernel panic in HA nodes when under high load

Added by Steve Wheeler 7 months ago. Updated 10 days ago.

Status:
Resolved
Priority:
Normal
Category:
FreeBSD
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
24.11
Release Notes:
Default
Affected Version:
2.7.2
Affected Architecture:
All

Description

Two 1541s running 23.09.1 in this example:

db:1:pfs> bt
Tracing pid 12 tid 100124 td 0xfffffe00e24d1560
kdb_enter() at kdb_enter+0x32/frame 0xfffffe01066f8820
vpanic() at vpanic+0x163/frame 0xfffffe01066f8950
panic() at panic+0x43/frame 0xfffffe01066f89b0
trap_fatal() at trap_fatal+0x40c/frame 0xfffffe01066f8a10
trap_pfault() at trap_pfault+0x4f/frame 0xfffffe01066f8a70
calltrap() at calltrap+0x8/frame 0xfffffe01066f8a70
--- trap 0xc, rip = 0xffffffff80fb5e29, rsp = 0xfffffe01066f8b40, rbp = 0xfffffe01066f8ba0 ---
pf_test_state_udp() at pf_test_state_udp+0x2a9/frame 0xfffffe01066f8ba0
pf_test() at pf_test+0x110a/frame 0xfffffe01066f8d40
pf_check_in() at pf_check_in+0x27/frame 0xfffffe01066f8d60
pfil_mbuf_in() at pfil_mbuf_in+0x38/frame 0xfffffe01066f8d90
ip_input() at ip_input+0x3ae/frame 0xfffffe01066f8df0
swi_net() at swi_net+0x128/frame 0xfffffe01066f8e60
ithread_loop() at ithread_loop+0x257/frame 0xfffffe01066f8ef0
fork_exit() at fork_exit+0x7f/frame 0xfffffe01066f8f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe01066f8f30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
db:1:pfs>  show registers
cs                        0x20
ds                        0x3b
es                        0x3b
fs                        0x13
gs                        0x1b
ss                        0x28
rax                       0x12
rcx         0xffffffff814591d9
rdx         0xffffffff844195ff
rbx                      0x100
rsp         0xfffffe01066f8820
rbp         0xfffffe01066f8820
rsi         0xfffffe01066f8290
rdi         0xffffffff82d40298  vt_conswindow+0x10
r8                        0x10
r9                        0x10
r10                        0xf
r11                       0x10
r12                          0
r13                          0
r14         0xffffffff813dc4fe
r15         0xfffffe00e24d1560
rip         0xffffffff80d38d62  kdb_enter+0x32
rflags                    0x82
kdb_enter+0x32: movq    $0,0x2344aa3(%rip)
db:1:pfs>  show pcpu
cpuid        = 8
dynamic pcpu = 0xfffffe00b4a6af00
curthread    = 0xfffffe00e24d1560: pid 12 tid 100124 critnest 1 "swi1: netisr 11" 
curpcb       = 0xfffffe00e24d1a80
fpcurthread  = none
idlethread   = 0xfffffe00e23fce40: tid 100011 "idle: cpu8" 
self         = 0xffffffff84018000
curpmap      = 0xffffffff83021ab0
tssp         = 0xffffffff84018384
rsp0         = 0xfffffe01066f9000
kcr3         = 0x8000000003f6f002
ucr3         = 0xffffffffffffffff
scr3         = 0x69eb1e858
gs32p        = 0xffffffff84018404
ldt          = 0xffffffff84018444
tss          = 0xffffffff84018434
curvnet      = 0xfffff80005383440

Crashing repeatedly. Multiple crash reports available.

Also see: https://netgate.slack.com/archives/C4GUL8CKF/p1671555158444819
Similar crash in 22.05.

Actions

Also available in: Atom PDF