Bug #14075
closedUsing the ``Transparent ClientIP`` option in HAproxy results in kernel panics
0%
Description
Report from a Netgate 7100 after upgrading to 23.01
.
Before disabling the Transparent ClientIP
option in haproxy, the system would crash with the following:
Fatal trap 12: page fault while in kernel mode cpuid = 3; apic id = 18 fault virtual address = 0x37fe fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff81318330 stack pointer = 0x28:0xfffffe002a76ada0 frame pointer = 0x28:0xfffffe002a76ada0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (if_io_tqg_3) rdi: fffff8011c392798 rsi: 37fe rdx: 3c rcx: 3c r8: fffff8011c38ef9a r9: fffffe002a76b19c rax: fffff8011c392798 rbx: 3c rbp: fffffe002a76ada0 r10: fffffe00aa2b46d0 r11: fffff8011c392700 r12: 68 r13: fffff80584106100 r14: fffff8011c392700 r15: 6 trap number = 12 panic: page fault cpuid = 3 time = 1676952890 KDB: enter: panic db:1:pfs> bt Tracing pid 0 tid 100010 td 0xfffffe00d5069560 kdb_enter() at kdb_enter+0x32/frame 0xfffffe002a76ab60 vpanic() at vpanic+0x182/frame 0xfffffe002a76abb0 panic() at panic+0x43/frame 0xfffffe002a76ac10 trap_fatal() at trap_fatal+0x409/frame 0xfffffe002a76ac70 trap_pfault() at trap_pfault+0x4f/frame 0xfffffe002a76acd0 calltrap() at calltrap+0x8/frame 0xfffffe002a76acd0 --- trap 0xc, rip = 0xffffffff81318330, rsp = 0xfffffe002a76ada0, rbp = 0xfffffe002a76ada0 --- memmove_erms() at memmove_erms+0x30/frame 0xfffffe002a76ada0 m_pullup() at m_pullup+0x19f/frame 0xfffffe002a76ade0 ipfw_chk() at ipfw_chk+0x1082/frame 0xfffffe002a76b020 ipfw_check_frame() at ipfw_check_frame+0x13c/frame 0xfffffe002a76b100 pfil_run_hooks() at pfil_run_hooks+0x97/frame 0xfffffe002a76b140 ether_output_frame() at ether_output_frame+0x94/frame 0xfffffe002a76b170 ether_output() at ether_output+0x66a/frame 0xfffffe002a76b200 pf_route() at pf_route+0x81c/frame 0xfffffe002a76b2c0 pf_test() at pf_test+0xc6b/frame 0xfffffe002a76b440 pf_check_out() at pf_check_out+0x1f/frame 0xfffffe002a76b460 pfil_run_hooks() at pfil_run_hooks+0x97/frame 0xfffffe002a76b4a0 ip_output() at ip_output+0xa13/frame 0xfffffe002a76b5a0 tcp_default_output() at tcp_default_output+0x1d2b/frame 0xfffffe002a76b770 tcp_output() at tcp_output+0x10/frame 0xfffffe002a76b790 tcp_do_segment() at tcp_do_segment+0x3164/frame 0xfffffe002a76b860 tcp_input_with_port() at tcp_input_with_port+0x100d/frame 0xfffffe002a76b9c0 tcp_input() at tcp_input+0xb/frame 0xfffffe002a76b9d0 ip_input() at ip_input+0x229/frame 0xfffffe002a76ba30 netisr_dispatch_src() at netisr_dispatch_src+0x2a6/frame 0xfffffe002a76ba80 ether_demux() at ether_demux+0x144/frame 0xfffffe002a76bab0 ether_nh_input() at ether_nh_input+0x353/frame 0xfffffe002a76bb10 netisr_dispatch_src() at netisr_dispatch_src+0xb9/frame 0xfffffe002a76bb60 ether_input() at ether_input+0x69/frame 0xfffffe002a76bbc0 ether_demux() at ether_demux+0x9e/frame 0xfffffe002a76bbf0 ether_nh_input() at ether_nh_input+0x353/frame 0xfffffe002a76bc50 netisr_dispatch_src() at netisr_dispatch_src+0xb9/frame 0xfffffe002a76bca0 ether_input() at ether_input+0x69/frame 0xfffffe002a76bd00 iflib_rxeof() at iflib_rxeof+0xbdb/frame 0xfffffe002a76be00 _task_fn_rx() at _task_fn_rx+0x72/frame 0xfffffe002a76be40 gtaskqueue_run_locked() at gtaskqueue_run_locked+0x15d/frame 0xfffffe002a76bec0 gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0xc3/frame 0xfffffe002a76bef0 fork_exit() at fork_exit+0x7e/frame 0xfffffe002a76bf30 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe002a76bf30 --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
Updated by Christian McDonald almost 2 years ago
- Target version set to 2.7.0
- Plus Target Version set to 23.05
This is likely a bug in ipfw, which was included in 23.01. 23.05 does not contain the ipfw kernel module.
23.01:
pkg info -l pfSense-kernel-pfSense-23.01 | grep -c ipfw
6
23.05:
pkg info -l pfSense-kernel-pfSense-23.01 | grep -c ipfw
0
From what I can tell, 23.05 shouldn't crash but the Transparent ClientIP feature is totally broken. We would need to see if the new L2 features of pf can be used instead of ipfw.
Is this feature even needed?
Updated by Christian McDonald almost 2 years ago
I have returned ipfw to development snapshots so we can work on replicating and testing there. It is not possible to fix this in 23.01 without shipping a point release, so any further investigations should take place on tomorrow's (3/15) development snapshots
Updated by Christian McDonald almost 2 years ago
- Assignee set to Christian McDonald
Updated by Marcos M over 1 year ago
- Status changed from New to Feedback
The original report was from a customer's system, however I have not been able to reproduce this either on 23.01 nor Plus snapshots. Reproducible steps would be helpful.
Updated by Christian McDonald over 1 year ago
- Status changed from Feedback to Not a Bug