Project

General

Profile

Actions

Bug #13938

closed

Kernel panic accessing the GUI over IPsec in certain environments when using nginx ``sendfile`` with unmapped mbufs

Added by Jim Pingle almost 2 years ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Normal
Category:
FreeBSD
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:
Plus Target Version:
23.05
Release Notes:
Default
Affected Version:
2.7.0
Affected Architecture:

Description

Under certain conditions which have not yet been identified, it is possible to encounter a kernel kernel panic on FreeBSD main/14.0-CURRENT builds (e.g. Plus 23.01) when attempting to access the GUI over an IPsec tunnel. Thus far we have only received a small number of reports (2) and we have not been able to reproduce the panic in lab conditions.

A community member tracked it down to the use of sendfile in nginx when used in combination with unmapped mbufs (kern.ipc.mb_use_ext_pgs=1) both of which are enabled by default.

Users encountering this crash can take either one of two actions:

1. Disable unmapped mbufs by adding a tunable to set kern.ipc.mb_use_ext_pgs=0
OR
2. Disable sendfile in nginx as described in https://forum.netgate.com/post/1084590

Full details on the forum thread, including backtraces and textdump archives:

https://forum.netgate.com/topic/176974/web-gui-crashes-after-upgrade-from-22-05-to-23-01

Actions #1

Updated by Steve Wheeler almost 2 years ago

To make searching easier the backtrace this generates is:

Tracing pid 3765 tid 100406 td 0xfffffe00c65a4900
kdb_enter() at kdb_enter+0x32/frame 0xfffffe00c3d6f320
vpanic() at vpanic+0x182/frame 0xfffffe00c3d6f370
panic() at panic+0x43/frame 0xfffffe00c3d6f3d0
trap_fatal() at trap_fatal+0x409/frame 0xfffffe00c3d6f430
trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00c3d6f490
calltrap() at calltrap+0x8/frame 0xfffffe00c3d6f490
--- trap 0xc, rip = 0xffffffff813187ba, rsp = 0xfffffe00c3d6f560, rbp = 0xfffffe00c3d6f560 ---
memcpy_erms() at memcpy_erms+0x10a/frame 0xfffffe00c3d6f560
m_unshare() at m_unshare+0x3de/frame 0xfffffe00c3d6f5e0
esp_output() at esp_output+0x186/frame 0xfffffe00c3d6f6d0
ipsec4_perform_request() at ipsec4_perform_request+0x1d2/frame 0xfffffe00c3d6f760
ipsec4_common_output() at ipsec4_common_output+0xa2/frame 0xfffffe00c3d6f7a0
ip_output() at ip_output+0x99d/frame 0xfffffe00c3d6f8a0
tcp_default_output() at tcp_default_output+0x1d2b/frame 0xfffffe00c3d6fa70
tcp_usr_ready() at tcp_usr_ready+0x1a1/frame 0xfffffe00c3d6fad0
sendfile_iodone() at sendfile_iodone+0x11c/frame 0xfffffe00c3d6fb10
vn_sendfile() at vn_sendfile+0x1663/frame 0xfffffe00c3d6fd70
sys_sendfile() at sys_sendfile+0xf7/frame 0xfffffe00c3d6fe00
amd64_syscall() at amd64_syscall+0x10c/frame 0xfffffe00c3d6ff30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00c3d6ff30
--- syscall (393, FreeBSD ELF64, sys_sendfile), rip = 0x8254b84ba, rsp = 0x8209bed68, rbp = 0x8209bf640 ---

Actions #2

Updated by Steve Wheeler almost 2 years ago

The workarounds used here also seem to apply at least partially to connections over OpenVPN tunnels.
See: https://forum.netgate.com/post/1090118

In that case there is no kernel panic but it seems to crash nginx requiring a reboot.

Actions #3

Updated by Danilo Zrenjanin almost 2 years ago

I can confirm that applying the patch from the forum fixed the issues with connections over IPsec.
https://forum.netgate.com/topic/176974/web-gui-crashes-after-upgrade-from-22-05-to-23-01/62

Actions #4

Updated by Christian McDonald almost 2 years ago

  • Status changed from New to Feedback
  • % Done changed from 0 to 100
Actions #5

Updated by Christian McDonald almost 2 years ago

  • Assignee changed from Mateusz Guzik to Christian McDonald

We will now disable sendfile mode. Sendfile has little to no benefit for us on pfSense.

This feature of nginx has been problematic upstream for a while, with it being broken and fixed several times.

Sendfile is really only useful when serving static files from UFS filesystem.

Actions #6

Updated by Jim Pingle almost 2 years ago

  • Status changed from Feedback to Resolved

sendfile is off in all nginx configurations now, for the GUI and Captive Portal.

Actions #7

Updated by Mateusz Guzik over 1 year ago

Seeing as this is a bug in mbuf handling, I would argue the thing to do is to flip the unmapped buf support off -- there may be other programs out there using sendfile, no point of them crashing the system.

Actions #9

Updated by Christian McDonald over 1 year ago

  • Status changed from Resolved to Feedback
Actions #10

Updated by Christian McDonald over 1 year ago

  • Status changed from Feedback to Resolved

kern.ipc.mb_use_ext_pgs has been disabled for 2 weeks now.

Marking as resolved.

Actions

Also available in: Atom PDF