Project

General

Profile

Actions

Bug #12547

open

unsheduled system reboot/crash

Added by Evgeny Korostelev 2 months ago. Updated about 2 months ago.

Status:
Feedback
Priority:
Normal
Assignee:
Category:
Operating System
Target version:
-
Start date:
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Release Notes:
Default
Affected Version:
Affected Architecture:
All

Description

pfSense Community Edition 2.5.2
Try navigate to menu "Diagnostics" -> "Routes"
Then system crash/reboot, and after boot have text system dump (attached to report)


Files

info (1).0 (398 Bytes) info (1).0 Evgeny Korostelev, 11/28/2021 07:16 AM
textdump.tar (1).0 (154 KB) textdump.tar (1).0 Evgeny Korostelev, 11/28/2021 07:16 AM
info.0 (398 Bytes) info.0 Evgeny Korostelev, 11/28/2021 07:16 AM
textdump.tar.0 (154 KB) textdump.tar.0 Evgeny Korostelev, 11/28/2021 07:16 AM
pfgwhome.home.local - Diagnostics_ Routes.pdf (274 KB) pfgwhome.home.local - Diagnostics_ Routes.pdf Diagnostic Routes PDF Print Evgeny Korostelev, 11/28/2021 08:03 AM
Actions #1

Updated by Evgeny Korostelev 2 months ago

Not every time !!!
after 45 minutes i have a succesfull result

Actions #2

Updated by Jim Pingle 2 months ago

  • Status changed from New to Feedback

This is not a general problem but one specific to your install or environment.

The backtrace in both cases is identical:

db:0:kdb.enter.default>  bt
Tracing pid 39711 tid 100715 td 0xfffff801d0b5e740
kdb_enter() at kdb_enter+0x37/frame 0xfffffe006631f1c0
vpanic() at vpanic+0x197/frame 0xfffffe006631f210
panic() at panic+0x43/frame 0xfffffe006631f270
trap_fatal() at trap_fatal+0x391/frame 0xfffffe006631f2d0
trap_pfault() at trap_pfault+0x4f/frame 0xfffffe006631f320
trap() at trap+0x286/frame 0xfffffe006631f430
calltrap() at calltrap+0x8/frame 0xfffffe006631f430
--- trap 0xc, rip = 0xffffffff80eebdf2, rsp = 0xfffffe006631f500, rbp = 0xfffffe006631f640 ---
sysctl_dumpentry() at sysctl_dumpentry+0x1a2/frame 0xfffffe006631f640
rn_walktree() at rn_walktree+0x98/frame 0xfffffe006631f670
sysctl_rtsock() at sysctl_rtsock+0x20d/frame 0xfffffe006631f8a0
sysctl_root_handler_locked() at sysctl_root_handler_locked+0x8a/frame 0xfffffe006631f8e0
sysctl_root() at sysctl_root+0x220/frame 0xfffffe006631f960
userland_sysctl() at userland_sysctl+0x178/frame 0xfffffe006631fa10
sys___sysctl() at sys___sysctl+0x5f/frame 0xfffffe006631fac0
amd64_syscall() at amd64_syscall+0x387/frame 0xfffffe006631fbf0
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe006631fbf0
--- syscall (202, FreeBSD ELF64, sys___sysctl), rip = 0x80046ce2a, rsp = 0x7fffffffea38, rbp = 0x7fffffffea70 ---

In the past, sysctl panics have sometimes been related to BIOS/hardware quirks and are not necessarily a software problem.

Since you seem to be able to repeat this, the first step is to move it up to a 2.6.0 development snapshot and see if the problem still happens there.

Actions #3

Updated by Mateusz Guzik about 2 months ago

I found the panicking instruction:
0xffffffff80eebdf2 <+418>: mov (%rcx),%rcx

corresponds to:
info.rti_info[RTAX_IFP] = rt->rt_ifp->if_addr->ifa_addr;

But given the panic if_addr is probably NULL.

That is, the route at hand is either in the process of being destroyed OR is not fully constructed yet and the code was unlucky enough to find it in this state.

A quick hack to work around the problem would NULL check the pointer, the correct fix will require finding out which of the cases we are dealing with here and plugging that.

I'm going to take a day or two with it and if I don't find any suspects I'll submit the workaround for the time being.

Actions #4

Updated by Mateusz Guzik about 2 months ago

  • Assignee set to Mateusz Guzik
  • Affected Architecture All added
  • Affected Architecture deleted (amd64)
Actions

Also available in: Atom PDF