Project

General

Profile

Actions

Bug #4403

closed

Enabling SNMP causes kernel panic with APU with empty SD card slot

Added by Andreas Walther almost 10 years ago. Updated almost 9 years ago.

Status:
Resolved
Priority:
High
Category:
SNMP
Target version:
Start date:
02/10/2015
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Release Notes:
Affected Version:
2.2.x
Affected Architecture:

Description

Hi Together,

i am not sure if this is a hardware problem, but basically i am using a PC Engines APU.1C(2GB) board which is working fine until i try to enable SNMP via the web interface.
The APU.1C should be the same like your recommended hardware VK-T40E Desktop Firewall Router Appliance (https://www.pfsense.org/hardware/pfsense-store.html#vkt40e)
I tried this 2 times with PFSense 2.2 after update from 2.1 and after a fresh 2.2 install.
The system is working without any problem until i try enable the snmp with the following settings.

Webconfigurator > Services > SNMP

SNMP Daemon Enable Checked
  • Read Community string: public34tr497g429tr20ztg
    Interface Binding
  • Bind Interface: LAN

Submitting the form is crashing the system.

After a power reset this is the output of the boot:

ugen2.1: <ATI> at usbus2
uhub2: <ATI OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus2
ugen3.1: <ATI> at usbus3
uhub3: <ATI EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus3
usbus4: 12Mbps Full Speed USB v1.0
usbus5: 12Mbps Full Speed USB v1.0
usbus6: 480Mbps High Speed USB v2.0
ugen4.1: <ATI> at usbus4
uhub4: <ATI OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus4
ugen5.1: <ATI> at usbus5
uhub5: <ATI OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus5
ugen6.1: <ATI> at usbus6
uhub6: <ATI EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus6
uhub4: 2 ports with 2 removable, self powered
uhub0: 5 ports with 5 removable, self powered
uhub2: 5 ports with 5 removable, self powered
uhub5: 4 ports with 4 removable, self powered
uhub6: 4 ports with 4 removable, self powered
uhub1: 5 ports with 5 removable, self powered
uhub3: 5 ports with 5 removable, self powered
ugen6.2: <Generic> at usbus6
umass0: <Generic Flash Card ReaderWriter, class 0/0, rev 2.01/1.00, addr 2> on usbus6
ugen3.2: <HUAWEI Technology> at usbus3
u3g0: <HUAWEI Technology HUAWEI MOBILE WCDMA EM770W, class 0/0, rev 2.00/0.00, addr 2> on usbus3
u3g0: Found 6 ports.
ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
ada0: <KINGSTON SMS200S330G 541ABBF0> ATA-8 SATA 3.x device
ada0: Serial Number 50026B724B0A8XXX
ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes)
ada0: Command Queueing enabled
ada0: 28626MB (58626288 512 byte sectors: 16H 63S/T 16383C)
ada0: Previously was known as ad4
da0 at umass-sim0 bus 0 scbus6 target 0 lun 0
da0: <Multiple Card  Reader 1.00> Removable Direct Access SCSI-4 device
da0: Serial Number 058F63666XXX
da0: 40.000MB/s transfers
da0: Attempt to query device size failed: NOT READY, Medium not present
da0: quirks=0x2<NO_6_BYTE>
SMP: AP CPU #1 Launched!
Timecounter "TSC" frequency 1000019445 Hz quality 800
Trying to mount root from ufs:/dev/ada0s1a [rw]...
WARNING: / was not properly dismounted
Configuring crash dumps...
Using /dev/ada0s1b for dump device.
Mounting filesystems...
** /dev/ada0s1a
** Last Mounted on /
** Root file system
** Phase 1 - Check Blocks and Sizes
INCORRECT BLOCK COUNT I=562046 (8 should be 0)
CORRECT? yes

INCORRECT BLOCK COUNT I=562054 (8 should be 0)
CORRECT? yes

** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
LINK COUNT FILE I=562181  OWNER=0 MODE=100644
SIZE=442 MTIME=Feb  7 14:28 2015  COUNT 2 SHOULD BE 1
ADJUST? yes

UNREF FILE I=2327427  OWNER=0 MODE=100644
SIZE=0 MTIME=Feb  7 14:28 2015
CLEAR? yes

UNREF FILE I=2327428  OWNER=0 MODE=100644
SIZE=0 MTIME=Feb  7 14:28 2015
CLEAR? yes

UNREF FILE I=2327429  OWNER=0 MODE=100644
SIZE=0 MTIME=Feb  7 14:28 2015
CLEAR? yes

UNREF FILE I=2327430  OWNER=0 MODE=100644
SIZE=0 MTIME=Feb  7 14:28 2015
CLEAR? yes

UNREF FILE I=2327431  OWNER=0 MODE=100644
SIZE=0 MTIME=Feb  7 14:28 2015
CLEAR? yes

UNREF FILE I=2327432  OWNER=0 MODE=100644
SIZE=0 MTIME=Feb  7 14:28 2015
CLEAR? yes

UNREF FILE I=2327437  OWNER=0 MODE=100644
SIZE=0 MTIME=Feb  7 14:28 2015
CLEAR? yes

UNREF FILE I=2327438  OWNER=0 MODE=100644
SIZE=0 MTIME=Feb  7 14:28 2015
CLEAR? yes

UNREF FILE I=2327439  OWNER=0 MODE=100644
SIZE=0 MTIME=Feb  7 14:28 2015
CLEAR? yes

UNREF FILE I=2327440  OWNER=0 MODE=100644
SIZE=0 MTIME=Feb  7 14:28 2015
CLEAR? yes

UNREF FILE I=2327441  OWNER=0 MODE=100644
SIZE=0 MTIME=Feb  7 14:28 2015
CLEAR? yes

UNREF FILE I=2327442  OWNER=0 MODE=100644
SIZE=0 MTIME=Feb  7 14:28 2015
CLEAR? yes

UNREF FILE  I=2327452  OWNER=0 MODE=100644
SIZE=0 MTIME=Feb  7 14:28 2015
RECONNECT? yes

NO lost+found DIRECTORY
CREATE? yes

** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPERBLK
SALVAGE? yes

SUMMARY INFORMATION BAD
SALVAGE? yes

BLK(S) MISSING IN BIT MAPS
SALVAGE? yes

6313 files, 65408 used, 6012987 free (451 frags, 751567 blocks, 0.0% fragmentation)

***** FILE SYSTEM STILL DIRTY *****

***** FILE SYSTEM WAS MODIFIED *****

***** PLEASE RERUN FSCK *****
** /dev/ada0s1a
** Last Mounted on /
** Root file system
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
6313 files, 65408 used, 6012987 free (451 frags, 751567 blocks, 0.0% fragmentation)

***** FILE SYSTEM MARKED CLEAN *****
Disabling APM on /dev/ad4
pwd_mkdb: root gid is incorrect
pwd_mkdb: at line #1
pwd_mkdb: /etc/master.passwd: Inappropriate file type or format

     ___
 ___/ f \
/ p \___/ Sense
\___/   \
    \___/

Welcome to pfSense 2.2-RELEASE  ...

savecore: reboot
savecore: writing core to /var/crash/textdump.tar.1
Creating symlinks......ELF ldconfig path: /lib /usr/lib /usr/lib/compat /usr/local/lib
32-bit compatibility ldconfig path: /usr/lib32
done.
Feb  7 14:30:44 system[253]: [ERROR] [pool lighty] cannot get uid for user 'root'
[ERROR] [pool lighty] cannot get uid for user 'root'
Feb  7 14:30:44 system[253]: [ERROR] FPM initialization failed
[ERROR] FPM initialization failed
fcgicli: Could not connect to server(/var/run/php-fpm.socket).
Launching the init system... done.
Initializing...................... done.
Starting device manager (devd)...
Warning: chown(): Unable to find uid for root in /etc/inc/config.lib.inc on line 867

Warning: chgrp(): Unable to find gid for proxy in /etc/inc/config.lib.inc on line 868
done.
Loading configuration......done.
Updating configuration...done.
Cleaning backup cache.................................done.
Setting up extended sysctls...done.
Setting timezone...done.
Configuring loopback interface...done.
Starting syslog...done.
Starting Secure Shell Services...done.
Setting up polling defaults...done.
Setting up interfaces microcode...done.
Configuring loopback interface...done.
Creating wireless clone interfaces...done.
Configuring LAGG interfaces...done.
Configuring VLAN interfaces...done.
Configuring QinQ interfaces...done.
Configuring WAN interface...done.
Configuring MODEMACCESS interface...done.
Configuring LAN interface...Starting DNS Resolver...done.
Starting DHCPv6 service...done.
done.
Configuring CARP settings...done.
Syncing OpenVPN settings...done.
Configuring firewall......done.
Starting PFLOG...done.
Setting up gateway monitors...done.
Synchronizing user settings...done.
Starting webConfigurator...done.
Configuring CRON...done.
Starting DNS Resolver...done.
Starting NTP time client...done.
pgrep: Invalid pid in file `/var/dhcpd/var/run/dhcpd.pid'
Starting DHCP service...done.
Starting DHCPv6 service...done.
Configuring firewall......done.
Starting SNMP daemon... done.
Generating RRD graphs...
Warning: chown(): Unable to find uid for nobody in /etc/inc/rrd.inc on line 289

Fatal trap 9: general protection fault while in kernel mode
cpuid = 1; apic id = 01
instruction pointer     = 0x20:0xffffffff80b6d4e5
stack pointer           = 0x28:0xfffffe003609f840
frame pointer           = 0x28:0xfffffe003609f850
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 32124 (bsnmpd)
[ thread pid 32124 tid 100104 ]
Stopped at      strlcpy+0x25:   movb    (%rax),%dl
db:0:kdb.enter.default> textdump set
textdump set
db:0:kdb.enter.default>  capture on
db:0:kdb.enter.default>  run lockinfo
db:1:lockinfo> show locks
No such command
db:1:locks>  show alllocks
No such command
db:1:alllocks>  show lockedvnods
Locked vnodes
db:0:kdb.enter.default>  show pcpu
cpuid        = 1
dynamic pcpu = 0xfffffe0098d5a700
curthread    = 0xfffff8000a4ea490: pid 32124 "bsnmpd" 
curpcb       = 0xfffffe003609fcc0
fpcurthread  = 0xfffff8000a4ea490: pid 32124 "bsnmpd" 
idlethread   = 0xfffff8000320e920: tid 100004 "idle: cpu1" 
curpmap      = 0xfffff800032199f8
tssp         = 0xffffffff8218d078
commontssp   = 0xffffffff8218d078
rsp0         = 0xfffffe003609fcc0
gs32p        = 0xffffffff8218ead0
ldt          = 0xffffffff8218eb10
tss          = 0xffffffff8218eb00
db:0:kdb.enter.default>  bt
Tracing pid 32124 tid 100104 td 0xfffff8000a4ea490
strlcpy() at strlcpy+0x25/frame 0xfffffe003609f850
sysctl_rman() at sysctl_rman+0x1e1/frame 0xfffffe003609f930
sysctl_root() at sysctl_root+0x232/frame 0xfffffe003609f980
userland_sysctl() at userland_sysctl+0x1d8/frame 0xfffffe003609fa30
sys___sysctl() at sys___sysctl+0x74/frame 0xfffffe003609fae0
amd64_syscall() at amd64_syscall+0x351/frame 0xfffffe003609fbf0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe003609fbf0
--- syscall (202, FreeBSD ELF64, sys___sysctl), rip = 0x800fb598a, rsp = 0x7fffffffa3d8, rbp = 0x7fffffffa410 ---
db:0:kdb.enter.default>  ps
  pid  ppid  pgrp   uid   state   wmesg         wchan        cmd
32124     1 32124     0  Rs      CPU 1                       bsnmpd
31400 23140    21     0  S+      kqread   0xfffff8000aad2300 ntpdate
23140     1    21     0  S+      wait     0xfffff8000a8f8980 sh
21737     1 21444     0  S       kqread   0xfffff8000a353100 lighttpd
16868     1 16868     0  Ss      select   0xfffff8000a3704c0 inetd
16061     1 16061     0  Ss      bpf      0xfffff8000a32a600 filterlog
 6840     1  6840     0  Ss      (threaded)                  mpd5
100106                   S       select   0xfffff8000a3732c0 mpd5
 5966     1  5966     0  Ss      select   0xfffff8000a3717c0 syslogd
  269     1   269     0  Ss      select   0xfffff8000a3719c0 devd
  261    21    21     0  R+      CPU 0                       php
  258   256   256     0  S       kqread   0xfffff8000a31fa00 check_reload_status
  256     1   256     0  Ss      kqread   0xfffff8000a320300 check_reload_status
   67     0     0     0  DL      mdwait   0xfffff8000a2f7000 [md0]
   21     1    21     0  Ss+     pause    0xfffff8000a34a0a8 sh
   20     0     0     0  DL      syncer   0xffffffff81faef08 [syncer]
   19     0     0     0  DL      vlruwt   0xfffff8000a34a980 [vnlru]
   18     0     0     0  DL      psleep   0xffffffff81fae104 [bufdaemon]
   17     0     0     0  DL      pgzero   0xffffffff82100e8c [pagezero]
    9     0     0     0  DL      pollid   0xffffffff81f5c8f0 [idlepoll]
    8     0     0     0  DL      psleep   0xffffffff821005c0 [vmdaemon]
    7     0     0     0  DL      psleep   0xffffffff8218c384 [pagedaemon]
    6     0     0     0  DL      waiting_ 0xffffffff8217cdf0 [sctp_iterator]
    5     0     0     0  DL      pftm     0xffffffff80cff710 [pf purge]
   16     0     0     0  DL      (threaded)                  [usb]
100072                   D       -        0xfffffe0000f93010 [ucom]
100063                   D       -        0xfffffe0000976e18 [usbus6]
100062                   D       -        0xfffffe0000976dc0 [usbus6]
100061                   D       -        0xfffffe0000976d68 [usbus6]
100060                   D       -        0xfffffe0000976d10 [usbus6]
100059                   D       -        0xfffffe0000981460 [usbus5]
100058                   D       -        0xfffffe0000981408 [usbus5]
100057                   D       -        0xfffffe00009813b0 [usbus5]
100056                   D       -        0xfffffe0000981358 [usbus5]
100055                   D       -        0xfffffe000096d460 [usbus4]
100054                   D       -        0xfffffe000096d408 [usbus4]
100053                   D       -        0xfffffe000096d3b0 [usbus4]
100052                   D       -        0xfffffe000096d358 [usbus4]
100049                   D       -        0xfffffe0000962e18 [usbus3]
100048                   D       -        0xfffffe0000962dc0 [usbus3]
100047                   D       -        0xfffffe0000962d68 [usbus3]
100046                   D       -        0xfffffe0000962d10 [usbus3]
100045                   D       -        0xfffffe0000959460 [usbus2]
100044                   D       -        0xfffffe0000959408 [usbus2]
100043                   D       -        0xfffffe00009593b0 [usbus2]
100042                   D       -        0xfffffe0000959358 [usbus2]
100041                   D       -        0xfffffe000092ce18 [usbus1]
100040                   D       -        0xfffffe000092cdc0 [usbus1]
100039                   D       -        0xfffffe000092cd68 [usbus1]
100038                   D       -        0xfffffe000092cd10 [usbus1]
100036                   D       -        0xfffffe0000923460 [usbus0]
100035                   D       -        0xfffffe0000923408 [usbus0]
100034                   D       -        0xfffffe00009233b0 [usbus0]
100033                   D       -        0xfffffe0000923358 [usbus0]
    4     0     0     0  DL      (threaded)                  [cam]
100071                   D       -        0xffffffff81e96ac0 [scanner]
100027                   D       -        0xffffffff81e96c80 [doneq0]
    3     0     0     0  DL      crypto_r 0xffffffff820fea90 [crypto returns]
    2     0     0     0  DL      crypto_w 0xffffffff820fe938 [crypto]
   15     0     0     0  DL      -        0xffffffff81eb4180 [rand_harvestq]
   14     0     0     0  DL      (threaded)                  [geom]
100013                   D       -        0xffffffff82171560 [g_down]
100012                   D       -        0xffffffff82171558 [g_up]
100011                   D       -        0xffffffff82171550 [g_event]
   13     0     0     0  DL      (threaded)                  [ng_queue]
100010                   D       sleep    0xffffffff81e54fc8 [ng_queue1]
100009                   D       sleep    0xffffffff81e54fc8 [ng_queue0]
   12     0     0     0  WL      (threaded)                  [intr]
100080                   I                                   [swi1: netisr 1]
100069                   I                                   [swi1: pfsync]
100067                   I                                   [swi1: pf send]
100064                   I                                   [swi0: uart uart]
100051                   I                                   [irq15: ata1]
100050                   I                                   [irq14: ata0]
100037                   I                                   [irq17: ehci0 ehci1+]
100032                   I                                   [irq18: ohci0 ohci1*]
100031                   I                                   [irq19: ahci0]
100030                   I                                   [irq261: re2]
100029                   I                                   [irq260: re1]
100028                   I                                   [irq259: re0]
100025                   I                                   [swi5: fast taskq]
100023                   I                                   [swi6: Giant taskq]
100021                   I                                   [swi6: task queue]
100008                   I                                   [swi3: vm]
100007                   I                                   [swi4: clock]
100006                   I                                   [swi4: clock]
100005                   I                                   [swi1: netisr 0]
   11     0     0     0  RL      (threaded)                  [idle]
100004                   CanRun                              [idle: cpu1]
100003                   CanRun                              [idle: cpu0]
    1     0     1     0  SLs     wait     0xfffff800032084c0 [init]
   10     0     0     0  DL      audit_wo 0xffffffff82183970 [audit]
    0     0     0     0  DLs     (threaded)                  [kernel]
100070                   D       -        0xfffff800032b1000 [CAM taskq]
100065                   D       -        0xfffff8000a054900 [mca taskq]
100026                   D       -        0xfffff800032b1200 [kqueue taskq]
100024                   D       -        0xfffff800032b1700 [thread taskq]
100022                   D       -        0xfffff800032b1c00 [ffs_trim taskq]
100020                   D       -        0xfffff800032b2400 [acpi_task_2]
100019                   D       -        0xfffff800032b2400 [acpi_task_1]
100018                   D       -        0xfffff800032b2400 [acpi_task_0]
100014                   D       -        0xfffff800031fa500 [firmware taskq]
100000                   D       swapin   0xffffffff82171658 [swapper]
db:0:kdb.enter.default>  alltrace

Tracing command bsnmpd pid 32124 tid 100104 td 0xfffff8000a4ea490
strlcpy() at strlcpy+0x25/frame 0xfffffe003609f850
sysctl_rman() at sysctl_rman+0x1e1/frame 0xfffffe003609f930
sysctl_root() at sysctl_root+0x232/frame 0xfffffe003609f980
userland_sysctl() at userland_sysctl+0x1d8/frame 0xfffffe003609fa30
sys___sysctl() at sys___sysctl+0x74/frame 0xfffffe003609fae0
amd64_syscall() at amd64_syscall+0x351/frame 0xfffffe003609fbf0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe003609fbf0
--- syscall (202, FreeBSD ELF64, sys___sysctl), rip = 0x800fb598a, rsp = 0x7fffffffa3d8, rbp = 0x7fffffffa410 ---
...

Files

pac_PID5779_n2.txt (68 KB) pac_PID5779_n2.txt Jim Pingle, 02/11/2015 12:41 PM
pfSense_pagefault_bsnmpd.txt (65.8 KB) pfSense_pagefault_bsnmpd.txt Marcel Janicki, 02/14/2015 08:20 AM
snmp_bug_boot_log.txt (41.9 KB) snmp_bug_boot_log.txt Console output when APU board boots with SNMP daemon enabled Stefan Nunninger, 03/09/2015 07:12 AM
crash-log.zip (9.25 KB) crash-log.zip crash log Guillaume Leroy, 03/11/2015 06:42 AM
snmp-apu-hostres.diff (765 Bytes) snmp-apu-hostres.diff Jim Pingle, 11/23/2015 03:39 PM
upgrade.log (173 KB) upgrade.log Guillaume Leroy, 11/23/2015 05:12 PM
Actions #1

Updated by Chris Buechler almost 10 years ago

  • Subject changed from Enabling SNMP Freezes System / Unbootable after Power Reset to Enabling SNMP causes kernel panic with APU in some circumstance
  • Status changed from New to Confirmed
  • Priority changed from Urgent to High

enough people have reported this that it's clearly an issue in some circumstance. I'm not sure what that circumstance is though. It's as simple for some as just enabling SNMP, don't even have to query it, and it kernel panics. Seems to be exclusive to the APU, but it didn't happen to me when enabling on an APU.

There are 2 diffs between my APU and what's shown here. Mine was on SD when I tried it, where the SD slot is empty here (thinking this is likely cause), and this has a 3G/4G/LTE modem in it. I'll install on a mSATA in my APU, remove the SD card, and see if that's replicable.

Actions #2

Updated by Jim Pingle almost 10 years ago

I can reproduce it on my APU now as well. Fresh install on mSATA, no SD card inserted, using the factory image.

Seems to only happen if the "Host Resources" module is selected, which requires mibII, but mibII alone is not enough to trigger the crash.

Backtrace looks about the same.

It wouldn't be the first time having a drive present with no media could crash via bsnmpd (cd devices without media mounted on ESX come to mind)

Actions #3

Updated by Ermal Luçi almost 10 years ago

does sysctl hw.bus return a result?

Actions #4

Updated by Jim Pingle almost 10 years ago

: sysctl hw.bus
hw.bus.devctl_disable: 0
hw.bus.devctl_queue: 1000

sysctl -a works also.

Actions #5

Updated by Andreas Walther almost 10 years ago

Well the first crash after update from 2.1 to 2.2 was with a sd card as the disk and a mini pcie 3g modem installed.
The second time with a msata module, mini pcie 3g modem and without a sd card.

So i think it has nothing to do with the empty sd card slot.
Jim P did u had a mini pci-e card installed?

As far as i understand Chris Buechler did not had a crash like this with a "empty" board and a sd card.
So maybe it has something to do with populated pci express bus?

Actions #6

Updated by Jim Pingle almost 10 years ago

I don't have an SD card in, but I do have a Mini-PCIe wireless card.

Actions #7

Updated by Chris Buechler almost 10 years ago

  • Subject changed from Enabling SNMP causes kernel panic with APU in some circumstance to Enabling SNMP causes kernel panic with APU with empty SD card slot
  • Target version set to 2.2.1

the only scenario we've been able to replicate is with no SD card installed. It's easily replicable by just removing the SD card. I have a wifi card in mine. Haven't tried without it installed, but it doesn't seem to be specific to any particular add-on hardware.

There might be some other circumstance with the same symptoms.

Andreas: could you do some experimentation with your combination of hardware? See if it's the same with only the mSATA on the board. Add the cellular card and see what happens. Then put a SD card in it (with or without the mSATA SSD) and see what happens there.

Actions #8

Updated by Andreas Walther almost 10 years ago

Chris Buechler wrote:

Andreas: could you do some experimentation with your combination of hardware? See if it's the same with only the mSATA on the board. Add the cellular card and see what happens. Then put a SD card in it (with or without the mSATA SSD) and see what happens there. Just running "sysctl -a" will suffice to upgrade, and won't leave you with a bricked setup like enabling SNMP does.

Yes i can do that.
I will do it on the weekend so you have to wait a little.

Actions #9

Updated by Andreas Walther almost 10 years ago

I just started to test what combinations of hardware let this crash happen, but the command "systcl -a" is not crashing the system.

Actions #10

Updated by Marcel Janicki almost 10 years ago

Same for me during the upgrade from 2.1.5 (amd64) to 2.2 (amd64) on a APU.1C4 (4 GB).
Retried it successfully with a previously disabled bsnmpd. After the upgrade I enabled the snmp-daemon and the kernel crashed instantly.
pfSense is on a SD card, no mSATA SSD, no 3G/4G/LTE modem - but a Compex WLE200NX WLAN card.

Actions #11

Updated by Chris Buechler almost 10 years ago

I misunderstood JimP's earlier comment, running 'sysctl -a' won't panic it in the way enabling SNMP will.

Actions #12

Updated by Guillaume Leroy almost 10 years ago

I am running on a SD card (and without any other card) and I am encountering the problem.

Actions #13

Updated by Chris Buechler over 9 years ago

  • Assignee set to Ermal Luçi
Actions #14

Updated by Stefan Nunninger over 9 years ago

I am experiencing this same bug.
I attach a log of the console output.
I am running pfsense 2.2 on an Alix APU board without SD-card. The system is stored on a MSATA SSD card.

Could somebody please eplain how to work around the boot loop.
I can boot a recovery system with pfsense from a USB stick.
Where can I disable the SNMP daemon to prevent the system to hang during boot?

Actions #15

Updated by Jim Pingle over 9 years ago

To work around it, you can rename the bsnmpd binary or otherwise disable it. For example:

mv /usr/sbin/bsnmpd /usr/sbin/bsnmpd.tmp
Actions #16

Updated by Stefan Nunninger over 9 years ago

Jim P wrote:

To work around it, you can rename the bsnmpd binary or otherwise disable it. For example:

mv /usr/sbin/bsnmpd /usr/sbin/bsnmpd.tmp

I tried to do so. This is what I did precisely:

  • Boot with pfsense from USB stick
  • login on console
  • fsck -y -t ufs /dev/ad4s1 # fix the filesystem otherwise cannot mount it
  • mount /dev/ad4s1 /mnt
  • mv /mnt/sbin/bsnmpd /mnt/sbin/bsnmpd.tmp
  • umount /mnt
  • halt
  • wait until system has stopped
  • remove USB stick
  • powercycle system to start again

Now the boot does not crash anymore but the console asks for a password and does not accept the admin password,
This is strange because I did not use a password on the console before.

Loging in on the website says: "503 - Service Not Available"

To did reset the password according to this desription:
https://doc.pfsense.org/index.php/I_locked_myself_out_of_the_WebGUI,_help!#Forgotten_Password_with_Locked_Console

However still I cannot login with admin/pfsense on the console.

It seems renaming the file /usr/sbin/bsnmpd had some strange side effects.

Can somebody please give me a hint what might go wrong.

Actions #17

Updated by Jim Pingle over 9 years ago

The filesystem in /etc was likely corrupted as a result of the repeated panics. Reinstalling is the safest recovery method. Restore a backup with SNMP disabled. Follow up on the forum for further assistance, it's too far removed from this bug report to discuss it more here.

Actions #18

Updated by Chris Buechler over 9 years ago

  • Assignee changed from Ermal Luçi to Renato Botelho

A quick, low-risk work around for this is to use the APU detection to skip starting SNMP on an APU that doesn't have a SD card installed (or isn't running from one), and log an error instead. That should suffice for 2.2.1, at least not crashing the system. Post-2.2.1 it'll need further review to fix the root cause and remove the workaround.

Renato, go ahead with that work around.

Actions #19

Updated by Jim Pingle over 9 years ago

From my testing it should be enough to skip only the hostres module on APU, other SNMP modules appeared to be OK, and that way people could still get some use out of SNMP on APU in the meantime.

Actions #20

Updated by Guillaume Leroy over 9 years ago

I don't agree : we noticed that the problem also occurs with SD card based APU setups. This is my case.

And btw I would like to have SNMP working on my APU devices too as it used to in the previous releases, at least the standard MIBs/branches (especially interfaces).
Isn't this kernel panic only caused by a specific SNMP agent module that we could unload / remove from the compilation or from bsnmpd configuration ?

Actions #21

Updated by Guillaume Leroy over 9 years ago

Jim P wrote:

From my testing it should be enough to skip only the hostres module on APU, other SNMP modules appeared to be OK, and that way people could still get some use out of SNMP on APU in the meantime.

+1 :-)

Actions #22

Updated by Jim Pingle over 9 years ago

Guillaume Leroy wrote:

Isn't this kernel panic only caused by a specific SNMP agent module that we could unload / remove from the compilation or from bsnmpd configuration ?

I mentioned that finding way back in Note 2 above (#4403-2)

Actions #23

Updated by Renato Botelho over 9 years ago

  • Status changed from Confirmed to Feedback

Added a conditional to skip hostres on APU for now

Actions #24

Updated by Guillaume Leroy over 9 years ago

Jim P wrote:

Guillaume Leroy wrote:

Isn't this kernel panic only caused by a specific SNMP agent module that we could unload / remove from the compilation or from bsnmpd configuration ?

I mentioned that finding way back in Note 2 above (#4403-2)

Very good work indeed !
I am now running 2.2 fine with bsnmpd enabled but Host Resources MIB module disabled.
Thanks.

Actions #25

Updated by Chris Buechler over 9 years ago

  • Status changed from Feedback to Confirmed
  • Assignee changed from Renato Botelho to Ermal Luçi
  • Target version changed from 2.2.1 to 2.2.2

Confirmed that works around it for now, moving this to 2.2.2 for a proper fix.

Actions #26

Updated by Chris Buechler over 9 years ago

  • Target version changed from 2.2.2 to 2.2.3
Actions #27

Updated by Matt Meyer over 9 years ago

I've just hit this issue myself using an ALIX 2D13. There are no other devices except for the CF card.

Actions #28

Updated by Chris Buechler over 9 years ago

Matt: haven't heard of it on ALIX but same could impact it also. does disabling the host resources MIB prevent the issue for you?

Actions #29

Updated by Chris Buechler over 9 years ago

  • Target version changed from 2.2.3 to 2.3
Actions #30

Updated by Ermal Luçi over 9 years ago

Actions #31

Updated by Matt Meyer over 9 years ago

Chris Buechler wrote:

Matt: haven't heard of it on ALIX but same could impact it also. does disabling the host resources MIB prevent the issue for you?

In my case this didn't help. I have since reverted to 2.1.5 but could upgrade again to test any fixes.

Actions #33

Updated by Jim Thompson about 9 years ago

  • Assignee changed from Ermal Luçi to Renato Botelho

reassign to Renato. Maybe this is fixed in FreeBSD 10.2

Actions #34

Updated by Jim Thompson about 9 years ago

  • Assignee changed from Renato Botelho to Chris Buechler

now reassigned to cmb

Actions #35

Updated by Chris Buechler about 9 years ago

can anyone still replicate this? Going back to 2.2.0-REL, full install on mSATA, no SD card, with or without an ath card, it doesn't happen. OP's APU is 2 GB and I'm on a 4 GB one, so tried setting hw.physmem in loader.conf to make it look like it has 2 GB RAM in case that was related, no dice. On 2.2.5 after removing the code that omits hostres on APUs it's fine as well. Also no issue on 2.3, but given I can't replicate where it was happening to others, that's not helpful.

Actions #36

Updated by Marcel Janicki almost 9 years ago

Configuration: pfSense 2.2.5 amd64 on a SD card, no mSATA SSD, APU.1C4, 4 GB, no 3G/4G/LTE modem but a Compex WLE200NX WLAN card (Atheros).
I've re-enabled the SNMP module Host Resources and the kernel didn't crash.
Thank you!

Actions #37

Updated by Guillaume Leroy almost 9 years ago

Almost same config on my side with an APU 1c, the nanobsd based system running on a SD card and nothing else.
I re-enabled the Host Resources module in the config and the system didn't crash. However I am not able to get any reply when polling the Host MIB either.
So I suppose the module is not really enabled probably because of the anticrash lock feature added when the issue was reported on the APU.

And I indeed don't see the module in the snmpd config file :

#cat /var/etc/snmpd.conf
...
snmpEnableAuthenTraps = 2
begemotSnmpdModulePath."mibII"  = "/usr/lib/snmp_mibII.so" 
begemotSnmpdModulePath."netgraph" = "/usr/lib/snmp_netgraph.so" 
%netgraph
begemotNgControlNodeName = "snmpd" 
begemotSnmpdModulePath."pf"     = "/usr/lib/snmp_pf.so" 
begemotSnmpdModulePath."ucd"     = "/usr/local/lib/snmp_ucd.so" 
begemotSnmpdModulePath."regex"     = "/usr/local/lib/snmp_regex.so" 

Marcel Janicki wrote:

I've re-enabled the SNMP module Host Resources and the kernel didn't crash.

Marcel, did you really check that you are now able to poll the Host MIB ?

Actions #38

Updated by Guillaume Leroy almost 9 years ago

OK, I've just run a manual test as I don't know how to force loading the module with pfSense.
I stopped the bsnmd instance started by pfSense and manually started another instance loading the system /etc/snmpd.config file with the host resources module enabled:

begemotSnmpdModulePath."hostres" = "/usr/lib/snmp_hostres.so" 

And the system crashed immediately.
So the problem is still there in 2.2.5 with APU devices.

Actions #39

Updated by Chris Buechler almost 9 years ago

Guillaume: could you try that same test on latest 2.3 and report back please?

Marcel: stock 2.2.1 and newer won't crash because hostres is skipped on APUs automatically, so you have to either remove that code or manually enable it.

Actions #40

Updated by Jim Pingle almost 9 years ago

I tested this on 2.3 today. I removed the APU check and confirmed hostres was present in the config. snmp started up and I was able to perform a full snmpwalk against it. No problems at all. Rebooted the unit and it still was running OK. Can wait for additional confirmation but it looks to me like this has been solved in FreeBSD 10.2.

The attached patch can be applied to remove the APU-specific check in services.inc so that hostres will be enabled on APU for those who want to test on 2.3.

I was able to easily reproduce the panic on 2.2.x on the same hardware (see my earlier notes)

Actions #41

Updated by Guillaume Leroy almost 9 years ago

Good news...
However, on my side, I have not been able to successfully complete the upgrade to 2.3-ALPHA, a 45min lasting upgrade process get me back a broken system with a lot of php errors (see attached if interested). :-(

Actions #42

Updated by Chris Buechler almost 9 years ago

  • Status changed from Feedback to Resolved
  • Affected Version changed from 2.2 to 2.2.x

Guillaume: start a thread on the 2.3 board on the forum and we can review that.

Since JimP can confirm, we're good here.

Actions #43

Updated by Marcel Janicki almost 9 years ago

Guillaume, I didn't realized that the hostres module is still skipped on APUs..
Hence I removed the APU check in services.inc and the kernel crashed instantly (2.2.5, FreeBSD 10.1).
Good to hear that it's fixed with 10.2.

Actions

Also available in: Atom PDF