Project

General

Profile

Bug #9058

crash in l2tp retransmit

Added by Bianco Veigel 8 months ago. Updated 22 days ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
L2TP
Target version:
Start date:
10/23/2018
Due date:
% Done:

0%

Estimated time:
Affected Version:
All
Affected Architecture:
amd64

Description

I'm using a Multilink L2TP WAN. After a fresh reinstall of 2.4.4 and completely new config (no import) it crashes regularly. The crashdump always has the same stacktrace showing a page fault while copying a packet for retransmitting after a timeout.

Fatal trap 12: page fault while in kernel mode
...
current process        = 13 (ng_queue1)
...
Tracing pid 13 tid 100022 td 0xfffff80003551620
m_copypacket() at m_copypacket+0x16/frame 0xfffffe00797d3ab0
ng_l2tp_seq_rack_timeout() at ng_l2tp_seq_rack_timeout+0x164/frame 0xfffffe00797d3af0
ng_apply_item() at ng_apply_item+0x8f/frame 0xfffffe00797d3b70
ngthread() at ngthread+0x10a/frame 0xfffffe00797d3bb0
fork_exit() at fork_exit+0x83/frame 0xfffffe00797d3bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00797d3bf0
info.0 (539 Bytes) info.0 Bianco Veigel, 10/23/2018 06:38 AM
textdump.tar.0 (101 KB) textdump.tar.0 Bianco Veigel, 10/23/2018 06:38 AM

History

#1 Updated by Jim Thompson 8 months ago

  • Priority changed from High to Normal

#2 Updated by Jim Pingle 8 months ago

I saw a crash with a backtrace like that once on a test VM with an L2TP WAN but only one time, not repeatedly, so I chalked it up to a random power or other VM-specific event.

Approximately how often is "regularly"? Have you noticed any pattern to when it happens? Is it after a certain amount of traffic, certain amount of time, or randomly?

#3 Updated by Bianco Veigel 8 months ago

Right now it happens at least once a day, but at random times. I'll check if the amount of traffic might be related.

#4 Updated by Bianco Veigel 8 months ago

After a few more crashes with different error messages, I ran a memory test, which showed errors. RAM is replaced and I will report back if the error occurs again, but right now I think this can be closed.

#5 Updated by Jim Pingle 8 months ago

  • Status changed from New to Feedback

OK, we'll wait for some more feedback here to see what happens.

#6 Updated by Bianco Veigel 8 months ago

Thanks for waiting. My pfsense crashed two times in the last two days. From the monitoring (telegraf, 300s interval) it seems not related to
  • Time (20:00 & 03:10)
  • Uptime (56h & 31h)
  • Packout count (10/9M & 7/5M)
  • Byte count (4.1/2.5GB & 3.1/1.1GB)

The byte/s and packet/s also have much higher peeks between the crashes.

#7 Updated by Jim Pingle 8 months ago

  • Status changed from Feedback to New

OK, and is the backtrace in the crash report always the same?

I have not seen a recurrence of this on my local setup but it does not transmit much of anything over L2TP.

#8 Updated by Bianco Veigel 8 months ago

yes it's always the same (except the hex addresses)

Tracing pid 13 tid 100022 td 0xfffff800039b4000
m_copypacket() at m_copypacket+0x16/frame 0xfffffe01195f7ab0
ng_l2tp_seq_rack_timeout() at ng_l2tp_seq_rack_timeout+0x164/frame 0xfffffe01195f7af0
ng_apply_item() at ng_apply_item+0x8f/frame 0xfffffe01195f7b70
ngthread() at ngthread+0x10a/frame 0xfffffe01195f7bb0
fork_exit() at fork_exit+0x83/frame 0xfffffe01195f7bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe01195f7bf0

#9 Updated by Bianco Veigel 8 months ago

This seems to be an upstream bug in FreeBSD mpd5 - today I got the same crash on my L2TP Server (FreeBSD 11.2-RELEASE-p2 / mpd5 5.8):

Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address      = 0x1c
fault code         = supervisor read data, page not present
instruction pointer        = 0x20:0xffffffff80b79f06
stack pointer              = 0x28:0xfffffe0094ab8a60
frame pointer              = 0x28:0xfffffe0094ab8aa0
code segment               = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags   = interrupt enabled, resume, IOPL = 0
current process            = 619 (ng_queue2)
trap number                = 12
panic: page fault
cpuid = 2
KDB: stack backtrace:
#0 0xffffffff80b3d567 at kdb_backtrace+0x67
#1 0xffffffff80af6b07 at vpanic+0x177
#2 0xffffffff80af6983 at panic+0x43
#3 0xffffffff80f77fcf at trap_fatal+0x35f
#4 0xffffffff80f78029 at trap_pfault+0x49
#5 0xffffffff80f777f7 at trap+0x2c7
#6 0xffffffff80f57dac at calltrap+0x8
#7 0xffffffff82835bf4 at ng_l2tp_seq_rack_timeout+0x164
#8 0xffffffff8282868d at ng_apply_item+0xcd
#9 0xffffffff8282b084 at ngthread+0x1a4
#10 0xffffffff80aba083 at fork_exit+0x83
#11 0xffffffff80f58cce at fork_trampoline+0xe

#10 Updated by Jim Pingle 22 days ago

  • Target version set to 2.5.0
  • Affected Version changed from 2.4.4 to All

This is still happening on FreeBSD 12/pfSense 2.5.0. Same backtrace:

db:0:kdb.enter.default>  show pcpu
cpuid        = 1
dynamic pcpu = 0xfffffe007db41c80
curthread    = 0xfffff80004166000: pid 13 tid 100023 "ng_queue1" 
curpcb       = 0xfffffe000c798cc0
fpcurthread  = none
idlethread   = 0xfffff8000411d580: tid 100004 "idle: cpu1" 
curpmap      = 0xffffffff82c884c8
tssp         = 0xffffffff82db3988
commontssp   = 0xffffffff82db3988
rsp0         = 0xfffffe000c798cc0
gs32p        = 0xffffffff82dba5c0
ldt          = 0xffffffff82dba600
tss          = 0xffffffff82dba5f0
curvnet      = 0xfffff8000406d540
db:0:kdb.enter.default>  bt
Tracing pid 13 tid 100023 td 0xfffff80004166000
m_copypacket() at m_copypacket+0x16/frame 0xfffffe000c798aa0
ng_l2tp_seq_rack_timeout() at ng_l2tp_seq_rack_timeout+0x1aa/frame 0xfffffe000c798ae0
ng_apply_item() at ng_apply_item+0x8f/frame 0xfffffe000c798b60
ngthread() at ngthread+0x12a/frame 0xfffffe000c798bb0
fork_exit() at fork_exit+0x83/frame 0xfffffe000c798bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe000c798bf0

I have a few saved crash reports if needed. Seems to happen once every couple weeks for me, so it's difficult to know if it's solved.

Also available in: Atom PDF