Project

General

Profile

Bug #9058

crash in l2tp retransmit

Added by Bianco Veigel 12 months ago. Updated 14 days ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
L2TP
Target version:
Start date:
10/23/2018
Due date:
% Done:

0%

Estimated time:
Affected Version:
All
Affected Architecture:
amd64

Description

I'm using a Multilink L2TP WAN. After a fresh reinstall of 2.4.4 and completely new config (no import) it crashes regularly. The crashdump always has the same stacktrace showing a page fault while copying a packet for retransmitting after a timeout.

Fatal trap 12: page fault while in kernel mode
...
current process        = 13 (ng_queue1)
...
Tracing pid 13 tid 100022 td 0xfffff80003551620
m_copypacket() at m_copypacket+0x16/frame 0xfffffe00797d3ab0
ng_l2tp_seq_rack_timeout() at ng_l2tp_seq_rack_timeout+0x164/frame 0xfffffe00797d3af0
ng_apply_item() at ng_apply_item+0x8f/frame 0xfffffe00797d3b70
ngthread() at ngthread+0x10a/frame 0xfffffe00797d3bb0
fork_exit() at fork_exit+0x83/frame 0xfffffe00797d3bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00797d3bf0
info.0 (539 Bytes) info.0 Bianco Veigel, 10/23/2018 06:38 AM
textdump.tar.0 (101 KB) textdump.tar.0 Bianco Veigel, 10/23/2018 06:38 AM

History

#1 Updated by Jim Thompson 12 months ago

  • Priority changed from High to Normal

#2 Updated by Jim Pingle 12 months ago

I saw a crash with a backtrace like that once on a test VM with an L2TP WAN but only one time, not repeatedly, so I chalked it up to a random power or other VM-specific event.

Approximately how often is "regularly"? Have you noticed any pattern to when it happens? Is it after a certain amount of traffic, certain amount of time, or randomly?

#3 Updated by Bianco Veigel 12 months ago

Right now it happens at least once a day, but at random times. I'll check if the amount of traffic might be related.

#4 Updated by Bianco Veigel 12 months ago

After a few more crashes with different error messages, I ran a memory test, which showed errors. RAM is replaced and I will report back if the error occurs again, but right now I think this can be closed.

#5 Updated by Jim Pingle 12 months ago

  • Status changed from New to Feedback

OK, we'll wait for some more feedback here to see what happens.

#6 Updated by Bianco Veigel 12 months ago

Thanks for waiting. My pfsense crashed two times in the last two days. From the monitoring (telegraf, 300s interval) it seems not related to
  • Time (20:00 & 03:10)
  • Uptime (56h & 31h)
  • Packout count (10/9M & 7/5M)
  • Byte count (4.1/2.5GB & 3.1/1.1GB)

The byte/s and packet/s also have much higher peeks between the crashes.

#7 Updated by Jim Pingle 12 months ago

  • Status changed from Feedback to New

OK, and is the backtrace in the crash report always the same?

I have not seen a recurrence of this on my local setup but it does not transmit much of anything over L2TP.

#8 Updated by Bianco Veigel 12 months ago

yes it's always the same (except the hex addresses)

Tracing pid 13 tid 100022 td 0xfffff800039b4000
m_copypacket() at m_copypacket+0x16/frame 0xfffffe01195f7ab0
ng_l2tp_seq_rack_timeout() at ng_l2tp_seq_rack_timeout+0x164/frame 0xfffffe01195f7af0
ng_apply_item() at ng_apply_item+0x8f/frame 0xfffffe01195f7b70
ngthread() at ngthread+0x10a/frame 0xfffffe01195f7bb0
fork_exit() at fork_exit+0x83/frame 0xfffffe01195f7bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe01195f7bf0

#9 Updated by Bianco Veigel 12 months ago

This seems to be an upstream bug in FreeBSD mpd5 - today I got the same crash on my L2TP Server (FreeBSD 11.2-RELEASE-p2 / mpd5 5.8):

Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address      = 0x1c
fault code         = supervisor read data, page not present
instruction pointer        = 0x20:0xffffffff80b79f06
stack pointer              = 0x28:0xfffffe0094ab8a60
frame pointer              = 0x28:0xfffffe0094ab8aa0
code segment               = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags   = interrupt enabled, resume, IOPL = 0
current process            = 619 (ng_queue2)
trap number                = 12
panic: page fault
cpuid = 2
KDB: stack backtrace:
#0 0xffffffff80b3d567 at kdb_backtrace+0x67
#1 0xffffffff80af6b07 at vpanic+0x177
#2 0xffffffff80af6983 at panic+0x43
#3 0xffffffff80f77fcf at trap_fatal+0x35f
#4 0xffffffff80f78029 at trap_pfault+0x49
#5 0xffffffff80f777f7 at trap+0x2c7
#6 0xffffffff80f57dac at calltrap+0x8
#7 0xffffffff82835bf4 at ng_l2tp_seq_rack_timeout+0x164
#8 0xffffffff8282868d at ng_apply_item+0xcd
#9 0xffffffff8282b084 at ngthread+0x1a4
#10 0xffffffff80aba083 at fork_exit+0x83
#11 0xffffffff80f58cce at fork_trampoline+0xe

#10 Updated by Jim Pingle 5 months ago

  • Target version set to 2.5.0
  • Affected Version changed from 2.4.4 to All

This is still happening on FreeBSD 12/pfSense 2.5.0. Same backtrace:

db:0:kdb.enter.default>  show pcpu
cpuid        = 1
dynamic pcpu = 0xfffffe007db41c80
curthread    = 0xfffff80004166000: pid 13 tid 100023 "ng_queue1" 
curpcb       = 0xfffffe000c798cc0
fpcurthread  = none
idlethread   = 0xfffff8000411d580: tid 100004 "idle: cpu1" 
curpmap      = 0xffffffff82c884c8
tssp         = 0xffffffff82db3988
commontssp   = 0xffffffff82db3988
rsp0         = 0xfffffe000c798cc0
gs32p        = 0xffffffff82dba5c0
ldt          = 0xffffffff82dba600
tss          = 0xffffffff82dba5f0
curvnet      = 0xfffff8000406d540
db:0:kdb.enter.default>  bt
Tracing pid 13 tid 100023 td 0xfffff80004166000
m_copypacket() at m_copypacket+0x16/frame 0xfffffe000c798aa0
ng_l2tp_seq_rack_timeout() at ng_l2tp_seq_rack_timeout+0x1aa/frame 0xfffffe000c798ae0
ng_apply_item() at ng_apply_item+0x8f/frame 0xfffffe000c798b60
ngthread() at ngthread+0x12a/frame 0xfffffe000c798bb0
fork_exit() at fork_exit+0x83/frame 0xfffffe000c798bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe000c798bf0

I have a few saved crash reports if needed. Seems to happen once every couple weeks for me, so it's difficult to know if it's solved.

Also available in: Atom PDF