Project

General

Profile

Bug #9058

crash in l2tp retransmit

Added by Bianco Veigel 5 months ago. Updated 5 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
L2TP
Target version:
-
Start date:
10/23/2018
Due date:
% Done:

0%

Estimated time:
Affected Version:
2.4.4
Affected Architecture:
amd64

Description

I'm using a Multilink L2TP WAN. After a fresh reinstall of 2.4.4 and completely new config (no import) it crashes regularly. The crashdump always has the same stacktrace showing a page fault while copying a packet for retransmitting after a timeout.

Fatal trap 12: page fault while in kernel mode
...
current process        = 13 (ng_queue1)
...
Tracing pid 13 tid 100022 td 0xfffff80003551620
m_copypacket() at m_copypacket+0x16/frame 0xfffffe00797d3ab0
ng_l2tp_seq_rack_timeout() at ng_l2tp_seq_rack_timeout+0x164/frame 0xfffffe00797d3af0
ng_apply_item() at ng_apply_item+0x8f/frame 0xfffffe00797d3b70
ngthread() at ngthread+0x10a/frame 0xfffffe00797d3bb0
fork_exit() at fork_exit+0x83/frame 0xfffffe00797d3bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00797d3bf0
info.0 (539 Bytes) info.0 Bianco Veigel, 10/23/2018 06:38 AM
textdump.tar.0 (101 KB) textdump.tar.0 Bianco Veigel, 10/23/2018 06:38 AM

History

#1 Updated by Jim Thompson 5 months ago

  • Priority changed from High to Normal

#2 Updated by Jim Pingle 5 months ago

I saw a crash with a backtrace like that once on a test VM with an L2TP WAN but only one time, not repeatedly, so I chalked it up to a random power or other VM-specific event.

Approximately how often is "regularly"? Have you noticed any pattern to when it happens? Is it after a certain amount of traffic, certain amount of time, or randomly?

#3 Updated by Bianco Veigel 5 months ago

Right now it happens at least once a day, but at random times. I'll check if the amount of traffic might be related.

#4 Updated by Bianco Veigel 5 months ago

After a few more crashes with different error messages, I ran a memory test, which showed errors. RAM is replaced and I will report back if the error occurs again, but right now I think this can be closed.

#5 Updated by Jim Pingle 5 months ago

  • Status changed from New to Feedback

OK, we'll wait for some more feedback here to see what happens.

#6 Updated by Bianco Veigel 5 months ago

Thanks for waiting. My pfsense crashed two times in the last two days. From the monitoring (telegraf, 300s interval) it seems not related to
  • Time (20:00 & 03:10)
  • Uptime (56h & 31h)
  • Packout count (10/9M & 7/5M)
  • Byte count (4.1/2.5GB & 3.1/1.1GB)

The byte/s and packet/s also have much higher peeks between the crashes.

#7 Updated by Jim Pingle 5 months ago

  • Status changed from Feedback to New

OK, and is the backtrace in the crash report always the same?

I have not seen a recurrence of this on my local setup but it does not transmit much of anything over L2TP.

#8 Updated by Bianco Veigel 5 months ago

yes it's always the same (except the hex addresses)

Tracing pid 13 tid 100022 td 0xfffff800039b4000
m_copypacket() at m_copypacket+0x16/frame 0xfffffe01195f7ab0
ng_l2tp_seq_rack_timeout() at ng_l2tp_seq_rack_timeout+0x164/frame 0xfffffe01195f7af0
ng_apply_item() at ng_apply_item+0x8f/frame 0xfffffe01195f7b70
ngthread() at ngthread+0x10a/frame 0xfffffe01195f7bb0
fork_exit() at fork_exit+0x83/frame 0xfffffe01195f7bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe01195f7bf0

#9 Updated by Bianco Veigel 5 months ago

This seems to be an upstream bug in FreeBSD mpd5 - today I got the same crash on my L2TP Server (FreeBSD 11.2-RELEASE-p2 / mpd5 5.8):

Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address      = 0x1c
fault code         = supervisor read data, page not present
instruction pointer        = 0x20:0xffffffff80b79f06
stack pointer              = 0x28:0xfffffe0094ab8a60
frame pointer              = 0x28:0xfffffe0094ab8aa0
code segment               = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags   = interrupt enabled, resume, IOPL = 0
current process            = 619 (ng_queue2)
trap number                = 12
panic: page fault
cpuid = 2
KDB: stack backtrace:
#0 0xffffffff80b3d567 at kdb_backtrace+0x67
#1 0xffffffff80af6b07 at vpanic+0x177
#2 0xffffffff80af6983 at panic+0x43
#3 0xffffffff80f77fcf at trap_fatal+0x35f
#4 0xffffffff80f78029 at trap_pfault+0x49
#5 0xffffffff80f777f7 at trap+0x2c7
#6 0xffffffff80f57dac at calltrap+0x8
#7 0xffffffff82835bf4 at ng_l2tp_seq_rack_timeout+0x164
#8 0xffffffff8282868d at ng_apply_item+0xcd
#9 0xffffffff8282b084 at ngthread+0x1a4
#10 0xffffffff80aba083 at fork_exit+0x83
#11 0xffffffff80f58cce at fork_trampoline+0xe

Also available in: Atom PDF