Project

General

Profile

Actions

Bug #8499

closed

IPv6 fragment logging causes panic in some circumstances

Added by Steve Wheeler almost 6 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
05/07/2018
Due date:
% Done:

100%

Estimated time:
Plus Target Version:
Release Notes:
Affected Version:
2.4.3
Affected Architecture:
All

Description

From customer ticket #4934.

The system crashes repeatedly with near identical back traces:

db:0:kdb.enter.default> bt
Tracing pid 12 tid 100074 td 0xfffff800036f55c0
strlen() at strlen+0x1f/frame 0xfffffe0059be7c90
kvprintf() at kvprintf+0x9dc/frame 0xfffffe0059be7d90
vlog() at vlog+0x9b/frame 0xfffffe0059be7e70
log() at log+0x3f/frame 0xfffffe0059be7ed0
ip6_forward() at ip6_forward+0xc3/frame 0xfffffe0059be8020
pf_refragment6() at pf_refragment6+0x17a/frame 0xfffffe0059be80e0
pf_test6() at pf_test6+0x14ba/frame 0xfffffe0059be8360
pf_test6() at pf_test6+0x1edb/frame 0xfffffe0059be85e0
pf_check6_in() at pf_check6_in+0x30/frame 0xfffffe0059be8600
pfil_run_hooks() at pfil_run_hooks+0x83/frame 0xfffffe0059be8690
ip6_input() at ip6_input+0xc75/frame 0xfffffe0059be8770
netisr_dispatch_src() at netisr_dispatch_src+0xa0/frame 0xfffffe0059be87c0
ether_demux() at ether_demux+0x16d/frame 0xfffffe0059be87f0
ether_nh_input() at ether_nh_input+0x337/frame 0xfffffe0059be8850
netisr_dispatch_src() at netisr_dispatch_src+0xa0/frame 0xfffffe0059be88a0
ether_input() at ether_input+0x26/frame 0xfffffe0059be88c0
vlan_input() at vlan_input+0x1f0/frame 0xfffffe0059be8940
ether_demux() at ether_demux+0x156/frame 0xfffffe0059be8970
ether_nh_input() at ether_nh_input+0x337/frame 0xfffffe0059be89d0
netisr_dispatch_src() at netisr_dispatch_src+0xa0/frame 0xfffffe0059be8a20
ether_input() at ether_input+0x26/frame 0xfffffe0059be8a40
igb_rxeof() at igb_rxeof+0x6ac/frame 0xfffffe0059be8ad0
igb_msix_que() at igb_msix_que+0x114/frame 0xfffffe0059be8b20
intr_event_execute_handlers() at intr_event_execute_handlers+0xec/frame 0xfffffe0059be8b60
ithread_loop() at ithread_loop+0xd6/frame 0xfffffe0059be8bb0
fork_exit() at fork_exit+0x85/frame 0xfffffe0059be8bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0059be8bf0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
db:0:kdb.enter.default> ps

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x60
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80ccc93f
stack pointer   = 0x28:0xfffffe0059be7c90
frame pointer   = 0x28:0xfffffe0059be7c90
code segment    = base 0x0, limit 0xfffff, type 0x1b
   = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags    = interrupt enabled, resume, IOPL = 0
current process = 12 (irq256: igb0:que 0)

This looks very similar to https://redmine.pfsense.org/issues/5428 except here no bridges are configured.

A PPPoE interface is configured with IPv6:

        <opt10>
            <descr><![CDATA[xxxxx]]></descr>
            <if>pppoe0</if>
            <spoofmac></spoofmac>
            <enable></enable>
            <ipaddr>pppoe</ipaddr>
            <ipaddrv6>dhcp6</ipaddrv6>
            <dhcp6-duid></dhcp6-duid>
            <dhcp6-ia-pd-len>0</dhcp6-ia-pd-len>
            <dhcp6usev4iface></dhcp6usev4iface>
            <dhcp6cvpt>bk</dhcp6cvpt>
            <adv_dhcp6_prefix_selected_interface>wan</adv_dhcp6_prefix_selected_interface>
        </opt10>

The v6 is currently not functional on that interface.
That is running on igb2 though.

The crash reports have all been on igb0 which here is configured with a number of VLANs some that have IPv6 configured on them.

Also see: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=220611#c4

Actions #1

Updated by Anonymous over 5 years ago

  • Assignee set to Luiz Souza
Actions #2

Updated by Luiz Souza over 5 years ago

  • Status changed from New to In Progress
Actions #3

Updated by Luiz Souza over 5 years ago

  • Status changed from In Progress to Feedback
  • % Done changed from 0 to 100

I committed our old fix for now, once the kp@ fix on the PR is tested I'll apply his fix.

Please check with the next snapshot.

Actions #4

Updated by Steve Wheeler over 5 years ago

I've never been able to replicate that locally. It's going to be very difficult to test.

Actions #5

Updated by Constantine Kormashev over 5 years ago

Looks like this is PPPoE related issue. I do not see problem with fragmented IPv6 and logging on Ethernet IPv6 forwarding.
But we need more tests with PPPoE, because the issue is hard to reproduce

Actions #6

Updated by Jim Pingle over 5 years ago

  • Target version changed from 2.4.4 to 2.4.4-GS

Still waiting on someone that can reproduce it to confirm if it still happens. May be fixed, but we won't know for certain until we get confirmation from a user that hit the original bug.

Actions #7

Updated by Anonymous over 5 years ago

  • Target version changed from 2.4.4-GS to 2.4.4-p1
Actions #8

Updated by Anonymous over 5 years ago

  • Target version changed from 2.4.4-p1 to 2.4.4-GS
Actions #9

Updated by Luiz Souza over 5 years ago

  • Target version changed from 2.4.4-GS to 2.4.4-p1
Actions #10

Updated by Renato Botelho over 5 years ago

  • Status changed from Feedback to Resolved

It should be resolved now but it's hard to reproduce. We can revisit if bug show up again

Actions

Also available in: Atom PDF