Bug #7940
closeddisabling LAGG causes system reboot on 2.4
100%
Description
It looks very similar to this - https://redmine.pfsense.org/issues/7119
When Lagg interface goes down:
<6>carp: 5@lagg0_vlan56: MASTER -> INIT (hardware interface down)
<6>carp: demoted by 240 to 240 (interface down)
<7>ifa_maintain_loopback_route: deletion failed for interface lagg0_vlan56: 3
<7>ifa_maintain_loopback_route: deletion failed for interface lagg0_vlan56: 3
<6>carp: demoted by -240 to 0 (vhid removed)
<6>lagg0_vlan56: promiscuous mode disabled
<6>vlan15: changing name to 'lagg0_vlan56'
<6>lagg0_vlan56: promiscuous mode enabled
<6>carp: 5@lagg0_vlan56: INIT -> BACKUP (initialization complete)
Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address = 0x200
fault code = supervisor write data, page not present
instruction pointer = 0x20:0xffffffff80d81f14
stack pointer = 0x28:0xfffffe044dbf98b0
frame pointer = 0x28:0xfffffe044dbf9900
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 12 (swi4: clock (0))
version.txt06000027613170320505 7614 ustarrootwheelFreeBSD 11.1-RELEASE-p1 #83 r313908+d77c47fe50c(RELENG_2_4_0): Tue Oct 10 06:48:42 CDT 2017
root@buildbot2.netgate.com:/builder/ce-240/tmp/obj/builder/ce-240/tmp/FreeBSD-src/sys/pfSense
Ticket for reference - https://customercare.netgate.com/requests/show/index/id/30787
Updated by Jim Pingle about 7 years ago
- Assignee set to Luiz Souza
- Target version set to 2.4.1
That does look almost identical to #7119, we should check to see if those patches need any adjustments for FreeBSD 11.1.
Updated by Jim Pingle about 7 years ago
- Target version changed from 2.4.1 to 2.4.2
Updated by Luiz Souza about 7 years ago
Please, can you post the backtrace of this crash ? (or upload the crashdump text file)
I can't reproduce this crash in my test devices, any specific details that I could try here ?
Updated by Luiz Souza about 7 years ago
- Status changed from New to Confirmed
Ok, I found a way to reproduce this.
It is not really related to lagg, it is a race that happens at interface detach and when there are pending link layer requests.
It also happens with other interfaces (I can reproduce the same crash with VLANs at least).
I will continue to investigate.
Updated by Luiz Souza about 7 years ago
- Status changed from Confirmed to Feedback
- % Done changed from 0 to 100
Fixed.
The fix will be available on the next snapshot.
Testing this issue is non trivial, but still, I would appreciate some testing.
Thanks!
Updated by Anonymous about 7 years ago
could not replicate on 2.4.2.a.20171103.1355, not on HA though.
Updated by Luiz Souza about 7 years ago
- Status changed from Feedback to Resolved